There is an old problem in Drupal, which is that when you copy/paste text containing references to within the page, then because of the way Drupal handles the URLs, if you click on the references you get sent to the front page, instead of getting sent to the reference in the same page.
I find this particularly annoying because the site I am developing will take content (articles) copy/pasted from Word or OpenOffice (Word/XP now does a particularly nice job of converting superscripted footnote references into something easily readable on the web when you copy/paste), and these articles very often have footnotes generated by the Word or OpenOffice.
I wrote this module (for 4.5.0) to handle the problem - but feel a bit intimidated submitting it to CVS because a) there are clearly lots of people who are far better than me at doing this kind of thing and b) I can't guarantee to maintain it, plus c) although it seems to work for me there may be other cases I've not taken into account.
What does the module do?
Basically, it works with page, story, or book type nodes, searches the node body for any occurrences of href=#xxx (or href=yyy#xxx) and replaces these with href=pathalias#xxx. Occurrences of href=#xxx are stripped out of the teaser (don't want them to appear here and anyway they could lead to confusion). In all other node types, it just strips the notes out altogether (this of course is fine for my purposes but not necessarily very generic).
In order for the module to work, you have to have the "path" module enabled, and enter a "Path alias" for the content. You can't just use the node number, because when you are in the process of creating a node, the node number is not yet known. Consequently, you can't use it (logical!!).
Anyway, here is the code. Any comments would be very welcome - otherwise feel free to use and I hope this is helpful:
// $Id: node_footnotes.module,v 0.1 2004/10/20 Joel Guesclin $
/**
* @file
* Resolves the problem of dead footnotes (eg bottom of page notes) as
* generated by Word or OpenOffice
*/
/**
* Implementation of hook_help().
*/
function node_footnotes_help($section) {
switch ($section) {
case 'admin/modules#description':
return t('Converts footnote references to include node reference');
}
}
/**
* Implementation of hook_nodeapi().
*/
function node_footnotes_nodeapi(&$node, $op, $teaser = NULL, $page = NULL) {
switch ($op) {
case 'validate':
if ($node->type == 'page' || $node->type == 'story' || $node->type == 'book') {
if ($node->title) {
// Here is the original regular expression to get all anchors out of a chunk of HTML:
// '/<a(\s)+[^>]*href(\s)*=(\s)*(\'|")*[^"\'>]+(\'|")*[^>]*(\s)*>(.|\s)*?<\/a(\s)*>/i'
// Thankyou Sam Fullman at codewalkers.com for a sweet piece of coding!!!
// And then, we add the hash in order to find only the footnote type href's
$findMatches = preg_match_all
('/(<a(\s)+[^>]*href(\s)*=(\s)*(\'|")*)([^"\'>]*)#+([^"\'>]+(\'|")*[^>]*(\s)*>(.|\s)*?<\/a(\s)*>)/i',
$node->body,$matches); // "
if($findMatches){
if (empty($node->path)) {
form_set_error('path', t('You must supply a path.'));
break;
}
$string = $node->body;
$pattern = '/(<a(\s)+[^>]*href(\s)*=(\s)*(\'|")*)([^"\'>]*)#+([^"\'>]+(\'|")*[^>]*(\s)*>(.|\s)*?<\/a(\s)*>)/i'; // "
$replacement = '\\1'.$node->path.'#\\7';
$node->body = preg_replace($pattern, $replacement, $string);
if ($node->teaser) {
$string = $node->teaser;
$replacement = '';
$node->teaser = preg_replace($pattern, $replacement, $string);
}
}
}
}
else {
$pattern = '/(<a(\s)+[^>]*href(\s)*=(\s)*(\'|")*)([^"\'>]*)#+([^"\'>]+(\'|")*[^>]*(\s)*>(.|\s)*?<\/a(\s)*>)/i'; // "
if ($node->body) {
$string = $node->body;
$replacement = '';
$node->body = preg_replace($pattern, $replacement, $string);
}
if ($node->teaser) {
$string = $node->teaser;
$replacement = '';
$node->teaser = preg_replace($pattern, $replacement, $string);
}
}
}
}
By the way, the occasional comments (//") are just there to unconfuse MPS PHP Designer so that it switches its colour coding back to something coherent!