Parsed links generates erroneous links when the link URL contains request parameters
| Project: | Related links |
| Version: | 5.x-2.1 |
| Component: | Code |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | active |
Jump to:
I have just installed the new 5.x-2.1 release (was running 5.x-1.0 before) to see if this is now fixed and it appears not.
Bug Description:
If you create a link within your node body to a URL with multiple request parameters (i.e. with ampersand characters), then this link will be broken when it is added into the parsed links list.
When Related links scans a link and adds it to the list of parsed links, it replaces any ampersand characters with the HTML entity & which consequently breaks the link (as the destination site interprets the characters "amp;" as the beginning of the request parameter).
This happens in both 5.x-2.1 and 5.x-1.0.
I have marked this as critical since it represents a bug in core relatedlinks functionality for any link which contains request parameters.

#1
** This comment was incorrect -- see comment #3 for corrected info **
#2
** This comment was incorrect -- see comment #3 for corrected info **
#3
Upon further investigation I can confirm that this IS a bug in relatedlinks, and can provide further details on it:
The problem is that relatedlinks converts & characters into
&entities WITHOUT checking the characters following each & to see if this has already been done.Therefore if the link already contained
&entities these are corrupted and become&, and hence are broken.This is therefore a critical bug in relatedlinks, because any link with a properly html-encoded URL containing multiple request parameters will be corrupted.
What's more, because WYSIWYG editors such as TinyMCE automatically clean up user entered URLs in this way, encoding & characters as
&(which is correct and valid behaviour) any such links will then be corrupted by RelatedLinks and will appear in the parsed-links list with&in them, therefore the links are broken.Suggested Fix a:
- when parsing URLs from the content, either leave them exactly as is (i.e. don't change any & characters).
Suggested Fix b:
- convert any
&html entities that are already there back into single & characters, then convert any & characters back into the&html entities. This way, you ensure all & characters have been replaced by&and also ensure that you do not accidentally generate any erroneous&entities in the process.#4
Not only is it a problem with URL parameters, it is also a problem with any HTML entities in the text of the link.
Definitely, a critical bug....
#5
It's interesting that the parsed links is the only block that actually works as expected on our site. I have to admit that this is the only module we are using in beta release only because we desperately needed something like this. Unfortunately the other two blocks (manual links and discovered links) that this module generates, create incorrect links for the node so our users were a bit confused by the relationship of the links to what they were reading, so we had to disable them. I am hoping this gets resolved and sorted out in the official release for Drupal 5.7. If however, there are fixes that have been posted elsewhere, please kindly email me through my contact form (http://drupal.org/user/275979/contact). Furthermore, the ability to configure exclusion of email addresses via module settings from the parsed links would be great. Not sure of this is achievable via overrides, but I am not that savvy when it gets that complex on configuration of modules.
#6
Has this been fixed in 5.x-2.2 yet?
I'm sure that implementing either of the fixes I suggested in comment #3 above would be pretty simple, since PHP already provides functions for doing HTML entity conversions.
It is a pretty critical problem, and worthy of some attention, I would think, especially if its easy to fix?