SpamSpan should not touch URLs at all, I think.
[scheme]://[user]:[pass]@[host]/[path]?[query]#[fragment]
"[scheme]://" could be optional.
Try this as an example:
<a href="user:password@www.drupal.org/something?q=user@drupal.org#user@drupal.org">Link 1</a>
<a href="https://user:password@www.drupal.org/something?q=user@drupal.org#user@drupal.org">Link 2</a>
You could use the php function parse-url to check that:
http://php.net/manual/en/function.parse-url.php
Perhaps with an additional check if a non-valid url gets valid if a scheme is added before.
Comments
Comment #1
gaele commentedThis sucks.
Flickr uses "@" a lot in their urls, e.g.
http://farm6.static.flickr.com/5060/buddyicons/1606911@N23.jpg
Comment #2
peterx commentedThere is no maintainer for the D6 version. The D7 version selects via a regular expression and regular expressions usually break when you change them, introducing more errors than you fix.
A change like this needs should be made in the D7 version then backported. D7 has a test system. This change needs someone who is an expert on regular expressions and the D7 test system, someone with the time to experiment and to create test cases in the Drupal test system.
Comment #3
peterx commentedThe following line is the example presented as a failure.
user:password@www.drupal.org/something?q=user@drupal.org#user@drupal.orgI thought about changing the regular expression to exclude email addresses preceded by a colon then I found a site with the following text.
Email example:fred@example.comA regular expression will not fix the problem. The other addresses need a span or div around them to protect them and Spamspan would need appropriate code to identify the protected addresses. Spamspan could have an option to only process addresses identified by a span or a div but it would have to be off by default and you would have to find someone to develop the change.
You could also add specialised fields to the content type and insert them into text through tokens. There are a few modules for that type of change.
If there is an easy reliable way to identify the difference between an email address and the examples you provide, talk with regex experts about submitting a change.