Refactor regex for better validation and to allow for special chars in urls

neo2049 - August 2, 2009 - 19:07
Project:Follow
Version:7.x-1.x-dev
Component:Code
Category:task
Priority:normal
Assigned:q0rban
Status:closed
Description

Hi,

I've been able to add facebook, twitter and flickr to the follow module but when I try last.fm it says my url is invalid. I have checked the url so many times but no joy. I think the module doesn't like the .fm in the url.

Please help.

#1

q0rban - August 4, 2009 - 02:17
Status:active» postponed (maintainer needs more info)

Hi, I'm sorry you are having troubles.

I can't replicate this. Can you paste the url here that you are unable to use? Thanks!

#2

neo2049 - August 4, 2009 - 17:26

The URL I am trying is http://www.last.fm/music/Musica+Reverso

Thanks for your help.

#3

q0rban - August 4, 2009 - 17:38

hmm, I imagine it's the plus symbol actually. Try http://www.last.fm/music/Musica%2BReverso

#4

neo2049 - August 5, 2009 - 00:32

Thanks man! It works perfectly now.

#5

q0rban - August 5, 2009 - 13:15
Status:postponed (maintainer needs more info)» active

I'm going to keep this active in case it continues to be a problem for other people.

#6

q0rban - August 12, 2009 - 00:29
Priority:critical» normal

Bumping down to normal

#7

q0rban - September 28, 2009 - 14:31
Title:I'm unable to add last.fm» Refactor regex to allow for special chars in urls
Category:bug report» task
Assigned to:neo2049» q0rban

#8

pwolanin - November 2, 2009 - 18:40
Title:Refactor regex to allow for special chars in urls» Refactor regex for better validation and to allow for special chars in urls
Version:6.x-1.2» 7.x-1.x-dev

started working on this for D7, adding knwn service urls to build better validation

#9

pwolanin - November 2, 2009 - 18:44
Status:active» needs review

here's a pass at a D7 patch, not really tested yet.

AttachmentSize
follow-D7-538172-9.patch 2.48 KB

#10

pwolanin - November 2, 2009 - 20:31

picking up the suggestion here: http://drupal.org/node/619276#comment-2218064 for a more complex data structure to help with validation and display

#11

pwolanin - November 2, 2009 - 20:58
AttachmentSize
follow-validation-538172-11.patch 5.8 KB

#12

pwolanin - November 2, 2009 - 21:14

fixes a bug with the user block theme function call also.

AttachmentSize
follow-validation-538172-12.patch 6.31 KB

#13

pwolanin - November 2, 2009 - 21:52

remove debug code, improve error message.

AttachmentSize
follow-validation-538172-13.patch 6.59 KB

#14

pwolanin - November 3, 2009 - 15:48

Allowing special characters is hard, since there is only a single text area. Basically to do it correctly we need to parse the URL on input and store the parts separately in the DB. I think drupal core is doing something like this in the menu module.

I'm planning to commit the above patch, since solving the other concern seems liek a more substantial problem that at first sight.

#15

q0rban - November 3, 2009 - 16:33

Marked #622118: Follow doesn't accept valid Flickr link dues to @ symbol as a duplicate of this issue.

Also, anyone that is having trouble with special characters can refer to this doc and convert the special characters by hand.

@pwolanin, is there any way to parse out the url and pass all fragments to urlencode?

#16

pwolanin - November 3, 2009 - 17:07

@q0rban - yes, see: http://api.drupal.org/api/function/menu_edit_item_validate/7

but then I'd suggest the DB storage needs to change. In fact the module should not even need to urlencode anything if using the l() function properly due to http://api.drupal.org/api/function/drupal_http_build_query/7

#17

q0rban - November 3, 2009 - 17:20

Ok, then I think the only thing the regex needs to do is make sure it has the domain in the url. Then we still need to somehow filter for XSS before saving to the DB, correct?

#18

pwolanin - November 3, 2009 - 17:40

@q0rban - no, l() should filter this on output, so using that well will make everything both easier and more robust. In fact, if we are going to parse the URL, likely we can make the regex simpler.

We will essentially need to save to the DB 2 possible ways:

1) a path + serialized options array as 2 columns
2) a path, query, and fragment columns (query sill likely as a serialized array)

#19

q0rban - November 3, 2009 - 17:48

Hmm... I'm going to have to think about this.. Off the bat I'm not crazy about parsing it and saving it to the DB that way.

#20

pwolanin - November 3, 2009 - 18:06

@q0rban - well that's how drupal core saves links - what's the downside?

#21

pwolanin - November 3, 2009 - 18:32

#22

pwolanin - November 3, 2009 - 21:01
AttachmentSize
follow-validation-538172-21.patch 12.61 KB

#23

pwolanin - November 3, 2009 - 21:30

could also look at using http://api.drupal.org/api/function/drupal_parse_url/7, but that's not as readily backported to 6.x

#24

pwolanin - November 3, 2009 - 22:02

Minor tweaks to regex generation and help/error text. About ready to commit this.

AttachmentSize
follow-validation-538172-24.patch 13.42 KB

#25

pwolanin - November 3, 2009 - 22:20
Status:needs review» fixed

#26

pwolanin - November 4, 2009 - 15:25

opps - broke a clean install. Follow-up patch to fix this - committing.

AttachmentSize
fix-install-538172-26.patch 642 bytes

#27

System Message - November 18, 2009 - 15:30
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.