Needs work
Project:
Drupal core
Version:
main
Component:
base system
Priority:
Normal
Category:
Task
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
13 Aug 2008 at 21:53 UTC
Updated:
10 Sep 2024 at 19:24 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
grndlvl commentedcreated test for valid_url() used your proposed function for testing, function has some errors.
Here are my test results and the patch for the test is included.
Comment #2
grndlvl commentedtest patch will not work with this function, made alterations after creating the patch to work with mentioned function, will roll different patch that works with the following function tomorrow.** Note this message does not affect the test results show.
Comment #3
nancydruInvalid URL checking is not a feature - it's a bug. My users have reported that carets (^) do not work. Indeed they are not checked for in 5.x, 6.x, or 7.x. I have trouble reading "preg" patterns so I can't tell if your proposed patch fixes this.
Since this is causing "valid_url()" to fail, I'm changing this to a bug and hope it gets not only committed to 7, but all the way back to 5.
Here's one to add to your test:
Comment #4
nancydruI just checked, your patch does not include the caret.
Comment #5
nancydruThe patch (with the caret fix) is now incorporated into the Web Links module and seems to work fine.
May I suggest some way of taking a boolean also for the second param so that it doesn't change the current API?
Comment #6
damien tournoud commented@NancyDru: RFC1738 tells us that:
The caret symbol is not one of them and shouldn't appear in an URL unencoded.
Comment #7
nancydruThe example I gave above is from an actual issue in the Web Links queue. It is created from a large and active web site. Do you wish to inform them that their web site is invalid?
BTW, valid_url, in its present form, also fails when then caret is encoded.
Comment #8
nancydruSo, I went back and removed the caret that I added and tested, and this patch appears to have a debugging lone left in, because it showed above the heading a var_dump type line:
array(1) { [0]=> string(1) "^" }When I add a drupal_urlencode() before valid_url, I get
which still fails to validate.
Comment #9
nancydruJust to let you know that
function valid_url($url, array $options = array()) {causes a WSOD in PHP 5.2.6.Comment #10
flickerfly commentedFor those who see the bug side of this, you may want to watch: http://drupal.org/node/124492 . mfer has come up with a fix that is getting worked through the system towards D7, but also with an eye to back-port to D5 and D6.
For here, other urls to test and extend for: tel:// (for phone numbers) and xmpp:// (for jabber IM)
These may be useful in a social networking scenario, among others. Also, moving towards the mobile browsers that are also phones, these could become more desired.
The XMPP URI stuff is something I had a very small part in a long time ago, but you can find more information here: http://wiki.jabber.org/index.php/XMPP_URIs and the RFC here: http://tools.ietf.org/html/rfc4622.
The tel URI is explained in http://www.ietf.org/rfc/rfc3966.txt and http://www.ietf.org/rfc/rfc2806.txt
Here are some examples of URL's to test against provided by http://msdn.microsoft.com/en-us/library/ms709071(VS.85).aspx and http://xmpp.org/extensions/xep-0147.html.
callto:192.168.103.77+type=ipcallto:someone@example.com+type=directory
callto:msils/someone@example.com+type=directory
callto:msils:1002/someone@example.com+type=directory
callto:12345+type=phone
callto:12345+gateway=fusion+type=phone
callto:someone@example.com
callto:12345+type=phone
xmpp:romeo@montague.net?message
xmpp:romeo@montague.net?message;subject=Test%20Message;body=Here%27s%20a%20test%20message
Comment #11
mfer commentedI propose a 4 part proposal for moving forward with better validation. Instead of expanding valid_url to do everything we add some additional validation functions.
The proposed spec for IRIs is at http://tools.ietf.org/html/rfc3987. Thoughts?
Comment #12
flickerfly commentedI like the sound of mfer's solution. Configurability means flexibility without compromise to security. So would we provide for custom validation of the URL, meaning the user puts in their own regex? I'm thinking something like the date/time config that allows custom time config.
My guts says "really bad idea", but might as well toss it on the table so that can get slaughtered logically instead of just by my gut reaction. :-)
Comment #13
mfer commented@flickerfly for url/uri validation I don't think we should allow the users to input their own regex. As part of the fields in core effort there are validators for fields. Developers can assign those. I want to see less configuration options. Maybe an option to set the allow schemes that has now web interface.
Comment #14
hass commentedOnly to notify you all: #368472: valid_url() marks correct IDN domains as invalid
Comment #15
hass commentedAbout extending the function like discussed here, I'd like to please you to make sure module developers are able to define what protocols are valid via params. For e.g. I'd like to reuse the function in my modules, but make sure only
http/httpsis allowed. I need this because I' like to validate the URL prior to using it much later indrupal_http_request().Comment #16
alexanderpas commentedComment #17
alexanderpas commentedOnly to notify you all: #389278: Create IDN encoding and decoding functions
Comment #18
dropcube commentedSubscribe
Comment #19
sunLet's start with a re-roll of this patch against D8. Any takers?
Comment #20
neclimdulIn Drupal 8 we should probably be using filter_var($url, FILTER_VALIDATE_URL) as a more robust solution.
Comment #21
neclimdulactually, we can do that in 7 too, is there a reason we aren't? Anyone know some history here?
Comment #22
alexanderpas commentedwe would still need to check the protocol, and use idn_to_ascii() to handle IDN domain names
Comment #23
pasquallehttp://mathiasbynens.be/demo/url-regex
this seems like a good list of urls to test against
Comment #24
mfer commented@alexanderpas FILTER_VALIDATE_URL isn't a real validation option. It passes the work to parse_url which notes it can't be used for validation. URLs that are invalid pass it. If you look at the implementation it doesn't work.
Also, idn_to_ascii() is part of an add on for PHP and not part of PHP proper. Using that would create a dependency on functionality not there is many environments. Is the idea to add another dependency to Drupal?
@Pasqualle I had no idea about that page but it's great. Y'all should consider implementing the one from diegoperini if it passes all the tests.
I would love to see idn support and there are some other tickets for that floating around.
Comment #25
Mixologicidn_to_ascii() is merely an implementation of the punycode. Urls could be converted to their punycode equivalents to check for validity.
More info here:
https://drupal.org/node/389278
Comment #26
dave reidAgreed, we have the same problem with filter_var(). I would suspect we need to look into Symfony's UrlValidator class as it uses a regex approach, and may work better than our current standard. Or should be improved via pull request to add better support. I'd rather us re-use an existing code like that rather than implement something custom.
Comment #35
avpadernoComment #42
catchCore uses a (modified since this issue was opened) regular expression, not filter_var(), so the issue title here is wrong.
The two bugs in the issue summary were fixed in the meantime, I think we need an updated issue summary with which valid URLs are not allowed by UrlHelper::isValid() and which valid URLs are rejected. These could then be moved into test coverage and we could go from there. The idea in #24 seems fine but is an API addition rather than a bugfix.
Comment #43
catch