Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
The code in http://api.drupal.org/api/function/valid_url/7 needs to support IDN (International domain names). The line that needs to be fixed should be:
(?:[a-z0-9\-\.]|%[0-9a-f]{2})+ # A domain name or a IPv4 address
I have no ready regex for this validation... maybe someone else?
Comments
Comment #1
mfer CreditAttribution: mfer commentedYes, we need to fix this. There are two parts to this...
First, would the name valid_url be correct? A url is a subset of a uri. That does not allow international characters (only ascii). Instead of a uri we would use an iri and the subset of that is a irl. This may be a matter of semantics but I'm still asking.
The current implementation is based on the spec RFC 3986. This is for uris. The iri spec is RFC 3987 and is still a draft/proposed standard. That being said, International domain names are out in the wild so this is a must.
We need to update the domain name and the path. What about the schema portion (http part)?
I think we need to replace \w with \pL_, a-z with \pL and 0-9 with \pN.
If someone writes up some tests for this I'll update the regex (unless someone else wants to).
Comment #2
hass CreditAttribution: hass commentedIDN support would only require an update to the domain name/hostname validation... the other parts don't need to change. I would also need http://drupal.org/node/295021#comment-1235860.
Comment #3
mfer CreditAttribution: mfer commented@hass - well, if we are going to go international should we limit it to IDN or flat out allow routable urls like http://例え.テスト/メインページ (ICANN site)?
If we are going to allow international characters, and we should, we should allow them everywhere they will be used in a url, irl, or what ever.
Comment #4
hass CreditAttribution: hass commentedHow should we ever check this with a regex? :-)
Comment #5
alexanderpas CreditAttribution: alexanderpas commentedComment #6
alexanderpas CreditAttribution: alexanderpas commentedpostponed until #389278: Create IDN encoding and decoding functions is in.
Comment #7
dropcube CreditAttribution: dropcube commentedSubscribe
Comment #8
marcvangend#389278: Create IDN encoding and decoding functions has been moved to D8 with priority 'normal'. What to do with this issue?
Comment #9
mfer CreditAttribution: mfer commented@marcvangend I'm marking this issue 'by design'. The intent of valid_url is to validate against urls. We are now talking about the iri space and not the uri space.
So, the current setup is by design. The path forward of encoding/decoding along with validation to handle idns is in that other issue. We can work from there.