Support from Acquia helps fund testing for Drupal Acquia logo

Comments

adschar’s picture

Assigned: Unassigned » adschar
FileSize
943 bytes

Indeed the old version accepted freely with/without final dot, which is incorrect. I've tried to follow RFC 1035 for the new version.

adschar’s picture

Dries’s picture

That's intentionally because people should be able to use e-mail addresses like root@localhost.

adschar’s picture

Dries,

root@localhost is a normal RFC1035 address, but are you trying to tell me that some people are using root@localhost. with a final dot?

If we chose the last alternative, we should patch common.inc explicitly stating that we want that possible final dot, and then close this issue.

Please note that neither the old nor the new version tries to limit the length of the top domain, or tries to say that it should come from the series of standard top domains like .biz, .eu, .com, .uk, .be etc.

Arie Dirk

killes@www.drop.org’s picture

Priority: Critical » Normal

Doesn't apply anymore.

Eglish’s picture

Version: » 4.7.0-beta2

Tested through /admin/user/create, and the trailing '.' is still allowed in 4.7 beta2

jadwigo’s picture

ok, root@localhost I can understand for testing purposes...

but out on the www it's not usefull to have people register on a site with jimbob@aol which would be an equivalent regexp... then I really want a TLD so my denial of approval mails get where they need to go...

So in my opinion it is a choice between ease of use (in real life situations where you don't want local email adresses to validate) and ease of local testing (where local email adresses are acceptable)

local testing seems to me is a very specialized case that should not break functionality for users who need the functionality the most ... people who really want to test locally could try root@localhost.lan instead.

webchick’s picture

> people who really want to test locally could try root@localhost.lan instead.

No, they can't. Local SMTP servers will only accept addresses literally addressed to 'something@localhost'.

jadwigo’s picture

that complicates things, but it still remains a one-off case which breaks desired behaviour for normal use IMO.

Isn't it possible to solve this by configuring for example your hosts file and the mailserver configuration to route localhost.lan?

webchick’s picture

Version: 4.7.0-beta2 » x.y.z
Category: bug » feature

Filing as feature request against CVS, since it would change standard Drupal behaviour.

webchick’s picture

Title: valid_email_address suffix bug » Properly validate e-mail suffixes
Status: Active » Needs work

...and title. Also, there's a patch.

PROMES’s picture

Version: x.y.z » 4.7.4
Category: feature » bug

Something is unclear for me in the above discussion.
I patched valid_email_address() in common.inc with patch common.inc_patch_12274 and still can use a very invalid email address: a@b. Is this correct by the RFC's? Or still a bug? Or is this for special case me@localhost?
If it's only valid for x@localhost, why don't you make it a special case?

Up till now I use the following validation function in non-drupal websites:
if (eregi("^[0-9a-z]([-_.]?[0-9a-z])*@[0-9a-z]([-.]?[0-9a-z])*\\.([a-z]{2}|com|edu|gov|int|mil|net|org|shop|aero|biz|coop|info|museum|name|shop|arpa|pro|mobi|jobs|travel)$", $mail, $check)) { ... }

I think not testing the 2 character second-level domain names and checking the few non-two character ones is a good idea. Or is this special-case list currently not valid any more? Due to ipv6?

Gerhard Killesreiter’s picture

Version: 4.7.4 » 5.x-dev

moving

seanbfuller’s picture

Version: 5.x-dev » 6.x-dev

I ran across this issue from a client request recently, so we had to take a quick crack at it. I rolled up this rough patch based on Cal Henderson's email validation routine. It also checks for a variable that will determine whether the site should allow email addresses without the ".com" -style extension. Essentially this is an option for admins to flip their email validation to use "real life" rules, essentially breaking the spec.

Quick things to note:
1. I built this against 6.0-beta1 (and updated the version accordingly)
2. The code is checking the variable, but there's no setting to flip it on or off. I wasn't sure the best place for it to live.
3. The code doesn't about for [] characters (cal@[hello.com]). Someone with a bit more knowledge of these email addresses might want to comment.

Obviously there are some potential issues with this approach, but as I was working on it, I thought I'd share. If there is a more up-to-date issue that I wasn't able to find, please let me know.

seanbfuller’s picture

Sorry, I forgot the patch. It's late in the day for me. Here it is.

wmostrey’s picture

Assigned: adschar » Unassigned

I really think we should tackle this one for once and for all. We could include an exception for "root@localhost" but not having decent e-mail validation sounds like a pretty big thing.

If we don't want to fix it, let's mark this "won't fix" for future reference.

cburschka’s picture

There are two parts to this:

1.) user@something. Definite bug. Trailing periods are not allowed in hostnames. Do not validate them.
2.) Possibly undesirable input of invalid hostnames. user@localhost is just as valid by the RFC as is user@aol and user@this.tld.is.stupid

Keep in mind that Drupal /can/ be installed in LAN environments, whose local nameservers /can/ resolve normally invalid TLDs. Filtering out by TLD at this low level is presumptuous. We already have a feature that filters email addresses more intelligently: the access rules. These could be improved to allow admins to allow, require or disallow certain domains or TLDs if that doesn't work properly yet. It shouldn't be duplicated here.

cburschka’s picture

Status: Needs work » Needs review
FileSize
768 bytes

The attached patch fixes the first point, ie. the actual bug.

Pre-patch:

admin@localhost - valid
admin@elsewhere - valid
admin@localhost.com - valid
admin@localhost. - valid
admin@ - invalid

Post-patch:

admin@localhost - valid
admin@elsewhere - valid
admin@localhost.com - valid
admin@localhost. - invalid
admin@ - invalid

I strongly believe that valid_email_address should not try to make intelligent guesses about the hostname namespace, and instead stick to the letter of RFC 822 as closely as feasible.

Edit: NOTE - this patch is racing against another one that expands validation from addr-spec to full mailbox format. The losing patch will require a re-roll.

cburschka’s picture

Title: Properly validate e-mail suffixes » Do not validate email addresses with trailing periods

Title updated to reflect the patch better.

dalin’s picture

What about using checkdnsrr($domain,"MX"). That way we could let someone@string validate only if 'string' has a valid MX dns entry?

eric.chenchao’s picture

Hi, I also meet this email validation problem now when I found one user registered with email like test@test. I really think this is not quite good for customer database.

So, I try to adjust the regex expression in the function valid_email_address,

//  $domain = '(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.?)+';
   $subdomain = '(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])';
   $domaintype = '(?:[a-zA-Z]{2,})';
   $domain = "(?:$subdomain\.)+$domaintype ";
//  return preg_match("/^$user@($domain|(\[($ipv4|$ipv6)\]))$/", $mail);
   return preg_match("/^$user@($domain|(\[($ipv4|$ipv6)\])|localhost)$/", $mail);
Kuldip Gohil’s picture

HI I have found this

It is good to use filter_var php function for email validation...

like it is also used in D7

http://api.drupal.org/api/function/valid_email_address/7

jadwigo’s picture

filter_var only exists in php 5.2.x or higher, which is okay for drupal 7, but not for drupal 6

DamienMcKenna’s picture

Please note the work that went into #265548: valid_email_address() is not RFC compliant, including a patch that brings verification that is very accurate to the RFCs.

aufumy’s picture

FileSize
8.61 KB

The patch mentioned by DamienMcKenna, which includes gpl'd code from Dominic Sayers, is attached.

Applied against drupal-6.17

Cyberwolf’s picture

Subscribing.

Status: Needs review » Needs work

The last submitted patch, 12274.patch, failed testing.

mcfilms’s picture

Version: 6.x-dev » 6.19
Priority: Normal » Major

I realize I am more than six years late to the party. But I would like to mention that the issue the OP posted about is still present. One person can register as:
example@example.com
and a second person can go through the registration as:
example@example.com.

That trailing dot will be seen as a new address. But Drupal's email system will not send a confirmation mail. Worse, if that person was automatically enrolled in SimpleNews, they can cause the newsletter sending to error out. In my experience this can cause cron to get stuck in an endless loop and not do the other cron tasks.

I changed the status of this to "major" and I think it should be critical. I can see a scenario where a bad actor could bring a site to its knees by just flooding the user registrations with email addresses with a trailing period.

This does indeed seem to be fixed in D7. Is there any plan to backport it to D6?

DamienMcKenna’s picture

@mcfilms: This was "fixed" in D7 by using new functionality in PHP 5.2, but D6 can't do that so this patch is still relevant.

mcfilms’s picture

Thank you. Do you mean the commontest patch at http://drupal.org/node/265548#comment-868037 or the JeremyFrench version at http://drupal.org/node/265548#comment-1478880?

Or do you mean the version you posted at http://drupal.org/node/265548#comment-1502570. That one is the most recent. However I notice that it did not go through the QA tests the other two did. Is there a reason?

Also do you know if any of these will be rolled into the next Drupal 6.20 release? Seems like it could be a fairly dangerous bug. It is in my case.

DamienMcKenna’s picture

Version: 6.19 » 6.x-dev
Status: Needs work » Needs review
FileSize
8.63 KB

Rerolled the patch against the latest DRUPAL-6 branch, updated the header in the patch to note the relevant RFCs and remove the attribution (which had been mentioned elsewhere).

Status: Needs review » Needs work

The last submitted patch, drupal-n12274-rfc_email_address_check.patch, failed testing.

DamienMcKenna’s picture

Status: Needs work » Needs review
FileSize
8.55 KB

Trying again, this time without the unnecessary lines at the top (for unidentified files added to my local CVS working directory).

Status: Needs review » Needs work

The last submitted patch, drupal-n12274-rfc_email_address_check.patch, failed testing.

dalin’s picture

@DamienMcKenna I don't understand why the testbot doesn't like your patch, it applies just fine for me.

But I think this patch needs waaaay more comments. For example near the top is a large switch statement. What is this trying to do and why. The Drupal standard is generally to have a comment every ten or twenty lines explaining what's going on.

Kuldip Gohil’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, drupal-n12274-rfc_email_address_check.patch, failed testing.

joachim’s picture

Golly, that's a lot of code when surely a regex can do it!

DamienMcKenna’s picture

@Joachim, please read about it on http://www.dominicsayers.com/isemail/ to see why it's more complicated than that. I'm sure someone who's handy with regex could help improve the patch. BTW Dominic is working on an update, which I'm going to re-roll once it's out.

markshust’s picture

FileSize
8.31 KB

I was getting an error applying the patch in #33. Something related to the comments.

Trying to submit the same patch with edited comment section.

markshust’s picture

Status: Needs work » Needs review
FileSize
8.31 KB

Trying again

markshust’s picture

Status: Needs review » Needs work

Re: @dalin, #35: Not sure why it's failing either - works fine for me too.

markshust’s picture

Status: Needs work » Needs review
FileSize
559 bytes

Wowsers..., for some reason I completely overlooked the code that was being submitted. The extra appended period can easily be gotten rid of with regex. My regex is ok, there may be an easier way to do this but this is pretty straightforward to me.

This patch will pass email@domain, not pass email@domain., but pass email@domain.com

Please review it and let me know if this takes care of this 5 year old issue!!! This patch can definitely be improved upon to check for characters inbetween periods in the tld or domains ending in stuff like .co.uk, but can be submitted as-is to take care of the period issue and close out this darn ticket.

PS: In your own codebase if you wanted to me the .com required, all you would have to do is use a + instead of * near the end of the regex.

markshust’s picture

FileSize
558 bytes

Something must have gone awry when I copied and pasted this from linux to mac.... let's try again.

webchick’s picture

Issue tags: +Needs tests

Let's get some tests for this.

webchick’s picture

Issue tags: -Needs tests

LOL. Sorry! Didn't realize this was D6. :)

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

Status: Needs work » Needs review
FileSize
554 bytes

Ha, no prob. Hmm... maybe its failing because I had the previous filename wrong? I'm not sure what else it could be.

markshust’s picture

So.... I don't know if I'm the suck at submitting patches, if it's my fault, or drupal's fault. Someone please help out, there's no reason why this should be failing.

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

Status: Needs work » Needs review
FileSize
761 bytes

Trying one more against the cvs branch 6 head, last one before I'm all out of ideas...

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

Status: Needs work » Needs review
FileSize
780 bytes

Hmm, it says it is looking for the file on line 8. Let's try this patch.

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

Anyone? I have a fix for this, but not sure why it's not passing testing.

markshust’s picture

Status: Needs work » Needs review

#53: common.inc_.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

perhaps this is something with the testing process. it says that it cannot find the file, but it is stated correctly in the diff as includes/common.inc

[15:22:01] Log initialzed.
[15:22:01] Environment variables and review information initialized.
[15:22:01] Review started.
[15:22:01] Invoking operation [setup]...
[15:22:01] Database backend [mysql] loaded.
[15:22:01] Invoking operation [fetch]...
[15:22:01] Fetched [common.inc__42.patch].
[15:22:01] Invoking operation [checkout]...
[15:22:01] Version control system backend [cvs] loaded.
[15:22:05] Main branch [drupal] checkout complete.
[15:22:07] Dependency [simpletest] checkout complete.
[15:22:07] Invoking operation [check]...
[15:22:07] Invoking operation [apply]...
[15:22:07] Command [patch -p0 -i /var/lib/drupaltestbot/sites/default/files/review/common.inc__42.patch 2>&1] failed with status [1] and output:
can't find file to patch at input line 9
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|? common.inc.patch
|Index: includes/common.inc
|===================================================================
|RCS file: /cvs/drupal/drupal/includes/common.inc,v
|retrieving revision 1.756.2.107
|diff -u -p -r1.756.2.107 common.inc
|--- includes/common.inc	15 Dec 2010 21:34:35 -0000	1.756.2.107
|+++ includes/common.inc	6 Jan 2011 20:21:49 -0000
--------------------------
File to patch:
Skip this patch? [y]
Skipping patch.
1 out of 1 hunk ignored.
[15:22:07] Encountered error on [apply], details:
array (
  '@filename' => 'common.inc__42.patch',
)
[15:22:07] Review complete.
[15:22:07] Static variables reset.
DrupalCuckoo’s picture

Status: Needs work » Needs review

#53: common.inc_.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

markshust’s picture

I submitted a fix for this almost two months ago. Someone please holla to help out.

dalin’s picture

Issue tags: +Needs tests

This patch needs tests. Also there is the issue that
foo@bar.com.
is a completely valid email address. The final dot is the root of the entire Internet.

igor.ro’s picture

Status: Needs work » Needs review

#53: common.inc_.patch queued for re-testing.

Cyberwolf’s picture

Why not add a MX record check, as suggested in #20?

Several PHP libraries (Apache Zeta Components, Zend Framework) have this option as well.

Status: Needs review » Needs work

The last submitted patch, common.inc_.patch, failed testing.

mcfilms’s picture

I wonder why these patched continue to fail testing. Both markoshust and Cyberwolf turned in patches that failed.

Is there information on how to find out why these failed?

Cyberwolf’s picture

AFAIK I did not turn in a patch for this issue :)

markshust’s picture

i didn't even add/remove any lines... just modified the existing line. this works people, i guess just copy/paste this into your current install and every subsequent upgrade since drupal won't let this one through.

markshust’s picture

FileSize
585 bytes

@dalin #62: Revised the regedit to take this into account

This patch will now validate email addresses based on the following syntax:

Passes:
user@localhost.com
user@localhost.co.uk

Fails:
user@localhost
user@localhost.
user@localhost.com.

If this patch fails testing, please let me know if/how I can generate tests to get this to pass and into core. This patch is only for 6.x branch as 7.x uses different logic for email validation.

markshust’s picture

Status: Needs work » Needs review
markshust’s picture

Status: Needs review » Fixed

Finally.. I missed the trailing period issue in all of the other patches. This should be good to go.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

markshust’s picture

Status: Closed (fixed) » Active

I didn't see this fix in yesterday's release... why not?

mcfilms’s picture

I hear ya... Hope somebody can give this the old stamp of approval so we can get it into core before D6 is obsolete.

markshust’s picture

bump.

mcfilms’s picture

I've been Rick Rolled!

I wonder if there is anyone else left in this thread that can get this into the next D6 update.

markshust’s picture

this thread has converted into my own personal humor relief page. it is kinda funny that it's been open for 6 years..

wmostrey’s picture

Status: Active » Needs review

#69: 12274-69.patch queued for re-testing.

wmostrey’s picture

Priority: Major » Normal
Status: Needs review » Reviewed & tested by the community

The reason your patch didn't get in is because you moved it to "fixed" yourself.

The patch still applies cleanly. This is the current state:

Passes:
user@localhost.com
user@localhost.co.uk

Fails:
user@localhost
user@localhost.
user@localhost.com.

If we agree on this, it's good to go. If we don't agree on this, then please state which case should be included or removed.

markshust’s picture

Status: Reviewed & tested by the community » Needs work

Arg.

Well, re-reading this entire post, it seems these should PASS as they are valid by RFC spec:
user@localhost
user@anydomain

Everything else looks correct. I'll mess around with the regex and account for this.

wmostrey’s picture

Do note that those two e-mail addresses are not valid in Drupal 7 anymore.

markshust’s picture

Status: Needs work » Needs review
FileSize
585 bytes

Let's see if this one works. It should:

Passes:
user@localhost.com
user@localhost.co.uk
user@localhost

Fails:
user@localhost.
user@localhost.com.

markshust’s picture

Status: Needs review » Needs work

Oops. Well, if they aren't valid for D7, then they shouldn't be valid for D6. Let's close this one out after this first regex patch i submitted passes.

markshust’s picture

Status: Needs work » Needs review

#69: 12274-69.patch queued for re-testing.

wmostrey’s picture

Status: Needs review » Reviewed & tested by the community

Ok the test in #69 passes. It's completely conform with Drupal 7, so it's good to go.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

All right, committed, pushed.

johncs’s picture

Status: Fixed » Active

I did not discover this until now. I recommend it be rolled back.

Posts above refer to RFC 822. RFC 822 is obsolete, and so is its successor. The latest I can find is here

I am using Thunderbird on CentOS 5.7 or so to send email, and Postfix also on CentOS to receive my email. Both allow email addresses ending in a dot, I just tested it.

I am using ISC Bind 9.3.6 on my server, and that package provides two client programs to look up IP addresses. Both accept domain names ending in a dot, and distinguish between domains with an ending dot and without:

16:20 [root@Server httpd]# nslookup jeremiah.zed.
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find jeremiah.zed.: NXDOMAIN

16:22 [root@Server httpd]# nslookup jeremiah.zed 
Server:         127.0.0.1
Address:        127.0.0.1#53

Name:   jeremiah.zed.js.id.au
Address: 203.161.75.23

16:22 [root@Server httpd]# 

This feature is useful as I use a wildcard domain name. This allows me to invent new subdomain names without having to update my domain information. Just for laughs, try looking up these domain names:

windows.suck.bunnies.through.garden.hoses.js.id.au
but.not.as.well.as.osx.js.id.au

I am in favour of validating email addresses, and I wish Drupal did it better because I have had a lot of spammers registering with bogus email addresses and causing havoc.

I think the best way to validate an email address is to test it to see whether it works. That means you open a TCP connexion to the preferred MX address (A address if there is no MX) and see whether the email address accepts mail/ If it does not, log the error someplace and decline the registration. Or, at Site Admin's option, accept it and mark it blocked and/or assign it a role chosen by Site Admin.

Note that you should check the response code from the remote SMTP servers, it's not unknown for mail server administrators to block connexions from "ADSL" servers as an antispam measure.

Gábor Hojtsy’s picture

I don't understand this. Your wildcard domain examples do not include a trailing dot. Why do you need a trailing dot? What does that not allowed stop you from doing?

markshust’s picture

I also have no clue what he is talking about. jeremiah.zed. (with the trailing period) is not a valid domain name.

mezitlab’s picture

Version: 6.x-dev » 6.24

The valid_email_address() function doesn't work proper with this domain pattern for me.
There are several e-mail addresses:
Dittrich.Peter@mail.t-com.hr
gottliebd@zg.t-com.sk

which were valid with this domain rule:
$domain = '(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.?)+';

but not valid with this new one:
$domain = '(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])(\.[a-zA-Z0-9]+)*)+';

Gábor Hojtsy’s picture

Uhm, right, the new pattern does not allow dashes in any component after the first dot in the email address if I'm reading it right. Doh.

Rolled back the patch and currently planning to release a Drupal 6.25 on Feb 29th. So we are back to the original bug with dots at the end of email addresses and can work out of there again.

Gábor Hojtsy’s picture

Version: 7.9 » 6.24
Component: CSS » base system

Drupal 6.25 with this rollback should be available in a matter of minutes at http://drupal.org/node/1461656 and at http://drupal.org/drupal-6.25

cafuego’s picture

I would point out that the regex as it stands also considers short notation IPv6 addresses invalid. For example with an IPv6 local address like so:

  • Valid: fe80:0000:0000:0000:f03c:91ff:fedf:8b07
  • Not valid: fe80::f03c:91ff:fedf:8b07
  • Not valid: fe80::

These are equivalent, the latter is just far easier to write, so it'd be nice if Drupal accepted it. The regex below is stolen from perls Regexp::IPv6.pm library and seems to be doing the business in my little test script:

(?-xism::(?::[0-9a-fA-F]{1,4}){0,5}(?:(?::[0-9a-fA-F]{1,4}){1,2}|:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})))|[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}:(?:[0-9a-fA-F]{1,4}|:)|(?::(?:[0-9a-fA-F]{1,4})?|(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))))|:(?:(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|[0-9a-fA-F]{1,4}(?::[0-9a-fA-F]{1,4})?|))|(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|:[0-9a-fA-F]{1,4}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){0,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,2}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,3}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:))|(?:(?::[0-9a-fA-F]{1,4}){0,4}(?::(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))|(?::[0-9a-fA-F]{1,4}){1,2})|:)))

Is it ok to leave this in this issue or should I open a new issue specifically for the ipv6 regex?

albeiq’s picture

Version: 6.24 » 6.25

We need to start from scratch following RFC1035.
Here is the EBNF grammar for domain names, stated in RFC1035:

<domain> ::= <subdomain> | " "

<subdomain> ::= <label> | <subdomain> "." <label>

<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]

<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>

<let-dig-hyp> ::= <let-dig> | "-"

<let-dig> ::= <letter> | <digit>

<letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case

<digit> ::= any one of the ten digits 0 through 9

Then, looking at the grammar, this should be the right regex:

$domain =
    '(?:[a-zA-Z](?:(?:[a-zA-Z0-9\-])*[a-zA-Z0-9])?)(?:\.(?:[a-zA-Z](?:(?:[a-zA-Z0-9\-])*[a-zA-Z0-9])?))*';

I tested it a bit and it solves the bug for the dots at the end of email addresses, still allowing hyphens in any part of the domain.

Please test it extensively and commit the patch for 6.25.

albeiq’s picture

I removed some unnecessary parenthesis:

$domain = '[a-zA-Z](?:(?:[a-zA-Z0-9\-])*[a-zA-Z0-9])?(?:\.[a-zA-Z](?:(?:[a-zA-Z0-9\-])*[a-zA-Z0-9])?)*';

DamienMcKenna’s picture

If you're going to backtrack to doing RFC-compliant addresses, just use the patch posted above of code from Dominic Sayers, or do a new patch based on his latest code, there's no need to write it from scratch (see: NIH Syndrome).

johncs’s picture

Those long names are examples of random subdomains of js.id.au, you can invent your own and they will work.

Read this again:
07:51 [summer@www ~]$ nslookup jeremiah.zed
Server: 127.0.0.1
Address: 127.0.0.1#53

Name: jeremiah.zed.js.id.au
Address: 203.161.75.23

07:52 [summer@www ~]$ nslookup jeremiah.zed.
Server: 127.0.0.1
Address: 127.0.0.1#53

** server can't find jeremiah.zed.: NXDOMAIN
07:52 [summer@www ~]$ nslookup jeremiah.zed.js.id.au
Server: 127.0.0.1
Address: 127.0.0.1#53

Name: jeremiah.zed.js.id.au
Address: 203.161.75.23

07:55 [summer@www ~]$

07:52 [summer@www ~]$

One has a trailing dot, one does not.

I have added another example this time, it shows what the expansion finds. The trailing dot prevents this expansion.

I learned about this years ago when the only book I could find about sendmail was one written by Paul Vixie, and that was well over ten years ago, so this is not new behaviour. I discarded it just two or three weeks ago, "Surely I won't need that again."

DNS and BIND by Paul Albitz & Cricket Liu, published by O'Reilly describes the use of wildcards in chapter 15, emphasising their importance for email routing.
Here is the text in the fourth edition: http://docstore.mik.ua/orelly/networking_2ndEd/dns/ch16_02.htm

This is essential reading:
http://www.dns-sd.org/TrailingDotsInDomainNames.html
In short, "A domain name that doesn't have a dot at the end is not fully-qualified and is potentially ambiguous."

Chi’s picture

Issue summary: View changes

Here is a good point about RFC email validation.
http://girders.org/blog/2013/01/31/dont-rfc-validate-email-addresses

Status: Active » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.