Cleanup for user_validate_name() + tests
MadHarold - June 4, 2008 - 08:37
| Project: | Drupal |
| Version: | 6.x-dev |
| Component: | user system |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | patch (code needs review) |
Description
The current code for user_validate_name() still uses ereg() and eregi() and the unit test for it is a bit bulky, so I decided to clean it up a little.
I ran into this piece of code:
<?php
if (strpos($name, '@') !== FALSE && !eregi('@([0-9a-z](-?[0-9a-z])*.)+[a-z]{2}([zmuvtg]|fo|me)?$', $name)) {
return t('The username is not a valid authentication ID.');
}
?>Which checks for a valid domain if the username contains a '@'.
What's the purpose of this piece of code? If it checks for a valid OpenID, it will fail miserably because OpenID's may contain more characters than Drupal allows at the 'user' part.
So, my question is: can this check be dropped from user_validate_name()? Or must we create a separate check for such user names?
| Attachment | Size |
|---|---|
| user_validate_name.patch | 4.18 KB |

#1
I don't pretend to understand that strange regexp, but it seems that there is no need to validate the server part because the authentication scheme don't use that anymore (it was used for Drupal Distributed Authentication, which is deprecated in favor of OpenID).
So feel free to drop that part.
One remark about the patch: the
strlen()there should be adrupal_strlen(), because we are in an unicode context.#2
One remark about the patch: the strlen() there should be a drupal_strlen(), because we are in an unicode context.
I was just about to comment on this ;)
Also the test cases have to be expanded with a lot more weirdo chars like
chr(0)orchr(128)and various weirdo UTF8 chars, etc (quoting chx).#3
and various weirdo UTF8 chars... or even invalid UTF8 characters for that matter.
#4
I've attached a patch that removes the 'authentication ID' check and replaces all remaining all
ereg()swithpreg_match(). I've also added a check on control characters.I'm still not sure how to handle
'foo' . chr(128) . 'bar'and if it's allowed in usernames... This is a handy table for those pesky 128(+) chars...Suggestions?
#5
Patch is goods so far.
I'm wondering if we could incorporate allowing apostrophes into this cleanup, since it's currently just an omission. Patch and discussion here: http://drupal.org/node/165226
#6
I've added apostrophe support(+test) and a small change in
check_plain()as proposed in http://drupal.org/node/165226 in the new version of the patch.I've chosen to use
ENT_COMPAT(Will convert double-quotes and leave single-quotes alone (default)) instead ofENT_NOQUOTES(Will leave both double and single quotes unconverted) because we still don't allow double quotes...#7
I don't understand at all why we are not using ENT_NOQUOTES in
check_plain()?It looks like this sort of things should not even be required. I guess some part of the code may call
check_plain()two times, and it fails because that function is not reentrant.#8
Attached a patch with
ENT_NOQUOTES.This goes well beyond my knowledge of Drupal so I can't predict the impact of it...
#9
Works for me. Will leave it at needs review for a second set of eyes but I think this is ready. Side note, patch isn't rolled from root of the Drupal directory, it's easier to apply that way.
#10
I've used this command in the root of my D7 installation:
harold@pi:/var/www/d7$ cvs diff -up modules/user includes > /var/www/user.patchas explained in http://drupal.org/patch/createSo the basic command to use is:
cvs diff -up original.php > filename.patch
What am I doing wrong?
#11
@MadHarold, the patch is perfectly rolled from the root directory, there is nothing wrong about it.
#12
whoops. Nothing wrong. user_100.patch wasn't rolled from root (or at least was prompting me for filenames), user_101.patch is completely fine. Sorry for the noise :(
#13
Testing the patch in #8:
Applied the patch to todays dev tarball - applies cleanly.
created a number of accounts - those with or without apostrophe are created correctly. Those with quotation marks (") give an error of an illegal character.
All that is left is for a "policy decision" - either patch is comment or comment comment 8 should go in.
Moved to RTBC as I am the second pair of eyes as requested by Catch.
#14
I've asked the security team to have a quick at this patch as well.
#15
Well, there are some XSS vectors that succeed due to injected quotes closing an img path or attribute, so I'm not that comfortable with this. Clearly check_plain() should not be called multiple time on the same text - but that's a separate issue.
#16
Using ENT_NOQUOTES allows values to 'run out' of attribute values:
Consider:
<?php$output = '<tag attribute="'. check_plain($somevalue) .'" />';
?>
Now, suppose $somevalue is 'foo" onerror="javascript:evil_js()" '
You'll get:
<tag attribute="foo" onerror="javascript:evil_js()" />#17
Would this analysis also apply for the original (otherwise almost identical) patch in comment 6 that uses ENT_COMPAT instead of ENT_NOQUOTES?
#18
In HTML 4, attributes may be delimited by either single or double quotes. Single quotes may be used unencoded in attribute values when the value is surrounded by double quotes and vv. As check_plain has no information about which quotes are used to embed something we need to play it safe and cater to both delimiters.
#19
Or in clear words; check_plain should not be modified.
#20
Is there another sane approach that can be used to allow apostrophes in usernames?
(this issue does not deal directly with that issue, the the failed segments of the patch are those that have been imported from the other issue mentioned in comments 5 and 6.)
#21
Of course there is. Investigate the causes of double encoding and fix them.
For instance: blog_link escapes the username
<?php$links['blog_usernames_blog'] = array(
'title' => t("@username's blog", array('@username' => $node->name)),
'href' => "blog/$node->uid",
'attributes' => array('title' => t("Read @username's latest blog entries.", array('@username' => $node->name)))
);
?>
Then theme_links escapes the title and attributes again (in l()):
<?phpif (isset($link['href'])) {
$output .= l($link['title'], $link['href'], $link['attributes'], $link['query'], $link['fragment'], FALSE, $html);
}
else if ($link['title']) {
//Some links are actually not links, but we wrap these in <span> for adding title and class attributes
if (!$html) {
$link['title'] = check_plain($link['title']);
}
?>
#22
I assume this patch is still useful without changing things to ENT_NOQUOTES, or to ENT_COMPAT, as they were added later on to the patch.
Attached patch is the same as above, but I have manually removed the bit for changing the ENT_QUOTES bit. I have kept in the bit allowing apostrophe's. I say let's assume the places where they do not work properly are bugs (they do work in most places).
#23
The patch in #22 can go in this way. Making check_plain() reentrant will require a new issue.
#24
I introduced a discussion about making check_plain() reentrant in #275308.
#25
Committed to CVS HEAD. Thanks.
#26
#27
Some tests are failing. Please keep that issue open on 7.x so we can figured out why.
#28
Dries: Looks like an incomplete version of the patch was committed. Only the user.test hunks were committed, not the user.module ones.
#29
Bumping to critical by convention (tests are failing...).
#30
While we are at it, here is the user.module hunk plus some small cleanups for the new username validation tests, that make test descriptions cleaner and remove a spurious '%s' in the test description.
#31
Looks good, works good too.
#32
Committed to CVS HEAD. Thanks Damien, and sorry for messing up.
#33
Ok, let's consider inclusion in the Drupal 6.x branch.
#34
Here's a port. I've excluded the user.test bit, as well as other things that actually changed functionality and/or strings.
#35
+ if (substr($name, 0, 1) == ' ') {+ return t('The username cannot begin with a space.');
+ }
+ if (substr($name, -1) == ' ') {
+ return t('The username cannot end with a space.');
Why can't we just trim() username? I realize this is already in the 7.x version; I didn't get a chance to review that patch before it was committed. It also shouldn't be a tab character or any other whitespace. trim() takes care of that.
#36
Followup issue for fixing instances of the username being check_plain'ed twice:
http://drupal.org/node/276174
#37
Patch in comment 34 applies cleanly, works as expected and is a port of what is in HEAD so setting to RTBC
As for comment 35, I would guess the proper place to trim would be upon input - before the validation function.
#38
While I completely agree with Angie's comment in #35, this is another issue.
However, two hunks were excluded from the patch in #34, that we should consider including again:
I'm putting on review the opportunity to add those back in.
#39
I have re-ported the patch from comment 30 - without any functional changes that were introduced to the port to drupal 6 in comment 34.
#40
@nbz: You're implementing a new feature though: allowing apostrophes.
Drupal 6 is in feature and string freeze both. The patch in #34 didn't change alter or add or remove any feature, even if #38 thinks it should contain more of the 7.x code. I'm still not sure about this though, as it will change the behaviour. So I'm leaving it for Gábor/Dries to decide on how much of it should be back ported.
#41
Not supporting a single quote in a username is a bug, not a lacking feature.
#42
@ #40 - I see where you are coming from, but from the mailing lists it seems Drupal 6 may get an extended life cycle (by virtue of delaying Drupal 7 code freeze date), and IMO allowing apostrophe's is a vital functionality, the lack of which bordering close to being a bug.
Not having explicit approval (yet) from Dries or Gabor for this additional functionality should not be a blocker for RTBC status. IMO the code review should be to make sure that it all works as expected without any caveats. nce the patch has been set RTBC, the branch maintainer can make up his mind about whether this should be allowed or not.
EDIT - not having this feature in core means that me (and potentially others who have migrated users from other systems that allow apostrophes) will run a patched core install, which may have security vulnerabilities. That should be considered a bug, not in the released software, but procedure.
#43
We have two patches here - comment 34 without the changes allowing apostrophes in usernames, and comment 39 with the changes allowing apostrophes.
Can somebody please review (and maybe help in the other issue (#276174: DO not check_plain() usernames more than once) too so that I can find out exactly where the other checkplain takes place)?
Once both of these patches are RTBC, the maintainer can decide if he wished to allow or not allow this bug fixing functionality.
#44
Patch in comment #34 (without new features and string change) was RTBC'd (in comment 37), so we're just short of a review of the patch in comment #39 (adding features and changing a string) before we can let Gábor-Dries decide on what to go with.