Cleanup for user_validate_name() + tests

MadHarold - June 4, 2008 - 08:37
Project:Drupal
Version:6.x-dev
Component:user system
Category:bug report
Priority:normal
Assigned:Unassigned
Status:patch (code needs review)
Description

The current code for user_validate_name() still uses ereg() and eregi() and the unit test for it is a bit bulky, so I decided to clean it up a little.

I ran into this piece of code:

<?php
if (strpos($name, '@') !== FALSE && !eregi('@([0-9a-z](-?[0-9a-z])*.)+[a-z]{2}([zmuvtg]|fo|me)?$', $name)) {
  return
t('The username is not a valid authentication ID.');
}
?>

Which checks for a valid domain if the username contains a '@'.

What's the purpose of this piece of code? If it checks for a valid OpenID, it will fail miserably because OpenID's may contain more characters than Drupal allows at the 'user' part.

So, my question is: can this check be dropped from user_validate_name()? Or must we create a separate check for such user names?

AttachmentSize
user_validate_name.patch4.18 KB

#1

Damien Tournoud - June 4, 2008 - 08:49

I don't pretend to understand that strange regexp, but it seems that there is no need to validate the server part because the authentication scheme don't use that anymore (it was used for Drupal Distributed Authentication, which is deprecated in favor of OpenID).

So feel free to drop that part.

One remark about the patch: the strlen() there should be a drupal_strlen(), because we are in an unicode context.

#2

MadHarold - June 4, 2008 - 09:06

One remark about the patch: the strlen() there should be a drupal_strlen(), because we are in an unicode context.
I was just about to comment on this ;)

Also the test cases have to be expanded with a lot more weirdo chars like chr(0) or chr(128) and various weirdo UTF8 chars, etc (quoting chx).

#3

Damien Tournoud - June 4, 2008 - 09:08

and various weirdo UTF8 chars... or even invalid UTF8 characters for that matter.

#4

MadHarold - June 5, 2008 - 08:41
Status:patch (code needs work)» patch (code needs review)

I've attached a patch that removes the 'authentication ID' check and replaces all remaining all ereg()s with preg_match(). I've also added a check on control characters.

I'm still not sure how to handle 'foo' . chr(128) . 'bar' and if it's allowed in usernames... This is a handy table for those pesky 128(+) chars...

Suggestions?

AttachmentSize
user.patch6.41 KB

#5

catch - June 5, 2008 - 10:23

Patch is goods so far.

I'm wondering if we could incorporate allowing apostrophes into this cleanup, since it's currently just an omission. Patch and discussion here: http://drupal.org/node/165226

#6

MadHarold - June 5, 2008 - 11:48

I've added apostrophe support(+test) and a small change in check_plain() as proposed in http://drupal.org/node/165226 in the new version of the patch.

I've chosen to use ENT_COMPAT (Will convert double-quotes and leave single-quotes alone (default)) instead of ENT_NOQUOTES (Will leave both double and single quotes unconverted) because we still don't allow double quotes...

AttachmentSize
user.patch7.63 KB

#7

Damien Tournoud - June 5, 2008 - 12:20

I don't understand at all why we are not using ENT_NOQUOTES in check_plain()?

It looks like this sort of things should not even be required. I guess some part of the code may call check_plain() two times, and it fails because that function is not reentrant.

#8

MadHarold - June 5, 2008 - 12:24

Attached a patch with ENT_NOQUOTES.

This goes well beyond my knowledge of Drupal so I can't predict the impact of it...

AttachmentSize
user.patch7.63 KB

#9

catch - June 5, 2008 - 12:28

Works for me. Will leave it at needs review for a second set of eyes but I think this is ready. Side note, patch isn't rolled from root of the Drupal directory, it's easier to apply that way.

#10

MadHarold - June 5, 2008 - 12:35

I've used this command in the root of my D7 installation: harold@pi:/var/www/d7$ cvs diff -up modules/user includes > /var/www/user.patch as explained in http://drupal.org/patch/create

So the basic command to use is:

cvs diff -up original.php > filename.patch

What am I doing wrong?

#11

Damien Tournoud - June 5, 2008 - 12:40

@MadHarold, the patch is perfectly rolled from the root directory, there is nothing wrong about it.

#12

catch - June 5, 2008 - 13:05

whoops. Nothing wrong. user_100.patch wasn't rolled from root (or at least was prompting me for filenames), user_101.patch is completely fine. Sorry for the noise :(

#13

nbz - June 17, 2008 - 13:00
Status:patch (code needs review)» patch (reviewed & tested by the community)

Testing the patch in #8:

Applied the patch to todays dev tarball - applies cleanly.

created a number of accounts - those with or without apostrophe are created correctly. Those with quotation marks (") give an error of an illegal character.

All that is left is for a "policy decision" - either patch is comment or comment comment 8 should go in.

Moved to RTBC as I am the second pair of eyes as requested by Catch.

#14

Dries - June 18, 2008 - 16:13

I've asked the security team to have a quick at this patch as well.

#15

pwolanin - June 18, 2008 - 17:13

Well, there are some XSS vectors that succeed due to injected quotes closing an img path or attribute, so I'm not that comfortable with this. Clearly check_plain() should not be called multiple time on the same text - but that's a separate issue.

#16

Heine - June 18, 2008 - 17:30
Status:patch (reviewed & tested by the community)» patch (code needs work)

Using ENT_NOQUOTES allows values to 'run out' of attribute values:

Consider:

<?php
$output
= '<tag attribute="'. check_plain($somevalue) .'" />';
?>

Now, suppose $somevalue is 'foo" onerror="javascript:evil_js()" '

You'll get:

<tag attribute="foo" onerror="javascript:evil_js()" />

#17

nbz - June 18, 2008 - 18:11

Would this analysis also apply for the original (otherwise almost identical) patch in comment 6 that uses ENT_COMPAT instead of ENT_NOQUOTES?

#18

Heine - June 18, 2008 - 20:13

In HTML 4, attributes may be delimited by either single or double quotes. Single quotes may be used unencoded in attribute values when the value is surrounded by double quotes and vv. As check_plain has no information about which quotes are used to embed something we need to play it safe and cater to both delimiters.

#19

Heine - June 18, 2008 - 21:00

Or in clear words; check_plain should not be modified.

#20

nbz - June 18, 2008 - 22:18

Is there another sane approach that can be used to allow apostrophes in usernames?

(this issue does not deal directly with that issue, the the failed segments of the patch are those that have been imported from the other issue mentioned in comments 5 and 6.)

#21

Heine - June 19, 2008 - 05:23

Of course there is. Investigate the causes of double encoding and fix them.

For instance: blog_link escapes the username

<?php
$links
['blog_usernames_blog'] = array(
 
'title' => t("@username's blog", array('@username' => $node->name)),
 
'href' => "blog/$node->uid",
 
'attributes' => array('title' => t("Read @username's latest blog entries.", array('@username' => $node->name)))
);
?>

Then theme_links escapes the title and attributes again (in l()):

<?php
if (isset($link['href'])) {
 
$output .= l($link['title'], $link['href'], $link['attributes'], $link['query'], $link['fragment'], FALSE, $html);
}
else if (
$link['title']) {
 
//Some links are actually not links, but we wrap these in <span> for adding title and class attributes
 
if (!$html) {
   
$link['title'] = check_plain($link['title']);
  }
?>

#22

nbz - June 26, 2008 - 14:37
Status:patch (code needs work)» patch (code needs review)

I assume this patch is still useful without changing things to ENT_NOQUOTES, or to ENT_COMPAT, as they were added later on to the patch.

Attached patch is the same as above, but I have manually removed the bit for changing the ENT_QUOTES bit. I have kept in the bit allowing apostrophe's. I say let's assume the places where they do not work properly are bugs (they do work in most places).

AttachmentSize
user.patch7.04 KB

#23

Damien Tournoud - June 26, 2008 - 15:16
Status:patch (code needs review)» patch (reviewed & tested by the community)

The patch in #22 can go in this way. Making check_plain() reentrant will require a new issue.

#24

Damien Tournoud - June 26, 2008 - 15:37

I introduced a discussion about making check_plain() reentrant in #275308.

#25

Dries - June 26, 2008 - 19:48
Status:patch (reviewed & tested by the community)» fixed

Committed to CVS HEAD. Thanks.

#26

nbz - June 26, 2008 - 20:44
Version:7.x-dev» 6.x-dev
Status:fixed» patch (to be ported)

#27

Damien Tournoud - June 26, 2008 - 20:46
Version:6.x-dev» 7.x-dev
Status:patch (to be ported)» active

Some tests are failing. Please keep that issue open on 7.x so we can figured out why.

#28

Damien Tournoud - June 26, 2008 - 20:53

Dries: Looks like an incomplete version of the patch was committed. Only the user.test hunks were committed, not the user.module ones.

#29

Damien Tournoud - June 26, 2008 - 20:59
Priority:normal» critical

Bumping to critical by convention (tests are failing...).

#30

Damien Tournoud - June 26, 2008 - 21:09
Status:active» patch (code needs review)

While we are at it, here is the user.module hunk plus some small cleanups for the new username validation tests, that make test descriptions cleaner and remove a spurious '%s' in the test description.

AttachmentSize
266488-user-test-cleanup.patch5.8 KB

#31

catch - June 26, 2008 - 21:15
Status:patch (code needs review)» patch (reviewed & tested by the community)

Looks good, works good too.

#32

Dries - June 27, 2008 - 07:25
Status:patch (reviewed & tested by the community)» fixed

Committed to CVS HEAD. Thanks Damien, and sorry for messing up.

#33

Damien Tournoud - June 27, 2008 - 11:36
Version:7.x-dev» 6.x-dev
Priority:critical» normal
Status:fixed» patch (to be ported)

Ok, let's consider inclusion in the Drupal 6.x branch.

#34

Freso - June 28, 2008 - 16:13
Status:patch (to be ported)» patch (code needs review)

Here's a port. I've excluded the user.test bit, as well as other things that actually changed functionality and/or strings.

AttachmentSize
266488-user-test-cleanup-34.patch2.32 KB

#35

webchick - June 28, 2008 - 16:48

+  if (substr($name, 0, 1) == ' ') {
+    return t('The username cannot begin with a space.');
+  }
+  if (substr($name, -1) == ' ') {
+    return t('The username cannot end with a space.');

Why can't we just trim() username? I realize this is already in the 7.x version; I didn't get a chance to review that patch before it was committed. It also shouldn't be a tab character or any other whitespace. trim() takes care of that.

#36

nbz - June 28, 2008 - 21:25

Followup issue for fixing instances of the username being check_plain'ed twice:

http://drupal.org/node/276174

#37

nbz - July 2, 2008 - 03:34
Status:patch (code needs review)» patch (reviewed & tested by the community)

Patch in comment 34 applies cleanly, works as expected and is a port of what is in HEAD so setting to RTBC

As for comment 35, I would guess the proper place to trim would be upon input - before the validation function.

#38

Damien Tournoud - July 5, 2008 - 14:59
Status:patch (reviewed & tested by the community)» patch (code needs review)

While I completely agree with Angie's comment in #35, this is another issue.

However, two hunks were excluded from the patch in #34, that we should consider including again:

  • The dropping of the "authentication ID" check: I think this should be dropped also for D6, because it is not used anymore (remote authentication schemes are done differently, see my comment in #1)
  • The addition of control characters in the 0x00-0x1f range, that we should probably include for sanity

I'm putting on review the opportunity to add those back in.

#39

nbz - July 9, 2008 - 16:49
Title:Cleanup for user_validate_name() + tests» Cleanup for user_validate_name() + allow apostrophes

I have re-ported the patch from comment 30 - without any functional changes that were introduced to the port to drupal 6 in comment 34.

AttachmentSize
user_validate.patch3.19 KB

#40

Freso - July 9, 2008 - 17:08
Title:Cleanup for user_validate_name() + allow apostrophes» Cleanup for user_validate_name() + tests

@nbz: You're implementing a new feature though: allowing apostrophes.

Drupal 6 is in feature and string freeze both. The patch in #34 didn't change alter or add or remove any feature, even if #38 thinks it should contain more of the 7.x code. I'm still not sure about this though, as it will change the behaviour. So I'm leaving it for Gábor/Dries to decide on how much of it should be back ported.

#41

Heine - July 9, 2008 - 17:35

Not supporting a single quote in a username is a bug, not a lacking feature.

#42

nbz - July 9, 2008 - 17:49

@ #40 - I see where you are coming from, but from the mailing lists it seems Drupal 6 may get an extended life cycle (by virtue of delaying Drupal 7 code freeze date), and IMO allowing apostrophe's is a vital functionality, the lack of which bordering close to being a bug.

Not having explicit approval (yet) from Dries or Gabor for this additional functionality should not be a blocker for RTBC status. IMO the code review should be to make sure that it all works as expected without any caveats. nce the patch has been set RTBC, the branch maintainer can make up his mind about whether this should be allowed or not.

EDIT - not having this feature in core means that me (and potentially others who have migrated users from other systems that allow apostrophes) will run a patched core install, which may have security vulnerabilities. That should be considered a bug, not in the released software, but procedure.

#43

nbz - July 21, 2008 - 16:02

We have two patches here - comment 34 without the changes allowing apostrophes in usernames, and comment 39 with the changes allowing apostrophes.

Can somebody please review (and maybe help in the other issue (#276174: DO not check_plain() usernames more than once) too so that I can find out exactly where the other checkplain takes place)?

Once both of these patches are RTBC, the maintainer can decide if he wished to allow or not allow this bug fixing functionality.

#44

Freso - July 28, 2008 - 09:46

Patch in comment #34 (without new features and string change) was RTBC'd (in comment 37), so we're just short of a review of the patch in comment #39 (adding features and changing a string) before we can let Gábor-Dries decide on what to go with.

 
 

Drupal is a registered trademark of Dries Buytaert.