From #21:

  1. It's semantically correct. Anonymous posts aren't all by the fictitious user zero, they're all by an undefined ("NULL") user.
  2. 0 = 0, but NULL <> NULL. We shouldn't be pretending any anonymous post is by the "same" author as another anonymous post.
  3. We lose the "magic" anonymous user row in the users table.
  4. Should we ever use foreign keys, we can automate anonymizing user content on user deletion.
  5. We have to do annoying things to even use the value zero in an auto-incrementing MySQL column.
  6. We can better enforce what UID columns allow anonymous values by choosing whether to allow or disallow NULL values.
  7. Because we already handle the anonymous user as a special case (a variable_get with no link) when rendering usernames, we can stop the confusing practice of joining anonymous user content onto the users table only to ignore everything we get from it.
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx’s picture

Title: This uid totally should be NULL instead of 0. » Anonymous uid totally should be NULL instead of 0.
Dave Reid’s picture

Could you please point to some specifics how it would benefit being NULL instead of 0 since this would be a lot of work to change this one little thing?

chx’s picture

Ever seen install.php ? tried to import a users table? not easy... that in alone would worth it. Also, it's theoretically more correct.

MrAdamJohn’s picture

agreed - it is theoretically more correct. might be better to approach the problem with a 'layer' in between... since a lot of problems (and re-coding presumably) would be caused by a return or use of NULL... why not return a -1 value instead? to be clear - the value in the table would be NULL and the value in use by drupal or returned on a call would be -1.

just an idea... ;)

Dave Reid’s picture

chx is referring to #204411: Anonymous user id breaks on MySQL export/import (xID of Zero get's auto-incremented on external xSQL events) and the following code in system.install (!= install.php):

  // This sets the above two users uid 0 (anonymous). We avoid an explicit 0
  // otherwise MySQL might insert the next auto_increment value.
  db_query("UPDATE {users} SET uid = uid - uid WHERE name = '%s'", '');
  // This sets uid 1 (superuser). We skip uid 2 but that's not a big problem.
  db_query("UPDATE {users} SET uid = 1 WHERE name = '%s'", 'placeholder-for-uid-1');
chx’s picture

Also note that if a field can be NULL and said field has a UNIQUE key on it then you can have multiple rows with NULL values because in SQL, the NULL value is never true in comparison to any other value, even NULL. This sounds like useful to me.

chx’s picture

-1 is not a good idea, you would break a lot of code totally unnecessary. Beauty of this feature is that if ($user->uid) still works happily.

killes@www.drop.org’s picture

Not sure this is a good idea. Wouldn't the performance of SQL queries suffer? I don't think you can do joind on fields containing NULL

Dave Reid’s picture

I'm taking a look at how this would be done and will help with the patch when it becomes possible to do so. One thing is we'll have to drop the primary key on {users} since uid cannot be a primary key if it allows NULL values. It will need to be changed to a unique key instead. From http://dev.mysql.com/doc/refman/5.1/en/create-table.html:

A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. For all engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL.

A PRIMARY KEY is a unique index where all key columns must be defined as NOT NULL. If they are not explicitly declared as NOT NULL, MySQL declares them so implicitly (and silently). A table can have only one PRIMARY KEY. If you do not have a PRIMARY KEY and an application asks for the PRIMARY KEY in your tables, MySQL returns the first UNIQUE index that has no NULL columns as the PRIMARY KEY.

Dave Reid’s picture

Side node while looking through this last night. Do we have a good reason for keeping an anonymous user record in the {users} table? user_load() could easily use drupal_anonymous_user().

David Strauss’s picture

+1000

chx’s picture

Dave, we keep that in the table to make INNER JOINs possible.

David Strauss’s picture

@killes: Joining on anonymous users it not a sensible operation because they're, well, anonymous. It doesn't mean anything for two items to both come from "anonymous," so the SQL model of NULL <> NULL is appropriate. Even the idea of the UID being NULL is entirely sensible because the user has no true UID.

Moreover, if we ever did start using foreign keys, there are two approaches we'd consider: CASCADE DELETE and SET NULL. SET NULL would anonymize a user's contributions without deleting them.

David Strauss’s picture

@Dave and @chx: We'd remove the anonymous user row from the users table. The current row is a hack-ish placeholder. I see no reason to use INNER JOINs on the users table if we want to include anonymous data. We should just LEFT JOIN.

Dave Reid’s picture

Yeah there's no point to having a NULL/empty record in the users table just so we can do inner joins. We won't have to implement a sequence API just to make the uid serial work, it won't have problems with import/exports anymore, and we can still make NULL the default uid (or leave it as 0 since the main problem would be fixed). Seems the optimal solution. I'll take a look at implementing that as well.

David Strauss’s picture

My casual benchmarks on the Drupal.org servers show this:

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.43 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.43 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.36 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.30 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.42 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.34 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.33 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.31 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.33 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.31 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.42 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.33 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.41 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.35 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n LEFT JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.36 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM node n INNER JOIN users u ON u.uid = n.uid;
+----------+
| COUNT(*) |
+----------+
|   340862 | 
+----------+
1 row in set (0.31 sec)

So, it looks like a tiny performance hit of <0.05s for 340K nodes. There's also potential for performance gains for anonymously published data because MySQL knows that NULL <> NULL and doesn't have to bother JOINing with the user table on those rows.

The reason INNER is so much faster in some cases (through not in this one) is because INNER JOINs allow MySQL to conveniently start from either side of the JOIN; the JOIN is symmetric. Because a LEFT JOIN doesn't restrict its result set based on rows in the right-hand table, MySQL can't usefully restrict its working set based on starting there. This isn't a significant concern for the users table because we generally use it to pull extra data on users and not to restrict results.

If we did want to use the users table to restrict results, we would still use an INNER JOIN, even with the anonymous UID being NULL.

Damien Tournoud’s picture

I'm not sure removing the uid = 0 line is very wise.

The main benefit of this line is that we automatically get name = 'anonymous' when joining {node} or {comment} to {user} for anonymous content. Removing that line would mean special casing this everywhere, including in places that doesn't really require more complexity (think... Views).

David Strauss’s picture

@Damien We don't have to have special cases everywhere because everything should render usernames through theme functions. Moreover, we already have as much of a "special case" as we would need: we don't link "Anonymous" to /user/0. The extra step of also outputting "Anonymous" is no greater work than deciding to not link "Anonymous" and link all real accounts.

Dave Reid’s picture

Yeah there's literally nothing valuable when doing joins on users for anonymous users. Every field is empty, and it's up to the query to do something like echo (!$result->uid ? variable_get('anonymous', t('Anonymous')) : $result->name);.

drewish’s picture

After reading through this I'm still not seeing the win of switching to NULL...

David Strauss’s picture

@drewish:

* It's semantically correct. Anonymous posts aren't all by the fictitious user zero, they're all by an undefined ("NULL") user.
* 0 = 0, but NULL <> NULL. We shouldn't be pretending any anonymous post is by the "same" author as another anonymous post.
* We lose the "magic" anonymous user row in the users table.
* Should we ever use foreign keys, we can automate anonymizing user content on user deletion.
* We have to do annoying things to even use the value zero in an auto-incrementing MySQL column.
* We can better enforce what UID columns allow anonymous values by choosing whether to allow or disallow NULL values.
* Because we already handle the anonymous user as a special case (a variable_get with no link) when rendering usernames, we can stop the confusing practice of joining anonymous user content onto the users table only to ignore everything we get from it.

drewish’s picture

I'd say that's a pretty good list of reasons. Thanks for the summary ;)

chx’s picture

Dave Reid’s picture

Issue tags: +anonymous users
Dave Reid’s picture

Component: database system » base system
Status: Active » Needs review
FileSize
8.28 KB

First step, remove any comparisons to 0 and instead simply do $uid or !$uid.

chx’s picture

Hm, sure, but why did comment use === I wonder.

Dave Reid’s picture

The code was introduced in #147417: Only save cookie for anonymous users and looks like there was no real reason to do an identical check. Comparing !$user->uid works just fine in that case.

Status: Needs review » Needs work

The last submitted patch failed testing.

David Strauss’s picture

Not too long ago (post Drupal.org D6 upgrade), we were specifically seeing the semantic harm in uid 0 for anonymous visitors at drupal.org/project/user. It was showing projects without an assigned user, which is not equivalent to all anonymous visitors owning those projects. Having uid 0 for anonymous users invites those sorts of errors.

valkire’s picture

Did this ever go anywhere?

Just thought I'd point out that we'll want the uid column in {users} to start at 1, if we want to still be able to do if($user->uid).

Damien Tournoud’s picture

Version: 7.x-dev » 8.x-dev

Probably not going to happen in Drupal 7.

carlos8f’s picture

subscribing

jbrown’s picture

Totally!

Leeteq’s picture

Subscribing.

pillarsdotnet’s picture

Running into the same problem with the quotes module. I'd like "Anonymous" to be aid (author id) NULL instead of 0, but I can't because it's an AUTO_INCREMENT field. Therefore, subscribing...

carlos8f’s picture

Just to clear up some confusion, uid is technically no longer AUTO_INCREMENT, it was changed to a regular 'int' field as of #356074: Provide a sequences API. However no upgrade path was provided in that patch, so D6->D7 db's will still be set to AUTO_INCREMENT, which is a bug :)

sun’s picture

Interesting. No strong preference, but all I can say is that it's a Drupalism to have an actual value for uid 0 - other systems indeed have no value (NULL) for anonymous users.

So, while this change feels a bit like "ugh, wow." I can definitely see the case of sanity and consistency.

chx’s picture

you have no idea of what you are getting into by changing an often used value into NULL . SQL NULL handling is madness.

Edit: yes I said years ago theoretically more correct and theoretically it might be TRUE, in reality, ouch.

MrAdamJohn’s picture

@chx ... I second that. Recently had a series of cascading problems from a NULL field db change. Suffice to say, this change, if adopted, would best be part of a major version release (#31) and decided as early as possible. Real life db normalization should teach us many things; ie Best in theory is not always best in reality.

chx’s picture

Also, anonymous users should be just userS and handled in an anonymous user bundle to handle the need for different fields as in comment module ie shut this issue down in favor of #355513: Create an anonymous user API

pillarsdotnet’s picture

Status: Needs work » Closed (duplicate)

I agree with #40.

Better approach at #355513: Create an anonymous user API

moshe weitzman’s picture

Issue summary: View changes

Updated issue summary.

moshe weitzman’s picture

SInce #355513: Create an anonymous user API has stalled,m I think this might still be a good way forward. I updated the Summary.

moshe weitzman’s picture

Title: Anonymous uid totally should be NULL instead of 0. » Anonymous should not appear in the users table at all
sun’s picture

Status: Closed (duplicate) » Active

Can only guess you meant to re-open this issue.

That said, this change proposal seems to make sense on its own, regardless of whether we'll do #355513: Create an anonymous user API in addition or not.

I.e.: you either get NULL or you get actual account information. That is, because aforementioned issue only intends to provide anonymous account info under certain conditions, so getting NULL for anonymous users won't be completely obsolete.

EclipseGc’s picture

Status: Active » Needs review
FileSize
1.45 KB

Ok, Drupal 8 leverages typed data in a bunch of places, and short of special casing the ever loving crap out of users within it, I think we have to consider this is a requirement for release at this point. I'm moving this to major and providing the beginnings of a patch to see what fails and if I can get this moving some.

Eclipse

Status: Needs review » Needs work

The last submitted patch, 350407.patch, failed testing.

rlmumford’s picture

Status: Needs work » Needs review
FileSize
1.94 KB

This patch also comments out a line in theme.inc that was causing the install process to break. It is now left to user module to add the 'user' variable to the default variables using hook_template_preprocess_default_variables_alter(). [I think it does this already]

Status: Needs review » Needs work

The last submitted patch, 350407-47.patch, failed testing.

EclipseGc’s picture

FileSize
2.58 KB

Ok, another revision to fix that test.

EclipseGc’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
Issue tags: -anonymous users

The last submitted patch, 350407.patch, failed testing.

EclipseGc’s picture

Status: Needs work » Needs review

#49: 350407.patch queued for re-testing.

I don't know why that failed, it passes for me locally on that test. retesting.

Eclipse

Status: Needs review » Needs work
Issue tags: +anonymous users

The last submitted patch, 350407.patch, failed testing.

EclipseGc’s picture

nvm, I'm crazy that doesn't pass locally. Will get it tomorrow.

rlmumford’s picture

Been looking through CommentFormController and it looks like $comment->is_anonymous is not being set to true in buildEntity() - it's actually not being set to anything - despite being set correctly in the form. That means the 'name' doesn't get stored with the comment (as they assume a real user has posted it).

rlmumford’s picture

Status: Needs work » Needs review
FileSize
2.71 KB

I've done some more digging and the problem actually seems to be in the CommentStorageController around line 150.

      // We test the value with '===' because we need to modify anonymous
      // users as well.
      if ($comment->uid->target_id === $user->uid && isset($user->name)) {
        $comment->name->value = $user->name;
      }

I'm not entirely sure what this code is trying to do however, as the anonymous user is a now an entity isset($user->name) is always going to be true and so the name set on the comment is never going to stick. I think we can change this to the following:

      // We test the value with '===' because we need to modify anonymous
      // users as well.
      if ($comment->uid->target_id === $user->uid && $user->uid) {
        $comment->name->value = $user->name;
      }
Dave Reid’s picture

Not sure why we can't just remove the commented-out code from _template_preprocess_default_variables().

Dave Reid’s picture

Status: Needs review » Active

For anyone else that didn't know, the current patches were moved to #1634280: drupal_anonymous_user() should return a User entity. Moving back to active for the original proposal.

Dave Reid’s picture

Issue summary: View changes

Updated issue summary.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.2.x-dev » 9.3.x-dev
quietone’s picture

Version: 9.3.x-dev » 8.0.x-dev
Issue summary: View changes
Status: Active » Closed (outdated)
Related issues: +#1851086: Replace admin/people with a View

I don't know when it was fixed but anyonymous is no longer appearing the user table at admin/people. It was certainly fixed when converted to a view in #1851086: Replace admin/people with a View.