This is the error in watchdog:

Duplicate entry '67c8c88248a6aebf6c273a50751374b3' for key 1 query: INSERT INTO sessions (sid, uid, cache, hostname, session, timestamp) VALUES ('67c8c88248a6aebf6c273a50751374b3', 0, 0, '<ip-address>', '', 1201162948) in /home/gelstle/public_html/client2/includes/database.mysql.inc on line 172.

Has happened about 8 times in the last 3 months.

Example from this morning: anonymous visitor lands on page that has 5 images whose src is defined in the form system/files/filename.jpg. What seems to be happening is:

First image: no problem, at end of request, http://api.drupal.org/api/function/sess_write/5 inserts the session into the {sessions} table.
2nd and 3rd images: the requests for these arrive at the server almost simultaneously with the first. At the end of the request, in sess_write(), the SELECT query returns a null result because the INSERT query has not run yet. These generate the error.
4th and 5th images: no problem

I can't see anything in 5.6 (or 6.x for that matter) that would change this behaviour in this situation.

Comments

gpk’s picture

This search brings up related but seemingly different issues that have all been fixed AFAICS: http://drupal.org/search/node/Duplicate+entry+for+%22key+1+query%22%3A+%...

c960657’s picture

StatusFileSize
new2.07 KB

I get this error several times a day. Because of the trigger_error() call in _db_query(), the error is written to the Apache error log and shows up in our log analyzer.

This patch for Drupal 5 just mutes any error from db_query("INSERT INTO ...") and in case of an error tries an UPDATE instead. What do you think of this?

An alternative approach is to use REPLACE INTO instead on MySQL and something similar on PostgreSQL, but I guess that makes the code more complicated with little gain. I may even be faster to just try the INSERT and then retry with an UPDATE in the rare occasion that the INSERT fails.

c960657’s picture

StatusFileSize
new2.33 KB

Here is a patch for Drupal 6.

BTW I don't know why the trigger_error() call shows up in my Apache error log. It should have been caught by Drupal's error handler, but apparently this doesn't happen. Whatever happens is probably not relevant for this issue, though.

gpk’s picture

Status: Active » Needs review

Looks promising, may need applying to 7.x (Drupal 6 patch looks fine) first then 6.x and 5.x finally.

Agree REPLACE INTO would be nice but not an option because of PostgreSQL incompatibility.

The @db_query("INSERT ... as you propose therefore seems a sensible option.

The only other way I can think is to try an UPDATE first and then if !db_affected_rows() try an INSERT, again muting the INSERT in case another thread got there first. Finally if the INSERT failed do an UPDATE after all. This offers a slight performance improvement in that existing sessions only require 1 query instead of 2; the improvement is not specifically related to this issue though. Also the line if ($user->uid || $value || count($_COOKIE)) { appears redundant, since if we've got that far then one or other of $_COOKIE[session_name()] or $value is non-empty, and hence $value || count($_COOKIE) will evaluate to TRUE anyway. Again, not really to do with this specific issue, these points would probably need to be dealt with separately and in 7.x only.

Not tested yet.. Have you tried the patch on your live site?

c960657’s picture

We put it on our production Drupal 5 site yesterday. So far we haven't seen any problems. I'll be out of town the next few days, but I'll get back with an updated status report on Monday.

You are right that the function could be optimized even further. It may be a good idea to completely eliminate the UPDATE {sessions} unless $value had actually changed or at most every X minutes has passed since the session was last written, similar to the session_write_interval check that was introduced in Drupal 6 for UPDATE {users}.

I suggest that we start out with a simple patch that can be applied to Drupal 5, 6, and 7 and then investigate further optimization for Drupal 7 in a separate issue. I don't mind giving it a shot.

c960657’s picture

So far there has been no problems.

I am new to Drupal development, so I'd like to hear your suggestion on how to proceed.

gpk’s picture

This will need to be fixed first in 7.x and 6.x before 5.x. (Actually Gábor seems to be committing patches to 6.x before Dries does same to 7.x - not sure if this is "normal"!) But before that it needs to be "Reviewed and Tested By the Community". On my list ...

jakeg’s picture

subscribing. I get these duplicates and also others for the following tables:
- node
- history
- cache_menu

See http://drupal.org/node/230029 re. node table duplicates

c960657’s picture

StatusFileSize
new2.33 KB

Patch updated for HEAD.

damien tournoud’s picture

Version: 5.1 » 7.x-dev
Status: Needs review » Needs work

The patch is on the right tracks, but we have concurrency issues at more places in the Drupal code base, and the solution used elsewhere is to do first an UPDATE and if it fails an INSERT. This is better because it is more common for the value to be present than absent. From variable_set(), for example:

  db_query("UPDATE {variable} SET value = '%s' WHERE name = '%s'", $serialized_value, $name);
  if (!db_affected_rows()) {
    @db_query("INSERT INTO {variable} (name, value) VALUES ('%s', '%s')", $name, $serialized_value);
  }

I would like to see that solution applied in session.inc, because its cleaner, more easily maintainable (we know the solution works because it is used elsewhere), and more efficient it the common case.

Also, it is common here that bugs get corrected in the current development version (7.x), and then only backported.

c960657’s picture

StatusFileSize
new2.93 KB

Patch updated to use the first-UPDATE-then-INSERT pattern. I also removed the redundant "if" clause (see comment 4).

One possible optimization is to only do the UPDATE if one of the values had changed since the user was loaded from the database in sess_read(), or if session_write_interval has passed since session.timestamp was last updated. This would reduce the number of UPDATE queries quite a bit. Would you prefer this to be discussed in a separate issue?

damien tournoud’s picture

Status: Needs work » Needs review

Putting this into "need review" mode.

The patch looks very sane.

@c960657: you can't really know if one of the value changed, except by saving the initial values. I'm not sure it really worth it.

c960657’s picture

FWIW, a similar suggestion was discussed (and rejected) in #40545: Improve speed by avoiding unnecessary updates in sess_write().

vpiotr’s picture

For me this patch is required, I've made it for 5.7 - every day I'm receiving this error (cased by imagecache probably, but it is sess_write() which is incorrect). Please include in D 5.8 too.

pwolanin’s picture

I've seen this on a 4.7 site with high volume, where it cause some really ugly stuff like profile fields for the anonymous user getting values for some other user.

moshe weitzman’s picture

Status: Needs review » Needs work

This will perform worse when the site is indexed by crawlers since you are doing a write query there. The patch removes the optimization for requests where no cookie is presented.

damien tournoud’s picture

Status: Needs work » Needs review

@moshe: the behavior didn't change.

What keeps the crawlers out of the session table is the check:

if (!session_save_session() || (empty($_COOKIE[session_name()]) && empty($value))) { (condition 1)
  return TRUE;
}

The logic that was below in the code is actually flawed:

if ($user->uid || $value || count($_COOKIE)) { (condition 2)
  [ insert the cookie ]
}

When we are here, we already passed condition 1, so we have either a not empty $value OR at least a cookie (!empty($_COOKIE[session_name()]). So condition 2 is always true, and thus pretty useless.

Back to code need review. Do you have any other comments Moshe?

c960657’s picture

StatusFileSize
new2.98 KB

Updated Drupal 6 backport of sess_write-drupal7_2.patch.

damien tournoud’s picture

This looks ready to go in. Does the patch from d7.x still apply (I don't have access to my test system from here)?

c960657’s picture

Yes, the D7 patch still applies.

dries’s picture

I'm not particularly fond of the @-trick. We try to avoid that at all costs because it is not an elegant way to do things.

damien tournoud’s picture

Status: Needs review » Reviewed & tested by the community

@Dries: as I explained in #10, this is currently our standard way of dealing with concurrency issues in the MERGE pattern.

From cache_set():

  db_query("UPDATE {" . $table . "} SET data = %b, created = %d, expire = %d, he
aders = '%s', serialized = %d WHERE cid = '%s'", $data, $created, $expire, $head
ers, $serialized, $cid);
  if (!db_affected_rows()) {
    @db_query("INSERT INTO {" . $table . "} (cid, data, created, expire, headers, serialized) VALUES ('%s', %b, %d, %d, '%s', %d)", $cid, $data, $created, $expire, $headers, $serialized);
  }

From variable_set():

    db_query("UPDATE {variable} SET value = '%s' WHERE name = '%s'", $serialized
_value, $name);
    if (!db_affected_rows()) {
      @db_query("INSERT INTO {variable} (name, value) VALUES ('%s', '%s')", $name, $serialized_value);
    }

The DB:TNG will have a merge() operation that is specifically meant to do this in a clever way. In the meantime, this should be committed as soon as possible, at least to D6 (which, without this is still not ready for primetime...).

dries’s picture

Version: 7.x-dev » 6.x-dev

OK, I can't think of a better solution for now so I'll swallow the @. Committed to CVS HEAD. Thanks for your comments, Damien. Much appreciated.

gpk’s picture

Clarification: a diff reveals that the D6 and D7 versions of this patch are almost identical. The only difference arises from changes in the comment block describing the decision on whether the session will be written to the DB or not. The D6 version refers to

the throttle module and the "Who's Online" block

whereas the D7 version only mentions

the "Who's Online" block

damien tournoud’s picture

The 6.x patch in #18 looks ready.

egfrith’s picture

A closely related patch to this one is at #293612: user_authenticate() should work when $_COOKIE is empty, which puts the $user->uid == 0 logic back where it belongs, in the remaining test for crawlers. It would be great if the patch at #293612: user_authenticate() should work when $_COOKIE is empty could be applied just after the patch here has been applied.

arhak’s picture

subscribing

gpk’s picture

#52662: race condition in sess_write looks like a duplicate.

Wesley Tanaka’s picture

Subscribing via #52662

yang_yi_cn’s picture

It's very interesting, but did it commit into the 6.x branch?

However, I don't think it actually solves the race condition problem.

The problem is originally because there could be race condition between the first SELECT query and the second INSERT query.

Now, same race condition can happen between the UPDATE query and the INSERT query. There will still be an error happening when the record is not available at the time of UPDATE but already exists at the time of INSERT. Of course, we hide the PHP error message by using @. But to me it's not acceptable, because I have a heavy load site and doing MySQL Master-Master replication, once there's an error in the sessions table, the replication was stopped.

What I'm suggesting is to use the

INSERT...ON DUPLICATE KEY UPDATE

syntax.

http://dev.mysql.com/doc/refman/5.0/en/insert.html

yang_yi_cn’s picture

    if ($user->uid || $value || count($_COOKIE)) {
      db_query("INSERT INTO {sessions} (sid, uid, cache, hostname, session, timestamp) VALUES ('%s', %d, %d, '%s', '%s', %d) ON DUPLICATE KEY UPDATE uid = %d, cache = %d, hostname = '%s', session = '%s', timestamp = %d ", $key, $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time(), $user->uid, isset($user->cache) ? $user->cache : '', ip_address(), $value, time());
    }
damien tournoud’s picture

@yang_yi_cn: INSERT ... ON DUPLICATE is a MySQL-only construct. On Drupal 7, this will be a MERGE, but for Drupal 6 there are no cleaner ways to do this while maintaining cross-database portability.

#18 is still ready to be committeed on D6.

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 6.x, thanks!

c960657’s picture

Note that this problem has reoccurred in HEAD. This regression is being tracked in #297860: Regression: Convert session.inc to the new database API (and _sess_write should use a merge).

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

gpk’s picture

Version: 6.x-dev » 5.x-dev
Status: Closed (fixed) » Patch (to be ported)

And now for 5.x, where we started ... #18 needs to be revised. (#2 is I think the only D5 patch here, and while it apparently works is not the way that #18 went.)

c960657’s picture

StatusFileSize
new3.04 KB

This is a backport of the D6 patch. It includes the D5 backport in #293612: user_authenticate() should work when $_COOKIE is empty which is RTBC, and a backport of the part of the part of #130971: The kitchen sink (remove notices, fix bugs, enhance code styling) that is related to session.inc.

When reviewing this, you may want to apply the patch and diff against the DRUPAL-6 branch. The only differences in sess_write should be the fixes for #142773: Drupal does not fully work when using a reverse proxy and the fix for #40545: Improve speed by avoiding unnecessary updates in sess_write() that have not been backported to D5.

c960657’s picture

Status: Patch (to be ported) » Needs review
c960657’s picture

StatusFileSize
new2.96 KB

Reroll of D5 patch (the backport of #293612: user_authenticate() should work when $_COOKIE is empty was just committed to the D5 branch).

drumm’s picture

Status: Needs review » Fixed

I did commit this to 5.x, with a bad commit message. I did a another trivial change to get the credit in the CVS log.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.