Problem/Motivation

The way we currently save node forces MySQL to lock the {node}.vid index longer then necessary, increasing the possibility of deadlock errors happening under load.

Test script:

for ($i = 0; $i < 5000; $i++) {
  $node = (object) array(
    'type' => 'article',
    'status' => 1,
    'uid' => 1,
    'title' => 'Dummy node',
    'body' => array(
      LANGUAGE_NONE => array(
        0 => array(
          'value' => 'Dummy body',
          'format' => 'filtered_html',
        ),
      ),
    ),
  );
  node_save($node);
}

Execute with concurrency (for example with seq 1 100 | xargs -L 1 -P 15 drush php-script stress-node.script), and you should see errors like:

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {node} (type, title, uid, status, created, changed) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5);

Proposed resolution

The problem lies in the fact that we insert 0 (the default value) in the {node}.vid for a short period of time during save. As a consequence, MySQL has to lock the begining of index attached to the {node}.vid unique key until the transaction ends. This increases the window during which a deadlock is possible.

Proposed resolution is that we insert NULL instead. NULL not being comparable to anything, it doesn't participate in uniqueness constraints and MySQL doesn't have to lock the index.

API changes

None.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Status: Active » Needs review
FileSize
570 bytes

Patch for D8. No upgrade path is necessary for D8 as I expect this to be applied to D7 too.

Damien Tournoud’s picture

Version: 8.x-dev » 7.x-dev
FileSize
1016 bytes

Patch for D7 with the upgrade path.

Damien Tournoud’s picture

Version: 7.x-dev » 8.x-dev

Status: Needs review » Needs work

The last submitted patch, 1369332-deadlock-node7.patch, failed testing.

Damien Tournoud’s picture

Status: Needs work » Needs review

The test bot wants to apply the D7 patch to D8. Not sure why.

basic’s picture

This looks like an easy way to prevent deadlocks in heavily used node tables. I have seen this issue cropping up with a number of different D7 sites recently.

Are there any other places where the NULL work around could reduce locking?

basic’s picture

Version: 8.x-dev » 7.x-dev
FileSize
1016 bytes

Trying 7.x test again

Damien Tournoud’s picture

I think that's the only place we play with a unique key like this.

Status: Needs review » Needs work

The last submitted patch, 1369332-deadlock-node7.patch, failed testing.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
1023 bytes

I broke the upgrade path.

basic’s picture

I believe this is an important fix for high traffic D7 users. I'm ready to RTBC, but will wait for some more feedback.

jbrown’s picture

There definitely needs to be a comment in node_schema()

basic’s picture

Add a comment referencing this issue with details on the use of NULL vs 0.

andypost’s picture

This require testing for pgsql and sqlite too

sun’s picture

Fixed the comment syntax and formatting.

basic’s picture

This works on a local sqlite db afaics (tests passed). Is anyone available to easily test pgsql? I can spin up an environment but it will take some time.

Damien Tournoud’s picture

Version: 7.x-dev » 8.x-dev
Status: Needs review » Reviewed & tested by the community
FileSize
1018 bytes

Tested on PostgreSQL: works with or without the patch.

Here is a straight reroll for D8 without the upgrade function.

Seems ready to roll for me. RTBC-ed.

basic’s picture

Excellent, looks good to me. Thanks Damien.

basic’s picture

This needs to be committed to D7 (#19) and D8 (#21). Hoping to get it in before the next release.

catch’s picture

Status: Reviewed & tested by the community » Needs work

This looks good but I think we can drop the @see, git blame is good enough for that. Nice find.

sun’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
770 bytes

Removed the @see, back to RTBC.

dude4linux’s picture

I'm seeing this error during installs of Drupal7 sites. Is there a patch for drupal-7.12? Thx.

no_commit_credit’s picture

FileSize
750 bytes
xjm’s picture

@dude4linux -- The issue is tagged for backport to D7, so there is a fair chance it will be included in the next D7 release if nothing else comes up. A D7 version is in #27 for convenience.

dude4linux’s picture

@xjm -- Thanks, that patch worked for me. At least it allowed me to get past the errors and continue until the next problem caused a crash on out-of-memory. :)

webchick’s picture

Don't we need an update hook for this in D7?

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 1369332-27-d7.patch, failed testing.

sun’s picture

Assigned: Unassigned » webchick
Status: Needs work » Needs review

re: Update hook? Yes, totally, but given the current state of #1333898: Support unstable2unstable upgrades? [policy, no patch], it's completely and totally unclear to me which patches should have a D8 update and which not.

Damien Tournoud’s picture

Assigned: webchick » Unassigned
Status: Needs review » Reviewed & tested by the community

#25 is the patch for D8, which doesn't need an upgrade path because it will be in the D7 to D8 upgrade path.

(until the policy is changed)

xjm’s picture

Right, agreed. We need an update hook in the backport for D7, but not in the D8 patch.

webchick’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Sorry, that's what I meant, sorry for the brevity. #27 isn't complete for D7 because it requires an update function. Since the current policy is still "no update hooks for head2head until beta", the D8 patch is good to go.

Committed and pushed #25 to 8.x, but the 7.x version still needs to be ported.

basic’s picture

D7 patch (from #19) without the @see

basic’s picture

Status: Patch (to be ported) » Needs review
sun’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed to 7.x. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Gastonia’s picture

Has this already been committed to the downloadable d7 dev release last updated on March 31st?

attiks’s picture

@Gastonia: dev releases are build twice a day, so yes.

somatics’s picture

Has this been rolled into a recommended release for core? I just upgraded to 7.15 yesterday and I'm still getting this error. It's locking us out of critical parts of our live site, so I'm trying to figure out how to fix it asap.

alanom’s picture

Edit: Looks like this was committed in 7.14. I've also experienced this error since the patch (in 7.14), suggesting we're encountering a seperate but similar issue with similar symptoms.

Edit 2: Just occurred to me that this patch is specific to nodes, I think my crash was a different entity type. Is 0 or null the default default (sic) for other entities? If 0 is the default default, might it make sense to apply this logic to all entity types?

Gastonia’s picture

I am still getting the error as of 7.15

PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {location_search_work} (lid) VALUES (:db_insert_placeholder_0); Array
(
[:db_insert_placeholder_0] => 16636
)
in location_search_locationapi() (line 446 of /Applications/MAMP/htdocs/bab/web/sites/all/modules/location/contrib/location_search/location_search.module).

can anyone confirm that this is no longer an issue after applying the patch?

alanom’s picture

Gastonia - it looks like, like me you're having an issue with a different type of entity to Nodes. Sounds like you're having the problem with Location entities, something to do with a Location Search module, which I'm guessing is part of or an extension of the Location module? I think my problem was with Relation entities.

So it would be great if someone could answer my question: is the 0 / NULL that causes this problem in Nodes the default default for entity types that don't have their own default? If it is, could there be serious consequences to re-working this patch, either:

  1. ...for the default entity objects, or...
  2. ...on a case-by-case basis in response to errors (e.g. for relations, for locations, etc)?
Heine’s picture

Gastonia’s picture

Am I interpreting correctly what alanomaly is saying above is that each module and maintainer is responsible for putting in the appropriate logic to prevent this from happening?

Should I put a copy of my error in the Locations project issue queue as well as others as this comes up for {there table}?

Our team has written serveral modules that use tables. Should we be rewriting our db code to avoid this? If so, what should we be looking for or incorporating? I am still constantly getting errors like this:

PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {location_instance} (nid, vid, uid, genid, lid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4); Array ( [:db_insert_placeholder_0] => 1059 [:db_insert_placeholder_1] => 1059 [:db_insert_placeholder_2] => 0 [:db_insert_placeholder_3] => cck:field_item_location:1059 [:db_insert_placeholder_4] => 1000 ) in location_save_locations() (line 974 of /var/www/buyagainbaby.com/sites/all/modules/location/location.module).

We have an extremely high traffic site for input via a team of people using the node forms and coders batch uploading content as data comes in.

What can be done to address this?

Thanks

alanom’s picture

@Gastonia - My knowledge of the Drupal entity system is pretty thin. But if you can prove that it is Location entities that cause the problem, then alerting the Location guys to the existence of the issue certainly can't hurt.

It's just a theory of mine that this issue exists at the generic entity layer and has been fixed only at the node layer - but it's a theory that fits all the available evidence so far. I'd suggest flagging it up with the Location guys who can hopefully cast an expert eye on the issue, and, if it's causing you a serious problem, maybe create an issue on the core queue opening discussion on getting a generic fix for all entities, not just nodes. Hopefully you can get input from the Location guys soon enough to be useful, and maybe you might see some useful progress on the Core issue in 2013.

Actually, scratch that last point - if you can demonstrate that this is indeed a generic issue with all entities, while the fix only applies to nodes, then I'd argue that the best action (in addition to flagging it up to the Location guys short-term) is probably re-opening this issue re-titled for Entity focus with 'Component' switched to 'Entity system', since relevant people are already following this thread and since relevant content and code is already in this thread. Potentially might get some useful action on it at core level this year that way (or, confirmation that my theory is wrong, which would also be useful).

Sudhanshu1987’s picture

FileSize
443 bytes

Hi All,

I applied the patch in #36 to allow null in vid. It alone did not help. I had to apply a patch in node.module to not allow vid to be added programatically. Then all things started wotking.
so essentially we need patch in #36 followed by the patch I am submitting.

Thanks,
Sudhanshu

igasi’s picture

Sorry comment in incorrect issue...

https://www.drupal.org/node/2032881

jhedstrom’s picture

For folks that stumble across this issue when trying to fix deadlocks on the node_field_data table, this updated script will result in deadlocks (for testing)

<?php

use \Drupal\node\Entity\Node;

for ($i = 0; $i < 5000; $i++) {
  $node = Node::create([
    'type' => 'article',
    'title' => 'A title',
    'uid' => 1,
    'body' => [
      'format' => 'filtered_html',
      'value' => 'Foo',
    ],
  ]);
  $node->save();
}

this script is then run as mentioned in the issue summary:

seq 1 100 | xargs -L 1 -P 15 drush php-script ./test.php

and the fix mentioned here regarding changing transaction isolation in settings.php seems to resolve the issues:

$databases['default']['default']['init_commands'] = array('isolation' => "SET SESSION tx_isolation='READ-COMMITTED'");
jhedstrom’s picture

I opened #2846880: Avoid deadlock issues on {node_field_data} table, as this is quite easy to produce on a vanilla D8 install for the node_field_data table.