Problem/Motivation
The way we currently save node forces MySQL to lock the {node}.vid
index longer then necessary, increasing the possibility of deadlock errors happening under load.
Test script:
for ($i = 0; $i < 5000; $i++) {
$node = (object) array(
'type' => 'article',
'status' => 1,
'uid' => 1,
'title' => 'Dummy node',
'body' => array(
LANGUAGE_NONE => array(
0 => array(
'value' => 'Dummy body',
'format' => 'filtered_html',
),
),
),
);
node_save($node);
}
Execute with concurrency (for example with seq 1 100 | xargs -L 1 -P 15 drush php-script stress-node.script
), and you should see errors like:
SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {node} (type, title, uid, status, created, changed) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5);
Proposed resolution
The problem lies in the fact that we insert 0
(the default value) in the {node}.vid
for a short period of time during save. As a consequence, MySQL has to lock the begining of index attached to the {node}.vid
unique key until the transaction ends. This increases the window during which a deadlock is possible.
Proposed resolution is that we insert NULL
instead. NULL
not being comparable to anything, it doesn't participate in uniqueness constraints and MySQL doesn't have to lock the index.
API changes
None.
Comment | File | Size | Author |
---|---|---|---|
#50 | patch.txt | 443 bytes | Sudhanshu1987 |
#36 | drupal7.node-vid-deadlock.36.patch | 1.19 KB | basic |
#27 | 1369332-27-d7.patch | 750 bytes | no_commit_credit |
#25 | drupal8.node-vid-deadlock.25.patch | 770 bytes | sun |
#21 | 1369332-drupal.node-vid-deadlock.patch | 1018 bytes | Damien Tournoud |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedPatch for D8. No upgrade path is necessary for D8 as I expect this to be applied to D7 too.
Comment #2
Damien Tournoud CreditAttribution: Damien Tournoud commentedPatch for D7 with the upgrade path.
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #9
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe test bot wants to apply the D7 patch to D8. Not sure why.
Comment #10
basic CreditAttribution: basic commentedThis looks like an easy way to prevent deadlocks in heavily used node tables. I have seen this issue cropping up with a number of different D7 sites recently.
Are there any other places where the NULL work around could reduce locking?
Comment #11
basic CreditAttribution: basic commentedTrying 7.x test again
Comment #12
Damien Tournoud CreditAttribution: Damien Tournoud commentedI think that's the only place we play with a unique key like this.
Comment #14
Damien Tournoud CreditAttribution: Damien Tournoud commentedI broke the upgrade path.
Comment #15
basic CreditAttribution: basic commentedI believe this is an important fix for high traffic D7 users. I'm ready to RTBC, but will wait for some more feedback.
Comment #16
jbrown CreditAttribution: jbrown commentedThere definitely needs to be a comment in node_schema()
Comment #17
basic CreditAttribution: basic commentedAdd a comment referencing this issue with details on the use of NULL vs 0.
Comment #18
andypostThis require testing for pgsql and sqlite too
Comment #19
sunFixed the comment syntax and formatting.
Comment #20
basic CreditAttribution: basic commentedThis works on a local sqlite db afaics (tests passed). Is anyone available to easily test pgsql? I can spin up an environment but it will take some time.
Comment #21
Damien Tournoud CreditAttribution: Damien Tournoud commentedTested on PostgreSQL: works with or without the patch.
Here is a straight reroll for D8 without the upgrade function.
Seems ready to roll for me. RTBC-ed.
Comment #22
basic CreditAttribution: basic commentedExcellent, looks good to me. Thanks Damien.
Comment #23
basic CreditAttribution: basic commentedThis needs to be committed to D7 (#19) and D8 (#21). Hoping to get it in before the next release.
Comment #24
catchThis looks good but I think we can drop the @see, git blame is good enough for that. Nice find.
Comment #25
sunRemoved the @see, back to RTBC.
Comment #26
dude4linux CreditAttribution: dude4linux commentedI'm seeing this error during installs of Drupal7 sites. Is there a patch for drupal-7.12? Thx.
Comment #27
no_commit_credit CreditAttribution: no_commit_credit commentedComment #28
xjm@dude4linux -- The issue is tagged for backport to D7, so there is a fair chance it will be included in the next D7 release if nothing else comes up. A D7 version is in #27 for convenience.
Comment #29
dude4linux CreditAttribution: dude4linux commented@xjm -- Thanks, that patch worked for me. At least it allowed me to get past the errors and continue until the next problem caused a crash on out-of-memory. :)
Comment #30
webchickDon't we need an update hook for this in D7?
Comment #32
sunre: Update hook? Yes, totally, but given the current state of #1333898: Support unstable2unstable upgrades? [policy, no patch], it's completely and totally unclear to me which patches should have a D8 update and which not.
Comment #33
Damien Tournoud CreditAttribution: Damien Tournoud commented#25 is the patch for D8, which doesn't need an upgrade path because it will be in the D7 to D8 upgrade path.
(until the policy is changed)
Comment #34
xjmRight, agreed. We need an update hook in the backport for D7, but not in the D8 patch.
Comment #35
webchickSorry, that's what I meant, sorry for the brevity. #27 isn't complete for D7 because it requires an update function. Since the current policy is still "no update hooks for head2head until beta", the D8 patch is good to go.
Committed and pushed #25 to 8.x, but the 7.x version still needs to be ported.
Comment #36
basic CreditAttribution: basic commentedD7 patch (from #19) without the @see
Comment #37
basic CreditAttribution: basic commentedComment #38
sunLooks good to me.
Comment #39
webchickCommitted and pushed to 7.x. Thanks!
Comment #41
Gastonia CreditAttribution: Gastonia commentedHas this already been committed to the downloadable d7 dev release last updated on March 31st?
Comment #42
attiks CreditAttribution: attiks commented@Gastonia: dev releases are build twice a day, so yes.
Comment #43
somatics CreditAttribution: somatics commentedHas this been rolled into a recommended release for core? I just upgraded to 7.15 yesterday and I'm still getting this error. It's locking us out of critical parts of our live site, so I'm trying to figure out how to fix it asap.
Comment #44
alanom CreditAttribution: alanom commentedEdit: Looks like this was committed in 7.14. I've also experienced this error since the patch (in 7.14), suggesting we're encountering a seperate but similar issue with similar symptoms.
Edit 2: Just occurred to me that this patch is specific to nodes, I think my crash was a different entity type. Is 0 or null the default default (sic) for other entities? If 0 is the default default, might it make sense to apply this logic to all entity types?
Comment #45
Gastonia CreditAttribution: Gastonia commentedI am still getting the error as of 7.15
PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {location_search_work} (lid) VALUES (:db_insert_placeholder_0); Array
(
[:db_insert_placeholder_0] => 16636
)
in location_search_locationapi() (line 446 of /Applications/MAMP/htdocs/bab/web/sites/all/modules/location/contrib/location_search/location_search.module).
can anyone confirm that this is no longer an issue after applying the patch?
Comment #46
alanom CreditAttribution: alanom commentedGastonia - it looks like, like me you're having an issue with a different type of entity to Nodes. Sounds like you're having the problem with Location entities, something to do with a Location Search module, which I'm guessing is part of or an extension of the Location module? I think my problem was with Relation entities.
So it would be great if someone could answer my question: is the 0 / NULL that causes this problem in Nodes the default default for entity types that don't have their own default? If it is, could there be serious consequences to re-working this patch, either:
Comment #47
Heine CreditAttribution: Heine commentedPlease see #937284: DEADLOCK errors on MergeQuery INSERT due to InnoDB gap locking when condition in SELECT ... FOR UPDATE results in 0 rows for other deadlock issues
Comment #48
Gastonia CreditAttribution: Gastonia commentedAm I interpreting correctly what alanomaly is saying above is that each module and maintainer is responsible for putting in the appropriate logic to prevent this from happening?
Should I put a copy of my error in the Locations project issue queue as well as others as this comes up for {there table}?
Our team has written serveral modules that use tables. Should we be rewriting our db code to avoid this? If so, what should we be looking for or incorporating? I am still constantly getting errors like this:
We have an extremely high traffic site for input via a team of people using the node forms and coders batch uploading content as data comes in.
What can be done to address this?
Thanks
Comment #49
alanom CreditAttribution: alanom commented@Gastonia - My knowledge of the Drupal entity system is pretty thin. But if you can prove that it is Location entities that cause the problem, then alerting the Location guys to the existence of the issue certainly can't hurt.
It's just a theory of mine that this issue exists at the generic entity layer and has been fixed only at the node layer - but it's a theory that fits all the available evidence so far. I'd suggest flagging it up with the Location guys who can hopefully cast an expert eye on the issue, and, if it's causing you a serious problem, maybe create an issue on the core queue opening discussion on getting a generic fix for all entities, not just nodes. Hopefully you can get input from the Location guys soon enough to be useful, and maybe you might see some useful progress on the Core issue in 2013.
Actually, scratch that last point - if you can demonstrate that this is indeed a generic issue with all entities, while the fix only applies to nodes, then I'd argue that the best action (in addition to flagging it up to the Location guys short-term) is probably re-opening this issue re-titled for Entity focus with 'Component' switched to 'Entity system', since relevant people are already following this thread and since relevant content and code is already in this thread. Potentially might get some useful action on it at core level this year that way (or, confirmation that my theory is wrong, which would also be useful).
Comment #50
Sudhanshu1987 CreditAttribution: Sudhanshu1987 commentedHi All,
I applied the patch in #36 to allow null in vid. It alone did not help. I had to apply a patch in node.module to not allow vid to be added programatically. Then all things started wotking.
so essentially we need patch in #36 followed by the patch I am submitting.
Thanks,
Sudhanshu
Comment #51
igasi CreditAttribution: igasi commentedSorry comment in incorrect issue...
https://www.drupal.org/node/2032881
Comment #52
jhedstromFor folks that stumble across this issue when trying to fix deadlocks on the
node_field_data
table, this updated script will result in deadlocks (for testing)this script is then run as mentioned in the issue summary:
and the fix mentioned here regarding changing transaction isolation in settings.php seems to resolve the issues:
Comment #53
jhedstromI opened #2846880: Avoid deadlock issues on {node_field_data} table, as this is quite easy to produce on a vanilla D8 install for the
node_field_data
table.