Problem/Motivation
Core's implementation of lock api supposed to use "semaphore
" table to implement locking.
This table as all others use InnoDB storage engine which brings serious bug that any lock is released by any transaction.
Proposed resolution
Change storage engine for mysql to Memory
Original report by @mikeytown2
#301362: Default to InnoDB in MySQL was great for D7 but the semaphore table should actually be MEMORY instead of InnoDB.
Problem reported elsewhere with wrong conclusion:
http://www.workhabit.com/blog/drupal-innodb-and-myisam-engine-issues-cac...
Recently reported here with the right conclusion: slide 26
http://www.percona.com/resources/technical-presentations/drupal-and-mysq...
Slide 26 contents
Memory engine for semaphore table
● Removes InnoDB overhead
● No disk access needed – table in RAM
● Native implementation (no plugin or modules)
● Non-durable
● Negative – table-level locking can reduce concurrency, but this
table is typically small (and hopefully not used often)
Various links on drupal.stackexchange.com where people are complaining about this issue.
http://drupal.stackexchange.com/questions/113042/after-enabling-mysql-re...
http://drupal.stackexchange.com/questions/23500/enabling-modules-my-site...
http://drupal.stackexchange.com/questions/22145/pdo-deadlocks-galore-on-...
Comment | File | Size | Author |
---|---|---|---|
#57 | semaphore-deadlock-on-dev-site.sql_.txt | 7.69 KB | kwerey |
#46 | deadlock-march-21.sql_.txt | 43.11 KB | mikeytown2 |
#46 | deadlock-april-9.sql_.txt | 25.11 KB | mikeytown2 |
#34 | drupal-1898204-34-semaphore-memory-table-D8.patch | 399 bytes | mikeytown2 |
#25 | drupal-1898204-25-semaphore-memory-table.patch | 381 bytes | mikeytown2 |
Comments
Comment #1
mikeytown2 CreditAttribution: mikeytown2 commentedInitial work for updating old sites:
Comment #2
mikeytown2 CreditAttribution: mikeytown2 commentedPatch to fix fresh installs.
Comment #3
mikeytown2 CreditAttribution: mikeytown2 commentedComment #4
mikeytown2 CreditAttribution: mikeytown2 commentedLast patch has a typo in it.
Comment #5
mikeytown2 CreditAttribution: mikeytown2 commentedNote about using memory table (something that could be done for semaphore)
http://stackoverflow.com/questions/4586559/how-to-change-a-heap-memory-m...
http://www.mysqlperformanceblog.com/2008/02/01/performance-gotcha-of-mys...
http://stackoverflow.com/questions/7306316/btree-vs-hashtable
http://dev.mysql.com/doc/refman/5.1/en/memory-storage-engine.html
Comment #6
mikeytown2 CreditAttribution: mikeytown2 commentedModules that use drupal_get_schema_unprocessed() in order to get cache table schema.
Output from
gotta_download_them_all/allmodules$ grep -l -r -i " = drupal_get_schema_unprocessed('system', 'cache');" ./
This is for 7.x code only
Comment #7
ron_s CreditAttribution: ron_s commentedMike, question about this patch... are you suggesting the cache, cache_bootstrap, cache_menu, semaphore, and variable tables should be MyISAM due to performance benefits for high traffic sites?
From reading the WorkHabit post, I wasn't sure if Nick was suggesting that converting existing MyISAM tables to InnoDB is the problem, or if it is also an issue for sites which were originally built using all InnoDB tables.
Thanks for your insights.
Ron
Comment #8
mikeytown2 CreditAttribution: mikeytown2 commentedThis issue doesn't matter if the tables where always InnoDB or if they were converted from MyISAM.
If the semaphore table is not MyISAM, multiple processes can acquire the same lock; thus it is critical for this table to be MyISAM. This is a bug. All other table changes are for speed.
From what I've seen, the only core cache tables that benefit from InnoDB are cache_form, cache_field, and cache_page. cache_menu is a toss up depending on the site, InnoDB or MyISAM is better; but with what I've seen so far MyISAM performs better.
Writes to the variable table should be rare, and I was seeing issues with queries taking a long time with this table. Once I switched it over to MyISAM, things started to look better.
Comment #9
gielfeldt CreditAttribution: gielfeldt commentedAre you sure about this?
This sounds like a bug that would be discovered quite fast, yet I cannot find any information on this except this thread and the WorkHabit blog?
Comment #10
mikeytown2 CreditAttribution: mikeytown2 commentedThe bug would be very hard to reproduce due to 2 inserts hitting the semaphore table at the same time. Once a MySQL database gets enough traffic (baseline - thousands of transactions per second) the bug will happen about once a month. The end result of this is usually a corrupt cache that needs to be flushed and/or the database locking up for multiple seconds (deadlock).
Usually most people switch to memcache once issues like this arise, thus the locking issue goes away. We were starting to think about switching this site over to memcache but I wanted to try changing the db engine after issuing SHOW PROCESSLIST and seeing lots of entries for the semaphore table. Long story short, we are still not using memcache for this site (4 months later & with more traffic) and we haven't had issues with the cache getting corrupted.
An alternative to using MyISAM would be to lock the table before doing any writes to it.
This isn't the first race condition I've found BTW :)
http://drupal.org/node/261148#comment-5798678
Comment #11
erikwebb CreditAttribution: erikwebb commentedI'm not sure how InnoDB would be causing this problem. 2 writing queries can't be executed at the exact same time, because row locking would jump in. You shouldn't need table-level locking, because each row represents a unique lock/variable/whatever and shouldn't need to be locked as a group. Are you saying this will only happen for a new entry or for any concurrent write to these tables?
Comment #12
mikeytown2 CreditAttribution: mikeytown2 commentedI've seen the processlist fill up with over 100 entries trying to interact with the semaphore table. There could be a bug in MySQL in regards to this.
I don't know why InnoDB is causing this issue but it is reproducible when MySQL is under a lot of load. I'm guessing it's not widely reported because once one sees a lot of errors related to the Drupal cache, most people switch away from database caching.
Comment #13
gielfeldt CreditAttribution: gielfeldt commentedHmmm..
It could be some next/gap-lock deadlock, if a semaphore is placed/released within a transaction. If this is the case, maybe switching to read-committed isolation level will solve it as well?
Comment #14
erikwebb CreditAttribution: erikwebb commentedAre there any conditions in Drupal where a transaction could be opened and hang before closing? That may cause some of these issues.
Comment #15
vinoth.3v CreditAttribution: vinoth.3v commentedI could not change that table back to MYISAM due to 'MySQL: Specified key was too long; max key length is 1000 bytes' error!
but I tried to change it to mysql MEMORY engine. seems working fine.
Comment #16
mikeytown2 CreditAttribution: mikeytown2 commentedGoing to close this issue and mark it a duplicate of #2164849: Enforce READ COMMITTED transaction isolation level.
By switching the table to MyISAM, we break out of the transaction but lose row level locking. Fixing this properly by running the query outside of the transaction sounds like the proper fix for this.
Comment #17
mikeytown2 CreditAttribution: mikeytown2 commentedStill getting deadlocks on the semaphore table. Changing the engine to memory seems to help; just be sure to use a BTREE index instead of a HASH index. HASH is the default for MEMORY and still has issues that BTREE fixes.
http://www.mysqlperformanceblog.com/2008/02/01/performance-gotcha-of-mys...
Comment #18
vinoth.3v CreditAttribution: vinoth.3v commentedAlso consider to add a index on (name, value) columns,
Comment #19
mikeytown2 CreditAttribution: mikeytown2 commented@vinoth.3V
You do have a point.
lock_may_be_available() is the only place that has a condition on name. Almost all other locking code puts a condition on the name and value; the exception to this is lock_release_all which puts a condition on the value.
I do think that would be a different issue though, system_schema() defines the indexes used.
Question is what would be ideal. Have the name value be a primary key & add in a unique name key.
Or add in another index called namevalue.
Or does it really matter as there is almost always less than 100 rows in semaphore. If you have more, something is wrong most likely.
Comment #20
mikeytown2 CreditAttribution: mikeytown2 commentedMySQL might need the key names to be quoted. Remove the -- if running multiple times.
Or If MySQL is having issues with btrees in memory tables this might be the best option
Comment #21
skomorokh CreditAttribution: skomorokh commentedI can confirm that this happens and that memory tables help. Was doing some load testing and MySQL had deadlocks on the semaphore table. Found this, switched to a memory table as per #20 and those stop.
Comment #22
mikeytown2 CreditAttribution: mikeytown2 commentedStarting to look into creating a patch for this and it doesn't look good...
DatabaseSchema_mysql::createKeysSql does not support index_type http://dev.mysql.com/doc/refman/5.0/en/create-index.html
MyISAM
BTREE
InnoDB
BTREE
MEMORY
/HEAP
HASH
,BTREE
NDB
BTREE
,HASH
Looks like no one has tried to create an index using the non default index type; because MyISAM and InnoDB only support BTREE.
https://www.google.com/search?q=site%3Adrupal.org%2Fnode+createKeysSql+i...
Comment #23
mikeytown2 CreditAttribution: mikeytown2 commentedWell for now I'll create a patch using the default index type which for memory tables is HASH.
This patch just changes the defined schema. I'll work on later patches to update existing installs.
Comment #25
mikeytown2 CreditAttribution: mikeytown2 commentedDoing something even simpler.
Comment #26
mikeytown2 CreditAttribution: mikeytown2 commentedComment #27
mikeytown2 CreditAttribution: mikeytown2 commentedComment #28
mikeytown2 CreditAttribution: mikeytown2 commentedComment #29
danblack CreditAttribution: danblack commentednice patch in #25 @mikeytown2. Works for me.
The mysqlperformanceblog has an issue with HASH based index to a large number of index duplicates as the index value is always the same. This isn't applicable here as the value is a random value and the primary key is unique.
Comment #31
danblack CreditAttribution: danblack commented25: drupal-1898204-25-semaphore-memory-table.patch queued for re-testing.
Comment #32
mikeytown2 CreditAttribution: mikeytown2 commentedMoving back to RTBC
Comment #33
Crell CreditAttribution: Crell commentedShouldn't this go into 8.x first? I'm OK with it, but it should go to 8.x first. :-)
Comment #34
mikeytown2 CreditAttribution: mikeytown2 commentedD8 Patch. Same as D7 except for paths.
Comment #35
mikeytown2 CreditAttribution: mikeytown2 commentedPatch passes tests for D8
Comment #36
catchThis needs an issue summary update to summarize the issue better. The 20th slide behind a registration form is a bit hard work...
Moving this to memory means it won't respect transactions. i.e. if you acquire a lock within a transaction, and that transaction gets rolled back, then the lock will persist until the end of the request. Before we'd clear the lock with the rest of the transaction. This seems like it's probably fine, but didn't see that discussed anywhere.
Comment #37
mikeytown2 CreditAttribution: mikeytown2 commentedLooks like DatabaseLockBackend::releaseAll() will get ran due to the constructor registering it in a shutdown function similar to D7 in _lock_id(). I wonder why the api.d.o site didn't pick up this usage, I ended up searching the code base to make sure this is happening.
Using a backend like memcache will have the same transactional issue. We could release the lock in the catch statement, but that would mean the lock id would need to be known at the catch scope; something that might be hard to get right. Also we could make a guideline that transactions should be started after a lock, similar to how it is being used in RouterRebuildSubscriber::menuLinksRebuild().
Comment #38
BerdirThe first argument is fairly irrelevant I think because that happens at the end of the request and not when a transaction is rolled back.
The second paragraph is spot on, though, lock is a pluggable backend, so the fact that it happens in the same transaction is not something that can be relied upon, just like caching (although core does in some cases, and there can be nasty side effects due to that, both when it's in the same transaction and when it isn't).
Comment #39
mikeytown2 CreditAttribution: mikeytown2 commentedComment #40
catchRight it's true that with memcache and etc. the same thing can happen, just wondering if we might hit conditions with the database backend that we previously didn't as a side effect of this patch.
Comment #41
catchThis doesn't feel right, but I've not tried using the memory storage engine before, so I don't have incredibly concrete reasons why. Asked nnewton to have a look and comment.
Comment #42
gielfeldt CreditAttribution: gielfeldt commentedI'm not sure if I lost track of this thread. I think maybe I did.
Did we find out if the table locking would cause an issue under high concurrent load? And did switching to READ-COMMITTED not solve the problem?
Comment #43
nnewton CreditAttribution: nnewton commentedI worry about this. Adding table level locking to semaphores doesn't seem like a positive. I realize the table is small, but it is used often...very often. I would echo gielfeldt's question. The times I've seen semaphore being a large problem under load was with the default tx_isolation.
-N
Comment #44
mikeytown2 CreditAttribution: mikeytown2 commentedI was still having issues with this table when using READ-COMMITTED; once I switched to memory the issues went away. Also noted that I'm using a BTREE index instead of HASH due to this http://www.mysqlperformanceblog.com/2008/02/01/performance-gotcha-of-mys... where they report HASH gives 80 deletes per second (12.5ms table locked) and BTREE gives 15,000 of deletes per second for the memory table. Having a table level lock last for 0.07ms is a good tradeoff if that eliminates deadlocks.
Comment #45
nnewton CreditAttribution: nnewton commentedInteresting. I don't suppose you have any of the innodb output or lock monitor information? I'm curious as to what exactly you were hitting.
I was having a hard time articulating to myself why I was uncomfortable with this, because its clearly a valid solution. The MEMORY engine, especially recent versions of it, has gotten a lot more usable and this is basically what its for. I think at the end of the day it just feels like premature optimization for a lot of installations.
Comment #46
mikeytown2 CreditAttribution: mikeytown2 commentedNoted that most people switch to memcache once they start having issues with the cache and semaphore tables. I'll admit that there are not a lot of data points for this change, but a google search for "drupal semaphore deadlock" returns 226k results so people are having this issue. We have a nagios alert that will issue SHOW PROCESSLIST; and SHOW ENGINE innodb STATUS; once a certain number of active database threads reaches a threshold. I've attached our last 2 to this issue.
Currently we'll get a deadlock about once every other day on the cache_field table (totally fine) but the last trigger of our nagios alert happened on april 9th (when things could be going bad).
The site in question is 99% logged in users with 3 webheads with master slave MySQL replication; all queries hitting the master currently at a rate of 1 million questions per hour.
The deadlocks on the semaphore table were not causing major issues for us but they where still happening by checking
SHOW ENGINE innodb STATUS
; moving semaphore to memory fixed those deadlocks.If you really need it I could change the engine back to innodb and wait for a deadlock on the semaphore table if you need that. I would rather not, but if information like this is what is preventing this patch from getting accepted then I'll be willing to do it next week.
Comment #47
andypostInteresting bug, suppose this is a major issue that "lock" (expiring mutex) could be released by any transaction.
Updated summary with that
Comment #48
pwolanin CreditAttribution: pwolanin commentedShould backport it all the way. So the 7 and 6 patches will need an update function to change the engine?
Comment #49
mikeytown2 CreditAttribution: mikeytown2 commentedCrated a different issue related to MySQL memory tables #2285423: Define the data column last in cache tables in order to support memory tables on percona MySQL 5.5+
Comment #50
pwolanin CreditAttribution: pwolanin commentedDiscussed with catch in IRC, given the unknowns (especially of the upgrade path) a core backport doesn't make sense. possibly it could be a pressflow feature.
Comment #51
David Strausschx asked me to comment here, but I don't have much to say other than recommending Redis (or similar) as the locking backend. Using MySQL/MariaDB for locking via the MEMORY engine may be a marginal improvement, but it will never be the best practice.
Comment #52
Fabianx CreditAttribution: Fabianx commentedTo avoid the deadlock issues can we just use another DB connection? To avoid the lock being part of the transaction?
For the lock backend this could be pretty simple implemented by using another 'database key' instead in D7-D8.
That would not even need a core patch then.
The same should be done for cache, which then even for the default DB backend, well the DB backend which uses a different DB key, should use cache_consistent module.
If we were to switch 8.x to use a different MySQL connection for locks and caches, the nice thing is that we would get lots of new failures, where before we took cache consistency for granted with DB backend - but break all others.
On the other hand its one more DB connection per PHP thread.
Comment #53
mikeytown2 CreditAttribution: mikeytown2 commentedUsing another DB connection would help; but as you mention it's another DB connection and people do have issues with hitting the 150 MySQL connection limit. It does give us more options which is nice.
If we were to go this route I would say we create options do do sync and async queries as the async query would have other benefits as well; faster cache sets, preemptive gets, etc. If we're going to play with multiple db connections we might as well throw async into the mix as well.
Comment #54
Crell CreditAttribution: Crell commentedThis all sounds like something that should be done in an optional alternate service now that we support those more directly. Definitely if we're talking about MySQL-specific features like memory tables and async queries. (Side note: async queries may be fine for caching but entirely defeat the point for a lock table.)
Comment #55
mikeytown2 CreditAttribution: mikeytown2 commentedNote: Async queries for cache_set #2336521: Use asynchronous MySQLi connection to avoid RECORD LOCKS (Deadlocks) on cache tables
Comment #56
kwerey CreditAttribution: kwerey commentedI'm glad to find this thread... I'm commenting to give a different example of this row locking issue for a non-busy non-production site. I've come across it before in Drupal 7.28 on a production site on shared hosting, and now it's shown up again copying a production database down onto a local dev site. I don't know have much experience in database optimisation so it's good to see ideas being thrown around.
This is Drupal 7.34 using Innodb and MySQL 5.5.23, the stray trx locks started after copying the database down from the production site (on a VPS) onto my work machine running Windows. There are a fair few third party modules on the site, but all of the ones that're removed have been uninstalled correctly. As recommended elsewhere I've made sure the
innodb_buffer_pool_size
is generous - I set it to a gig.I think what's happening isn't exactly the case Mikey described with two transactions just happening to take place at the precise right time - it looks like transactions are hanging somehow and/or locks never being released even though the transaction's finished. F'rexample whatever this is seems to be locking a lot of rows for 4 whole minutes.
I uploaded the whole innodb status log - if anyone's interested in troubleshooting let me know and I can try and replicate stuff.
Comment #57
kwerey CreditAttribution: kwerey commentedComment #58
Crell CreditAttribution: Crell commentedNote that for Drupal 8, https://www.drupal.org/node/2306083 combined with #2371709: Move the on-demand-table creation into the database API may make this process a lot easier to optimize per-site. (Let an alternate backend provide a different schema with a different engine for that one use case, poof.)
Comment #59
nnewton CreditAttribution: nnewton commentedI had the opportunity to test this a little bit recently. The MEMORY engine isn't usable for this at high concurrency. Trading row level locks for table level locks...not an improvement :)
Its possible that changing this could help for some largish mid-level sites, but as David said this just isn't the best practice. I don't see the point of changing this for a very small percentage of sites. Smaller/mid-level sites won't need it, InnoDB is fine (if its not fine, its miss-configured or something else is happening). For sites that have _just_ outgrown InnoDB for this and are not yet requiring Memcache/Redis for locking, it would help. However, that is quite a small window to be shooting for.
Comment #60
mikeytown2 CreditAttribution: mikeytown2 commented@nnewton
Did you use hash or btree indexes on the memory table?
http://www.percona.com/blog/2008/02/01/performance-gotcha-of-mysql-memor...
Comment #61
nnewton CreditAttribution: nnewton commentedYa, I used BTree. Even if I had used HASH though, I don't think I would have hit that issue because it was failing 'above' that. These do table level locking. They have the same issue that the query cache does at high concurrency. The lock is simply too hot.
-N
Comment #62
mikeytown2 CreditAttribution: mikeytown2 commentedThanks for the info nnewton
Created this issue #2387371: Create a lock back end that uses the APDQC connection so locks can happen outside of a transaction when using a DB backend.
Comment #63
Crell CreditAttribution: Crell commentednnewton: Then should we file this as a won't fix, since there are better options?
Comment #64
mikeytown2 CreditAttribution: mikeytown2 commentedThe main issue is you can't guarantee a lock will not be taken/released inside of a transaction; thus using the default connection can result in bad things from happening. Using a non-transactional table is a quick fix that works surprisingly well. I have gotten reports that switching the semaphore table to memory helps with deadlocks
https://twitter.com/btopro/status/502914571058024448
Using a secondary db connection (what my new db backed cache module does) would another way to fix this issue.
Obviously something like memcache is the preferred way to handle locks; but not everyone wants to setup a memcache server. It's true that InnoDB is crazy fast; so much so you can use the memcache api to interface with that engine directly http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-replication.html and it's been getting tuned like crazy in MySQL 5.7; and the InnoDB performance from 5.5 to 5.6 was a big improvement as well.
I'm thinking that if you're using MySQL 5.5 or less, the memory table might be better than InnoDB; If you're using MySQL 5.6 or higher, InnoDB is will probably be better. I'll take this into consideration for apdqc; offering to convert to memory if MySQL is 5.5 or less; convert to InnoDB if MySQL 5.6 or higher.
1st: non-db locking (memcache or equivalent).
2nd: apdqc locking (no eta currently).
3rd: memory table (no deadlocks).
4th: core.
So I'm thinking even though using the memory table does fix things; the better fix is to use a new db connection. For today I would say the memory table is better due to no deadlocks, but we don't want to force this via core. Using a 2nd db connection would be the correct fix for core.
There is a ~8 month window between when PHP 5.4.5 was released and when MySQL 5.6 was GA, so we can expect to see some MySQL 5.5 installs with D8.
Comment #65
mikeytown2 CreditAttribution: mikeytown2 commentedThere is one other alternative to MEMORY/InnoDB: MySQL_Cluster. Seems nuts but I thought I would point this out just in case any one is interested in this http://www.clusterdb.com/mysql-cluster/replacing-memory-storage-engine-w...
BTW since switching the semaphore table back to InnoDB I've been getting deadlock reports but no down time due to the usage of APDQC. It puts lock queries in a different connection and thus outside of any transaction.
Latest deadlock report from prod below:
You can see that these 2 queries came in right around 20ms apart (i've seen 0.1ms before); same key and the second one fails. All is good, just slightly annoying because potential deadlocks on another table will get pushed out of this report as this deadlocks about once an hour.
Comment #66
gonssalNot fixing this for Drupal 8 would be a great failure. The semaphore table is one of the biggest problems in any big Drupal project I've worked on.
Comment #67
deminyFor Drupal 7, I added following update function in my *.install file to make the change:
Comment #68
YuvalBH CreditAttribution: YuvalBH as a volunteer commentedIf the semaphore table is not MyISAM, multiple processes can acquire the same lock; thus it is critical for this table to be MyISAM.
This is a bug. All other table changes are for speed. which is still exist with Drupal 7.43 !!
From what I've seen, the only core cache tables that benefit from InnoDB are cache_form, cache_field, and cache_page. cache_menu is a toss up depending on the site, InnoDB or MyISAM is better; but with what I've seen so far MyISAM performs better.
I changed the following tables to MyISAM in order to fix this issue:
1. Cache
2. Cache_View
3. Cache_bootstrap
4. cache_menu
5. Variables
6. Semaphore (changed to Rewrite according to the above)
7. menu_router
not much help with logs
I am shocked...it spend on it almost 10 hours !!
I hope that this will help others.
Comment #69
mikeytown2 CreditAttribution: mikeytown2 commented@yuval.j.benhur
If you're using MySQL 5.6 or better odds are your innodb configuration needs help. The semaphore table is unique in that the data in it does not need long term durability; thus the memory table seems like a good fit. The problem with memory/MyISAM is that each write locks the entire table and thus when you start to get a lot of concurrent writes happening this can become a bottleneck (see #59). Using innodb for the semaphore table is fine, yes it will create deadlocks but they will be quickly resolved.
Have you tried the latest dev of https://www.drupal.org/project/apdqc ? It does a pretty good job of telling you what MySQL configuration changes need to happen in order to get innodb running smoothly.
Comment #70
Pronco CreditAttribution: Pronco commentedI am still having some issues converting semaphore to memory engine with MySQL.
I can set limits as in below and it seems to work
ALTER TABLE semaphore ADD PRIMARY KEY (name(50), value(255)) USING BTREE;
However, I am not quite sure on whether it has an adverse impact to Drupal lock mechanism or not, If this is the case, please let me know how do I convert semaphore without any troubles?
Comment #71
sukh.singh CreditAttribution: sukh.singh commentedFor me engine type memory is not working, I am able to alter the table getting following errors
I have talked to one of the DB administrator they suggested the following...
I am not expert in Database Administration, but in case someone finds this information useful can use it.
I am using Server version: 5.6.24-72.2-log Percona Server (GPL), Release 72.2, Revision 8d0f85b
Comment #72
Elijah Lynn