While running tests I sometimes run out of MySQL connections. This is because Simpletest creates a new connection during setUp() and closes it again during tearDown(), but the database API does not properly close the connection.
There is no explicit PDO::close(). Instead the connection is closed when the connection object is garbage collected. However, this does not happen, because the database API keeps at least two references to the connection object.
In DatabaseConnection::__construct():
$this->setAttribute(PDO::ATTR_STATEMENT_CLASS, array($this->statementClass, array($this)));
The latter $this introduces a circular reference (I think), and these are not necessarily detected when the object comes out of scope (I think this has been improved in recent PHP versions). See this comment.
In DatabaseConnection_mysql::__construct():
register_shutdown_function(array($this, 'nextIdDelete'));
A reference is saved in the shutdown function.
Due to these two lines, the connection object is not destroyed and the connection is thus not closed when DatabaseConnection::closeConnection() or DatabaseConnection::removeConnection() is called.
The following reproduces what happens during setUp() and tearDown():
for ($i = 0; $i < 20; $i++) {
$connection_info = Database::getConnectionInfo('default');
Database::renameConnection('default', 'simpletest_original_default');
Database::addConnectionInfo('default', 'default', $connection_info['default']);
Database::getConnection('default')->nextId();
Database::removeConnection('default');
Database::renameConnection('simpletest_original_default', 'default');
}
This is an issue when running tests. During regular operations you usually don't open and close the database connection many times during a request.
Comment | File | Size | Author |
---|---|---|---|
#66 | db.connection-843114-66.patch | 12.73 KB | Berdir |
#63 | drupal8.db-connection-test-skip.63.patch | 2.68 KB | sun |
#60 | skip-tests.patch | 2.07 KB | Berdir |
#50 | db.connection.50.patch | 12.38 KB | sun |
#50 | interdiff.txt | 6.35 KB | sun |
Comments
Comment #1
BerdirSuggestions on how to fix it?
Your link suggests to avoid array($this) but I'm not sure if we can refactor our code. Is the connection aware of it's key? If yes, we could only pass the $key and then get the connection object throuh Database::getConnection().
The second was a static call in the beginning, but we "fixed" it to not be static anymore. That's actuallly already the second bug because of that line, maybe there is a better way?
Would it help to set the object to null before unsetting it as it is done here: http://ch2.php.net/manual/en/pdo.connections.php?
According to the descriptions on http://php.net/unset, unset can behave in unexpected ways when used on static/global variables.
Comment #2
Crell CreditAttribution: Crell commentedThe first is a requirement of the way PDO works. I am not sure that it actually causes a circular reference; if so, at least not in user-space where we could do anything about it. :-(
For the second, bah. I suggested in another thread, I believe, moving that from a shutdown function to the destructor of the connection itself. So when the connection is about to go down it cleans up after itself first.
We can also see about switching from unset() to = NULL in Database::removeConnection() if it helps.
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedAgreed on moving nextIdDelete() to the destructor. That's way better then a shutdown function!
Comment #4
BerdirUntested patch to switch to __destruct().
While doing that, I've noticed that there is a bug which would cause the cleanup to run multiple times, that would need to be fixed anyway (we check on $this->shutdownRegistered but set $shutdownRegistered).
PS: Correct issue this time...
Comment #5
chx CreditAttribution: chx commented#4: use_destruct.patch queued for re-testing.
Comment #6
chx CreditAttribution: chx commentedSo create another fake connection, copy the current one, run its nextid twice, unset the connection object to close it and check whether nextid cleanup happened. This is a MySQL specific test but the functionality is MySQL specific too.
Comment #7
quicksketchThis patch is a great code improvement, but it doesn't seem to solve the problem of connections not getting closed properly. I'm writing an update hook that loops through several hundred databases and I keep running out of connections even though I'm calling
Database::removeConnection($key)
. In debugging things a bit, it doesn't look like the connection is truly released until the end of the request.I kept trying to dig deeper and deeper into the Database layer to figure out where this is coming from, but even at the most basic level I can't get the database connection to be freed by PDO.
In the Database::openConnection() method, I even tried immediately shutting down the connection like this:
But even wiping out the connection immediately after opening it won't call the __destruct() method on the connection. As mentioned in the original posting, we might have some kind of circular reference to the DatabaseConnection object somewhere inside itself. Because PHP won't disconnect the connection until all references to the connection variable are removed, it doesn't get cleaned up until the end of the request.
Comment #8
quicksketchTurns out it's not just "some kind of circular reference", it's actually the exact example that @c960657 pointed out in the original posting, with an exact description of the problem. http://www.php.net/manual/en/book.pdo.php#80127
That's exactly what we're doing in our code. Here's the incriminating line:
So by self-referencing the $this variable inside PDO, it makes it so that we can't disconnect the database. Fortunately it doesn't look particularly hard to correct this problem, as we can simply use setAttribute() again and remove the self-reference right before we unset the connection variable. It's not very pretty but it's the only thing we can do without modifying (and changing the API) of the other methods that assume the database connection already exists inside DatabaseStatementBase->$dbh.
Patches attached for both D8 and D7 backport.
Comment #9
quicksketchComment #10
quicksketchDoh, sorry about the status changes... getting it into the right spot.
Comment #12
quicksketchFor anyone needing a work-around in the mean time, you can manually call the setAttribute() function before a call to Database::removeConnection() to accomplish the same thing. Though the shutdown function won't get executed if you've used nextId() for anything, this approach could have other side effects until the patch is committed that fixes the entire problem.
Comment #13
quicksketchHm, actually looks like D8 doesn't have this problem. The whole segment of code that causes the problem was removed at some point. For D8 we can still clean up the code and use __destruct() method rather than a shutdown function though, so here's a slimmed down version for D8.
Comment #15
quicksketchSimpler D8 patch. Though less consistent with the D7 version, the code doesn't need to be so verbose considering the main problem doesn't exist in D8.
Comment #16
sun#15: database_disconnect-d8.patch queued for re-testing.
Comment #17
sunMarked #1791602: Database::closeConnection() does not actually close connection as duplicate of this issue, which contains further proof.
Bumping to critical, since this bug is causing random test failures on HEAD currently.
TestBase::tearDown() calls Database::removeConnection(), which is supposed to also close the connection. Manually injecting an additional Database::closeConnection() does not change the situation. Monitoring the result of
SHOW PROCESSLIST
during a test run shows that new connections are stacking up for every test method that is executed in a test case.Comment #18
sunI'm afraid, not even the major hackfest in attached patch made all the connections close. (Essentially, I explored all options that have been mentioned so far.)
That said, replacing $this with $key, $target for the Statement class did reduce the amount of open connections, but not to a complete level.
I suspect that there are various other pointers to the same connection object passed around; e.g., StatementPrefetch::__construct() also takes the connection object and sets it as a class property.
Help?
Comment #20
sunRemoved all the debugging stuff from #18.
Comment #21
sunSorry, forgot to remove a parent::__destruct().
#18 still applies though. Needs further investigation.
Comment #23
catchInteresting, is this also #930876: PDOException: SQLSTATE[08004] [1040] Too many connections in lock_may_be_available() then?
Comment #24
CrookedNumber CreditAttribution: CrookedNumber commentedRe: #7 and #12
I had a similar task to quicksketch -- looping over 100+ DB's in a single page request. Thankfully, his hack (#12) worked. Details here: http://drupal.org/node/1791602#comment-6528874
Big thanks to quicksketch and sun!
Comment #25
Crell CreditAttribution: Crell commentedThis is no good. The static methods should never ever be called from within the DB classes. Why do we need to do this? Is there no other way to avoid whatever is keeping the connections around? (I'm assuming some sort of circular dependency...)
Comment #26
sunAs far as I'm able to understand the problem space, the situation is this:
1) A new connection is opened.
2) If the db driver doesn't opt-out for the Statement class, the very first reference to the connection is stored in that class.
3) The connection object is handed further to other objects upon performing queries, which leads to further references.
4) When the connection object ought to be destroyed, the references are still intact, and there's no single point in the system that knows about the stack of connection object references elsewhere. Thus, the connection is not closed.
I'm fairly sure that this can even be reproduced fairly easy in a unit test.
(Silly me didn't think of that yet; ran the same web test again and again for debugging, which took minutes for each run. ;))
Comment #27
Crell CreditAttribution: Crell commentedPHP 5.3 should be able to handle circular reference cleanup when there are no remaining external references. I don't know why it wouldn't be doing that.
Is there a way to force PHP to kill an object entirely, rather than just its handle/variable?
Comment #28
sunThat was explicitly denied by PHP core: https://bugs.php.net/bug.php?id=52243
Putting that into a search query funnily yields a stunning amount of results: https://www.google.com/search?q=force+php+to+destroy+object
Gosh, I wish had the idea for this test yesterday already... totally wasted hours of debugging ;)
That said, the test passes with the changes here -- however, I'm going to enhance that overly simple test to run an actual query on the new connection in a minute, which I currently guess to be the point where it's going to fail.
Comment #30
sunNote that the 1 failure between the test-only and the full patch is a different failure.
The test-only patch fails in the right spot; ConnectionUnitTest - a newly opened connection is not closed when closing it directly after opening it.
Comment #31
Crell CreditAttribution: Crell commentedPerhaps we need some sort of ->commitSepuku() method on connections that will actively destroy any references that object has?
Although... if it's the default statement class that's doing it that sounds like a PHP bug, doesn't it? A circular reference in PHP itself?
Comment #32
sunI guess it's not a PHP bug, because PHP does not instruct you to pass the database connection object to the statement class (which isn't really needed in the first place - the only reason for passing it is that call to getLogger()).
Comment #33
Crell CreditAttribution: Crell commentedExcept PDOStatement has a dbh variable on it, which we're just inheriting. That's why it's public. :-) Could it be an issue that only triggers when providing a custom connection handle?
Comment #34
sunHum, where? I don't see a PDOStatement::$dbh property?
Comment #35
sunAdded Connection::destroy() method. Heavily enhanced tests.
Removed most but not all debugging output.
With attached patch, only one test failure remains in ::testOpenSelectQueryClose() -- I stepped through tons of Database component files, but wasn't able to locate where further references are stored.
Comment #37
sunI have absolutely no idea why, because I'm not doing anything at all with regard to database connections over there, but this bug is consistently exposed on all testbots with #1067408: Themes do not have an installation status and thus a critical blocker for that issue.
Comment #38
c960657 CreditAttribution: c960657 commentedUnsetting $this->schema in Connection::destroy() makes testOpenSelectQueryClose() pass for me.
It seems that these tests are not compatible with
run-tests.sh --concurrency 2 ...
.Comment #39
sun@c960657: YOU'RE GENIUS!!! :)
Fixed object reference in Connection::$schema is not unset.
Reverted all other changes, added destroy() docs.
Comment #40
sunFixed typo/grammar.
Comment #42
sunHeh, meh! :)
Fixed variable name clash.
Comment #44
sunFixed Unit tests do not necessarily open a connection, but tearDown() unconditionally removes/closes it.
Comment #46
sunColor me confused on this test failure:
Entity not found in the database.
Other EntityFormTest.php 66 Drupal\system\Tests\Entity\EntityFormTest->testFormCRUD()
Sending for re-test.
Comment #47
sun#44: db.connection.44.patch queued for re-testing.
Comment #48
c960657 CreditAttribution: c960657 commentedSometimes you just need another pair of eyes :-)
If this is MySQL-specific, we should check for this in checkRequirements() (see #1273478: Implement @requires and @dependencies within TestBase, mark tests as skipped). I've have read that on PostgreSQL you can get the number of connections using
select count(*) from pg_stat_activity
, so perhaps we can extend the tests to PostgreSQL also.The concurrence issue is hard to get around. Basically this test will fail if other processes are accessing the database at the same time. I'm not sure whether test bots on d.o run with concurrency > 1. If they do, then this test will generate random test failures on the test bots. One possible solution is to add another requirement (e.g. check for
$_SERVER['HTTP_USER_AGENT'] == 'Drupal command line' && $args['concurrency'] > 1
).An alternative (and probably better) strategy to just counting the number of concurrent connections is to check the specific connection IDs using CONNECTION_ID() and match those against the IDs returned by SHOW PROCESSLIST.
Comment #49
BerdirThe testbot runs at concurrency 8, so the chance of this failing in random ways would be quite high then?
Comment #50
sunAh, thanks for clarifying, now I understand what you meant with the concurrency remark. That's bad. :-/
What we could potentially do is to
1) create an additional, separate "monitor" connection in setUp()
2) retrieve and record the connection ID after opening it in each test
3) use the monitor connection to check whether the connection ID exists after closing the connection
Fixed countConnections() is incompatible with test runner concurrency.
Comment #51
sunI think this is RTBC :)
Comment #52
Crell CreditAttribution: Crell commentedI didn't look over the new tests in fine detail, but the changes to the Connection classes all look sane to me. +1
Comment #53
sunCan we move forward here?
The test uses
SHOW PROCESSLIST;
andSELECT CONNECTION_ID();
queries, which might be MySQL-specific.However, this issue 1) is a critical bug, 2) blocks at least one critical task for D8, and 3) figuring out how this test can work in a db-agnostic way can be a non-critical follow-up, IMHO.
Comment #54
Crell CreditAttribution: Crell commentedLet's do.
Comment #55
webchickCommitted and pushed to 8.x. Woohoo! $critical_bugs--; $random_failures--;
Moving down to 7.x for backport.
Comment #56
catchSHOW PROCESSLIST is MySQL only - can we make that test just run on MySQL for now?
Comment #57
sun#1273478: Implement @requires and @dependencies within TestBase, mark tests as skipped introduced checkRequirements(), but that leads to a test failure. I researched how to fix that properly and we've to change the API over there to make it possible to skip this test when not running against MySQL.
Comment #58
Crell CreditAttribution: Crell commentedA completely unscientific sampling of people in #drupal-contribute agreed that since this is in, and we're now talking follow-up, we can demote this issue.
Comment #59
catchDo any of those people use Drupal 7? Bumping back to critical until there's some justification.
Comment #60
BerdirCan't we do something as simple as this for the moment?
Gives a "No test results to display" error in the UI but seems to work fine in CLI. We could also add a dummy $this->pass() message to avoid that.
Comment #62
Berdir#60: skip-tests.patch queued for re-testing.
Comment #63
sunYes, let's do that for now. I moved the check and @todo into a central location to reduce duplication.
Looks good to go for me.
Comment #64
Berdirless duplication++.
Let's get this in, then backport it to 7.x and get rid of it :)
Comment #65
webchickThis'll work for the purposes of removing a critical, however it's a bit clunky. I would move #1273478: Implement @requires and @dependencies within TestBase, mark tests as skipped to a major task, but it already is one. ;) Lovely.
Committed and pushed to 8.x. Down to 7.x for porting.
Comment #66
BerdirAttaching a backport, connection unit tests pass locally.
Comment #67
sunThanks! That is a clean, cumulative backport.
I additionally verified that the new test appears in the list of test results and that its assertion count indicates that the test was executed on MySQL.
Comment #68
podarok#66 +1 RTBC
Comment #69
David_Rothstein CreditAttribution: David_Rothstein commentedDoes this happen on Drupal 7 too, or only on Drupal 8?
If it happens on Drupal 7, then presumably this would have been causing random test failures for months or years... I didn't think that was the case. If not, then we can probably demote this from critical for Drupal 7.
Comment #70
BerdirTo major? That wouldn't help, we're over the major bug threshold too anyway :)
I'm not sure if it causes test failures or not in 7.x, it is relatively hard to verify because the test failures caused by this are often very cryptic random assertion failures if e.g. a page request dies with a exception.
I've started a test run with --all --concurrency 8 on 7.x and the connection count was around ~30 most of the time and shortly got over 50 during the database tests (I've aborted after that), the connection count also always went quite a bit up and down. If that is enough to cause random failures, I don't know.
With this patch, the connection count stays at a relatively stable 16-21 connections, I haven't seen a higher amount, even during the database tests.
So I think this is quite a nice improvement that can also positively affect real sites which connect to multiple databases and *might* help with the quite often reported "too many open connections" problem, although my guess is that that one is actually caused by a database lock and many parallel connections.
Comment #71
sunThis issue was originally opened against D7, so yes, this is definitely a bug there.
Also, we did not notice the bug for a long time, because it seems like testbots have a maximum limit of ~100 connections. The bug was only exposed due to recent changes in D8, whereas I do not know exactly why - thus far, I assume that the introduction of drupal_container(), plus the database service being registered on the DIC, and additionally, the fact that tests are setting up a new container for every test, is ultimately what caused the connection count to steadily increase. But that's just a wild guess.
As @quicksketch mentioned in #7, you can run into the bug when dealing with many databases.
I won't comment on critical/major priority futzing, because this patch is RTBC and just needs a commit. :P
Comment #72
David_Rothstein CreditAttribution: David_Rothstein commentedOK, thanks. Sounds like we can safely move this back to "normal" for Drupal 7 then (its original status), even though it was critical for Drupal 8.
Still leaving it at RTBC and will look at it eventually, but now there's less of a rush :)
Comment #73
webchickThanks, David! Assigning to you so you can review it at your leisure.
Comment #74
kenyan CreditAttribution: kenyan commentedSo, go with #66?
I'm coming over from:
http://drupal.org/node/930876#comment-6599142
Comment #75
David_Rothstein CreditAttribution: David_Rothstein commentedSorry for not getting this into Drupal 7.17, but I wasn't too excited about committing a potentially dangerous patch like this right before a release.
I reviewed it now and it looked pretty good to me. It was actually less scary-looking than I thought it would be :) But it will still be good to have this in 7.x-dev for a while before the next release, in case people run into any problems with it.
It seems like the tests might need a little cleanup, e.g. duplicate
$connection_info = Database::getConnectionInfo('default');
line (D7 only), some confusion over variable names $originalCount vs. $originalTarget (both D7 and d8?), but I didn't see anything that would prevent this from being committed to fix the bug, so...Committed to 7.x. Thanks! http://drupalcode.org/project/drupal.git/commit/c7e45d9
Comment #76
David_Rothstein CreditAttribution: David_Rothstein commentedDrupal 7.18 was a security release only, so this issue is now scheduled for Drupal 7.19 instead.
Fixing tags accordingly.
Comment #78
pwaterz CreditAttribution: pwaterz commentedIm not sure if my situation is relevant but I am having an issue with the a stale connection object. I have applied the above patch and am still loosing connection. Here is what is going on:
I have drush command that makes several API calls to gather information for creating nodes. These api requests take about 20mins to complete. As soon as I have all the information and try and save the node I get
Cannot modify header information - headers already sent by (output started at /opt/drupal/drush/includes/output.inc:37) bootstrap.inc:1212 [warning]
PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away: INSERT INTO {watchdog} (uid, type, message, variables, severity, link, location, referer, hostname, timestamp) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8, :db_insert_placeholder_9); Array
(
[:db_insert_placeholder_0] => 0
[:db_insert_placeholder_1] => highwire
[:db_insert_placeholder_2] => Error in processing /ajpcell/299/1/C111.atom SQLSTATE[HY000]: General error: 2006 MySQL server has gone away
[:db_insert_placeholder_3] => a:0:{}
[:db_insert_placeholder_4] => 3
[:db_insert_placeholder_5] =>
[:db_insert_placeholder_6] => http://default/index.php
[:db_insert_placeholder_7] =>
[:db_insert_placeholder_8] => 127.0.0.1
[:db_insert_placeholder_9] => 1357256234
)
in dblog_watchdog() (line 154 of /opt/sites/freshd72/drupal-webroot/modules/dblog/dblog.module).
Drush command terminated abnormally due to an unrecoverable error.
From the little I read about this post, this patch was suppose to fix that.
Comment #79
pwaterz CreditAttribution: pwaterz commentedFor testing purposes I put a random SQL statement in my drush command that made a SQL call every 30 seconds or so, and that caused the connection to stay open. Is there any way to keep the connection alive longer?
Comment #80
David_Rothstein CreditAttribution: David_Rothstein commentedDrupal 7.19 was a security release only, so this issue is now scheduled for Drupal 7.20 instead.
Fixing tags accordingly.
Comment #81
David_Rothstein CreditAttribution: David_Rothstein commentedTag change didn't seem to stick - trying that again.
Comment #82
David_Rothstein CreditAttribution: David_Rothstein commentedDrupal 7.20 was a security release only, so this issue is now scheduled for Drupal 7.21 instead. For real this time... I think :)
Fixing tags accordingly.
Comment #83
David_Rothstein CreditAttribution: David_Rothstein commentedFixing tags since Drupal 7.21 only contained a fix to deal with fallout from the Drupal 7.20 security release. Hopefully this is really the last time I do this and this will be released in Drupal 7.22 for real :)
Comment #84
David_Rothstein CreditAttribution: David_Rothstein commented.
Comment #85
jackbravo CreditAttribution: jackbravo commentedI don't believe you anymore :(
:P
Comment #86
spencerthayer CreditAttribution: spencerthayer commentedThis bug seems to be pretty persistent, doubt we'll see it fixed in .22. Getting it now with 7.21. Updated the thread that ranks first in Google http://drupal.org/node/930876#comment-7165942.
Comment #87
pwaterz CreditAttribution: pwaterz commentedOk so I was able to get around this by chaining my wait time out during run time when I know I have a long process.
//Set the mysql waittime really high because sometimes things take a long time
db_query("SET SESSION wait_timeout=1000000");
I also have experiment with a few other methods.
I tried setting the PDO timeout attribute in my datebase array, but that had no effect.
I also considered extending the mysql connection class and unsetting
protected $statementClass
so that I could use a PDO persistant connection but never tried that, but in theory it should work.Comment #88
pwaterz CreditAttribution: pwaterz commentedSee my comment here http://drupal.org/node/1079892#comment-7195740 for more info about persistant connections
Comment #89
jackbravo CreditAttribution: jackbravo commented@pwaterz, this fix only made it until drupal 7.22. So this probably doesn't happen anymore.
Comment #90
Riki_tiki_tavi CreditAttribution: Riki_tiki_tavi commentedSurprise! Your little hack doesn't work.
I have a fresh shiny drupal 7.31 installation + php 5.4.4 + php-fpm + postgres 9.1.14 + pgpool and it doesn't work.
The code above shows
"PDOException: SQLSTATE[08006] [7] remaining connection slots are reserved for non-replication superuser connections"
after several iterations.So the problem is not in Drupal. The problem is in PDO.
Looks like they are completely ignore a begging to make a good connection close method more than a 2 years.
IMHO, PDO for a D7+ was a very bad choice.
I have a heavy loaded site on D6 + Postgres + pgpool and i can't just upgrade it because of PDO unable to close connection correctly? Are you joking?