Problem/Motivation
PDO driver has a limit to the number of named arguments you can use.
This varies by DB driver.
When using operations that perform bulk actions on entities, such as deleting multiple entities, it is possible to hit this limit.
Steps to reproduce
Proposed resolution
Document that such operations should be performed in chunks.
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Original report
API page: http://api.drupal.org/api/drupal/modules--node--node.module/function/nod...
Describe the problem you have found:
It should be noted that the argument array to node_delete_multiple() should be limited in size to avoid a PDO error like #1210092: PDOException: SQLSTATE[HY000]: General error: 1 too many SQL variables: SELECT t.* FROM {field_data_body} t WHERE (entity_type =
Comment | File | Size | Author |
---|---|---|---|
#20 | error-1210606-20.png | 18.3 KB | pillarsdotnet |
#2 | documentation-limited_argument_length-1210606-2.patch | 2.32 KB | pillarsdotnet |
#1 | documentation-limited_argument_length-1210606-1.patch | 2.32 KB | pillarsdotnet |
Comments
Comment #1
pillarsdotnet CreditAttribution: pillarsdotnet commentedAdded documentation to the following functions:
node_delete_multiple()
node_load_multiple()
DrupalDefaultEntityController::buildQuery()
Comment #2
pillarsdotnet CreditAttribution: pillarsdotnet commentedThe same docs change, rolled for 8.x
Comment #3
pillarsdotnet CreditAttribution: pillarsdotnet commentedComment #4
jhodgdonWhat exactly is the limit? Adding this documentation seems like a reasonable idea, but maybe it should give me some idea what the limit is, or what it depends on?
Comment #5
pillarsdotnet CreditAttribution: pillarsdotnet commentedI imagine that the limit is dependent on the database driver and/or configuration. In my testing configuration (PHP 5.5.0 with memory_limit set to 256M, PDO/sqlite3 database), 2000 is clearly too high.
I didn't try to find exactly what number breaks the query in my configuration, let alone under a broad range of databases and memory limit configurations.
Comment #6
jhodgdonHm. Can we provide any guidance at all then?
Comment #7
pillarsdotnet CreditAttribution: pillarsdotnet commentedNot without extensive testing that I frankly am not interested in doing. For all I know, it's a fundamental hard-coded limit of PDO and not dependent on database/driver/config at all. If so, then our function should be rewritten to enforce the limits of the language we are using.
Comment #8
salvisYou've identified PDO as the bottleneck for the number of values inside IN()? Good!
Comment #9
pillarsdotnet CreditAttribution: pillarsdotnet commentedNo I have not identified the bottleneck. Nor am I interested in doing so. Please re-read #7 and do not mis-quote me.
Comment #10
salvisSorry — #7:
I disagree. The first priority should be to work around any such limits, as transparently as possible.
In a layered architecture, upper layers shouldn't have to worry about constraints that can be handled in lower layers.
Comment #11
pillarsdotnet CreditAttribution: pillarsdotnet commentedWill salvis please file his own bug report against the Drupal core database system instead of trying to sidetrack this documentation issue?
Comment #12
jhodgdonNo need to be snarky, pillarsdotnet... A polite request would be sufficient, but yes this is a doc issue.
So, back to the patch. You mentioned "out of memory errors", but do you know of this being a possibility? The only error I see in that other issue is a PDO error due to too many placeholders. The part you added to the doc doesn't even mention placeholders or explain how having "all the items in one query" would cause problems. Really, the problem is not that all the results come in one query, it's that the IN() clause has too many items in it, or that there are too may placeholders in the query.
Another thing... It looks like the problem is really in the entity loading controller called by entity_load(). You decided to document this in 3 functions, but there are many others that could be affected, right? So why did you decide on this particular list of 3 functions? It seems a bit arbitrary... and besides the entity loading controller is pluggable.
So I think the best thing would be to document it on entity_load as affecting the default entity controller, and on the functions in the default controller that are affected, and maybe put a statement like:
See entity_load()'s $ids parameter for information about limitations on the $nids argument.
on other functions that might be affected by this. And then in entity_load()'s $ids parameter, maybe something like this to better (I hope) document what is going on:
The default entity controller uses a query that puts all of the individual IDs into the query as placeholders in an IN() SQL statement. Having too many IDs in the array will lead to an overflow error.
Comment #13
pillarsdotnet CreditAttribution: pillarsdotnet commentedSince all of the node objects are being loaded into memory before being deleted, I imagine that in certain edge cases (very complex nodes in combination with a very small PHP memory_limit) there could be an out-of-memory issue. In my experience, this manifests as weird syntax errors in code that is actually syntactically correct.
I'm sure you are more knowledgable than I, but what experience I have in database optimization leads me to believe that extremely long and/or complex queries are detrimental to performance, quite aside from any limits built in to the PDO code.
I try to write documentation from a user standpoint, not from a systems engineering standpoint. All the user needs to know is that putting too many items in that array can lead to errors.
The problem is that the current code and documentation assumes that it is possible and practical to produce unlimited-length SQL queries which load unlimited amounts of data into RAM.
We scratch our own itch. But you're right; I should have submitted three patches, not one.
Marking this a duplicate in favor of:
Comment #14
jhodgdonWhy open those three issues when the real problem is entity_load and the default controller? Wrong solution, IMO.
Comment #15
pillarsdotnet CreditAttribution: pillarsdotnet commentedBecause then you can debate those three issues separately rather than arguing about whether I've managed to find every single function whose documentation exhibits this bug.
Actually, the lowest-level function in the call chain is
DatabaseCondition::compile()
.I got a fatal error while using a very popular Drupal module. I submitted a patch to fix the fatal error, and was told that the module is working as designed. I submitted a documentation patch to help future programmers avoid the fatal error, and am being told that I have wrongly documented the wrong functions. Therefore, I have posted comments to the relevant API pages as a stopgap measure.
At this point, I don't care anymore.
Comment #16
jhodgdonOK. Well let's just document it in *one* place, so that if the code gets fixed, we won't have to go back and fix the doc everywhere else, and put in some references.
Sorry you don't care any more. Hopefully someone else will make a patch that patches the doc in a way that is maintainable and correct.
Comment #17
droplet CreditAttribution: droplet commentedsub from #1212148: Document limited size of the $nids array argument to node_load_multiple() function.
I have a test on it, doesn't like memory issue (Mysql 5 + PHP5.2/5.3)
http://drupal.org/node/1212148#comment-4708288
But i notice @pillarsdotnet is using (PHP 5.5.0 with memory_limit set to 256M, PDO/sqlite3 database)
PHP 5.5.0 ??
can you test it with mysql ? maybe an issue only happen on sqlite3
Comment #18
pillarsdotnet CreditAttribution: pillarsdotnet commentedFrom http://www.sqlite.org/limits.html:
Comment #19
droplet CreditAttribution: droplet commentedwe are better to patch DBTNG to limit it in SQLite
Comment #20
pillarsdotnet CreditAttribution: pillarsdotnet commentedI recompiled sqlite with SQLITE_MAX_VARIABLE_NUMBER set to 65536, then recompiled PHP, and performed the same test with the same results.
Comment #21
kmajzlik CreditAttribution: kmajzlik commentedLimit of arguments for sqlsrv = 2100. Strange thing.
Comment #25
andypostComment #31
larowlanAs this is documentation, its just a task.
Updated the title and issue summary
Comment #32
geek-merlinI disagree with the direction of this. Yes, we have limits on the DB level, which depends on DB type.
Instead of putting the burden of messing with this on higher level, we should fix the DB driver.
#2031261: Make SQLite faster by combining multiple inserts and updates in a single query has working code for this, and even a report of chunking improving performance.
So let's treat this as a BUG or TASK for the DB driver like "Fix DB limits via chunking"