Closed (fixed)
Project:
Drupal core
Version:
7.x-dev
Component:
database system
Priority:
Normal
Category:
Bug report
Assigned:
Issue tags:
Reporter:
Created:
3 Jul 2010 at 08:31 UTC
Updated:
15 Oct 2010 at 21:46 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
chx commentedOngoing, I do not think this issue can be fixed so easily. I rewrote MergeQuery. The reason we struggle so much is that we started from INSERT ON DUPLICATE KEY UPDATE which is a MySQL-ism and turned out to be unusable (hey, I am not surprised).
This patch starts from the standard
and builds from there. It exposes exposes the fact we are building a condition by exposing a usual condition interface. We also expose a full UPDATE and a full INSERT query builder. In fact, it extends UpdateQuery. If I could have extended two classes, I would have done that, but alas I couldn't so I copypasted lots from InsertQuery. Better than copypasting the whole condition shebang from UpdateQuery.
Now, MergeQuery fields and key become simple wrappers around insertFields and updateFields and condition. And so the reason for needing an exception trying to add the same field to key() and value() is finally crystal clear: depending which one is called first, we would get different queries.
Methods like "update" and "updateExcept" are so gone. Did anyone grok this? I certainly did not!
I know the postfix setter is ugly. Sue me. Or roll a better version :)
Comment #3
Anonymous (not verified) commentedwow, interesting, no time to do anything except subscribe for now.
Comment #4
chx commentedSee, that's it. user_role_grant_permission used a db_merge with just a key. That's meaningless. It now involves in the primary key and makes key() a 'magical wish' function. In my version, key() merely builds the condition and adds the same field-value pairs to INSERT. In the original, key() was saying "make it so that this record exists in the table"...
I have updated the query to have module in fields. I think that's what this query wants to do.
Comment #6
chx commentedNow, I went over the MERGE tests and turns out that calling a db_merge with just a key() has a meaning because key() populates all three parts of the query making the UPDATE a no-op. Fine! We can do that. The original version optimized this query out so I have added a needsUpdate flag to optimize out myself.
This patch makes the merge tests pass. It tells volumes what changes were required. Note that nothing else in core now needed to change.
To sum up the advantages of this class:
Comment #7
chx commentedWith less debug but with proper forUpdate method added to select.
Comment #8
chx commentedLittle ugly duckling (=SQLite) support addded. This was not exactly hard...
Edit: for now I stop working on this. I think it's done. Then the reviews will tell me I need another 20-30 rerolls. Bring it on!
Comment #9
chx commentedSigh. Poll :( Just fixing poll fixed dblog too.
Comment #10
chx commentedHm, webchick have committed the SQLite code from the parent issue. This patch deletes it :)
Note that what I said about ready still stand, these last two patches only fixed poll and nuked the brand new SQLite specific MERGE.
Comment #11
Crell commentedNot quite. The equivalence of that query is built, but that's not the actual SQL that runs. It should be documented as such.
Grammar error here. I'm not sure what "this method can called more than once" means.
Actually no, it's now returning the MergeQuery, which is the called object, not the InsertQuery. The same error is present elsewhere, too.
We only need count($this->condition) here, as the count() method is automatically called. That's what the Countable interface is for.
This should explain what FOR UPDATE means/does, since I suspect most people have never heard of it. I certainly hadn't until the previous thread. :-)
This is different than the interface definition, which defaults to TRUE, not NULL.
The other drivers put their SELECT implementations in query.inc, since they're all very small and we don't need to burn another file I/O hit on them. That said, I'm not sure that it's even necessary to split SelectQuery off anymore since we're still hitting that file on every page request anyway due to the menu system. We'll deal with that later.
Overall, I really like that this patch goes back to the original MERGE definition and works forward from there. It looks like we end up in the same place, though, with the final SQL that runs.
However, architecturally extending Merge off of Update is not the right way to go. Suppose we later on want/need a driver-specific Merge? Does it extend off of MergeQuery or UpdateQuery_foo?
The better approach here is composition. MergeQuery should create an insert query and update query internally and use those as needed, and then extend from Query directly.
Also, this is an API change as seen in the unit tests. I don't quite grok the reasons for that, and since webchick is considering making the next release a beta we really really have to be careful with those. :-(
Comment #12
chx commentedIt's awesome that we put SelectQuery implementations in query.inc because I have already filed an issue (which you denied me and I still disagree) stating DBTNG is impossible to follow, now we are not even consistent in where classes are? Yummy. I will file a critical to remove the top level select.inc, then.
MergeQuery does not create an INSERT or an UPDATE query that's the default, rather degenerate implementation of the standard. We have nothing better. I inherited UpdateQuery to avoid copypasting the boilerplate for DatabaseCondition. Aside from expression, I do not use UpdateQuery at all. And so if you need MergeQuery_foo, you can happily inherit it from MergeQuery, as it's quite unlikely you override expression in UpdateQuery_foo as all that method does is collect expressions. We do not use anything important from UpdateQuery, most importantly we override execute. This is a degenerate way to implement the DatabaseConnection-composition trait. I wish I could use traits but they are not available so I cook with what I have. I could have inherited even DeleteQuery for this trait and copypaste the two LoC that UpdateQuery::expression is into MergeQuery::expression. I still do not want to copypaste the whole DC-composition trait in there. (And I dislike we did it already so many times. Can't we put it into, say, Query?)
Comment #13
chx commentedAddressed Crell's comments. Decided to allow and documented that subsequent calls setting the same key-value pairs are OK. Every word of the FOR UPDATE documentation comes from the relevant section of the PostgreSQL manual (though I deleted a few words), I believe this amount of copy-paste is covered by citation rights. Crell needs to review anyways, it's very handy that our legal counsel and database maintainer is the same guy. Thanks dude! :D
Comment #15
chx commentedWell, well. I studied the ANSI SQL standard (a late 2003 draft is available). The subquery I added from INSERT query is clearly not supported:
the INSERT needs VALUES. On the other hand, it's again quite clear that the condition table and the INSERT/UPDATE table need not to be the same so I added a conditionTable method. It's quite nice that this automatically adds support for using a subquery (aka anonymous view) because db_select() supports it.
(as for the override clause in the standard, we are not supporting that, although we could but that would be a feature request against InsertQuery and MergeQuery. Docs here. However, such support would be quite complicated and even Oracle does not seem to support it. Let it rest :) )
Comment #16
chx commentedOh, bah, PHP! Needs an abstract method defined, eh.
Comment #18
chx commentedSigh...
Comment #19
chx commentedAdding the API change tag.
Comment #20
Crell commentedupdateFields() is not documented to take multiple calls the way insertFields() is. Why?
Why can this function be called multiple times, but not others? Why should this one be special?
The first comma is a comma-splice. It should probably be a colon.
Isn't forUpdate() chainable?
Actually this is all quite reasonable. I think. I didn't get to fully read it. But I think I'm probably OK with the overall thing. I need to look at it again when not dog tired. chx, go ahead and address the above. :-)
Powered by Dreditor.
Comment #21
chx commentedupdateFields() is not documented to take multiple calls the way insertFields() is. Why? -- because update::fields can be called multiple times. insert::fields can't be. The comment says "unlike insertQuery::fields() this method can be called more than once" that's a worthy comment. "Like updateQuery::fields() this method can be called more than once" does not say anything. It's not a special function, rather insertQuery::fields was special and once again the comment says so. Added even more here to make this clear: Warning: unlike InsertQuery::fields() this method can be called more than
once, much like UpdateQuery::fields() and MergeQuery::updateFields().
It's NOT a comma splice, it's a list of things. Changed the comment to: The fields are copied to all three parts of the query: the condition, the UPDATE part and the INSERT part.
Yes, forUpdate is chainable. Fixed.
Comment #23
chx commentedwhile forUpdate is chainable addExpression is not. Reverted.
Comment #25
Crell commentedI just spent quite a while staring at this patch, and I think I finally understand it. :-) Some comments below:
This is not true. UpdateQuery::fields() just does a blanket overwrite.
The actual reason this method can be safely called multiple times is because it can be called from keys() and fields(), and needs to handle that gracefully.
Couldn't this be eliminated by just calling $this->insertFields() and $this->updateFields()? It's only one more line, and removes the need for two lines and an extra function call. :-)
This should be $this->connection->insert(...)
This should be $this->connection->update(...)
Other than those, I am OK with this patch. It is an API change but the footprint of it is quite small it and arguably does move the API closer to the actual SQL spec, which we have been trying to use as our baseline.
So let's fix the above, then commit this.
Powered by Dreditor.
Comment #26
chx commentedI have changed the logic inside fields and key both making some of the above not a problem any more. You can't call updateFields/insertFields any more twice. key and fields can be but it's recommended against. Changed db_insert/db_update.
Comment #27
Crell commentedLet's get this in before webchick gets antsy and tags something as beta. :-)
Comment #28
dries commentedReviewed this again and committed it to CVS HEAD. Thanks.
Comment #29
rfayBUSTED! We have an issue titled "Clarify merge queries" without even an issue summary.
Please, an issue summary.
This is marked as API change. Does it have BC implications?
Comment #30
Crell commented"Our previous Merge query implementation was modeled on MySQL's "INSERT ... ON DUPLICATE KEY UPDATE" functionality. However, that did not properly map to the definition of Merge queries in the SQL spec.
The new implementation is modeled directly on the SQL spec's Merge query definition (even if none of the core databases are capable of supporting it directly). In most cases there is no API change. However, code that was using MergeQuery::updateExcept() will need to switch to insertFields() and updateFields(). See the docblock of MergeQuery and the corresponding unit tests for examples."
There's your API change report, Randy. :-) Thanks for following up.
Comment #32
damien tournoud commentedThis introduced two (small) regressions, see #943042: MergeQuery regressions: key is used in the update query, exception is echoed.