Comments

chx’s picture

StatusFileSize
new19.96 KB

Ongoing, I do not think this issue can be fixed so easily. I rewrote MergeQuery. The reason we struggle so much is that we started from INSERT ON DUPLICATE KEY UPDATE which is a MySQL-ism and turned out to be unusable (hey, I am not surprised).

This patch starts from the standard

MERGE INTO table_name USING table_name ON (condition)
  WHEN MATCHED THEN
  UPDATE SET column1 = value1 [, column2 = value2 ...]
  WHEN NOT MATCHED THEN
  INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...

and builds from there. It exposes exposes the fact we are building a condition by exposing a usual condition interface. We also expose a full UPDATE and a full INSERT query builder. In fact, it extends UpdateQuery. If I could have extended two classes, I would have done that, but alas I couldn't so I copypasted lots from InsertQuery. Better than copypasting the whole condition shebang from UpdateQuery.

Now, MergeQuery fields and key become simple wrappers around insertFields and updateFields and condition. And so the reason for needing an exception trying to add the same field to key() and value() is finally crystal clear: depending which one is called first, we would get different queries.

Methods like "update" and "updateExcept" are so gone. Did anyone grok this? I certainly did not!

I know the postfix setter is ugly. Sue me. Or roll a better version :)

Status: Needs review » Needs work

The last submitted patch, stop_monkeying_around_with_merge.patch, failed testing.

Anonymous’s picture

wow, interesting, no time to do anything except subscribe for now.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new20.5 KB

See, that's it. user_role_grant_permission used a db_merge with just a key. That's meaningless. It now involves in the primary key and makes key() a 'magical wish' function. In my version, key() merely builds the condition and adds the same field-value pairs to INSERT. In the original, key() was saying "make it so that this record exists in the table"...

I have updated the query to have module in fields. I think that's what this query wants to do.

Status: Needs review » Needs work

The last submitted patch, stop_monkeying_around_with_merge.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new24.33 KB

Now, I went over the MERGE tests and turns out that calling a db_merge with just a key() has a meaning because key() populates all three parts of the query making the UPDATE a no-op. Fine! We can do that. The original version optimized this query out so I have added a needsUpdate flag to optimize out myself.

This patch makes the merge tests pass. It tells volumes what changes were required. Note that nothing else in core now needed to change.

To sum up the advantages of this class:

  1. Makes it trivial to implement the MERGE in standard-following databases. The three we support are not ANSI SQL compatible in this regard.
  2. Consistent with the existing query builders. UpdateQuery is extended and InsertQuery is copypasted with only the relevant fields() method names changed to insertFields and updateFields.
  3. Makes it easy to understand what's going on by making key() and fields() a wrapper around the inherited methods described above.
  4. Disallows the confusing and already invalid addition of same column-value to key() and fields().
chx’s picture

StatusFileSize
new25.64 KB

With less debug but with proper forUpdate method added to select.

chx’s picture

StatusFileSize
new26.18 KB

Little ugly duckling (=SQLite) support addded. This was not exactly hard...

Edit: for now I stop working on this. I think it's done. Then the reviews will tell me I need another 20-30 rerolls. Bring it on!

chx’s picture

StatusFileSize
new26.78 KB

Sigh. Poll :( Just fixing poll fixed dblog too.

chx’s picture

StatusFileSize
new29.4 KB

Hm, webchick have committed the SQLite code from the parent issue. This patch deletes it :)

Note that what I said about ready still stand, these last two patches only fixed poll and nuked the brand new SQLite specific MERGE.

Crell’s picture

Status: Needs review » Needs work
+++ includes/database/query.inc	2010-07-03 19:38:55 +0000
@@ -1190,6 +868,319 @@ class UpdateQuery extends Query implemen
+ * The following query is built:
+ *
+ * @code ¶
+ * MERGE INTO table_name USING table_name ON (condition)
+ *   WHEN MATCHED THEN
+ *   UPDATE SET column1 = value1 [, column2 = value2 ...]
+ *   WHEN NOT MATCHED THEN
+ *   INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...
+ * @endcode

Not quite. The equivalence of that query is built, but that's not the actual SQL that runs. It should be documented as such.

+++ includes/database/query.inc	2010-07-03 19:38:55 +0000
@@ -1190,6 +868,319 @@ class UpdateQuery extends Query implemen
+   * Warning: unlike InsertQuery::fields() this method can called more than
+   * once for different fields. Subsequent additions of the same field throws
+   * an InvalidMergeQueryException.

Grammar error here. I'm not sure what "this method can called more than once" means.

+++ includes/database/query.inc	2010-07-03 19:38:55 +0000
@@ -1190,6 +868,319 @@ class UpdateQuery extends Query implemen
+   * @return InsertQuery
+   *   The called object.
+   */
+  public function insertFields(array $fields, array $values = array()) {

Actually no, it's now returning the MergeQuery, which is the called object, not the InsertQuery. The same error is present elsewhere, too.

+++ includes/database/query.inc	2010-07-03 19:38:55 +0000
@@ -1190,6 +868,319 @@ class UpdateQuery extends Query implemen
+      if (!$this->condition->count()) {
+        throw new InvalidMergeQueryException(t('Invalid merge query: no conditions'));
+      }

We only need count($this->condition) here, as the count() method is automatically called. That's what the Countable interface is for.

+++ includes/database/select.inc	2010-07-03 19:38:55 +0000
@@ -485,6 +485,17 @@ interface SelectQueryInterface extends Q
+  /**
+   * Add FOR UPDATE to the query.
+   *
+   * @param $set
+   *   IF TRUE, FOR UPDATE will be added to the query, if FALSE then it won't.
+   *  ¶
+   * @return QueryConditionInterface
+   *   The called object.
+   */
+  public function forUpdate($set = TRUE);

This should explain what FOR UPDATE means/does, since I suspect most people have never heard of it. I certainly hadn't until the previous thread. :-)

+++ includes/database/select.inc	2010-07-03 19:38:55 +0000
@@ -1014,6 +1035,12 @@ class SelectQuery extends Query implemen
+  public function forUpdate($set = NULL) {

This is different than the interface definition, which defaults to TRUE, not NULL.

+++ includes/database/sqlite/query.inc	2010-07-03 21:23:34 +0000
@@ -119,65 +119,6 @@ class UpdateQuery_sqlite extends UpdateQ
=== added file 'includes/database/sqlite/select.inc'

=== added file 'includes/database/sqlite/select.inc'
--- includes/database/sqlite/select.inc	1970-01-01 00:00:00 +0000

--- includes/database/sqlite/select.inc	1970-01-01 00:00:00 +0000
+++ includes/database/sqlite/select.inc	2010-07-03 19:16:41 +0000

+++ includes/database/sqlite/select.inc	2010-07-03 19:16:41 +0000
+++ includes/database/sqlite/select.inc	2010-07-03 19:16:41 +0000
@@ -0,0 +1,17 @@

The other drivers put their SELECT implementations in query.inc, since they're all very small and we don't need to burn another file I/O hit on them. That said, I'm not sure that it's even necessary to split SelectQuery off anymore since we're still hitting that file on every page request anyway due to the menu system. We'll deal with that later.

Overall, I really like that this patch goes back to the original MERGE definition and works forward from there. It looks like we end up in the same place, though, with the final SQL that runs.

However, architecturally extending Merge off of Update is not the right way to go. Suppose we later on want/need a driver-specific Merge? Does it extend off of MergeQuery or UpdateQuery_foo?

The better approach here is composition. MergeQuery should create an insert query and update query internally and use those as needed, and then extend from Query directly.

Also, this is an API change as seen in the unit tests. I don't quite grok the reasons for that, and since webchick is considering making the next release a beta we really really have to be careful with those. :-(

chx’s picture

It's awesome that we put SelectQuery implementations in query.inc because I have already filed an issue (which you denied me and I still disagree) stating DBTNG is impossible to follow, now we are not even consistent in where classes are? Yummy. I will file a critical to remove the top level select.inc, then.

MergeQuery does not create an INSERT or an UPDATE query that's the default, rather degenerate implementation of the standard. We have nothing better. I inherited UpdateQuery to avoid copypasting the boilerplate for DatabaseCondition. Aside from expression, I do not use UpdateQuery at all. And so if you need MergeQuery_foo, you can happily inherit it from MergeQuery, as it's quite unlikely you override expression in UpdateQuery_foo as all that method does is collect expressions. We do not use anything important from UpdateQuery, most importantly we override execute. This is a degenerate way to implement the DatabaseConnection-composition trait. I wish I could use traits but they are not available so I cook with what I have. I could have inherited even DeleteQuery for this trait and copypaste the two LoC that UpdateQuery::expression is into MergeQuery::expression. I still do not want to copypaste the whole DC-composition trait in there. (And I dislike we did it already so many times. Can't we put it into, say, Query?)

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new30.72 KB

Addressed Crell's comments. Decided to allow and documented that subsequent calls setting the same key-value pairs are OK. Every word of the FOR UPDATE documentation comes from the relevant section of the PostgreSQL manual (though I deleted a few words), I believe this amount of copy-paste is covered by citation rights. Crell needs to review anyways, it's very handy that our legal counsel and database maintainer is the same guy. Thanks dude! :D

Status: Needs review » Needs work

The last submitted patch, merge_clarify.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new31.13 KB

Well, well. I studied the ANSI SQL standard (a late 2003 draft is available). The subquery I added from INSERT query is clearly not supported:

<merge insert specification> ::=
INSERT [ <left paren> <insert column list> <right paren> ]
[ <override clause> ]
VALUES <merge insert value list>

the INSERT needs VALUES. On the other hand, it's again quite clear that the condition table and the INSERT/UPDATE table need not to be the same so I added a conditionTable method. It's quite nice that this automatically adds support for using a subquery (aka anonymous view) because db_select() supports it.

(as for the override clause in the standard, we are not supporting that, although we could but that would be a feature request against InsertQuery and MergeQuery. Docs here. However, such support would be quite complicated and even Oracle does not seem to support it. Let it rest :) )

chx’s picture

StatusFileSize
new31.17 KB

Oh, bah, PHP! Needs an abstract method defined, eh.

Status: Needs review » Needs work

The last submitted patch, merge_clarify.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new31.18 KB

Sigh...

chx’s picture

Issue tags: +API change

Adding the API change tag.

Crell’s picture

+++ includes/database/query.inc	2010-08-16 08:53:24 +0000
@@ -1206,6 +880,376 @@ class UpdateQuery extends Query implemen
+   * Adds a set of field->value pairs to be updated.

updateFields() is not documented to take multiple calls the way insertFields() is. Why?

+++ includes/database/query.inc	2010-08-16 08:53:24 +0000
@@ -1206,6 +880,376 @@ class UpdateQuery extends Query implemen
+  /**
+   * Adds a set of field->value pairs to be inserted.
+   *
+   * Warning: unlike InsertQuery::fields() this method can be called more than
+   * once.
+   *

Why can this function be called multiple times, but not others? Why should this one be special?

+++ includes/database/query.inc	2010-08-16 08:53:24 +0000
@@ -1206,6 +880,376 @@ class UpdateQuery extends Query implemen
+   * The fields are copied to the condition, the UPDATE part and the ISNERT
+   * part all. If no other method is called, the UPDATE will become a no-op.

The first comma is a comma-splice. It should probably be a colon.

+++ includes/database/query.inc	2010-08-16 08:53:24 +0000
@@ -1206,6 +880,376 @@ class UpdateQuery extends Query implemen
+      $select = $this->connection->select($this->conditionTable)
+        ->condition($this->condition)
+        ->forUpdate();
+      $select->addExpression('1');

Isn't forUpdate() chainable?

Actually this is all quite reasonable. I think. I didn't get to fully read it. But I think I'm probably OK with the overall thing. I need to look at it again when not dog tired. chx, go ahead and address the above. :-)

Powered by Dreditor.

chx’s picture

StatusFileSize
new31.27 KB

updateFields() is not documented to take multiple calls the way insertFields() is. Why? -- because update::fields can be called multiple times. insert::fields can't be. The comment says "unlike insertQuery::fields() this method can be called more than once" that's a worthy comment. "Like updateQuery::fields() this method can be called more than once" does not say anything. It's not a special function, rather insertQuery::fields was special and once again the comment says so. Added even more here to make this clear: Warning: unlike InsertQuery::fields() this method can be called more than
once, much like UpdateQuery::fields() and MergeQuery::updateFields().

It's NOT a comma splice, it's a list of things. Changed the comment to: The fields are copied to all three parts of the query: the condition, the UPDATE part and the INSERT part.

Yes, forUpdate is chainable. Fixed.

Status: Needs review » Needs work

The last submitted patch, merge_clarify.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new31.27 KB

while forUpdate is chainable addExpression is not. Reverted.

Crell’s picture

Status: Needs review » Needs work

I just spent quite a while staring at this patch, and I think I finally understand it. :-) Some comments below:

+++ includes/database/query.inc	2010-09-10 21:21:56 +0000
@@ -1206,6 +880,377 @@ class UpdateQuery extends Query implemen
+   * Warning: unlike InsertQuery::fields() this method can be called more than
+   * once, much like UpdateQuery::fields() and MergeQuery::updateFields().

This is not true. UpdateQuery::fields() just does a blanket overwrite.

The actual reason this method can be safely called multiple times is because it can be called from keys() and fields(), and needs to handle that gracefully.

+++ includes/database/query.inc	2010-09-10 21:21:56 +0000
@@ -1206,6 +880,377 @@ class UpdateQuery extends Query implemen
+    // The MergeQuery::fields method sets needsUpdate to TRUE unconditionally,
+    // however if only key() is called, the UPDATE is not necessary, so save
+    // the value of the needsUpdate flag and restore after.
+    $needsUpdate = $this->needsUpdate;
+    $this->fields($fields);
+    $this->needsUpdate = $needsUpdate;

Couldn't this be eliminated by just calling $this->insertFields() and $this->updateFields()? It's only one more line, and removes the need for two lines and an extra function call. :-)

+++ includes/database/query.inc	2010-09-10 21:21:56 +0000
@@ -1206,6 +880,377 @@ class UpdateQuery extends Query implemen
+          $insert = db_insert($this->table)->fields($this->insertFields);

This should be $this->connection->insert(...)

+++ includes/database/query.inc	2010-09-10 21:21:56 +0000
@@ -1206,6 +880,377 @@ class UpdateQuery extends Query implemen
+        $update = db_update($this->table)

This should be $this->connection->update(...)

Other than those, I am OK with this patch. It is an API change but the footprint of it is quite small it and arguably does move the API closer to the actual SQL spec, which we have been trying to use as our baseline.

So let's fix the above, then commit this.

Powered by Dreditor.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new31.62 KB

I have changed the logic inside fields and key both making some of the above not a problem any more. You can't call updateFields/insertFields any more twice. key and fields can be but it's recommended against. Changed db_insert/db_update.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

Let's get this in before webchick gets antsy and tags something as beta. :-)

dries’s picture

Status: Reviewed & tested by the community » Fixed

Reviewed this again and committed it to CVS HEAD. Thanks.

rfay’s picture

If you are a human please disregard this issue for now. If you are the testing bot then I want to see what do you say.

BUSTED! We have an issue titled "Clarify merge queries" without even an issue summary.

Please, an issue summary.

This is marked as API change. Does it have BC implications?

Crell’s picture

"Our previous Merge query implementation was modeled on MySQL's "INSERT ... ON DUPLICATE KEY UPDATE" functionality. However, that did not properly map to the definition of Merge queries in the SQL spec.

The new implementation is modeled directly on the SQL spec's Merge query definition (even if none of the core databases are capable of supporting it directly). In most cases there is no API change. However, code that was using MergeQuery::updateExcept() will need to switch to insertFields() and updateFields(). See the docblock of MergeQuery and the corresponding unit tests for examples."

There's your API change report, Randy. :-) Thanks for following up.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

damien tournoud’s picture