Community Documentation

Merge queries using db_merge

Last updated December 11, 2012. Created by Crell on September 18, 2008.
Edited by jbrauer, greggles, texas-bronius, jhodgdon. Log in to edit this page.

Merge queries are a special type of hybrid query. Although a syntax is defined for them in the SQL 2003 specification, virtually no database supports the standard syntax. Most, however, provide some alternate implementation using a database-specific syntax. The Merge query builder in Drupal abstracts the concept of a Merge query out into a structured object that can be compiled down to the appropriate syntax for each database. These are sometimes called "UPSERT" queries, a combination of UPDATE and INSERT.

In the general sense, a Merge query is a combination of an Insert query and an Update query. If a given condition is met, such as a row with a given primary key already existing, then an Update query is run. If not, an Insert query is run. In the most common case, it is equivalent to:

<?php
if (db_result(db_query("SELECT COUNT(*) FROM {example} WHERE id=:id", array(':id' => $id))->fetchField())) {
 
// Run an update using WHERE id = $id
}
else {
 
// Run an insert, inserting $id for id
}
?>

The actual implementation varies widely from database to database. Note that while Merge queries are conceptually an atomic operation, they may or may not be truly atomic depending on the implementation for a specific database. The MySQL implementation is a single atomic query, for example, but the degenerate case (above) is not.

The most common idioms for Merge queries are listed below.

Just set it

<?php
db_merge
('example')
  ->
key(array('name' => $name))
  ->
fields(array(
     
'field1' => $value1,
     
'field2' => $value2,
  ))
  ->
execute();
?>

In the above example, we instruct the query to operate on the "example" table. We then specify one key field, 'name', with a value of $name. We then specify an array of values to set.

If a row already exists in which the field "name" has the value $name, then fields field1 and field2 will be set to the corresponding values in that existing row. If such a row does not exist, one will be created in which name has the value $name, field1 has the value $value1, and field2 has the value $value2. Thus at the end of the query, the end result is the same regardless of whether or not the row already existed.

Conditional set

In some cases, you may want to set values differently depending on whether or not the record, as identified by the key() fields, already existed. There are two ways to do that.

<?php
db_merge
('example')
  ->
key(array('name' => $name))
  ->
fields(array(
     
'field1' => $value1,
     
'field2' => $value2,
  ))
  ->
updateFields(array(
   
'field1' => $alternate1,
  ))
  ->
execute();
?>

The above example will behave the same as the first, except that if the record already exists and we are updating it field1 will be set to $alternate1 instead of $value1 and field2 will not be affected. The update() method accepts either a single associative array of values or two numeric arrays, one fields one values, that must be in the same order.

<?php
db_merge
('example')
  ->
key(array('name' => $name))
  ->
fields(array(
     
'field1' => $value1,
     
'field2' => $value2,
  ))
  ->
expression('field1', 'field1 + :inc', array(':inc' => 1))
  ->
execute();
?>

In this example, if the record already exists then field1 will be set to its current value plus 1. That makes it very useful for "counter queries", where you want to increment some counter in the database every time a certain event happens. field2 will still be set to the same value regardless of whether the record exists or not.

Note that expression() may be called multiple times, once for each field that should be set to an expression if the record already exists. The first parameter is the field, the second is an SQL fragment indicating the expression the field should be set to, and the optional third parameter is an array of placeholder values to insert into the expression.

There is also no requirement that a field used in expression() be already present in fields().

Precedence

Given the above API it is quite possible to define queries that do not logically make sense, say if a field is set to both be ignored and to be set to an expression if the record already exists. To minimize potential errors, the following rules apply:

  • If a field is set to an expression(), that takes priority over update().
  • If values are specified in update(), only those fields will be altered if the record already exists. Fields not specified in update() will not be affected.

Note that it may still be possible to define queries that do not make sense. It is up to the developer to ensure that a nonsensical query is not specified as the behavior in that case is undefined.

Comments

An exception will be thrown

Note the following from database.inc:

/**
* Exception thrown for merge queries that do not make semantic sense.
*
* There are many ways that a merge query could be malformed.  They should all
* throw this exception and set an appropriately descriptive message.
*/
class InvalidMergeQueryException extends Exception {}

This should get mentioned in the last section, Precedence, I think. You won't find it mentioned in the db_merge documentation either.

------------
We are born naked, wet, and hungry. Then things get worse.

updateFields, not update

The method for specifying update fields is updateFields(), not update(). And you should probably point out that the call to updateFields() should appear after the call to fields(), since the fields() logic overwrites the list of updated fields.

updateExcept() does not exist

The method updateExcept() does not exist, according to the documentation of MergeQuery.

I find MergeQuery a bit confusing. The most easy and safe way to use it is like this: MergeQuery::key() should have a list of keys used for SELECT, MergeQuery::insertFields() should have an EXPLICIT list of the fields that should be used for INSERT (including those listed by key() if needed), and MergeQuery::updateFields() should have an EXPLICIT list of the fields that should be used for UPDATE.

Example:

<?php
    db_merge
('block')
      ->
key(array(
         
'module' => $module,
         
'delta'  => $delta,
        ))
      ->
insertFields(array(
         
'module' => $module,
         
'delta'  => $delta,
         
'cache'  => -1,
         
'status' => 1,
         
'weight' => $weight,
         
'region' => 'right',
         
'theme'  => $theme_key,
        ))
      ->
updateFields(array(
         
'cache'  => -1,
         
'status' => 1,
         
'weight' => $weight,
        ))
      ->
execute();
?>

However, if insert and update fields are the same, and also the keys are auto-increment, the MergeQuery::fields() can be used instead of MergeQuery::insertFields() and MergeQuery::updateFields(). So, it is neither safe nor reasonable to mix fields() with insertFields() and updateFields().

Returns last insert ID?

Is there a way to return the insert IDs from a merge query, if an INSERT was made?

checking the status

By setting assigning the db_merge to a variable, you can then access the status 1 = insert, 2 = update

<?php
$DBWriteStatus
= db_merge('tablename')
  ->
key(array(
       
'keyfield' => $keyvalue,
        ))
  ->
fields(array(
     
'field2' => $field2value,
     
'field3' => $field3value,
  ))
  ->
execute();


switch (
$DBWriteStatus) {
  case
1:
    echo
"database insert completed";
    break;
  case
2:
    echo
"database update completed";
    break;
  Default:
    echo
"sorry... as far as I know, nothing happened";
    break;
}
?>

use STATUS_INSERT and

use MergeQuery::STATUS_INSERT and MergeQuery::STATUS_UPDATE constants

I wish to help open source!

How would you handle composite keys

for a table with a composite primary key such as:

CREATE TABLE IF NOT EXISTS {example} (
      `field1` bigint (20) NOT NULL,
      `field2` bigint(20) NOT NULL,
      `field3` int(11) NOT NULL,
      PRIMARY KEY `field1_field2_field3` (`field1`, `field2`,`field3`))

------
How would I write the db_merge statement? Thinking out loud, I would do something like this:
db_merge('example')
  ->key(array(
      'field1' => $field1,
      'field2' => $field2,
      'field3' => $field3,
))
  ->fields(array(
      'field1' => $value1,
      'field2' => $value2,
  ))
  ->execute();

The real question is how do I know that this is searching using the index?

axiom3279

Auto-increment primary key column

I struggled with the fact that the order of the methods fields() and insertFields() is important: a call to insertFields() clears any fields that have been set in fields().

<?php
// cid is the auto-increment primary key.
db_merge('client')
  ->
key(array('cid' => $edit['cid']))
  ->
insertFields(array('created' => REQUEST_TIME, 'cid' => NULL))
  ->
fields(array(
   
'name' => $edit['name'],
   
'description' => $edit['description'],
   
'public' => $edit['public'],
   
'updated' => REQUEST_TIME,
  ))
  ->
execute();
?>

The following does not work, insertFields clears any columns set in fields() and defines just the 'created' column. This leads to an SQL error because some of the fields above are defined to be not null.
<?php
db_merge
('client')
  ->
key(array('cid' => $edit['cid']))
  ->
fields(array(
   
'name' => $edit['name'],
   
'description' => $edit['description'],
   
'public' => $edit['public'],
   
'updated' => REQUEST_TIME,
  ))
  ->
insertFields(array('created' => REQUEST_TIME, 'cid' => NULL))
  ->
execute();
?>
nobody click here