While looking through drupal and the API documentation (drupaldocs.org) to learn how to create my first module, I noticed a potential database issue. This only relates to sites that use database replication. And it only relates to executing a delete followed by an insert of the same primary key. At the moment it's very very rare this would ever occur but I thought I'd mention it for future-proofing.

There are generally two types of replication (that I know of): transaction-based and replicator-master. For transaction-based the transaction logs are passed around to each server and queries are guaranteed to execute in the order submitted by the application. For master-based replication all queries are sent to a master server, usually balanced across threads, and passed to each slave. The master will usually not guarantee the order of queries (as each query can be passed to any thread). So a delete followed by an insert may end up executing on another database server as an insert followed by a delete. The insert fails on a duplicate key and the delete runs. Some or all of the servers have then lost that record.

I found the following code in includes/bootstrap.inc on drupaldocs.org:

function variable_set($name, $value) {
  global $conf;

  db_query("DELETE FROM {variable} WHERE name = '%s'", $name);
  db_query("INSERT INTO {variable} (name, value) VALUES ('%s', '%s')", $name, serialize($value));
  cache_clear_all('variables');

  $conf[$name] = $value;
} 

If this were run on the systems I use at my job there would be a good chance that the variable would disappear across all database servers. It depends on the systems used and the DBA's settings. It does create more code having to check if a record exists in order to choose between an update and an insert, but it eliminates potential problems on large systems. I haven't looked through all the code and contributed modules to see how common this is and I don't know how many systems out there are set up with master-slave replication, but I would just keep it in mind for the future as new code is written.

If anyone thinks it's important enough I'll write it up as a page in the handbook.

Comments

dries’s picture

It would be interesting to discuss some best practices, or to compare different solutions used by other projects (if any).

glen-1’s picture

what about using REPLACE INTO syntax?

http://dev.mysql.com/doc/refman/5.0/en/replace.html

   db_query("REPLACE INTO {variable} (name, value) VALUES ('%s', '%s')", $name, serialize($value));

it needs the primary key stored in the query, i assume name should be key in this query.

zoo33’s picture

I don't think REPLACE INTO is compatible with PostgreSQL.

dries’s picture

REPLACE is a MySQL-ism.

jvandyk’s picture

The above code now has database locking added to it, so this is no longer an issue.

ChrisKennedy’s picture

In case anyone stumbles upon this old thread, variable_set now uses database locking to avoid race conditions with DELETE+INSERT. See, for example, http://api.drupal.org/api/4.7/function/variable_set

johnquillen’s picture

This is an old thread, but thought it worth mentioning that according to http://drupal.org/node/146921#comment-236565 using database locking is not sufficient for multi-master replication.