What do you of the drupal community think about a contrib module to change node id association from sequential numbers to using other methods? One idea would be to use the md5 of the title or something smaller within a menu of methods or even an ui to set up more complicated schemes.

I'd be happy to contribute $40 to anyone that does it.. I think it'd be useful if it can enable asynchronous node creation between multiple master sql servers.

Comments

MrTaco’s picture

yeah, i think this is a good idea as well

though i'd lean more towards generating a random ID out of a huge possible ID space

i don't think this could be done via a module though, i'm pretty sure this would require core modification to the node_save function

mitchell’s picture

I'm not a programmer, but here's my attempt to figure out the problem:

This line ( $node->nid = db_next_id('{node}_nid'); ) in http://api.drupal.org/api/HEAD/function/node_save made me see that it is not the only function that would need this ability, all functions that use http://api.drupal.org/api/HEAD/function/db_next_id need the updated logic.

So I hope a contrib module could take over that function used in all includes/databases.*.inc files'. http://groups.drupal.org/node/1947#comment-5226 talks about this exactly.

I think changes might take place here (from database.mysql.inc):
function db_next_id($name) {
$name = db_prefix_tables($name);
db_query('LOCK TABLES {sequences} WRITE');
$id = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", $name)) + 1;
db_query("REPLACE INTO {sequences} VALUES ('%s', %d)", $name, $id);
db_query('UNLOCK TABLES');

return $id;

The id randomization functions could take the place of the " $id = db_result(db_query( " query. The admin page could manage it.

The locking mechanism could be replaced by a more robust allocation/conflict resolution system, some ideas:
1. I'm guessing locking wouldn't be necessary; the randomization process should prevent duplicate results for a dependeable period of time much greater than the time to concurrently create content.
2. it should regenerate the id if it picks one that it can see exists
3. if 2 takes too long, then we could even use a localized temporary sequential id system and the large primary key creation could be handled separately.

mitchell’s picture

1,2, and 3 are practically unnecessary. the mysql clustering software will manage the primary nid's. http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication... starts off discussing the problem with the AUTO_INCREMENT function used in database/database.4.1.mysql, and shows what to change that to for a multi-master setup.

the data organization/replication/reorganization and web server to db connection management will have to take place somewhere probably other than in drupal; so for now, a way to alter that database.*.*.* and an update.php script would be really cool.

let's see what other problems ensue or other issues we'll need to deal with in the future, because it seems like most of the heavy work should be done by the clustering software or middleware proxies and this module/patch could be rather small for some big major improvements.

Souvent22’s picture

I've implemented random id generation for my last project if you're interested, the $40 still up for grab? Great beer money. :). But really, it wasn't that hard at all, and gave my project a much needed performance boost because these is no table locking of the sequence table. It's based on a "First Commiter" scheme. Lemme know, and i'll post a snippet. I'm considering rolling a patch so that one can turn off/on sequential or random id generation from the administration menu.

Souvent22’s picture

I posted the outline of how I implemented the Random ID generation. I'll post a patch sometime soon hopefully.

Link: http://earnestberry.com/node/13