From the development list:

1. node's primary key is wrong.

Currently, the primary key of the node table is (nid, vid), and we have a unique index on (vid). I think this is incorrect. Semantically, if a node has nid 12 and vid 30, we do not think of it as "node 12/30", we think of it as "node 12." The path to access it is node/12, not node/12/30. More precisely, the nid is what uniquely
identifies the node as an entity; therefore, it needs to be the primary key.

So, we can change the primary key to be (nid), add a unique index on (nid,vid), and leave the unique index on (vid) too. There will be no performance consequences or code changes outside system.install.

2. Other tables have no primary key.

The tables block, filters, flood, permission, and term_relation have no primary key. They should, and the Schema module will need them to if they are to participate in upcoming functionality.

Note that we do not have to change the code to *use* these primary keys, we just need to add them. Since we have now resolved that using auto-incrementing fields is allowed in Drupal, we do not even need to
change module code to maintain the primary key.

These two changes will require a core schema update with patches in system.install.

Comments

bjaspan’s picture

Status: Active » Needs review
StatusFileSize
new10.71 KB

Patch attached. I added primary keys to the tables listed and changed node's primary key to nid. I had to add a column list to the 'INSERT INTO {permissions}' commands since that table now has an auto-incrementing id column. I searched for other instances of this same problem in core and did not find any.

This patch also contains my fixes to http://drupal.org/node/139673 (replacing int(1) with smallint for pgsql). If this patch gets committed first, that issue can be marked fixed, too. If that one commits first, I'll have to re-roll this patch.

moshe weitzman’s picture

subscribe

Chris Johnson’s picture

subscribe

bjaspan’s picture

Status: Needs review » Needs work

Ummm... my patch does not fix node's primary key for existing sites, only new ones. Buzzzz.

Also, I think I am going to retract my suggestion to add primary keys to those tables that are missing it. It is *probably* the right thing to do but I am not 100% certain of it and it does have a consequence: it increases the storage requirements for the table, and some of those tables might have a lot of rows. Since I do not actually have a current need for the primary keys, I do not think it is worth the uncertainty. Also, if I need it later for the Schema module, I can always use hook_schema_alter() to add it in dynamically, outside the core release cycle.

I'll submit a new patch later today.

Crell’s picture

Not adding keys to sites being upgraded but only having them on new sites introduces inconsistency between two otherwise equal 6.x configurations. That's a Bad Thing(tm).

I definitely think that, at very least, all tables should have a primary key. Even if Drupal itself doesn't use them, other database tools do. For instance, I use Navicat for managing databases. Navicat has a great feature where you can remote-copy a database from one server to another and even synchronize structure or data between them. That only works, though, if all involved tables have a primary key that it can use as a reference point. Because some Drupal tables don't have a primary key, I can't use the synchronization functionality.

None of the tables you list as not having primary keys at all is especially large, I think; at least they aren't on any of the sites I run. Adding a primary key index to them shouldn't be a big time-hit, and even if it does it's a one-time hit during an upgrade. The space usage is also not a big deal there.

+1 on adding primary keys to all tables, including existing databases.

bjaspan’s picture

Status: Needs work » Needs review

New patch attached. The primary and unique keys for existing node tables are now updated. I left in adding primary keys to the tables that do not already have them; crell's comment reminded me that I was right that we need them.

bjaspan’s picture

StatusFileSize
new9.94 KB

Forgot the attachment.

dries’s picture

Status: Needs review » Needs work

By convention we use the following format for IDs:

xid where 'x' is the first letter of the table name.

For example, blocks would have bid, not id.

bjaspan’s picture

Status: Needs work » Needs review
StatusFileSize
new9.99 KB

Fixed. For term_relation and term_synonym I used trid and tsid. Let me know if I should use something different.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new9.92 KB

i gave this a thorogh test with update and with new install and all is good.

the update function was mistakenly placed in the 1xxx series. i moved it to the 6xxx series.

dries’s picture

Status: Reviewed & tested by the community » Fixed

Tested, works, committed! Thanks. :)

hunmonk’s picture

Status: Fixed » Needs work

the node table now has a primary key nid, and an index on nid -- i'm pretty sure the nid index is unnecessary.

moshe weitzman’s picture

block module could use a revamp to use block id as a key instead of module+delta. this will affect hook_block().

bjaspan’s picture

Re #12: This will be much simpler to fix once http://drupal.org/node/144765 gets committed, so I'll wait for that.

david strauss’s picture

@moshe weitzman I don't see the benefit of a new block ID. We can just make module + delta the primary key.

david strauss’s picture

Priority: Normal » Critical

These keys are wrong! These changes make the database less consistent and slower.

david strauss’s picture

There should never be a key on UNIQUE nid, vid. vid alone is unique in Drupal.

Here's a first draft of the primary keys these tables should have:
node: nid (and a UNIQUE on vid)
menu_links: unsure right now
blocks: module, delta
filters: format, module, delta
history: uid, nid.
term_relation: tid1, tid2
term_synonym: tid, name
file_revisions: vid, fid (so the vid index becomes unnecessary)
permission: rid, tid (probably)
term_node: vid, tid
url_alias: src

Creating extra autoincrement columns to be easy PRIMARY keys provides no protection against duplicates. You end up having to create a UNIQUE key on what should be the PRIMARY key and have the database maintain at least two BTREEs for the table.

david strauss’s picture

I'm working on a patch.

bjaspan’s picture

Creating extra autoincrement columns to be easy PRIMARY keys provides no protection against duplicates.

That's correct. A table structure that already does not prevent duplicates, upon having an auto-increment primary key added, still does not prevent duplicates. Adding the primary key does not make the database less consistent, it maintains the (not-ideal) status quo. It does add a bit of overhead but none of the tables to which I added it are huge. I do agree about the uselessness of a unique nid,vid index but again, that was already present.

As I just posted in http://drupal.org/node/147298, I certainly support a patch that sets all the primary keys correctly and will help review it when it is ready.

david strauss’s picture

@bjaspan Those tables are not small on sites like Drupal.org. I welcome your help on my patch, though. :-)

drewish’s picture

David Strauss, it's a bit confusing to have the work spread between two issues. You should either switch this issue back to fixed or mark #147298 as a duplicate and move the patch over here.

david strauss’s picture

Status: Needs work » Fixed

@drewish The reason I split the issues is because this patch has already been committed. I'll mark this one fixed to resolve the confusion.

Anonymous’s picture

Status: Fixed » Closed (fixed)