This is against base system, because while I think the initial patch should probably tackle node, we'll want to generalize this to entity revisions at the same time.

I don't have time to do a full write-up, here's the basics of what I'd like to change though:

- have the node_revision table actually store all the information in the node table (for example we need separate columns for node->uid (primary author) and node->revision_uid (last edited) - since these are completely different values and have been the cause of serious bugs and misconceptions in various versions of core.

- change the order of operations when inserting nodes from:

drupal_write_record('node');
drupal_write_record('node_revision');
db_update('node') - to set the revision id.
Get an nid from sequences (same as user API).
drupal_write_record('node_revision');
drupal_write_record('node');

This would remove some complexity from node_save() I think, and also opens up the possibility (if we wanted to), to store one or more revisions of nodes before we actually create a record in the node table. This could support better full page previews, drafts (google docs-ish) and other things that are currently very difficult when the only choice is between creating a full node record with all that entails or trying to fake things with form values.

The other thing, and this needs its own issue, is that we could consider moving from a CRUD model to a CRAP model:

CREATE / READ / UPDATE / DELETE
CREATE / READ / ARCHIVE / PURGE

So always save a revision every node edit, but then allow old inactive revisions to be purged - something that could be made pluggable. - again this would simplify both code and user interface in a lot of areas.

CommentFileSizeAuthor
#8 node_schema.patch4.01 KBcatch
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim’s picture

Subscribe.

One thing that's always bugged me about node revisions is the workflow to enable revisions for a single node requires you to create a new, pointless, revision. What I usually want to do is make sure other users can't lose information; I'm not making any changes myself.

> - have the node_revision table actually store all the information in the node table

So basically, we need entity table {foo} to also have an identical {foo_revisions} table. It would be pretty cool if the entity system and the schema system could work together to make that happen automatically.

bojanz’s picture

Subscribe.

Very interested in making revisions better.

catch’s picture

The revisions table may need to have additional fields, but it should at least have all the fields that the main {entity} table does yeah.

fago’s picture

While CRAP makes much sense to me for handling revisions, we really should stay with the usual CRUD interface any average developer coming to Drupal knows and is comfortable to work with. Still, we can go with CRAP in the back and simplify things that way.

bigjim’s picture

Would the idea be to extend the interface for rolling back revisions that's available for nodes out to the other entities? Not sure it's practical for things like comments but may be useful for users and taxonomy.

catch’s picture

Component: base system » entity system
Issue tags: -Entity system, -entity API
catch’s picture

Priority: Normal » Major

Bumping this to major, really needs to be fixed properly.

#1217190: Interaction between project issue's comment behavior and node_save is causing revision author to update to comment author is the latest fallout from this mess.

catch’s picture

Status: Active » Needs work
FileSize
4.01 KB

Here's a very rough start just on fixing the schema, don't have time to make a working patch today but wanted to get this moving.

Changes so far:

  • {node_revision} now inherits from {node}'s schema, but adds extra columns for tracking revisions and its own indexes. I'd like to eventually move the base schema handling to the entity system and have node extend that.
  • Started to remove {auto_increment} from {node}, IMO we should look at having {node_revision} responsible for both vid and nid via db_next_id() - this will remove the dance of $update_node etc. - just reserve an nid and vid when creating new revisions and call it a day, at least with core SQL storage. When we move this to a generic entity revision controller it means it'll also work for users, which can't use autoincrement.
dixon_’s picture

sub.

zhangtaihao’s picture

I agree with fago in #4. The CRUD workflow should manifest itself in code, regulating how modules and other components use objects/entities. The CRAP workflow should simply determine which variant of an object/entity is used in the CRUD workflow. The CRAP workflow can then be managed and altered/extended in a modular fashion, be it through additional alter mechanisms and/or user interfaces (e.g. to revert to/purge a revision).

Whether or not CRUD should trigger CRAP is a separate subject, one that should prove interesting in terms of whether race conditions would occur.

attiks’s picture

subscribing

jstoller’s picture

subscribing

colan’s picture

Subscribing.

yoroy’s picture

#218755: Support revisions in different states tells me this is the place to start working our way towards sane revisions handling in core. Seems like this is the one then that could become the meta issue. Would love to see a battle plan come together in here.

jstoller’s picture

The following was originally suggested in the conversation at #218755: Support revisions in different states, but seems to be relevant here.

What if we had four standard node tables: {node}, {node_revision}, {node_active} and {node_latest}.

{node} would contain only that information which always applies to the entire node: nid, type, language, tnid and translate.

{node_revision} would be an expanded version of the current table of that name, much like what's already been suggested here, containing any node data that could potentially vary from one version to the next.

{node_active} and {node_latest} would be denormalization tables for the two most basic views of node data. They would be similar to today's {node} table in structure. Their content could be completely rebuilt from scratch using the new {node} and {node_revision} tables. {node_active} would reference the most current, published, version of each node. {node_latest}, on the other hand, would reference the latest version of each node (Published or Unpublished).

This same concept could be expanded to all entities. For instance, we could have {users}, {users_revision}, {users_active} and {users_latest}, or {comment}, {comment_revision}, {comment_active} and {comment_latest}.

Here "active" and "latest" can be thought of as two standard site access modes (or something along those lines). You can think of these access modes as conceptually similar to the view modes we now offer for node displays, except they would apply on a more global level. If I'm viewing the site in the "active" mode, then I'm seeing only the publicly visible content. If I'm viewing the site in the "latest" mode, then I'm seeing all the latest content, even if it's still unpublished. This effectively gives me a virtual staging site for content, letting me see my entire site as if the current revisions of all my content were published. Yay!

Much like view modes, we could allow developers to define new access modes, in addition to the two defaults, opening up a whole new world of possibilities. I suppose each access mode would have a function determining what revisions it contains. When a revision is saved, Drupal would run through these functions and update the appropriate tables. For instance, {node_latest} might be updated every time a revision is saved, but {node_active} would only be updated if the saved revision is published. If and when we implement a more advanced state system, this could get even more fun!

agentrickard’s picture

I just posted a simple starter patch in #218755: Support revisions in different states

yoroy’s picture

Should we postpone this issue on #218755: Support revisions in different states or are they duplicates by now? 218755 seems to making solid progress.

catch’s picture

At least this bit:

- have the node_revision table actually store all the information in the node table (for example we need separate columns for node->uid (primary author) and node->revision_uid (last edited) - since these are completely different values and have been the cause of serious bugs and misconceptions in various versions of core.

Is a bug, and we should probably split it out of this issue completely.

yoroy’s picture

Opened #1528028: Add tests for reverting revisions where revision_uid and uid differ with that bit in it. Somebody who actually knows about this might want to clarify if necessary :)

Berdir’s picture

#1498634: [meta] Define revision/translation SQL schema for core entities (no patch) might also be related, which is discussing on how to implement generic entity based revisions *and* translations for properties.

fago’s picture

I've created #1612014: Create an interface for revisionable entities for getting started with the entity-generic implementation.

Berdir’s picture

See also #1723892: Support for revisions for entity save and delete operations, which does not really refactor but generalizes the current code so that it can be used for other entities. Yes, I'm hunting for reviews ;)

Berdir’s picture

A large port of it. The issue opened by yoroy hasn't been touched by this, but it is actually touched by the node property/field table refactoring, so we might be able to close it as a duplicate of that. Not sure if there's something else remaining here.

mitchell’s picture