Critical, since this is a data loss bug. Sorry, I don't know the standards for this project.

While troubleshooting what I thought was just a bum query at #1036132-199: Provide a mechanism for issue summaries, I see that Project Issue module (or Drupal.org customizations or something, but I'm starting here) will override the node_revisions.uid field with the uid of the person who last posted a comment, destroying its original value. :(

This makes it impossible to know who actually created a node revision.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webchick’s picture

Stepping through in a debugger (thanks, NetBeans!) this happens in project_issue_update_by_comment():

  // Update the issue title.
  $node = node_load($comment_data->nid, NULL, TRUE);  // Don't use cached since we changed data above.
  $node->title = $comment_data->title;

  // This also updates the changed date of the issue.
  node_save($node); # Ouch. Right here.

Unfortunately, in addition to setting the changed date of the issue, it's setting $node->revision_uid to $user->uid.

I noticed earlier in the callstack, the node was drupal_clone()ed. I wonder if we need to do the same thing here.

webchick’s picture

Specifically, in the bowels of node_save(), it's doing this:

  if ($node->is_new) {
    // It's not, so this is skipped.
  }
  else {
    drupal_write_record('node', $node, 'nid');
    if (!empty($node->revision)) {
      // It's not, so this is skipped.
    }
    else {
      _node_save_revision($node, $user->uid, 'vid'); # Bingo. :( $user->uid. :(
    }
  }
mikey_p’s picture

In short, node_save() in D6 sucks, and ignores everything you send in and updates that most recent revision with the current user's ID, or saves a new revision with the current user's ID. Neither of those scenarios works for what we need with issue summaries.

I don't see a good way around this short of:

1) mocking global $user, which will obviously never fly in a million years.

2) mocking node_save() or something like it that can save the revision (if revision settings are set that way) or update the revision and node tables as needed, without overwriting node data with global vars. This would also require mocking all the nodeapi hooks, which is one of my bigger concerns as project* tends to rely on these for quite a bit of it's processing, and without them it's hard to tell what might break.

webchick’s picture

Status: Active » Needs review
FileSize
1.39 KB

Digging through git blame, looks like this was introduced in #187799: Revisions are associated with node owner, not node editor.

An option 3) is just not to call node_load()/node_save() here at all, and just update the title directly with a SQL query. This would have the disadvantage of not adhering to proper node API protocol; it's possible some module wants to react to the fact that an issue title was updated. OTOH, it does make this a dramatically faster action.

Here's a patch that does this, and in some light testing on localhost it seems to fix it. It's still pretty janky, but seems less scary than re-writing node_save(). :\ Though mikey_p was saying in IRC that he's nervous about what else might be relying on that node_save() call.

webchick’s picture

Also, a reasonable question was asked by mikey_p of why this wasn't caught in testing.

The drupal.org sandbox environments are all-around pretty awesome, but they fail big time on one count... all users (including uid 1) share the same pass. So if you give that out in a public issue, you open the site up for random morons to log in and make trouble. There's also no mail functionality, so individuals can't reset their passes.

So, what you end up doing is creating a dummy user with an easy password, and everyone who tests tests with that user. Therefore, the person commenting last on the issue was often the person who'd just edited the issue, so this wasn't obvious. Additionally, I had to click on 3-4 different issues to find one that actually exhibited this problem on the staging site, and that was when I was explicitly looking for it. Just by luck of the draw, the first 3-4 issues in the list were ones where the last person to comment was also the OP. D'oh. :(

webchick’s picture

Title: node_revision.uid updated on issue comment » When adding or deleting comments, node_save() blindly overwrites node_revision.uid data
Project: Project issue tracking » Drupal core
Version: 6.x-1.x-dev » 8.x-dev
Component: Comments » node system
Status: Needs review » Active

Actually, according to #1036132-201: Provide a mechanism for issue summaries and evidenced by http://drupal.org/node/1217224/revisions, this is a core issue, as it happens both when saving a comment in project_issue module, as well as when deleting spam.

The patch in #5 is bogus; restoring to active.

webchick’s picture

Here's a test that fails. Since this is the first test I've written in about 2 years, some sanity-checking would be good.

chx’s picture

Status: Active » Needs work

There is *absolutely no way* we are doing this. We are not going to change an entity without firing generic and specific entity update hooks. Either fire them manually (ohgawd) or fix node_save. Edit: eh sorry didnt realize webchick already discarded that idea. Oh well, it can't hurt writing down this for the Nth time :)

webchick’s picture

Status: Needs work » Active

Right...

#5 is bogus. Redacted in #7. Roundabout test for the bug in node_save() is in #8.

catch’s picture

Status: Active » Needs work

In D8, we should always save a revision, and we should store both the author and the revision author as separate tables in the {node_revision} table. #1082292: Clean up/refactor revisions is for that.

For D7/6 could we consider adding an extra column to {node_revision} so we can differentiate the two. I'd love it if we could force a new revision for everything but somehow doubt that is going to fly.

webchick’s picture

Status: Needs work » Active
Issue tags: +Needs backport to D6, +Needs backport to D7
catch’s picture

Status: Active » Needs review
Issue tags: -Needs backport to D6, -Needs backport to D7

Since there's a patch with a test, let's watch it fail.

catch’s picture

restoring tags..

webchick’s picture

Priority: Critical » Major

Summary of an IRC discussion about this.

I went in thinking there was absolutely no reason to ever obliterate node_revision.uid, but catch pointed out that if you log in as another user, edit the node and add a bunch of naughty words, *uncheck* the "Create new revision" checkbox, and save it, then the dirty words get attributed to that user.

This actually happened in the issue summaries issue at http://drupal.org/node/1036132/revisions/view/1550004/1551216. That line on the left (as well as some other stuff) was actually added by neclimdul, but because the code wasn't deployed to require the revision checkbox on issues, his edits got rolled into the revision attributed to me (not on purpose).

So it seems we do maybe need two columns here to fix this properly... one for the original revision author, one for the "last edited" revision author.

In the meantime, to stop Project issue module from obliterating data on account of having the audacity to call a core API function :P chx suggests this wailing-inducing, but secure, fix (pseudcode):

session_save_session(FALSE); $user = user_load($node->uid); node_save($node); session_save_session(TRUE);

Basically, impersonating the global $user as the original node revision_uid. :( session_save_session(FALSE) disallows the global user from changing in case of a server crash or similar. It still makes me barf, but I don't have any better ideas, beyond storing the $node->revision_uid and then doing a direct UPDATE after the fact when node_save() is done (suggested by msonnabaum). However, this approach would break the use case of using something like Materialized Views to create a denormalized table including that field... it wouldn't get updated since various node hooks wouldn't fire. Anyway, I'll split these off into two issues shortly so we can deal with PI sooner than core.

Also, chx pointed out that this behaviour has existed since Drupal 4.7, so this can't be critical. But "major" seems appropriate.

Status: Needs review » Needs work

The last submitted patch, node-revision-authors-test.patch, failed testing.

webchick’s picture

Ok, split the project_issue module thing into a separate issue: #1217286: Posting a comment changes too many issue node revision properties.

Let's leave this one for core.

sun’s picture

Sadly, I don't think any kind of fix for this can be backported. Production sites, contributed and custom modules as well as Views and stuff may rely on the currently broken behavior.

jhodgdon’s picture

So... One second to have a reality check.

In project_issue, when editing a comment, you can do all kinds of editing on issue "fields" (putting it in quotes because they might not be core Fields or cck Fields, but I mean things like the project, version, component, etc. that are "fields" on the issue, in the generic sense of added bits of data). As well as, of course, the title.

So doesn't that mean that the issue has been revised, and shouldn't it logically create a completely new revision of the issue node if any of the information (title, project, version, etc.) has been updated?

Looking at node_save (D6):

   if (!empty($node->revision)) {
      _node_save_revision($node, $user->uid);
      db_query('UPDATE {node} SET vid = %d WHERE nid = %d', $node->vid, $node->nid);
    }
    else {
      _node_save_revision($node, $user->uid, 'vid');
    }

So this is saying "If $node->revision is NOT set, create a new revision. If it's set, update the previous revision with the new information and the current user ID". That second case is what is causing our misery, but it does seem like the right thing to do in node_save(), because presumably if node_save() is being called, someone has updated the node, and why not update the revision to say they last updated the node?

I guess the question in my mind is: why is Project Issue bypassing creating a new revision when all of this stuff (title, "fields") has changed -- isn't that a new revision? And if none of this stuff has changed, why is it calling node_save in the first place (i.e., why does it need to save the node at all)? This seems like it could be fixed in Project Issue, and node_save() is not really brain dead... Or am I missing something? The problem seems to be that we're not seeing the issue "fields" as part of the issue node, which they definitely are... though I see that they don't show up in the diffs (not sure why that would be either).

sun’s picture

I guess the question in my mind is: why is Project Issue bypassing creating a new revision when all of this stuff (title, "fields") has changed -- isn't that a new revision?

You got it the wrong way around. project_issue does create a new node revision when a comment is posted, because the comment can change the node title.

Only the patch in #6 changed that code to no longer create a revision, but that's bogus.

I can see only two ways out of this situation:

  1. Completely fork node_save() into a _project_issue_node_save(), which is only invoked for the case when a comment is posted. Make that replacement not invoke _node_save_revision(), and instead, insert a custom node revision row that only changes 'title', taking over the remaining fields from the previous revision.

    That way, the new node revision (triggered through a comment) can only change the title, but still leads to a new revision. Unpublishing or deleting a comment still reverts to the previous node revision like now. And the issue summary gets the proper user information and changed date.

  2. Re-implement issue summaries to be not based on existing project_issue node revisions.
catch’s picture

project_issue doesn't create a brand new revision, it updates the existing revision in place - those are two slightly different things.

If it was creating new revisions, we'd see a revisions tab on this issue, but there isn't one.

sun’s picture

Status: Needs work » Closed (won't fix)

Meanwhile, I think this bug report is bogus. It's project_issue.module that performs a node_save() when posting a comment, not Drupal core.

Node module's behavior of setting the node revision author to the current user is correct, since the system wants to track who has updated a node, without changing the node author. So this information needs to come from somewhere, and that's global $user->uid. Of course, the whole situation with inserting and updating revisions as well as supplying meta-data for revisions could be improved, but that's not a bug, that's a feature request to do a major revamp of how we want to implement and use revisions in Drupal -- which requires much more solid discussion and shouldn't be driven by a bogus action of a contrib module.

jhodgdon’s picture

Title: When adding or deleting comments, node_save() blindly overwrites node_revision.uid data » Interaction between project issue's comment behavior and node_save is causing revision author to update to comment author
Project: Drupal core » Project issue tracking
Version: 8.x-dev » 6.x-1.x-dev
Component: node system » Comments
Status: Closed (won't fix) » Needs work

Just what I was going to say, thanks catch. It's updating the existing revision, and the problem is that node_save(), when you do that, says "Oh, it's a new user, update the revision authoring information", and that's what everyone is complaining about.

If it were a normal kind of a node with versioned CCK/core Fields, it would have versioning on the fields, and it would be creating a new revision any time any of the fields was updated. But I don't think Project Issue keeps revision history for its "fields", which IMO is also bad (makes it difficult to revert spamming).

Anyway, the problem is an interaction between Project issue doing something non-standard (doing node_save and telling it to update the existing revision, whenever a comment is added, because the commenter might have changed the title), and node_save doing something reasonable (on a node save that updates an existing revision, update the node revision author to the current reviser).

I don't think this is a core bug, much less an 8.x core bug.

sun’s picture

mikey_p’s picture

Status: Closed (duplicate) » Closed (won't fix)

Project issues does keep track of all of the properties that are changed and it keeps track of the places the changes were made either in project_issue_comments or in the comments table. Project issue isn't really at fault here, since its arguable that even if the title *wasn't* changed, that you'd still want to update the node so that it would have an accurate time stamp for an issue.