Whilst updating #2102417: Change Drupal\Core\Routing\RouteBuildEvent::getModule() to getProvider() to needs work with comment https://drupal.org/comment/8154399#comment-8154399 d.o crashed... I then resubmitted the comment but now the comment does not have the issue status change logged.

I guess this is because the issue node is saved before the comment and there was a problem whilst saving the comment. It would seem to be of utmost importance that updating an issue that updates the node and creates a comment are wrapped in a db transaction to prevent stuff like this from occurring.

CommentFileSizeAuthor
#17 2130059-17.status-update-weirdness.png63.47 KBdww
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott’s picture

Priority: Normal » Critical

So this has now manifested itself in yet another way. I added four related issues to #1851018: Improve breakpoint configuration implementation. and it took two goes! What was really interesting is that the first time the comment reflects the changes I wanted to make but the issue did not get updated https://drupal.org/comment/8161589#comment-8161589. So I did it again and it worked https://drupal.org/comment/8161601#comment-8161601

Bumping to critical as this type of inconsistent behaviour will really hurt our ability to manage issues.

chx’s picture

As far as I can see, the whole of node_save is wrapped in a transaction and the comment save in nodechanges happens inside nodechanges_node_update (the actual save is here) so I am a bit surprised of the symptoms described here.

jthorson’s picture

Priority: Critical » Major

I agree this is an annoyance, and can severely impact usability of the site if it goes on for long ... but bumping to major for now, to help prioritize this relative to thinks like project releases simply not working. :)

YesCT’s picture

Gábor Hojtsy’s picture

Gábor Hojtsy’s picture

My experience: some issue updates, eg. on #1952394: Add configuration translation user interface module in core were saved in the node revisions table (and posted as a comment) but DID NOT end up in the final node itself. See revisions with comments "Update again only showed up in node revisions but not in main node :/" and "Attempting to restore lost elements on the summary that did not even show up earlier....".

My *theory* is this is due to node saves taking so long and the actual node save happening too late, so possibly database replication kicks in in the meantime. My *theory* is creating the comment, sending emails, etc. took too long that before the code got to saving the final node, the db replication took over and did not let that happen. The revisions were *always* saved, so eg. files, etc. were on the revision but not the main node, so files needed uploading again.

I think I've seen @timplunkett experience this on other issues as well (with a comment like "d.o ate my patch"), but cannot find that issue right now. Will ask him to comment too.

Gábor Hojtsy’s picture

Title: Issue status change not displayed in comment » Issue status change not consistently saved in comment, node and revisions

Retitled for full scope of bug (AFAIS).

jthorson’s picture

Good theory in #6 ... the mail generation was definitely causing problems!

tvn’s picture

kscheirer’s picture

dww’s picture

Mail generation might be significantly improved via #2139561: Use a queue to manage issue notification emails

Meanwhile, I saw another possible symptom of this bug at that very issue. My first comment uploaded a patch, which sort of worked (nodechanges saw it as a new file, and the link is valid), but it wasn't actually attached to the issue node itself. So I needed a second comment, uploading the patch again, to have something attached to the issue.

chx’s picture

Still, #2. Anyone has a good idea on that one? I will ping DamZ.

jonathan1055’s picture

I had a similar experience to #11 above on Scheduler project #773510: Integration with Rules module comment 31 and 32. I updated the issue, adding a comment (which should have become 31) and uploaded a patch. The file was uploaded, I got the notification e-mail but no comment was added to the thread. The patch was attached to the issue summary (but not recognised as a patch and queued for testing). I then added comment 31 just as a comment.

Jonathan

tvn’s picture

Latest episode #2144751: Issue status is not being recorded (or reported) correctly.
In this issue: https://drupal.org/node/1503314 comment #35 changed the status. However there is no revision for that comment: https://drupal.org/node/1503314/revisions.
Comment #36 was supposed to change status, but didn't.

tvn’s picture

Closed #2147307: "System message" not marking issues "needs work" when they fail testing as duplicate.

For issue #2047633: Move definition of menu links to hook_menu_link_defaults(), decouple key name from path, and make 'parent' explicit comment #81 where dawehner changed status from "needs work" to "needs review" does not exist as a revision on the node https://drupal.org/node/2047633/revisions.

Further it seems that dawehner's revision got merged with System Message's revision:
https://drupal.org/node/2047633/revisions/view/6740375/6740739
status and files are from dawehner, but comment is from System Message

herom’s picture

Just saw a different case, wondering if it is same issue?
Issue #1840980: [meta] Bootstrap 3.0. It says "fixed" at top, but it's been "closed (fixed)" at the bottom.

dww’s picture

Here's another example:
#2137201-30: [META] Regressions in project_release handling of version element fields and release-history XML and comment #31.

Screenshot of duplicate updates changing the status of an issue

Both comments claim to move that issue from Active to Fixed. Fun.

webchick’s picture

Here's another instance. System message not setting the issue "needs work" again. #2141539-10: Rename FieldItemListInterface::filterEmptyValues() to filterEmptyItems()

klonos’s picture

...coming from from #2137073: Comments that alter the same issue settings should not be allowed..

Some more fun, same as that mentioned in #17...

Example use case in #1952394: Add configuration translation user interface module in core where comments #183 and #184 both change the issue status from "needs work" to RTBC:

comments change same issue settings

xjm’s picture

Ran into this today when running metrics on 8.x criticals that mysteriously appeared a lot bleaker than they should be. :P I found 35 (!!) fixed critical issues on staging that are missing the all-important revision where they were set to fixed. That's a majority of critical issues that have been marked fixed since the upgrade.

#2146027: Regression: Seven jquery dialog style removed is an example.

xjm’s picture

Also, are we sure this isn't crititcal? Oodles of missing node revisions seems like a pretty serious data integrity issue to me.

Damien Tournoud’s picture

This is really interesting. If you look at the example linked by @dww above: https://drupal.org/comment/8234005#comment-8234005

Looking at the revision table we have three revisions:

  • 6738079 is the original;
  • 6747269 triggered by @dww;
  • 6747277 triggered by @drumm

If you look at the revision table, it is obvious that the change by @drumm actually reverted all the changes done by @dww. For example, look at the field_issue_assigned values:

  • 6738079 has 0;
  • 6747269 has 46549;
  • 6747277 has 0.

Same for the body field value. The change made by @dww in 6747269 are actually reverted in 6747277.

But if you look at the diff made by nodechanges you will see that only the change actively made by @drumm is recorded (the change of the issue status), not the revert of all the other values.

From that we can conclude that when @drumm made his change, the both loaded node and the loaded unchanged node were wrong.

Damien Tournoud’s picture

I think the most likely explanation is that we are running into a cache consistency issue. The way this works is that:

  1. We open a transaction;
  2. We save the fields by calling field_attach_update(); in this function we clear the cache;
  3. Then we call the update hooks; some of them are relatively long running;
  4. We commit the transaction.

Now suppose that the cache is non-transactional (for example, you use memcache instead of the database cache). If anything loads the node after (2) but before (4), we are going to load (and cache) the *old* version of the fields.

This has been reported in #1679344: Race condition in node_save() when not using DB for cache_field. The easy fix is to key the cache by a generation identifier (worst case, the revision id, but that will only work if we create a revision).

helmo’s picture

Here's another example in the old Drush issue queue. #1000784: Another Fatal error: Out of memory

Marked as active even though it should be closed (Fixed).

xjm’s picture

#24 looks more like #2125263: Issue status not taken from last comment during D7 migration than this issue. Good find though!

xjm’s picture

xjm’s picture

Priority: Major » Critical

So, I've been running queries on staging, and far more nodes are affected by this than #2125263: Issue status not taken from last comment during D7 migration, which certainly makes some issues confusing to read, but at least is a one-time corruption of past data rather than an ongoing data loss issue like this one. This issue should be critical.

Wim Leers’s picture

Agreed. More often than not, testbot fails to set an issue to "needs work" because of this. So it also affects our daily workflow.

drumm’s picture

The core issue, #2130059: Issue status change not consistently saved in node revisions, needs to be fixed. When there is a promising patch, we can apply it on Drupal.org.

drumm’s picture

webchick’s picture

xjm’s picture

Title: Issue status change not consistently saved in comment, node and revisions » Issue status change not consistently saved in node revisions (and testbot sometimes fails to mark patches NW)

Retitling so we stop getting duplicates of this for the impact on PIFT. :)

drumm’s picture

Looks like no one has worked on #1679344: Race condition in node_save() when not using DB for cache_field.

With #2125755: System messages removed all issue tags during D7 upgrade and #2125263: Issue status not taken from last comment during D7 migration done, we no longer need the project issue migration tables and have cleared quite a bit of space on our DBs. We can now try Damien Tournoud's workaround of using the DB to back the field cache instead of MemCache. This change doesn't need downtime (unless it goes badly in production), so we are tentatively going to try it out this afternoon.

drumm’s picture

We are now running Drupal.org's field cache with the core DB cache. Let's see if this problem does not get worse, and the site stays up.

drumm’s picture

Assigned: Unassigned » drumm

I guess this is mine now. The site seems to be doing okay.

Please add any examples of bad data to the issue summary. And remove any examples that have since been updated.

webchick’s picture

Hm. Looks like it's still happening:

#1853536-53: Reintroduce Views integration for search.module (not supporting backlinks view) was at 3:17pm, after the last comment in this thread.

drumm’s picture

The test bot does have a case where it will post a followup, but not update the status, by design. From pift_cron_auto_followup():

  $node = node_load($nid);
  $files = $node->field_issue_files;
  if (empty($files['und'])) {
    return;
  }
  $lastfile = end($files['und']);
  // Only change the issue status if this is the last file in the
  // field_issue_files stack.
  if ($lastfile['uri'] == $file->uri) {
    $node->field_issue_status[LANGUAGE_NONE][0]['value'] = PIFT_FOLLOWUP_FAIL;
  }

I could imagine that logic being tightened up in a separate issue, end($files['und']) is often an interdiff.

klonos’s picture

Damien Tournoud’s picture

@klonos those two are normal given #37.

jthorson’s picture

Yup ... #37 is a known issue, and I was sure there was an existing issue for it - but I can't seem to find it now. Opened #2169435: Fix pift_cron_autofollowup_logic in the PIFT queue as a result.

drumm’s picture

Title: Issue status change not consistently saved in node revisions (and testbot sometimes fails to mark patches NW) » Issue status change not consistently saved in node revisions

Changing the title given #2169435: Fix pift_cron_autofollowup_logic.

This issue should cover any situations where the issue status is not the same in all of:

  • The issue's sidebar.
  • The most recent comment that has a status change.
  • Issue queue listings. (Issue queue listings are a search index, updated in the same page load as node save, but after node save. They can lag up to 30 seconds.)
drumm’s picture

Status: Active » Fixed

Since the field cache has been moved, it seems like there have not been more mismatches between the issue sidebar and last comment changing the issue status.

For the issue search index - I reindexed 200ish issues which had last comment timestamps mismatched between the search_api_db_project_issues and search_api_db_project_issues tables. These can be found with

SELECT item_id, from_unixtime(pi.last_comment_timestamp), from_unixtime(ncs.last_comment_timestamp) FROM search_api_db_project_issues pi INNER JOIN node_comment_statistics ncs ON pi.item_id = ncs.nid WHERE pi.last_comment_timestamp <> ncs.last_comment_timestamp ORDER BY ncs.last_comment_timestamp DESC;

I haven't noticed any of those with mismatched statuses.

It seems this can be called fixed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.