Problem/Motivation

During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.

This works as long as the node access system isn't used. However, if

  • the site is using node access, and
  • there are non-public content types for some user roles,

and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data() might return FALSE indicating a field update is possible, when in fact it isn't.

The root of the problem is field_has_data()'s use of EntityFieldQuery to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, unfiltered query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.

Proposed resolution

  • Support a DANGEROUS_ACCESS_CHECK_OPT_OUT query tag in EFQ that allows queries internal to the Field API to bypass the field access check.
  • Patch in #38 adds the suggested tag and responds to it in the query builder in field_sql_storage_field_storage_query().
  • The patch also includes an automated test that fails when the bug is present.

Remaining tasks

API changes

  • Addition of DANGEROUS_ACCESS_CHECK_OPT_OUT tag to bypass access checks in EFQs.

Original report by [smk-ka]

During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.

This works as long as the node access system isn't used. However, if

  • the site is using node access, and
  • there are non-public content types for some user roles,

and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data() might return FALSE indicating a field update is possible, when in fact it isn't.

The root of the problem is field_has_data()'s use of EntityFieldQuery to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, UNFILTERED query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

yched’s picture

Version: 7.x-dev » 8.x-dev
Status: Closed (duplicate) » Needs review
Issue tags: +Needs backport to D7
FileSize
539 bytes

See #997394-7: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.
Attached patch is a workaround for the specific field_has_data() issue, suggested by Damien.

mh86’s picture

I've just hit this bug when reverting fields from a feature with drush, the data was gone afterwards. Workaround from #2 seems to work and applies for D7 as well.

David Stosik’s picture

Status: Needs review » Reviewed & tested by the community

Yeah, #2 patch seems to fix the issue for me too, on Drupal 7. Same use case : when I reverted a feature, I lost all my fields content. I even experienced, once, losing all my fields content, without even reverting the feature. This never happened while patch above was applied.

chx’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs tests

Also, I am not entirely convinced this is the right solution. Can't we instead add metadata to explicitly stop field_sql_storage from adding that tag instead? We could name it DANGEROUS_SKIP_CHECK, we had a constant like that in D5 for a different check which was also dangerous to skip.

yched’s picture

@chx : well, the standard pattern is a query tag to opt *in* for access checks. EFQ has an odd behavior where it sometimes hardcode access check and sometimes not (that's what the more general #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition is about), but do we want to add a reversed opt-out tag to work around this oddity ?

kathyh’s picture

Status: Needs work » Needs review
FileSize
559 bytes

Patch updated for D8 /core. After botcheck, needs tests.

xjm’s picture

So, recipe for this automated test is:

  1. Enable the test node access module.
  2. Create a user with limited node access privileges, but also administer content types privileges (?).
  3. Create a node that the user does not have access to.
  4. Log in as the user from #2 and make a change to a field that exists on the node from #3 (?).
  5. Verify that the data from the field is not lost.

I don't quite understand "and a field update is about to be performed by a user which happens to have no access to the associated content type" so please clarify or correct the steps above if possible.

Aside, I do agree that the approach here seems kinda ishy... or at least merits a wider discussion. Given a test, though, we can play with more solutions if desired.

czigor’s picture

I had the same problem as in #3. #2 solved it for me. Somewhat more detailed explanation can be found here: #1463748: drush and cron destroys every field data that is not viewable by non-authenticated users

xjm’s picture

Can someone please see if this is reproducible using the steps in #8 ? And if not, please correct them. Thanks!

czigor’s picture

I could reproduce this the easiest way using the content_access module.

STEPS TO REPRODUCE:
1. Install drupal.
2. Login as admin, enable field ui, image.
3. On the admin/people/permissions page give 'administer content types' permission to anonymous user. ALL other permissions should be switched off (even 'view published content') for everyone.
4. Create a 'story' ct with an image field.
5. Add a story with an image field.
6. Enable content_access.
7. On the admin/structure/types/manage/story/access page all checkboxes should be empty. Rebuild permissions.
8. Log out
9. Go to admin/structure/types/manage/story/fields and change something on the image field (e.g make it required).

RESULT:
The image data disappears from the posted story node.

(For the test: instead of anonymous you might want to use an authenticated user with the above permission I guess)

Two points I don't really get:
1. I could not reproduce this with the body field of the story. The body data remain even if image data disappear.
2. Why do we need the content access module? Anonymous does not have read permission for story nodes even without the content access module. Still, if we disable content_access, the image field data remains.

xjm’s picture

Assigned: Unassigned » xjm
Issue tags: -Needs steps to reproduce

Excellent; thanks @czigor.

I'm a little unsure about leaving out the access content permission ("view published content" in the UI). How the node access system works is that you need access content to access any content at all, no matter what node access permissions are set. Then, if no node access module is installed, drupal has a special "global" grant that grants view access to all users. When you install a node access module, however, Drupal automatically deletes this global rule, allowing specific rules set by the access control modle to come into play.

(The metaphor I use to explain it is: If you need to go through a door to get access to node content, then the access content permission is a door chain--latched when the permission is unchecked--and node access grants as well as most other permissions are keys. A key does you no good if the door is chained.)

I'll try working on a test.

chris.leversuch’s picture

I had a look at this over the weekend but forgot to post the comment :) I'll post it now in case it's any help.

I used the following steps:

  1. Enable the test node access module.
  2. Rebuild permissions
  3. Create a user with administer content types privileges (by creating a role with the permission and assigning it to the user).
  4. Add a text field to Basic page content type
  5. Create a Basic page node including data in the field just created
  6. Log in as the user from #3 and check node/1 - See Access Denied
  7. Visit admin/structure/types/manage/page/fields, Edit the field created in step 4 e.g. change max length

Without the patch, user #3 with just administer content types is able to edit a field that has data - the global settings section has the text "These settings apply to the Test field everywhere it is used.". Editing the field results in data loss.

With the patch, the same user sees the text "These settings apply to the Test field everywhere it is used. Because the field already has data, some settings can no longer be changed.". This is the same as uid 1 sees.

Hope that all makes sense.

tunny’s picture

@czigor

For 1. where the "body data remains". I think this may be because the field is an instance on another entity (node). I had a similar situation where I lost data in fields exclusive to that entity, but data in a field that was an instance on more than one entity remained intact.

czigor’s picture

@tunny: You're right, that was it.

tim.plunkett’s picture

Assigned: xjm » Unassigned
Issue tags: +Needs issue summary update

Unassigning xjm per her request.

deepak manhas’s picture

xjm’s picture

Assigned: Unassigned » xjm

Alright, picking this back up. :)

xjm’s picture

Assigned: xjm » Unassigned
Issue tags: -Needs tests
FileSize
3.65 KB
3.1 KB

Attached test exposes the bug, failing against current HEAD and passing when combined with the fix from #7.

We still need to decide if we wish to try a different approach for this; see #5 and #6.

xjm’s picture

Assigned: Unassigned » xjm

Alright chx explained this to me. Rolling a patch. I'm actually not sure how I feel about the approach suggested in #5 but I do think it's better than the current one, at least.

xjm’s picture

Alright, the attached implements the solution chx suggested in #5, using the same test I wrote in #21. I think this is a more correct solution. My only remaining reservations are:

  1. Tag name... bikeshed away.
  2. It kinda feels like a workaround for a workaround.
  3. I'm not 100% convinced that we shouldn't try to fix this in field_sql_storage_field_storage_query() instead... on the other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition seems to imply that we should be moving stuff out of that, so I'm not sure.

On the other other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition also suggests hotfixing this first (the whole data loss thing...)

xjm’s picture

Assigned: xjm » Unassigned
chx’s picture

The only reason I am not RTBC'ing this because in fact I wrote it :) xjm provided the test, thanks!

tim.plunkett’s picture

Status: Needs review » Reviewed & tested by the community

I don't see any real reason to debate the name of that tag. DANGEROUS_NODE_ACCESS_OPT_OUT explains what it's about.

And I don't think #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition should hold this up.

chx’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
4.53 KB

Wat about tis?

tim.plunkett’s picture

I definitely prefer this approach, since it doesn't chop up the node_query_entity_field_access_alter()/_node_query_node_access_alter() combination.

However, is the docblock from the other patch (// Allow queries internal to the API to opt out of the access check, for etc) still relevant?

Status: Needs review » Needs work

The last submitted patch, 1302228_27.patch, failed testing.

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
4.54 KB

It was still using $this, and not $select_query.

Status: Needs review » Needs work

The last submitted patch, drupal-1302228-30.patch, failed testing.

xjm’s picture

Assigned: Unassigned » xjm
Status: Needs work » Needs review

There's $query, which is the query being executed when field_sql_storage_field_storage_query() is invoked, and then there's $select_query, which is the one we just built up one line so of course it doesn't have any tags yet. We need to check the former and then adjust the latter. That fixes this locally with my test.

Rolling it.

xjm’s picture

I also added back the inline comment.

xjm’s picture

Assigned: xjm » Unassigned
FileSize
984 bytes

Ahem.

xjm’s picture

Actually we should also remove the word "node" from the tag name because node.module is no longer part of this solution; it's just a generic access check opt-out.

xjm’s picture

+++ b/core/modules/field/modules/field_sql_storage/field_sql_storage.moduleundefined
@@ -515,7 +515,12 @@ function field_sql_storage_field_storage_query(EntityFieldQuery $query) {
+      //  the access grants for the current user.

Extra space, derp. Fixed here. Sorry testbot and issue-followers.

tim.plunkett’s picture

+++ b/core/modules/field/modules/field_sql_storage/field_sql_storage.moduleundefined
@@ -515,7 +515,12 @@ function field_sql_storage_field_storage_query(EntityFieldQuery $query) {
+      if (!isset($query->tags['DANGEROUS_ACCESS_CHECK_OPT_OUT'])) {

Should this be if (!$query->hasTag('DANGEROUS_ACCESS_CHECK_OPT_OUT')) { like it was in the previous patch?

tim.plunkett’s picture

Oh, EntityFieldQuery doesn't have that method, checking tags directly is the only option here.

RTBC anyone? I feel weird doing it after doing it in #26.

xjm’s picture

Issue summary: View changes

issue format

xjm’s picture

Summary updated.

xjm’s picture

Issue summary: View changes

(xjm) Updated issue summary.

Niklas Fiekas’s picture

Status: Needs review » Reviewed & tested by the community

Given that there's still #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition, the solid test coverage, that it must be a backportable fix and @tim.plunkett asking in #40, this looks like a good way to have the dataloss bug fixed. Really.

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

OK this does feel like a bit of an odd workaround, but given we have the other critical issue open and it'd be nice not to randomly delete people's data based on the current user, let's get this one in.

Committed/pushed to 8.x, moving to 7.x for backport.

coleman.sean.c’s picture

Status: Patch (to be ported) » Needs review
FileSize
4.68 KB

Here's #38 rerolled for 7.x.

Status: Needs review » Needs work

The last submitted patch, field_has_data-1302228-44.patch, failed testing.

Niklas Fiekas’s picture

Thank you for rerolling.
NodeWebTestCase isn't in Drupal 7, yet. That is because #1373142: Use the Testing profile. Speed up testbot by 50% wasn't yet backported. I am not sure how consistent we must keep this, but the options are waiting (i.e. helping on that issue), slicing NodeWebTestCase off from the patch there and get it in together with this or making this test independent of NodeWebTestCase. The latter could in our case just be changing NodeWebTestCase to DrupalWebTestCase.

xjm’s picture

Assigned: Unassigned » xjm

Since this is a critical bug, we should fix this first and not wait on that much larger issue. The options are:

  1. Change the test here to extend DrupalWebTestCase directly, which means sun's patch will need to be rerolled to add a change to this test, or
  2. Add NodeWebTestCase here and extend it. This also means rerolling sun's patch but it's much simpler as we just take out that hunk. Here's the class:
    class NodeWebTestCase extends DrupalWebTestCase {
      function setUp() {
        $modules = func_get_args();
        if (isset($modules[0]) && is_array($modules[0])) {
          $modules = $modules[0];
        }
        $modules[] = 'node';
        parent::setUp($modules);
    
        // Create Basic page and Article node types.
        if ($this->profile != 'standard') {
          $this->drupalCreateContentType(array('type' => 'page', 'name' => 'Basic page'));
          $this->drupalCreateContentType(array('type' => 'article', 'name' => 'Article'));
        }
      }
    }
    

I personally like the second better so I'll reroll with that.

xjm’s picture

Status: Needs work » Needs review
FileSize
5.41 KB
5.41 KB

Attached simply adds the base class as described in #47; only this test case extends it at this point. #1373142: Use the Testing profile. Speed up testbot by 50% will get the rest.

I also forward-ported the docblock I added here to D8: #1540094: Add missing docblocks for classes and methods in node.test

Edit: Obviously, that is not the real interdiff.

xjm’s picture

FileSize
893 bytes

Actual interdiff from #44 to #48.

Status: Needs review » Needs work

The last submitted patch, 1302228-48-d7.patch, failed testing.

xjm’s picture

+++ b/modules/node/node.testundefined
@@ -2493,3 +2513,78 @@ class NodeAccessPagerTestCase extends DrupalWebTestCase {
+    $langcode = LANGUAGE_NOT_SPECIFIED;

Should be LANGUAGE_NONE.

xjm’s picture

Status: Needs work » Needs review
FileSize
560 bytes
5.4 KB
Niklas Fiekas’s picture

Status: Needs review » Reviewed & tested by the community

Thank you.

BTMash’s picture

The code looks great - +1 from me on rtbc.

webchick’s picture

I talked this over with xjm. On the whole, I would be more comfortable having a full month to test this, since it looks like it requires changes in field storage modules (e.g. mongodb, youtube, flickr, amazon), and I'm not sure what effect adding that tag to queries that somehow fail to get it attached would have. So I'd like to hold this until after Wednesday.

chx’s picture

Note most of what webchick listed are stream wrappers :) I am unaware of anything but mongodb and the civicrm sandbox http://drupal.org/sandbox/matt2000/1512090 implementing a field storage w/ EFQ (the Riak storage done by Dmitri doesnt have EFQ).

osopolar’s picture

I came here from issue #1211272: Reverting features deletes ctools components. I was loosing some field_data (the tables field_data_comment_body and field_data_endpoints where emptied) while a cache clear was rebuilding some updated features. After applying this patch no data loss happened.

webchick’s picture

Title: field_has_data() returns inconsistent results – possible data loss » Change notice for: field_has_data() returns inconsistent results – possible data loss
Status: Reviewed & tested by the community » Active

Yikes! Sorry, folks. My "couple of days" after 7.14 to check for regressions accidentally turned into 10 days. Oops. :(

So I have to say that this DANGEROUS_ACCESS_CHECK_OPT_OUT stuff really skeezes me out. OTOH, I can see why it's needed. It indeed makes sense that the field API itself would need to be more "neutral" on the question of access than the given current user, and I don't immediately have better suggestions on how to do this.

So... committed and pushed to 7.x. Thanks!

This will need a change notice.

Berdir’s picture

Category: bug » task
webchick’s picture

Issue tags: +7.15 release notes

.

star-szr’s picture

Status: Active » Needs review

Change notice ready for review: http://drupal.org/node/1597378

Berdir’s picture

Title: Change notice for: field_has_data() returns inconsistent results – possible data loss » field_has_data() returns inconsistent results – possible data loss
Status: Needs review » Fixed

Looks good to me. If you're as confused as I am about the current behavior (why it's an opt-out system and not opt-in like normal queries), that is being discussed in #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.

David_Rothstein’s picture

Went ahead and added this to CHANGELOG.txt now: http://drupalcode.org/project/drupal.git/commit/746b5ac

Question: Should the change notice (http://drupal.org/node/1597378) also refer to the fact that field storage engines are now supposed to respond to the new DANGEROUS_ACCESS_CHECK_OPT_OUT tag? Or do we think there are so few field storage engines out there that it's not worth mentioning?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

xjm’s picture

Assigned: xjm » Unassigned
xjm’s picture

Issue summary: View changes

fixed link to field_has_data()