Problem/Motivation

During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.

This works as long as the node access system isn't used. However, if

  • the site is using node access, and
  • there are non-public content types for some user roles,

and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data() might return FALSE indicating a field update is possible, when in fact it isn't.

The root of the problem is field_has_data()'s use of EntityFieldQuery to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, unfiltered query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.

Proposed resolution

  • Support a DANGEROUS_ACCESS_CHECK_OPT_OUT query tag in EFQ that allows queries internal to the Field API to bypass the field access check.
  • Patch in #38 adds the suggested tag and responds to it in the query builder in field_sql_storage_field_storage_query().
  • The patch also includes an automated test that fails when the bug is present.

Remaining tasks

API changes

  • Addition of DANGEROUS_ACCESS_CHECK_OPT_OUT tag to bypass access checks in EFQs.

Original report by [smk-ka]

During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.

This works as long as the node access system isn't used. However, if

  • the site is using node access, and
  • there are non-public content types for some user roles,

and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data() might return FALSE indicating a field update is possible, when in fact it isn't.

The root of the problem is field_has_data()'s use of EntityFieldQuery to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, UNFILTERED query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.

Files: 
CommentFileSizeAuthor
#52 1302228-52-d7.patch5.4 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 38,738 pass(es).
[ View ]
#52 interdiff.txt560 bytesxjm
#49 interdiff.txt893 bytesxjm
#48 1302228-48-d7.patch5.41 KBxjm
FAILED: [[SimpleTest]]: [MySQL] 38,750 pass(es), 5 fail(s), and 1 exception(s).
[ View ]
#48 interdiff.txt5.41 KBxjm
#44 field_has_data-1302228-44.patch4.68 KBcoleman.sean.c
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed during invocation of run-tests.sh.
[ View ]
#38 field_has_data-1302228-38.patch4.74 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 35,130 pass(es).
[ View ]
#37 field_has_data-1302228-37.patch4.74 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 35,116 pass(es).
[ View ]
#37 interdiff-35-37.txt1.41 KBxjm
#36 interdiff.txt984 bytesxjm
#35 field_has_data-1302228-36.patch4.73 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 35,126 pass(es).
[ View ]
#35 interdiff.txt0 bytesxjm
#30 drupal-1302228-30.patch4.54 KBtim.plunkett
FAILED: [[SimpleTest]]: [MySQL] 35,108 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#27 1302228_27.patch4.53 KBchx
FAILED: [[SimpleTest]]: [MySQL] 34,170 pass(es), 238 fail(s), and 88 exception(s).
[ View ]
#23 dangerous-tag-1302228-23-combined.patch4.42 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 35,124 pass(es).
[ View ]
#21 field_has_data-1302228-21-test-only.patch3.1 KBxjm
FAILED: [[SimpleTest]]: [MySQL] 36,207 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#21 field_has_data-1302228-21-combined.patch3.65 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 36,211 pass(es).
[ View ]
#7 field_has_data_node_access-1302228-7.patch559 byteskathyh
PASSED: [[SimpleTest]]: [MySQL] 33,978 pass(es).
[ View ]
#2 field_has_data_node_access-1302228-2.patch539 bytesyched
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch field_has_data_node_access-1302228-2.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Comments

Version:7.x-dev» 8.x-dev
Status:Closed (duplicate)» Needs review
Issue tags:+needs backport to D7
StatusFileSize
new539 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch field_has_data_node_access-1302228-2.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

See #997394-7: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.
Attached patch is a workaround for the specific field_has_data() issue, suggested by Damien.

I've just hit this bug when reverting fields from a feature with drush, the data was gone afterwards. Workaround from #2 seems to work and applies for D7 as well.

Status:Needs review» Reviewed & tested by the community

Yeah, #2 patch seems to fix the issue for me too, on Drupal 7. Same use case : when I reverted a feature, I lost all my fields content. I even experienced, once, losing all my fields content, without even reverting the feature. This never happened while patch above was applied.

Status:Reviewed & tested by the community» Needs work
Issue tags:+Needs tests

Also, I am not entirely convinced this is the right solution. Can't we instead add metadata to explicitly stop field_sql_storage from adding that tag instead? We could name it DANGEROUS_SKIP_CHECK, we had a constant like that in D5 for a different check which was also dangerous to skip.

@chx : well, the standard pattern is a query tag to opt *in* for access checks. EFQ has an odd behavior where it sometimes hardcode access check and sometimes not (that's what the more general #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition is about), but do we want to add a reversed opt-out tag to work around this oddity ?

Status:Needs work» Needs review
StatusFileSize
new559 bytes
PASSED: [[SimpleTest]]: [MySQL] 33,978 pass(es).
[ View ]

Patch updated for D8 /core. After botcheck, needs tests.

So, recipe for this automated test is:

  1. Enable the test node access module.
  2. Create a user with limited node access privileges, but also administer content types privileges (?).
  3. Create a node that the user does not have access to.
  4. Log in as the user from #2 and make a change to a field that exists on the node from #3 (?).
  5. Verify that the data from the field is not lost.

I don't quite understand "and a field update is about to be performed by a user which happens to have no access to the associated content type" so please clarify or correct the steps above if possible.

Aside, I do agree that the approach here seems kinda ishy... or at least merits a wider discussion. Given a test, though, we can play with more solutions if desired.

I had the same problem as in #3. #2 solved it for me. Somewhat more detailed explanation can be found here: #1463748: drush and cron destroys every field data that is not viewable by non-authenticated users

Can someone please see if this is reproducible using the steps in #8 ? And if not, please correct them. Thanks!

I could reproduce this the easiest way using the content_access module.

STEPS TO REPRODUCE:
1. Install drupal.
2. Login as admin, enable field ui, image.
3. On the admin/people/permissions page give 'administer content types' permission to anonymous user. ALL other permissions should be switched off (even 'view published content') for everyone.
4. Create a 'story' ct with an image field.
5. Add a story with an image field.
6. Enable content_access.
7. On the admin/structure/types/manage/story/access page all checkboxes should be empty. Rebuild permissions.
8. Log out
9. Go to admin/structure/types/manage/story/fields and change something on the image field (e.g make it required).

RESULT:
The image data disappears from the posted story node.

(For the test: instead of anonymous you might want to use an authenticated user with the above permission I guess)

Two points I don't really get:
1. I could not reproduce this with the body field of the story. The body data remain even if image data disappear.
2. Why do we need the content access module? Anonymous does not have read permission for story nodes even without the content access module. Still, if we disable content_access, the image field data remains.

Assigned:Unassigned» xjm
Issue tags:-Needs steps to reproduce

Excellent; thanks @czigor.

I'm a little unsure about leaving out the access content permission ("view published content" in the UI). How the node access system works is that you need access content to access any content at all, no matter what node access permissions are set. Then, if no node access module is installed, drupal has a special "global" grant that grants view access to all users. When you install a node access module, however, Drupal automatically deletes this global rule, allowing specific rules set by the access control modle to come into play.

(The metaphor I use to explain it is: If you need to go through a door to get access to node content, then the access content permission is a door chain--latched when the permission is unchecked--and node access grants as well as most other permissions are keys. A key does you no good if the door is chained.)

I'll try working on a test.

I had a look at this over the weekend but forgot to post the comment :) I'll post it now in case it's any help.

I used the following steps:

  1. Enable the test node access module.
  2. Rebuild permissions
  3. Create a user with administer content types privileges (by creating a role with the permission and assigning it to the user).
  4. Add a text field to Basic page content type
  5. Create a Basic page node including data in the field just created
  6. Log in as the user from #3 and check node/1 - See Access Denied
  7. Visit admin/structure/types/manage/page/fields, Edit the field created in step 4 e.g. change max length

Without the patch, user #3 with just administer content types is able to edit a field that has data - the global settings section has the text "These settings apply to the Test field everywhere it is used.". Editing the field results in data loss.

With the patch, the same user sees the text "These settings apply to the Test field everywhere it is used. Because the field already has data, some settings can no longer be changed.". This is the same as uid 1 sees.

Hope that all makes sense.

@czigor

For 1. where the "body data remains". I think this may be because the field is an instance on another entity (node). I had a similar situation where I lost data in fields exclusive to that entity, but data in a field that was an instance on more than one entity remained intact.

@tunny: You're right, that was it.

Assigned:xjm» Unassigned
Issue tags:+Needs issue summary update

Unassigning xjm per her request.

Assigned:Unassigned» xjm

Alright, picking this back up. :)

Assigned:xjm» Unassigned
Issue tags:-Needs tests
StatusFileSize
new3.65 KB
PASSED: [[SimpleTest]]: [MySQL] 36,211 pass(es).
[ View ]
new3.1 KB
FAILED: [[SimpleTest]]: [MySQL] 36,207 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

Attached test exposes the bug, failing against current HEAD and passing when combined with the fix from #7.

We still need to decide if we wish to try a different approach for this; see #5 and #6.

Assigned:Unassigned» xjm

Alright chx explained this to me. Rolling a patch. I'm actually not sure how I feel about the approach suggested in #5 but I do think it's better than the current one, at least.

StatusFileSize
new4.42 KB
PASSED: [[SimpleTest]]: [MySQL] 35,124 pass(es).
[ View ]

Alright, the attached implements the solution chx suggested in #5, using the same test I wrote in #21. I think this is a more correct solution. My only remaining reservations are:

  1. Tag name... bikeshed away.
  2. It kinda feels like a workaround for a workaround.
  3. I'm not 100% convinced that we shouldn't try to fix this in field_sql_storage_field_storage_query() instead... on the other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition seems to imply that we should be moving stuff out of that, so I'm not sure.

On the other other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition also suggests hotfixing this first (the whole data loss thing...)

Assigned:xjm» Unassigned

The only reason I am not RTBC'ing this because in fact I wrote it :) xjm provided the test, thanks!

Status:Needs review» Reviewed & tested by the community

I don't see any real reason to debate the name of that tag. DANGEROUS_NODE_ACCESS_OPT_OUT explains what it's about.

And I don't think #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition should hold this up.

Status:Reviewed & tested by the community» Needs review
StatusFileSize
new4.53 KB
FAILED: [[SimpleTest]]: [MySQL] 34,170 pass(es), 238 fail(s), and 88 exception(s).
[ View ]

Wat about tis?

I definitely prefer this approach, since it doesn't chop up the node_query_entity_field_access_alter()/_node_query_node_access_alter() combination.

However, is the docblock from the other patch (// Allow queries internal to the API to opt out of the access check, for etc) still relevant?

Status:Needs review» Needs work

The last submitted patch, 1302228_27.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new4.54 KB
FAILED: [[SimpleTest]]: [MySQL] 35,108 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

It was still using $this, and not $select_query.

Status:Needs review» Needs work

The last submitted patch, drupal-1302228-30.patch, failed testing.

Assigned:Unassigned» xjm
Status:Needs work» Needs review

There's $query, which is the query being executed when field_sql_storage_field_storage_query() is invoked, and then there's $select_query, which is the one we just built up one line so of course it doesn't have any tags yet. We need to check the former and then adjust the latter. That fixes this locally with my test.

Rolling it.

StatusFileSize
new0 bytes
new4.73 KB
PASSED: [[SimpleTest]]: [MySQL] 35,126 pass(es).
[ View ]

I also added back the inline comment.

Assigned:xjm» Unassigned
StatusFileSize
new984 bytes

Ahem.

StatusFileSize
new1.41 KB
new4.74 KB
PASSED: [[SimpleTest]]: [MySQL] 35,116 pass(es).
[ View ]

Actually we should also remove the word "node" from the tag name because node.module is no longer part of this solution; it's just a generic access check opt-out.

StatusFileSize
new4.74 KB
PASSED: [[SimpleTest]]: [MySQL] 35,130 pass(es).
[ View ]

+++ b/core/modules/field/modules/field_sql_storage/field_sql_storage.moduleundefined
@@ -515,7 +515,12 @@ function field_sql_storage_field_storage_query(EntityFieldQuery $query) {
+      //  the access grants for the current user.

Extra space, derp. Fixed here. Sorry testbot and issue-followers.

+++ b/core/modules/field/modules/field_sql_storage/field_sql_storage.moduleundefined
@@ -515,7 +515,12 @@ function field_sql_storage_field_storage_query(EntityFieldQuery $query) {
+      if (!isset($query->tags['DANGEROUS_ACCESS_CHECK_OPT_OUT'])) {

Should this be if (!$query->hasTag('DANGEROUS_ACCESS_CHECK_OPT_OUT')) { like it was in the previous patch?

Oh, EntityFieldQuery doesn't have that method, checking tags directly is the only option here.

RTBC anyone? I feel weird doing it after doing it in #26.

Issue summary:View changes

issue format

Summary updated.

Issue summary:View changes

(xjm) Updated issue summary.

Status:Needs review» Reviewed & tested by the community

Given that there's still #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition, the solid test coverage, that it must be a backportable fix and @tim.plunkett asking in #40, this looks like a good way to have the dataloss bug fixed. Really.

Version:8.x-dev» 7.x-dev
Status:Reviewed & tested by the community» Patch (to be ported)

OK this does feel like a bit of an odd workaround, but given we have the other critical issue open and it'd be nice not to randomly delete people's data based on the current user, let's get this one in.

Committed/pushed to 8.x, moving to 7.x for backport.

Status:Patch (to be ported)» Needs review
StatusFileSize
new4.68 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed during invocation of run-tests.sh.
[ View ]

Here's #38 rerolled for 7.x.

Status:Needs review» Needs work

The last submitted patch, field_has_data-1302228-44.patch, failed testing.

Thank you for rerolling.
NodeWebTestCase isn't in Drupal 7, yet. That is because #1373142: Use the Testing profile. Speed up testbot by 50% wasn't yet backported. I am not sure how consistent we must keep this, but the options are waiting (i.e. helping on that issue), slicing NodeWebTestCase off from the patch there and get it in together with this or making this test independent of NodeWebTestCase. The latter could in our case just be changing NodeWebTestCase to DrupalWebTestCase.

Assigned:Unassigned» xjm

Since this is a critical bug, we should fix this first and not wait on that much larger issue. The options are:

  1. Change the test here to extend DrupalWebTestCase directly, which means sun's patch will need to be rerolled to add a change to this test, or
  2. Add NodeWebTestCase here and extend it. This also means rerolling sun's patch but it's much simpler as we just take out that hunk. Here's the class:
    <?php
    class NodeWebTestCase extends DrupalWebTestCase {
      function
    setUp() {
       
    $modules = func_get_args();
        if (isset(
    $modules[0]) && is_array($modules[0])) {
         
    $modules = $modules[0];
        }
       
    $modules[] = 'node';
       
    parent::setUp($modules);
       
    // Create Basic page and Article node types.
       
    if ($this->profile != 'standard') {
         
    $this->drupalCreateContentType(array('type' => 'page', 'name' => 'Basic page'));
         
    $this->drupalCreateContentType(array('type' => 'article', 'name' => 'Article'));
        }
      }
    }
    ?>

I personally like the second better so I'll reroll with that.

Status:Needs work» Needs review
StatusFileSize
new5.41 KB
new5.41 KB
FAILED: [[SimpleTest]]: [MySQL] 38,750 pass(es), 5 fail(s), and 1 exception(s).
[ View ]

Attached simply adds the base class as described in #47; only this test case extends it at this point. #1373142: Use the Testing profile. Speed up testbot by 50% will get the rest.

I also forward-ported the docblock I added here to D8: #1540094: Add missing docblocks for classes and methods in node.test

Edit: Obviously, that is not the real interdiff.

StatusFileSize
new893 bytes

Actual interdiff from #44 to #48.

Status:Needs review» Needs work

The last submitted patch, 1302228-48-d7.patch, failed testing.

+++ b/modules/node/node.testundefined
@@ -2493,3 +2513,78 @@ class NodeAccessPagerTestCase extends DrupalWebTestCase {
+    $langcode = LANGUAGE_NOT_SPECIFIED;

Should be LANGUAGE_NONE.

Status:Needs work» Needs review
StatusFileSize
new560 bytes
new5.4 KB
PASSED: [[SimpleTest]]: [MySQL] 38,738 pass(es).
[ View ]

Status:Needs review» Reviewed & tested by the community

Thank you.

The code looks great - +1 from me on rtbc.

I talked this over with xjm. On the whole, I would be more comfortable having a full month to test this, since it looks like it requires changes in field storage modules (e.g. mongodb, youtube, flickr, amazon), and I'm not sure what effect adding that tag to queries that somehow fail to get it attached would have. So I'd like to hold this until after Wednesday.

Note most of what webchick listed are stream wrappers :) I am unaware of anything but mongodb and the civicrm sandbox http://drupal.org/sandbox/matt2000/1512090 implementing a field storage w/ EFQ (the Riak storage done by Dmitri doesnt have EFQ).

I came here from issue #1211272: Reverting features deletes ctools components. I was loosing some field_data (the tables field_data_comment_body and field_data_endpoints where emptied) while a cache clear was rebuilding some updated features. After applying this patch no data loss happened.

Title:field_has_data() returns inconsistent results – possible data lossChange notice for: field_has_data() returns inconsistent results – possible data loss
Status:Reviewed & tested by the community» Active

Yikes! Sorry, folks. My "couple of days" after 7.14 to check for regressions accidentally turned into 10 days. Oops. :(

So I have to say that this DANGEROUS_ACCESS_CHECK_OPT_OUT stuff really skeezes me out. OTOH, I can see why it's needed. It indeed makes sense that the field API itself would need to be more "neutral" on the question of access than the given current user, and I don't immediately have better suggestions on how to do this.

So... committed and pushed to 7.x. Thanks!

This will need a change notice.

Category:bug» task

Issue tags:+7.15 release notes

.

Status:Active» Needs review

Change notice ready for review: http://drupal.org/node/1597378

Title:Change notice for: field_has_data() returns inconsistent results – possible data lossfield_has_data() returns inconsistent results – possible data loss
Status:Needs review» Fixed

Looks good to me. If you're as confused as I am about the current behavior (why it's an opt-out system and not opt-in like normal queries), that is being discussed in #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.

Went ahead and added this to CHANGELOG.txt now: http://drupalcode.org/project/drupal.git/commit/746b5ac

Question: Should the change notice (http://drupal.org/node/1597378) also refer to the fact that field storage engines are now supposed to respond to the new DANGEROUS_ACCESS_CHECK_OPT_OUT tag? Or do we think there are so few field storage engines out there that it's not worth mentioning?

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Assigned:xjm» Unassigned

Issue summary:View changes

fixed link to field_has_data()