Problem/Motivation
During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.
This works as long as the node access system isn't used. However, if
- the site is using node access, and
- there are non-public content types for some user roles,
and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data()
might return FALSE indicating a field update is possible, when in fact it isn't.
The root of the problem is field_has_data()
's use of EntityFieldQuery
to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, unfiltered query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.
Proposed resolution
- Support a
DANGEROUS_ACCESS_CHECK_OPT_OUT
query tag in EFQ that allows queries internal to the Field API to bypass the field access check. - Patch in #38 adds the suggested tag and responds to it in the query builder in field_sql_storage_field_storage_query().
- The patch also includes an automated test that fails when the bug is present.
Remaining tasks
- Patch needs review.
- #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition is a followup issue to provide a more complete solution for node-based field access control.
API changes
- Addition of
DANGEROUS_ACCESS_CHECK_OPT_OUT
tag to bypass access checks in EFQs.
Original report by [smk-ka]
During a field_update_field() operation, field_has_data() is called to determine whether a field update should be allowed or not: if a field already contains any data, any changes are forbidden to prevent data loss. Only if a field doesn't have any database rows it will be allowed, causing the field to be dropped and recreated.
This works as long as the node access system isn't used. However, if
- the site is using node access, and
- there are non-public content types for some user roles,
and a field update is about to be performed by a user which happens to have no access to the associated content type, field_has_data() might return FALSE indicating a field update is possible, when in fact it isn't.
The root of the problem is field_has_data()'s use of EntityFieldQuery to determine whether a field contains any rows: this causes the query to be altered by the node access system, which is wrong – we need the raw, UNFILTERED query result, not a (possibly) crippled one. In the case of no publicly viewable nodes, this is causing a populated field table to be dropped.
Comment | File | Size | Author |
---|---|---|---|
#52 | 1302228-52-d7.patch | 5.4 KB | xjm |
#52 | interdiff.txt | 560 bytes | xjm |
#49 | interdiff.txt | 893 bytes | xjm |
#48 | 1302228-48-d7.patch | 5.41 KB | xjm |
#48 | interdiff.txt | 5.41 KB | xjm |
Comments
Comment #1
catchDuplicate of #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.
Comment #2
yched CreditAttribution: yched commentedSee #997394-7: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.
Attached patch is a workaround for the specific field_has_data() issue, suggested by Damien.
Comment #3
mh86 CreditAttribution: mh86 commentedI've just hit this bug when reverting fields from a feature with drush, the data was gone afterwards. Workaround from #2 seems to work and applies for D7 as well.
Comment #4
David Stosik CreditAttribution: David Stosik commentedYeah, #2 patch seems to fix the issue for me too, on Drupal 7. Same use case : when I reverted a feature, I lost all my fields content. I even experienced, once, losing all my fields content, without even reverting the feature. This never happened while patch above was applied.
Comment #5
chx CreditAttribution: chx commentedAlso, I am not entirely convinced this is the right solution. Can't we instead add metadata to explicitly stop field_sql_storage from adding that tag instead? We could name it
DANGEROUS_SKIP_CHECK
, we had a constant like that in D5 for a different check which was also dangerous to skip.Comment #6
yched CreditAttribution: yched commented@chx : well, the standard pattern is a query tag to opt *in* for access checks. EFQ has an odd behavior where it sometimes hardcode access check and sometimes not (that's what the more general #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition is about), but do we want to add a reversed opt-out tag to work around this oddity ?
Comment #7
kathyh CreditAttribution: kathyh commentedPatch updated for D8 /core. After botcheck, needs tests.
Comment #8
xjmSo, recipe for this automated test is:
I don't quite understand "and a field update is about to be performed by a user which happens to have no access to the associated content type" so please clarify or correct the steps above if possible.
Aside, I do agree that the approach here seems kinda ishy... or at least merits a wider discussion. Given a test, though, we can play with more solutions if desired.
Comment #9
czigor CreditAttribution: czigor commentedI had the same problem as in #3. #2 solved it for me. Somewhat more detailed explanation can be found here: #1463748: drush and cron destroys every field data that is not viewable by non-authenticated users
Comment #10
xjmCan someone please see if this is reproducible using the steps in #8 ? And if not, please correct them. Thanks!
Comment #11
czigor CreditAttribution: czigor commentedI could reproduce this the easiest way using the content_access module.
STEPS TO REPRODUCE:
1. Install drupal.
2. Login as admin, enable field ui, image.
3. On the admin/people/permissions page give 'administer content types' permission to anonymous user. ALL other permissions should be switched off (even 'view published content') for everyone.
4. Create a 'story' ct with an image field.
5. Add a story with an image field.
6. Enable content_access.
7. On the admin/structure/types/manage/story/access page all checkboxes should be empty. Rebuild permissions.
8. Log out
9. Go to admin/structure/types/manage/story/fields and change something on the image field (e.g make it required).
RESULT:
The image data disappears from the posted story node.
(For the test: instead of anonymous you might want to use an authenticated user with the above permission I guess)
Two points I don't really get:
1. I could not reproduce this with the body field of the story. The body data remain even if image data disappear.
2. Why do we need the content access module? Anonymous does not have read permission for story nodes even without the content access module. Still, if we disable content_access, the image field data remains.
Comment #12
xjmExcellent; thanks @czigor.
I'm a little unsure about leaving out the
access content
permission ("view published content" in the UI). How the node access system works is that you needaccess content
to access any content at all, no matter what node access permissions are set. Then, if no node access module is installed, drupal has a special "global" grant that grants view access to all users. When you install a node access module, however, Drupal automatically deletes this global rule, allowing specific rules set by the access control modle to come into play.(The metaphor I use to explain it is: If you need to go through a door to get access to node content, then the
access content
permission is a door chain--latched when the permission is unchecked--and node access grants as well as most other permissions are keys. A key does you no good if the door is chained.)I'll try working on a test.
Comment #13
chris.leversuch CreditAttribution: chris.leversuch commentedI had a look at this over the weekend but forgot to post the comment :) I'll post it now in case it's any help.
I used the following steps:
Without the patch, user #3 with just administer content types is able to edit a field that has data - the global settings section has the text "These settings apply to the Test field everywhere it is used.". Editing the field results in data loss.
With the patch, the same user sees the text "These settings apply to the Test field everywhere it is used. Because the field already has data, some settings can no longer be changed.". This is the same as uid 1 sees.
Hope that all makes sense.
Comment #14
tunny CreditAttribution: tunny commented@czigor
For 1. where the "body data remains". I think this may be because the field is an instance on another entity (node). I had a similar situation where I lost data in fields exclusive to that entity, but data in a field that was an instance on more than one entity remained intact.
Comment #15
czigor CreditAttribution: czigor commented@tunny: You're right, that was it.
Comment #16
tim.plunkettUnassigning xjm per her request.
Comment #17
deepak manhas CreditAttribution: deepak manhas commented#2: field_has_data_node_access-1302228-2.patch queued for re-testing.
Comment #18
xjmAlright, picking this back up. :)
Comment #21
xjmAttached test exposes the bug, failing against current HEAD and passing when combined with the fix from #7.
We still need to decide if we wish to try a different approach for this; see #5 and #6.
Comment #22
xjmAlright chx explained this to me. Rolling a patch. I'm actually not sure how I feel about the approach suggested in #5 but I do think it's better than the current one, at least.
Comment #23
xjmAlright, the attached implements the solution chx suggested in #5, using the same test I wrote in #21. I think this is a more correct solution. My only remaining reservations are:
field_sql_storage_field_storage_query()
instead... on the other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition seems to imply that we should be moving stuff out of that, so I'm not sure.On the other other hand, #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition also suggests hotfixing this first (the whole data loss thing...)
Comment #24
xjmComment #25
chx CreditAttribution: chx commentedThe only reason I am not RTBC'ing this because in fact I wrote it :) xjm provided the test, thanks!
Comment #26
tim.plunkettI don't see any real reason to debate the name of that tag.
DANGEROUS_NODE_ACCESS_OPT_OUT
explains what it's about.And I don't think #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition should hold this up.
Comment #27
chx CreditAttribution: chx commentedWat about tis?
Comment #28
tim.plunkettI definitely prefer this approach, since it doesn't chop up the
node_query_entity_field_access_alter()
/_node_query_node_access_alter()
combination.However, is the docblock from the other patch (
// Allow queries internal to the API to opt out of the access check, for
etc) still relevant?Comment #30
tim.plunkettIt was still using $this, and not $select_query.
Comment #34
xjmThere's
$query
, which is the query being executed whenfield_sql_storage_field_storage_query()
is invoked, and then there's$select_query
, which is the one we just built up one line so of course it doesn't have any tags yet. We need to check the former and then adjust the latter. That fixes this locally with my test.Rolling it.
Comment #35
xjmI also added back the inline comment.
Comment #36
xjmAhem.
Comment #37
xjmActually we should also remove the word "node" from the tag name because
node.module
is no longer part of this solution; it's just a generic access check opt-out.Comment #38
xjmExtra space, derp. Fixed here. Sorry testbot and issue-followers.
Comment #39
tim.plunkettShould this be
if (!$query->hasTag('DANGEROUS_ACCESS_CHECK_OPT_OUT')) {
like it was in the previous patch?Comment #40
tim.plunkettOh, EntityFieldQuery doesn't have that method, checking tags directly is the only option here.
RTBC anyone? I feel weird doing it after doing it in #26.
Comment #40.0
xjmissue format
Comment #41
xjmSummary updated.
Comment #41.0
xjm(xjm) Updated issue summary.
Comment #42
Niklas Fiekas CreditAttribution: Niklas Fiekas commentedGiven that there's still #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition, the solid test coverage, that it must be a backportable fix and @tim.plunkett asking in #40, this looks like a good way to have the dataloss bug fixed. Really.
Comment #43
catchOK this does feel like a bit of an odd workaround, but given we have the other critical issue open and it'd be nice not to randomly delete people's data based on the current user, let's get this one in.
Committed/pushed to 8.x, moving to 7.x for backport.
Comment #44
coleman.sean.c CreditAttribution: coleman.sean.c commentedHere's #38 rerolled for 7.x.
Comment #46
Niklas Fiekas CreditAttribution: Niklas Fiekas commentedThank you for rerolling.
NodeWebTestCase isn't in Drupal 7, yet. That is because #1373142: Use the Testing profile. Speed up testbot by 50% wasn't yet backported. I am not sure how consistent we must keep this, but the options are waiting (i.e. helping on that issue), slicing NodeWebTestCase off from the patch there and get it in together with this or making this test independent of NodeWebTestCase. The latter could in our case just be changing NodeWebTestCase to DrupalWebTestCase.
Comment #47
xjmSince this is a critical bug, we should fix this first and not wait on that much larger issue. The options are:
DrupalWebTestCase
directly, which means sun's patch will need to be rerolled to add a change to this test, orNodeWebTestCase
here and extend it. This also means rerolling sun's patch but it's much simpler as we just take out that hunk. Here's the class:I personally like the second better so I'll reroll with that.
Comment #48
xjmAttached simply adds the base class as described in #47; only this test case extends it at this point. #1373142: Use the Testing profile. Speed up testbot by 50% will get the rest.
I also forward-ported the docblock I added here to D8: #1540094: Add missing docblocks for classes and methods in node.test
Edit: Obviously, that is not the real interdiff.
Comment #49
xjmActual interdiff from #44 to #48.
Comment #51
xjmShould be LANGUAGE_NONE.
Comment #52
xjmComment #53
Niklas Fiekas CreditAttribution: Niklas Fiekas commentedThank you.
Comment #54
BTMash CreditAttribution: BTMash commentedThe code looks great - +1 from me on rtbc.
Comment #55
webchickI talked this over with xjm. On the whole, I would be more comfortable having a full month to test this, since it looks like it requires changes in field storage modules (e.g. mongodb, youtube, flickr, amazon), and I'm not sure what effect adding that tag to queries that somehow fail to get it attached would have. So I'd like to hold this until after Wednesday.
Comment #56
chx CreditAttribution: chx commentedNote most of what webchick listed are stream wrappers :) I am unaware of anything but mongodb and the civicrm sandbox http://drupal.org/sandbox/matt2000/1512090 implementing a field storage w/ EFQ (the Riak storage done by Dmitri doesnt have EFQ).
Comment #57
osopolarI came here from issue #1211272: Reverting features deletes ctools components. I was loosing some field_data (the tables field_data_comment_body and field_data_endpoints where emptied) while a cache clear was rebuilding some updated features. After applying this patch no data loss happened.
Comment #58
webchickYikes! Sorry, folks. My "couple of days" after 7.14 to check for regressions accidentally turned into 10 days. Oops. :(
So I have to say that this DANGEROUS_ACCESS_CHECK_OPT_OUT stuff really skeezes me out. OTOH, I can see why it's needed. It indeed makes sense that the field API itself would need to be more "neutral" on the question of access than the given current user, and I don't immediately have better suggestions on how to do this.
So... committed and pushed to 7.x. Thanks!
This will need a change notice.
Comment #59
BerdirComment #60
webchick.
Comment #61
star-szrChange notice ready for review: http://drupal.org/node/1597378
Comment #62
BerdirLooks good to me. If you're as confused as I am about the current behavior (why it's an opt-out system and not opt-in like normal queries), that is being discussed in #997394: field_sql_query hardcodes a 'node_access' tag on EFQs with a fieldCondition.
Comment #63
David_Rothstein CreditAttribution: David_Rothstein commentedWent ahead and added this to CHANGELOG.txt now: http://drupalcode.org/project/drupal.git/commit/746b5ac
Question: Should the change notice (http://drupal.org/node/1597378) also refer to the fact that field storage engines are now supposed to respond to the new DANGEROUS_ACCESS_CHECK_OPT_OUT tag? Or do we think there are so few field storage engines out there that it's not worth mentioning?
Comment #65
xjmComment #65.0
xjmfixed link to field_has_data()