File and image fields have an odd pattern of whenever an entity is loaded that has file or image fields attached, we do a full file_load() on every referenced file in file_field_load() and image_field_load(). So given an loaded entity, $entity->field_name[langcode][delta] looks like the following:
array (
'fid' => '1',
'display' => '1',
'description' => NULL,
'uid' => '1',
'filename' => 'filename.jpg',
'uri' => 'public://filename.jpg',
'filemime' => 'image/jpeg',
'filesize' => '62722',
'status' => '1',
'timestamp' => '1328200488',
'rdf_mapping' =>
array (
),
)
This data also gets cached into {cache_field}. This also means once we make file entities fieldable with File entity, then all the field data for that file alsoget cached with that node as well. Whenever someone edits the fields on that file and saves, we now have stale data in the field cache for any nodes that reference that file. This problem has arisen with #1431070: Field cache is not cleared on file-entity-update. This also breaks our pattern of keeping only what we need with the loaded entity - we only need the file ID. $entity->field_name[langcode][delta] SHOULD just be the following since that's only what the field schema stores:
array (
'fid' => '1',
'display' => '1',
'description' => NULL,
)
We can load the file objects using hook_field_prepare_view() when we need to. We also have a similar problem with text_field_load() running text processing on items on node load.
| Comment | File | Size | Author |
|---|---|---|---|
| #31 | 0001-Issue-1446464-avoid-file_field_load-on-empty-image-f.patch | 1.24 KB | fgm |
| #30 | 0001-Issue-1446464-avoid-file_field_load-on-empty-image-f.patch | 959 bytes | fgm |
| #29 | image_file.png | 60.48 KB | fgm |
| #25 | field_image-before-patch.png | 29.49 KB | michaellenahan |
| #25 | field_image-after-patch.png | 27.3 KB | michaellenahan |
Comments
Comment #1
catchYeah the right place to load referenced entities is in entity/field_prepare_view() - you can do multiple load in there so it's cheap, and saves duplicating information in the cache.
Comment #2
dave reidComment #3
aaron commentedassigning this to myself. Patch forthcoming.
Comment #4
aaron commentedand here we go
Comment #5
cfennell commentedI manually tested this: uploaded image, attached it to a node, removed it from the node. One minor nit, the $fids array initialization was removed in the above patch in #4; I added that back in.
Comment #6
cfennell commentedSince $item no longer has the full file object metadata payload, we now need to use file_load($item['fid']) to get the file info rather than type casting the $item array. There are almost certainly more such cases in the file, but we can address them as problems arise.
Comment #7
cfennell commentedAn NOW without a trailing whitespace patch error.
Comment #8
emmajane commentedPatch applies cleanly. Single file (and multiple files) can be added and removed without error. Files are unlinked and also deleted from the file system.
Comment #10
cfennell commented#7: load-files-on-view-1446464-7.patch queued for re-testing.
Comment #12
berdirAwesome, this probably doesn't apply anymore because the presave hook is gone :)
This would also conflict with #1361226: Make the file entity a classed object, so let's wait with a re-roll until that one is in. Also let's have another look at how and what we're passing to theme_file_link().
Comment #13
catchBumping to major bug, I bet we don't clear the field cache when those file objects get updated either. This is probably too big a change to backport to D7 though unfortunately.
Comment #14
berdirNodes and comments load user properties and add it to the main object, that results in exactly the same problem when using persistent entity caches. Not sure if there is already a bug report and if it should be major (because core doesn't do persistent entity caching, but supports pluggable caching) but it's something I want to fix.
Comment #15
berdirCompletely untested re-roll...
Comment #17
berdirI'm not sure if the tests passed before, but I don't think this could have been caused by the file entity change.
Obviously, after this change, we can't do an (object) cast anymore of the field item and expect a file object, so I converted all of them to a file_load().
That we are loosing some properties is for the tests not relevant, unlike the similar, reverted changes in the file entity patch. We know that file_link isn't going to be using them and most checks are just about the uri and filename.
There is also one instance that needs to be changed in file_file_download(). I made the change there as simple as possible, it will conflict with #1245220: file_file_download() passed bogus $field to field_access() anyway. Could be improved to do a single load although that has the downside that we might load more than necessary.
Comment #18
berdirAnd now with patch.
Comment #20
berdirMore similar test fixes for the image tests. This one should pass.
Comment #21
berdirComment #22
tim.plunkettNeeds reroll for the PSR-0-ification of test classes.
Comment #23
tim.plunkettRerolled.
Comment #24
tim.plunkettUgh.
Comment #25
michaellenahan commentedDoing this as part of core office hours, with help from tim.plunkett.
http://drupalofficehours.org/task/198
* Added an image to an article.
* Created a very simple module, which implements hook_node_load().
See attachments for before and after the patch has been applied.
field_image output is now like this:
Comment #26
grendzy commentedThis looks good to me, I'm getting the same result with node_load() as the other reviewers. The contents of e.g. {field_data_field_image} are loaded, but not the file entity. The change is also consistent with the docs for hook_field_load() which say "never use this hook to load fieldable entities... use hook_field_formatter_prepare_view() instead".
Comment #27
catchPatch looks great. Gets rid of lots of nasty casting etc. committed/pushed to 8.x.
Comment #29
fgmNeeds backport to D7: image fields on 7.x still carry the whole set of information. cf screen copy.
In addition, although not completely the same topic, from profiling an actual site where this information was taken, it looks like we could just save significant PHP time by not calling file_field_load() if empty($items), like:
Just a quick patch to see what breaks if we add this optimization.
Comment #30
fgmMissing patch..
Comment #31
fgmPrevious patch was wrong: $items is indexed per-entity. Rerolled.
Comment #32
tim.plunkettComment #33
davidwbarratt commentedI think this should be changed to something like:
which is a more common pattern in Drupal, I think it's also more efficient since you don't need to know how many items are in the array, just that it's not empty.
Comment #34
fgmThe formulations are not strictly equivalent, for example
count(FALSE) > 0isTRUE, while!empty(FALSE)isFALSE.This being said, is normal use cases of the field structure, they /should/ always be equivalent : content is always expected to be
NULL(equivalent) or an array (always equivalent ?).Comment #35
slashrsm commentedComment #36
slashrsm commentedMeh...