Problem/Motivation
In #1447790: image_load makes admin/content/media unusably slow the {image_dimensions} has been added to cache image dimensions and avoid re-reading an image file when its dimensions are needed.
Instead of doing it only for image dimensions, the File entity module should provide a generic framework for its consumers to retrieve files meta-data. This way we could prevent having the same kind of performance issue with other kind of meta-data (EXIF, ID3, etc.).
Proposed resolution
On file insert, update and load the meta-data for a file should be read and stored in some kind of generic store and added as an array in a property of the $file
object (something like $file->metadata
).
Getting a file meta-data should of a file should be pluggable. Additional modules should be able to provide meta-data to File entity.
The meta-data storage should be flexible enough for any type of meta-data (string, number but also more complex types).
The main goal should be to provide an unified solution for consumers of file entities to efficiently retrieve meta-data when using file_load
and file_load_multiple
. To keep things simple, providing efficient querying over the meta-data should be the concern of separated and dedicated modules. For instance, a module could store selected ID3 audio and video meta-data as fields values while another could store the selected EXIF meta-data in a dedicated table exposed to views.
An alternative solution may be to lazy-load meta-data when requested once we have a FileEntity class (needs #1361226: Make the file entity a classed object which is blocked by #1401558: Remove the usage handling logic from file_delete()). This way we avoid the overhead of loading too much rarely used data on file loads.
The two approaches could also be combined with basic meta-data loaded on file loads, but with lazy-loaded extended meta-date lazy-loaded. For instance, image dimensions and base ID3 tags (title, artists, album, etc.) could be loaded on file load while things like video/audio play time, bitrate, channels number, codec, etc. could be lazy-loaded (in groups) when requested.
API changes
A new hook or plugin should be added to allow modules to provide their own meta-data.
Comment | File | Size | Author |
---|---|---|---|
#20 | 1496942-20-metadata-api.patch | 19.95 KB | Devin Carlson |
#17 | 1496942-metadata-api-17.patch | 14.53 KB | aaron |
#16 | 1496942-metadata-api-16.patch | 14.6 KB | aaron |
#14 | 1496942-metadata-api-14.patch | 14.59 KB | aaron |
#10 | 1496942-metadata-api.patch | 14.52 KB | Dave Reid |
Comments
Comment #0.0
pbuyle CreditAttribution: pbuyle commentedtypo
Comment #0.1
pbuyle CreditAttribution: pbuyle commentedAdd lazy-loading idea.
Comment #1
Dave ReidComment #2
Dave ReidWe are going to go ahead with a {file_metadata} table and convert the {image_dimensions} table to use the new format since it will be compatible with the new Entity Property API for D8. The only things that should be stored in this table are things that can be extracted from the raw file in the file system without any kind of Drupal context: things like image dimensions, audio track length, video size and length, etc.
Comment #3
aaron CreditAttribution: aaron commentedthis was discussed in very early meetings, as far back as 2008. It is a very good idea, that should allow for all kinds of cool uses, such as integration with GetID3, storing YouTube information, such as title and duration, and lots of other useful tidbits
Comment #4
Dave ReidI would really like to add this to the blocker list as something to push for in D8 before freeze and since it adds an API I want people to start being able to use it.
Comment #5
Devin Carlson CreditAttribution: Devin Carlson commentedMarked #1845958: Alt and title support for all display format. as a duplicate.
Comment #6
Dave ReidInitial patch adding a metadata API.
Comment #7
Dave ReidRevised patch, still need to remove some references to image_dimensions.
Comment #8
Dave ReidOne more version.
Comment #10
Dave ReidRevised patch that fixes a major bug in the new function.
Comment #12
Devin Carlson CreditAttribution: Devin Carlson commented#10: 1496942-metadata-api.patch queued for re-testing.
Comment #14
aaron CreditAttribution: aaron commentedComment #15
aaron CreditAttribution: aaron commentedComment #16
aaron CreditAttribution: aaron commentedComment #17
aaron CreditAttribution: aaron commentedComment #19
Dave ReidComment #20
Devin Carlson CreditAttribution: Devin Carlson commentedA patch to address a number of small issues with #10 (mainly using db_merge instead of db_insert and accommodating changes in tests).
Comment #21
aaron CreditAttribution: aaron commentedbravo!
Comment #22
Devin Carlson CreditAttribution: Devin Carlson commentedCommitted #20 to File entity 7.x-2.x. Thanks everyone!
Please file separate follow-up issues for any enhancements you require, locations the API could be implemented or troubles with the upgrade path you come across, etc.
Comment #23.0
(not verified) CreditAttribution: commentedAnd related issues references.