filefield_meta: store audio metadata from getid3
moonray - June 3, 2009 - 12:19
| Project: | FileField |
| Version: | 6.x-3.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | moonray |
| Status: | needs work |
Description
There used to be an issue related to this, but it was closed, so I'm creating a new one so it doesn't get lost.
It'd be great if we could use metadata for audio files from the id3 tags, which is gathered by getid3.
I'm willing to write the code, but I need some direction. What would be the best way to store/retrieve this data?

#1
This data is already stored is it not? You can load the file with field_file_load($fid) and the file data should be included within the returned $file object if the file was uploaded after enabling the filefield_meta module.
#2
No there is only basic data that's stored: audio_format, audio_sample_rate, audio_channel_mode, audio_bitrate, audio_bitrate_mode
I would like to see things like id3 data: title, artist, album, track, year, etc.
#3
Ah, I see. You might try corresponding with drewish and browsing the Audio module issue queue, as I believe he is hoping to make Audio module depend on FileField eventually. Considering that Audio module already provides this exact functionality, merging the interests of FileField and Audio would be a huge boon to the community.
#4
is there any news on this?
#5
nope.
#6
is there anybody working on this already? I started to do something like the audio module does when reading an audio file.
Maybe I will have a patch by the weekend.
#7
i attach a patch. I apologize if I did wrong, I am kind of new to developing for drupal.
What I do is to save in the new database the id3 tags.
The tags are available from $file->data['tags']
Please, tell me if you give a try. thanks!
Description of changes:
Added new table "filefield_meta_audio"
Added some functions from Audio module.
Get the info from id3tags and store them in the new table "filefield_meta_audio".
The tags are put in $file->data['tags']
I hope this is ok, but I am not so sure about the workflow I did.
#8
has anyone tried the patch?
#9
I cleaned up and added a couple of lines to the files, updated new patch.
Added to the cron hook to delete the entries in the new filefield_meta_audio table.
#10
Solved an incompatibility with audio module
#11
I thought I posted a comment on this a few days ago, but I must've never finished it.
Anyway, this patch still has some work that needs to be done on it. Some configuration settings don't make sense in FileField Meta, things like "browsable" or "autocomplete". Is there any way to edit these tags once they've been set (or should there be)?
The call to filefield_meta_clean_tag() contains all kinds of unknown characters. It seems strange to do all this conversion ourselves, maybe we should look into using Transliteration module, which does the same thing but much, much better.
Rather than this crude error assembly:
$error .= '<ul><li>'. implode('</li><li>', $info['error']) .'</li></ul>';We should do something like this to make our UL
$error .= theme('item_list', $info['error']);None of this data does us much good if there isn't Views implementation, that should be included as part of this feature also.
#12
Hi again, I modified the function to clean the strange characters, if transliteration is present, now it gets the string from there. Now all the fields are stored, but for views, only defined fields are computed. (Views does not work yet, but could be starting point)
The error is not themed, I am not sure about how to do it properly.
Thanks for the feedback
#13
Did anyone tried the patch?
#14
Correct me if I'm wrong, but it seems the patch store the ID3 tags for files in a dedicated table. But fielfield already provides a kind of simple extensible storage usable for files metadata: the 'data' field of a $file object returned by field_file_load($fid). filefield_meta already store the extracted metadata in this 'field'. Wouldn't it logic to also store the ID3 tags there. It is of course not as powerful as a dedicated database, but for more advanced usage, a dedicated module (such as Audio) could re-use the already extracted metadata and even duplicate them in its down table if needed. This way, filefield_meta is kept simple but rich enough to provides the foundation for other modules to build advanced features.
This can be done with four lines of code as in the attached patch. The tags are then available in $file->data['tags']['id3'].
Again, this is a simple solution. Cleaning the data and the selection of what is displayed/used by its features is the consuming module responsibility.
#15
I will try this patch once I get filefield meta working at all: http://drupal.org/node/349693
#16
Thanks, It would be great to have all that information in the metadata
#17
Filefield meta seems to work now, but the getid3 patch info is not available yet. How can we help to get this done? Thanks for the work!
#18
I tried the patch from http://drupal.org/node/480754#comment-1830770 for a few weeks and seems to be working really well. It would be great to have it in the main release.
#19
I realized the last patch was a little bit incongruent with the way metadata stores the information, so I coded again with also giving views capability.
With the new patch, the field ['data'] is an array containing the following:
[data] => array(
[description] => []
[duration] => [266.762375]
[height] => [0]
[width] => [0]
[audio_bitrate_mode] => [cbr]
[audio_channel_mode] => [stereo]
[audio_format] => [mp3]
[audio_bitrate] => [192000]
[audio_sample_rate] => [44100]
[title] => [Blind]
[artist] => [MzW!!]
[album] => [Seven]
[track_number] => [1]
[recording_time] => [2006]
[genre] => [Drum & Bass]
[year] => [2006]
The fields are filled from the id3v2 tags of the file.
And the album, title, track number, genre and year now are exposed to views.
(The attached file is the patch for the project, 2 files have been modified, filefield_meta.module and filefield_meta.vews.inc)
The diff was against the latest cvs 3.x checkout
#20
You're probably going to want to add audio_ to the front, or something similar (Can you get ID3 data on video files?) since filefield_meta stores info on images etc as well. Just a thought. Looking good though. Will compliment my mp3 player module well.
#21
I applied your suggestion by prepending audio_ to the key in the array so now the fields are like this:
[audio_title] => [Blind]
[audio_artist] => [MzW!!]
[audio_album] => [Seven]
[audio_track_number] => [1]
[audio_recording_time] => [2006]
[audio_genre] => [Drum & Bass]
[audio_year] => [2006]
#22
Added TOKENS to the previous patch:
The tokens are for the metadata information, and it works with filefield_paths.
#23
I've been following the progress on this issue for a week or so, applying the patches on dev versions of filefield in my development environment. The data's getting in there. I'm just at a bit of a loss as to how to access the information in a view. I have, for example, node-based view filtering for my podcast node type, which has a filefield for mp3 files. A block display of 'fields' row style, with a relationship set up on the mp3 filefield 'fid'. When I go to add fields to the display, I find the id3 tags in the 'file' category, but when I add any of them (other than the bitrate, bitrate mode, and bitrate mode), I get 'Error: handler for filefield_meta > audio_artist doesn't exist!' (for example).
How are you making use of this tag data?
#24
per #12 views doesn't work yet
#25
I just tried to access views info from "file" view type. Views are not exported in node type... (do not know why)
Apart from that, I am using the tokens and the id3 information from the templates and works as expected. May you review the latest patch?
#26
Hi,
The idea for my patch in comment #14 is for filefield_meta to expose id3 (v1 and v2) to dedicated modules, not for direct usage. That's why it stores the whole 'tags' array from getid3() in filefield_meta's data. IMHO, the same should be done for all getID3() returned data/tags (yes, getID3() reads more than id3).
Exposing tags directly in data (and not in a nested array), cleaning, formating, etc. should be done in dedicated modules. Its more flexible for future usages as the same tags may be used differently by different modules. For instance, using the ID3 title tags as audio_title means that a generic modules willing to handle titles for both audio and image files would have to use audio_title and whatever field is used to store EXIF title (ImageTitle or ImageDescription), or worse re-run getID3() to extract EXIF data itself. But if filefield_meta store all getID3() extracted data/tags, it will have to use a know and documented data structure[1].
If storing all getID3() extracted data in filefield_meta's data is too much overhead for some sites. A separate module (with and heavy weight) can later clean it.
Another solution, would be to have a dedicated hook invoked by fielfield_meta to handle getID3()'s extracted data and to merge the results of this hook in the data field. Something like hook_filefield_meta_data($info).
[1] Relying on getID3() data structure means that developer already knowing it won't have to learn to much filefield_meta specific data/tags naming and structure. While those learning getID3() data/tags naming and structure to use in filefield_meta get knowledge re-usable outside filefield_meta (and even Drupal). getID3() structure is already documented at http://getid3.sourceforge.net/source/structure.txt
#27
That patch still lacks of views and tokens. I tried to do that but I did not know how to expose recursive information to views.
I kind of agree about saving all the structure array, but maybe it's too complex to access the ifnormation and decide what to allow into views and tokens.
This would be a good debate about what to do with metadata information for files.