There used to be an issue related to this, but it was closed, so I'm creating a new one so it doesn't get lost.

It'd be great if we could use metadata for audio files from the id3 tags, which is gathered by getid3.

I'm willing to write the code, but I need some direction. What would be the best way to store/retrieve this data?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

quicksketch’s picture

This data is already stored is it not? You can load the file with field_file_load($fid) and the file data should be included within the returned $file object if the file was uploaded after enabling the filefield_meta module.

moonray’s picture

No there is only basic data that's stored: audio_format, audio_sample_rate, audio_channel_mode, audio_bitrate, audio_bitrate_mode

I would like to see things like id3 data: title, artist, album, track, year, etc.

quicksketch’s picture

Ah, I see. You might try corresponding with drewish and browsing the Audio module issue queue, as I believe he is hoping to make Audio module depend on FileField eventually. Considering that Audio module already provides this exact functionality, merging the interests of FileField and Audio would be a huge boon to the community.

davebv’s picture

is there any news on this?

moonray’s picture

nope.

davebv’s picture

is there anybody working on this already? I started to do something like the audio module does when reading an audio file.

Maybe I will have a patch by the weekend.

davebv’s picture

FileSize
7.72 KB

i attach a patch. I apologize if I did wrong, I am kind of new to developing for drupal.

What I do is to save in the new database the id3 tags.

The tags are available from $file->data['tags']

Please, tell me if you give a try. thanks!

Description of changes:
Added new table "filefield_meta_audio"

Added some functions from Audio module.
Get the info from id3tags and store them in the new table "filefield_meta_audio".
The tags are put in $file->data['tags']

I hope this is ok, but I am not so sure about the workflow I did.

davebv’s picture

has anyone tried the patch?

davebv’s picture

FileSize
7.93 KB

I cleaned up and added a couple of lines to the files, updated new patch.

Added to the cron hook to delete the entries in the new filefield_meta_audio table.

davebv’s picture

FileSize
7.99 KB

Solved an incompatibility with audio module

quicksketch’s picture

Status: Active » Needs work

I thought I posted a comment on this a few days ago, but I must've never finished it.

Anyway, this patch still has some work that needs to be done on it. Some configuration settings don't make sense in FileField Meta, things like "browsable" or "autocomplete". Is there any way to edit these tags once they've been set (or should there be)?

The call to filefield_meta_clean_tag() contains all kinds of unknown characters. It seems strange to do all this conversion ourselves, maybe we should look into using Transliteration module, which does the same thing but much, much better.

Rather than this crude error assembly:

$error .= '<ul><li>'. implode('</li><li>', $info['error']) .'</li></ul>';

We should do something like this to make our UL

$error .= theme('item_list', $info['error']);

None of this data does us much good if there isn't Views implementation, that should be included as part of this feature also.

davebv’s picture

FileSize
10.07 KB

Hi again, I modified the function to clean the strange characters, if transliteration is present, now it gets the string from there. Now all the fields are stored, but for views, only defined fields are computed. (Views does not work yet, but could be starting point)

The error is not themed, I am not sure about how to do it properly.

Thanks for the feedback

davebv’s picture

Did anyone tried the patch?

pbuyle’s picture

FileSize
693 bytes

Correct me if I'm wrong, but it seems the patch store the ID3 tags for files in a dedicated table. But fielfield already provides a kind of simple extensible storage usable for files metadata: the 'data' field of a $file object returned by field_file_load($fid). filefield_meta already store the extracted metadata in this 'field'. Wouldn't it logic to also store the ID3 tags there. It is of course not as powerful as a dedicated database, but for more advanced usage, a dedicated module (such as Audio) could re-use the already extracted metadata and even duplicate them in its down table if needed. This way, filefield_meta is kept simple but rich enough to provides the foundation for other modules to build advanced features.

This can be done with four lines of code as in the attached patch. The tags are then available in $file->data['tags']['id3'].

Again, this is a simple solution. Cleaning the data and the selection of what is displayed/used by its features is the consuming module responsibility.

Flying Drupalist’s picture

I will try this patch once I get filefield meta working at all: http://drupal.org/node/349693

davebv’s picture

Thanks, It would be great to have all that information in the metadata

davebv’s picture

Filefield meta seems to work now, but the getid3 patch info is not available yet. How can we help to get this done? Thanks for the work!

davebv’s picture

I tried the patch from http://drupal.org/node/480754#comment-1830770 for a few weeks and seems to be working really well. It would be great to have it in the main release.

davebv’s picture

I realized the last patch was a little bit incongruent with the way metadata stores the information, so I coded again with also giving views capability.

With the new patch, the field ['data'] is an array containing the following:
[data] => array(
[description] => []
[duration] => [266.762375]
[height] => [0]
[width] => [0]
[audio_bitrate_mode] => [cbr]
[audio_channel_mode] => [stereo]
[audio_format] => [mp3]
[audio_bitrate] => [192000]
[audio_sample_rate] => [44100]
[title] => [Blind]
[artist] => [MzW!!]
[album] => [Seven]
[track_number] => [1]
[recording_time] => [2006]
[genre] => [Drum & Bass]
[year] => [2006]

The fields are filled from the id3v2 tags of the file.

And the album, title, track number, genre and year now are exposed to views.

(The attached file is the patch for the project, 2 files have been modified, filefield_meta.module and filefield_meta.vews.inc)

The diff was against the latest cvs 3.x checkout

jdelaune’s picture

You're probably going to want to add audio_ to the front, or something similar (Can you get ID3 data on video files?) since filefield_meta stores info on images etc as well. Just a thought. Looking good though. Will compliment my mp3 player module well.

davebv’s picture

I applied your suggestion by prepending audio_ to the key in the array so now the fields are like this:

[audio_title] => [Blind]
[audio_artist] => [MzW!!]
[audio_album] => [Seven]
[audio_track_number] => [1]
[audio_recording_time] => [2006]
[audio_genre] => [Drum & Bass]
[audio_year] => [2006]

davebv’s picture

Added TOKENS to the previous patch:

The tokens are for the metadata information, and it works with filefield_paths.

chiebert’s picture

I've been following the progress on this issue for a week or so, applying the patches on dev versions of filefield in my development environment. The data's getting in there. I'm just at a bit of a loss as to how to access the information in a view. I have, for example, node-based view filtering for my podcast node type, which has a filefield for mp3 files. A block display of 'fields' row style, with a relationship set up on the mp3 filefield 'fid'. When I go to add fields to the display, I find the id3 tags in the 'file' category, but when I add any of them (other than the bitrate, bitrate mode, and bitrate mode), I get 'Error: handler for filefield_meta > audio_artist doesn't exist!' (for example).

How are you making use of this tag data?

VM’s picture

per #12 views doesn't work yet

davebv’s picture

I just tried to access views info from "file" view type. Views are not exported in node type... (do not know why)

Apart from that, I am using the tokens and the id3 information from the templates and works as expected. May you review the latest patch?

pbuyle’s picture

Hi,

The idea for my patch in comment #14 is for filefield_meta to expose id3 (v1 and v2) to dedicated modules, not for direct usage. That's why it stores the whole 'tags' array from getid3() in filefield_meta's data. IMHO, the same should be done for all getID3() returned data/tags (yes, getID3() reads more than id3).

Exposing tags directly in data (and not in a nested array), cleaning, formating, etc. should be done in dedicated modules. Its more flexible for future usages as the same tags may be used differently by different modules. For instance, using the ID3 title tags as audio_title means that a generic modules willing to handle titles for both audio and image files would have to use audio_title and whatever field is used to store EXIF title (ImageTitle or ImageDescription), or worse re-run getID3() to extract EXIF data itself. But if filefield_meta store all getID3() extracted data/tags, it will have to use a know and documented data structure[1].

If storing all getID3() extracted data in filefield_meta's data is too much overhead for some sites. A separate module (with and heavy weight) can later clean it.

Another solution, would be to have a dedicated hook invoked by fielfield_meta to handle getID3()'s extracted data and to merge the results of this hook in the data field. Something like hook_filefield_meta_data($info).

[1] Relying on getID3() data structure means that developer already knowing it won't have to learn to much filefield_meta specific data/tags naming and structure. While those learning getID3() data/tags naming and structure to use in filefield_meta get knowledge re-usable outside filefield_meta (and even Drupal). getID3() structure is already documented at http://getid3.sourceforge.net/source/structure.txt

davebv’s picture

That patch still lacks of views and tokens. I tried to do that but I did not know how to expose recursive information to views.

I kind of agree about saving all the structure array, but maybe it's too complex to access the ifnormation and decide what to allow into views and tokens.

This would be a good debate about what to do with metadata information for files.

nicholas.alipaz’s picture

any updates on this? seems it has been left here since October 2009.

kvbbro’s picture

subscribing

quicksketch’s picture

Weighing in my opinion here. I think gathering ID3 tag information is great but I think there will likely be some limitations. Specifically I don't really think it's worthwhile to store this data in individual columns or in a new dedicated table for efficiency reasons. This mostly because I don't think that you'd want to sort by Artist, Album name, year, genre, or any ID3 tags directly. Sorting on any of these columns is going to be quite inefficient, since none of these things are consistent or given numeric IDs.

So instead I think it would make sense to have a single serialized column for all this extra information, making it available within the $node and $file objects but not directly sortable through SQL.

Instead I think it would make sense (eventually) to provide mapping settings for the content type and automatically prepopulate other fields after a file has been uploaded. That is if you upload an MP3, it would automatically prepopulate corresponding Taxonomy fields with year and genre, an artist node reference field, and the node title all through JavaScript. Which ID3 tags map to which fields is configurable in case you want to use text fields instead of taxonomies or node references. Then you have much better mechanisms for organizing and searching this data. The use of ID3 tags is almost entirely a matter of convenience rather than something that is used regularly for data retrieval. It also allows administrators to modify this information and it means that providing tokens and (most) views support isn't really needed since they'll all be provided by other CCK fields or Taxonomy already

Flying Drupalist’s picture

Totally, that's my wish as well.

quicksketch’s picture

Status: Needs work » Needs review
FileSize
11.7 KB

Okay here's a take at this functionality. It's based off davebv's excellent work so far.

- Provides direct reading and storage of ID3 tags, completely unmodified in a single "tags" column in the filefield_meta table.
- Includes an upgrade path to batch import ID3 tags for all existing files.
- Uses a "hidden" variable "filefield_meta_tags" to determine which tags are supported by default. Users may override this if they wish to support other ID3 tags. We may also form_alter() the admin/settings/getid3 form to make this a publicly accessible option, though translation is an issue there.
- Provides working Views integration for accessing these tags.
- Provides working Token integration (same as davebv's patch, except I moved it to filefield_meta.token.inc) for the supported tags.

It does not provide the fancy ID3 tag to CCK/Taxonomy/Title mapping I described in #30, but it's a solid start and I'd be happy to get this in and then work on the JavaScript matching of values in a later iteration.

Speaking of future iterations of this module, anyone that has input on #780848: Merge getID3 with FileField Meta in Drupal 7, it would be good to figure out what the future of this module is now that FileField has been moved into core.

johnhanley’s picture

subscribing

jthomasbailey’s picture

The patch seems to be working (I originally had errors applying it, uninstalled Filefield and reinstalled the newest Dev, May-08 and it worked)

I don't think the Tokens are working though, unless I'm missing something. But enabling PHP evaluation in Automatic Nodetitles for example and entering something like <?php print $node->field_mp3[0]['data']['tags']['artist'] ?> - <?php print $node->field_mp3[0]['data']['tags']['title'] ?> does the trick.

Works perfectly with Contemplate.

asb’s picture

sub

deadman’s picture

Tested the patch in #34 and works ok although for files with large amounts of metadata, the serialized array is truncated when stored in the database and therefore is corrupted.

In my test case I was using ogg files with embedded coverart. As the coverart is added to the "tags_html" array, it is serialized along with the rest of the tags when saved to the database. My temporary solution was to adjust the code to discard the coverart tag:

  // Add in arbitrary ID3 tags.
  if (isset($info['tags_html'])) {
    // We use tags_html instead of tags because it is the most reliable data
    // source for pulling in non-UTF-8 characters according to getID3 docs.
    foreach ($info['tags_html'] as $type => $values) {
      // Typically $type may be IDv2 (for MP3s) or quicktime (for AAC).
      foreach ($values as $key => $value) {
        $value = isset($value[0]) ? (string) $value[0] : (string) $value;
        if (!empty($value) && $key != 'coverart') {
          $file->data['tags'][$key] = html_entity_decode($value, ENT_QUOTES, 'UTF-8');
        }
      }
    }
  }

There's probably a better way of handling that situation though (checking size of each tag?) as other tags may produce a similar result.

quicksketch’s picture

Status: Needs review » Fixed
FileSize
11.84 KB

Thanks deadman. I incorporated your changes and included them in the attached patch, which I've committed.

deadman’s picture

Fantastic, glad to be of assistance.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Flying Drupalist’s picture

What about the CCK integration mentioned in #30

A Romka’s picture

any way of editing tags and saving those tags in mp3 files?

BeaPower’s picture

I tried to apply this to Filefield v. 3.9 and I get errors while trying to patch with ssh. Any fix for this?

VM’s picture

A) when you get errors you should post them

B) This patch is already part of filefield 6.x-3.9 based on the date of this thread and when the patch was committed to -dev.

BeaPower’s picture

How can I get to show in views, I am not with relationships - author, title, genre, etc.