Posted by aaron on September 2, 2011 at 7:24pm
28 followers
| Project: | Media |
| Version: | 7.x-2.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | major |
| Assigned: | aaron |
| Status: | needs review |
| Issue tags: | file usage, Media Initiative, sprint |
Issue Summary
file_usage_list() does not reflect media added through WYSIWYG. it does when added through file, image, or media element fields.
Comments
#1
also see #1222576: file_usage_list() is defective (might return incomplete results), which will make this information even more crucial (especially as a views filter).
#2
It's not just about file_usage_add() (which would probably require our filter system to provide context information - #226963: Context-aware filters), but ensuring we can also call file_usage_delete() when appropriate which I'm not sure we'd be able to do.
#3
this likely depends on #226963: Context-aware filters, although Dave Reid pointed out on IRC we might be able to discover the context with debug_backtrace(), if we wanted to put it through the wringer... :P
#4
#5
#6
Combined with Drupal's "annoying" behavior of automatically, without warning, deleting files/things when it thinks its not used anymore this is a big problem.
I started a discussion about this on g.d.o yesterday: http://groups.drupal.org/node/191183 and also added a great use case of how bad it can get today.
To sort this out is tricky though. But letting Drupal decide what should be deleted or not is definitely not the right way of doing it!
#7
Bumping priority
#8
When I started my work three months ago with media, I saw this problem, I started a sandbox project designed by tracking an "advanced files usage" of media module. Since then the media module changed a lot, but this was my idea to solve this:
1. Using hook_node_insert to add file_usage record to database.
2. Using hook_node_update to compare embed media files in old and new version of node, and update file_usage if necessary.
3. In hook_node_delete remove the file_usage record.
To do this, we doesn't need context aware filters.
#9
If this goes in, media would also be the right place for this patch I've created: #1193036-13: Handle images inserted as binary blobs on Drag & Drop.
Goal is to enable drag & drop / copy & paste in the WYSIWYG editor.
I've talked to sun and as far as I understand him the main issue he sees with the patch is that the file usage wouldn't be tracked.
#10
@szantog: Like your idea, was actually discussing the exact same approach with one of our developers last Friday.
Two questions:
#11
@tsvenson, Yes, on the revision question was that, where my work paused - I completely misunderstood the file_usage system. Now, I think it should be work exactly the same, like a filefield.
I saw, where the file_usage_add funtion is used: The http://api.drupal.org/api/drupal/modules--file--file.field.inc/function/... and http://api.drupal.org/api/drupal/modules--file--file.field.inc/function/...
Based on this, if a new node is created, lets start play with file usage:
1. We use two files, with fid 1 and 2, in node 1.
The file usage table is this:
Fid Module Type Id Count1 media node 1 1
2 media node 1 1
2. Update the node and create new revision, all of two files stay in use.
Fid Module Type Id Count1 media node 1 2
2 media node 1 2
3. Update the node and create new revision, delete the FID 1, and add fid 3.
Fid Module Type Id Count1 media node 1 2
2 media node 1 3
3 media node 1 1
4. Switch back to an earlier revision (a new revision is created!):
Fid Module Type Id Count1 media node 1 3
2 media node 1 4
3 media node 1 2
5. Delete the last revision in 3:
Fid Module Type Id Count1 media node 1 2
2 media node 1 3
3 media node 1 1
6. Delete the last revision in 4:
Fid Module Type Id Count1 media node 1 1
2 media node 1 2
I don't know, what about workbanch moderation, but I think, if this behavior (what is the same, as a normal filefield) isn't good for some other module, this module should fix it. As I think there shouldn't be problem with workbanch moderation. What revision is live - it's irrelevant for file_usage table, only need decrease or increase the counter, when revision is added/deleted.
If the line of idea is good, I can work with this on the next week.
#12
This is a complete hack, but here it is to get the ball rolling.
<?php
/**
* Implements hook_filter_info().
*/
function altvideo_filter_info() {
$filters['altvideo_filter'] = array(
'title' => t('Record WYSIWYG usage of media'),
'description' => t('This filter will record any file usage in WYSIWYG, in the context of a node. Must be before the %convert filter. Provided by the custom altvideo module.', array('%convert' => t('Converts Media tags to Markup'))),
'process callback' => 'altvideo_filter',
'tips callback' => 'media_filter_tips',
);
return $filters;
}
/**
* Discover & record any embedded media in the {file_usage} table.
*/
function altvideo_filter($original) {
$nid = &drupal_static('altvideo_nid');
// Get the stack of function calls.
$backtrace = debug_backtrace();
// We are looking for the node_load function call to get the NID
// that is being processed.
foreach ($backtrace as $function_call) {
// Log nodes
if ($function_call['function'] == 'node_load') {
if ($nid != $function_call['args'][0]) {
// Extract the nid.
$nid = $function_call['args'][0];
// Delete any prior file usage from this embed.
db_delete('file_usage')
->condition('module', 'altvideo')
->condition('id', $nid)
->execute();
}
$text = ' ' . $original . ' ';
preg_replace_callback("/\[\[.*? ]]/s", 'altvideo_token_to_markup', $text); // Remove the space after the ?
}
}
return $original;
}
/**
* Replace callback to convert tag into markup
* @param string $match
* Takes a match of tag code
* @return
* Return the replaced markup
*/
function altvideo_token_to_markup($match) {
$match = str_replace("[[", "", $match);
$match = str_replace("]]", "", $match);
$tag = $match[0];
try {
if (!is_string($tag)) {
throw new Exception('Unable to find matching tag');
}
$tag_info = drupal_json_decode($tag);
if (!isset($tag_info['fid'])) {
throw new Exception('No file Id');
}
if (!isset($tag_info['view_mode'])) {
// Should we log or throw an exception here instead?
// Do we need to validate the view mode for fields API?
$tag_info['view_mode'] = media_variable_get('wysiwyg_default_view_mode');
}
$file = file_load($tag_info['fid']);
if (!$file) {
throw new Exception('Could not load media object');
}
// Track the fid of this file in the {file_usage} table.
altvideo_track_usage($file->fid);
}
catch (Exception $e) {
}
}
function altvideo_track_usage($fid) {
$nid = &drupal_static('altvideo_nid');
$file = file_load($fid);
file_usage_add($file, 'altvideo', 'node', $nid);
}
?>
#13
This approach looks real clever specially using backtrace as I had not ever considered using backtrace this way. I've recently been looking into how the file_usage table is used for another module 'filedepot' that I'm porting over to D7 and wanted to note there is a file_usage_delete() and it could be used instead of the direct SQL to delete the usage record in the above code.
#14
Something is a bit off here...
If media module allows to upload files via an editor, then the resulting files are still part of media, and thus, part of your site's "media library."
Hence, since a file in a library inherently defines usage on its own, media module should declare a usage.
#15
No. Wrong.
You have a node reference field. You create nodes meant to be referenced. Those nodes are data and do not yet have any "usage". Reference some of those nodes. Save. De-reference some of those nodes. Save.
Do those nodes still exist? YES.
Repeat with any other entity reference type EXCEPT FILES. We shouldn't have to track usage for something that is uploaded into the "Media library" (aka just file_managed) that is not yet associated with any content.
#16
You nicely repeated what I stated in #1399846-3: Make the file deletion in file_field_delete_file() optional already:
In that case, the file entity might have further data (or even fields) attached to it. It cannot be deleted if it's still in use.
The file usage API is essentially what makes up a managed file. If you don't want that "managed" behavior, then don't use managed files.
But it sounds pretty clear to me that you actually do want to have that managed file behavior (because you explicitly want to treat files as entities, and thus, attach further data to them, and also retain them, even in the case when they are deleted).
However, both use-cases formally are a file usage on their own. The file is in use when any additional data is attached to it, and the file is also used permanently if it should be retained forever (as in a "media library" concept).
Thus, the root cause is that no usage is registered for a file. Register it.
It's not the job of a low-level File API to care for the diversity of possible use-cases. All it knows about is: "This is a file I'm supposed to manage. Is it still used anywhere? No? Alright, then it's garbage and can be deleted."
#17
I think Sun may has a point here about Media needing to register files. The default delete functionality that we have now is exactly what is desired for use cases such as profile photos, where the upload should be deleted if the user changes their photo or cancels their account. These images probably shouldn't be shown in the Media library either. Perhaps what Media needs is configuration that defines which files it should track (marking as used by media). Files that are not tracked by Media shouldn't be shown in the browser and should be deleted when removed from a file field, as per now.
On the same token, WYSIWYG file usage should also be tracked. This would prevent problems where someone wants to remove a file from the media library and accidentally breaks node(s) where the file is embedded into the body.
How does this sound?
#18
By sun's logic it sounds as if file_entity should be claiming ownership of files-as-entities, which would then get folded in to Drupal 8.
I think, however, that discussion is not related to this actual issue, which is that a file added to a node via WYSIWYG is, in fact, managed by that node and needs to be tracked as such.
#19
Yes, I agree with that, agentrickard. sun's argument has its merits, but belongs more properly at #1239056: Remove Media at content edit page cause the file to be completely deleted from server/database?. It has no bearing on whether to mark a usage for an embedded media instance.
#20
Is this the feature I need if I want to know where was the image posted originally, so I can add a link in the image gallery that says something like "image originally posted at X"?
Is there any other way to link the images to the nodes they were attached to in the first place?
Thanks.
#21
Until filters are context-aware, is this something that could be done on hook_entity_{insert/update/delete}? If this seems like a sound approach, I can write a patch.
#22
I think hook_field_attach_insert / update / delete, actually, which give you more detailed data about the field.
See http://api.drupal.org/api/drupal/modules!field!field.api.php/function/hook_field_attach_insert/7
#23
Yes this is actually possible to do with hook_field_attach_insert/update. I have some code that I would like to eventually upload to this issue but I'm currently unable to do so because of client code release policy. If anyone wants to work on it until I can resolve it they should be more than welcome to do so, but I also don't want people to waste time on something that I've already kinda solved. :/
#24
As soon as this is fixed I'd like to port this patch to the media module: #1193036-13: Handle images inserted as binary blobs on Drag & Drop.
At least if there are no objections.
#25
OK, here's an initial pass at it. A couple notes:
1. I am likely doing the field stuff wrong, particularly lines 57-68 in the patch. Any suggestions?
2. I'm a little concerned about the "media" namespace in the file_usage table. It seems like we could inadvertently wipe out other file_usage entries from elsewhere in the media module. Should this be something like "media_filter"?
#26
#27
The last submitted patch, media-filter-file_usage-1268116-25.patch, failed testing.
#28
Sorry, this patch should pass.
#29
Whoops, blew away the updated regex.
#30
I like that approach. Unfortunately, it looks as though it will not remove file usage when an entity embedding the media has been deleted.
#31
maybe that is a simple matter of implementing hook_field_attach_delete().
#32
Seems like you could just implement
hook_entity_delete()to remove all entries in the file_usage table. Does this seem good?#33
#34
Cleaned up the field stuff a bit. I'm sure it's still wrong.
#35
#36
Rerolled with a few small cleanups. (Mainly what I did is break _media_filter_parse_from_fields() up into smaller helper functions, since I need to reuse this code in another patch.)
I think the field code looks pretty reasonable. I'm not sure if using field_get_items() is technically correct here (since it will use the current language that the field would happen to be displayed in, which can change) but in practice it might not matter much since the media tags are likely to be the same across different translations of the text.