Files inserted by the WYSIWYG need to create a record in the Drupal {file_usage} table. Patch from #101 provides code that handles the insertion of images and properly accounts for node revisions as well as providing tests for various use cases raised in this issue.
How to test
To review this patch, please run the tests included with it with an eye for the conditions identified in #86, #89, #102, and #104
Notes
Further consideration may be needed for non-node entities.
Original description: file_usage_list() does not reflect media added through WYSIWYG. it does when added through file, image, or media element fields.
Comments
Comment #1
aaron CreditAttribution: aaron commentedalso see #1222576: file_usage_list() is defective (might return incomplete results), which will make this information even more crucial (especially as a views filter).
Comment #2
Dave ReidIt's not just about file_usage_add() (which would probably require our filter system to provide context information - #226963: Context-aware text filters (provide more meta information to the filter system)), but ensuring we can also call file_usage_delete() when appropriate which I'm not sure we'd be able to do.
Comment #3
aaron CreditAttribution: aaron commentedthis likely depends on #226963: Context-aware text filters (provide more meta information to the filter system), although Dave Reid pointed out on IRC we might be able to discover the context with debug_backtrace(), if we wanted to put it through the wringer... :P
Comment #4
Dave ReidComment #5
Dave ReidComment #6
tsvenson CreditAttribution: tsvenson commentedCombined with Drupal's "annoying" behavior of automatically, without warning, deleting files/things when it thinks its not used anymore this is a big problem.
I started a discussion about this on g.d.o yesterday: http://groups.drupal.org/node/191183 and also added a great use case of how bad it can get today.
To sort this out is tricky though. But letting Drupal decide what should be deleted or not is definitely not the right way of doing it!
Comment #7
Dave ReidBumping priority
Comment #8
szantog CreditAttribution: szantog commentedWhen I started my work three months ago with media, I saw this problem, I started a sandbox project designed by tracking an "advanced files usage" of media module. Since then the media module changed a lot, but this was my idea to solve this:
1. Using hook_node_insert to add file_usage record to database.
2. Using hook_node_update to compare embed media files in old and new version of node, and update file_usage if necessary.
3. In hook_node_delete remove the file_usage record.
To do this, we doesn't need context aware filters.
Comment #9
das-peter CreditAttribution: das-peter commentedIf this goes in, media would also be the right place for this patch I've created: #1193036-13: Firefox handling of Drag & Drop images: inserted as binary blobs in WYSIWYG editors.
Goal is to enable drag & drop / copy & paste in the WYSIWYG editor.
I've talked to sun and as far as I understand him the main issue he sees with the patch is that the file usage wouldn't be tracked.
Comment #10
tsvenson CreditAttribution: tsvenson commented@szantog: Like your idea, was actually discussing the exact same approach with one of our developers last Friday.
Two questions:
Comment #11
szantog CreditAttribution: szantog commented@tsvenson, Yes, on the revision question was that, where my work paused - I completely misunderstood the file_usage system. Now, I think it should be work exactly the same, like a filefield.
I saw, where the file_usage_add funtion is used: The http://api.drupal.org/api/drupal/modules--file--file.field.inc/function/... and http://api.drupal.org/api/drupal/modules--file--file.field.inc/function/...
Based on this, if a new node is created, lets start play with file usage:
1. We use two files, with fid 1 and 2, in node 1.
The file usage table is this:
2. Update the node and create new revision, all of two files stay in use.
3. Update the node and create new revision, delete the FID 1, and add fid 3.
4. Switch back to an earlier revision (a new revision is created!):
5. Delete the last revision in 3:
6. Delete the last revision in 4:
I don't know, what about workbanch moderation, but I think, if this behavior (what is the same, as a normal filefield) isn't good for some other module, this module should fix it. As I think there shouldn't be problem with workbanch moderation. What revision is live - it's irrelevant for file_usage table, only need decrease or increase the counter, when revision is added/deleted.
If the line of idea is good, I can work with this on the next week.
Comment #12
aaron CreditAttribution: aaron commentedThis is a complete hack, but here it is to get the ball rolling.
Comment #13
blainelang CreditAttribution: blainelang commentedThis approach looks real clever specially using backtrace as I had not ever considered using backtrace this way. I've recently been looking into how the file_usage table is used for another module 'filedepot' that I'm porting over to D7 and wanted to note there is a file_usage_delete() and it could be used instead of the direct SQL to delete the usage record in the above code.
Comment #14
sunSomething is a bit off here...
If media module allows to upload files via an editor, then the resulting files are still part of media, and thus, part of your site's "media library."
Hence, since a file in a library inherently defines usage on its own, media module should declare a usage.
Comment #15
Dave ReidNo. Wrong.
You have a node reference field. You create nodes meant to be referenced. Those nodes are data and do not yet have any "usage". Reference some of those nodes. Save. De-reference some of those nodes. Save.
Do those nodes still exist? YES.
Repeat with any other entity reference type EXCEPT FILES. We shouldn't have to track usage for something that is uploaded into the "Media library" (aka just file_managed) that is not yet associated with any content.
Comment #16
sunYou nicely repeated what I stated in #1399846-3: Make unused file 'cleanup' configurable already:
In that case, the file entity might have further data (or even fields) attached to it. It cannot be deleted if it's still in use.
The file usage API is essentially what makes up a managed file. If you don't want that "managed" behavior, then don't use managed files.
But it sounds pretty clear to me that you actually do want to have that managed file behavior (because you explicitly want to treat files as entities, and thus, attach further data to them, and also retain them, even in the case when they are deleted).
However, both use-cases formally are a file usage on their own. The file is in use when any additional data is attached to it, and the file is also used permanently if it should be retained forever (as in a "media library" concept).
Thus, the root cause is that no usage is registered for a file. Register it.
It's not the job of a low-level File API to care for the diversity of possible use-cases. All it knows about is: "This is a file I'm supposed to manage. Is it still used anywhere? No? Alright, then it's garbage and can be deleted."
Comment #17
rickvug CreditAttribution: rickvug commentedI think Sun may has a point here about Media needing to register files. The default delete functionality that we have now is exactly what is desired for use cases such as profile photos, where the upload should be deleted if the user changes their photo or cancels their account. These images probably shouldn't be shown in the Media library either. Perhaps what Media needs is configuration that defines which files it should track (marking as used by media). Files that are not tracked by Media shouldn't be shown in the browser and should be deleted when removed from a file field, as per now.
On the same token, WYSIWYG file usage should also be tracked. This would prevent problems where someone wants to remove a file from the media library and accidentally breaks node(s) where the file is embedded into the body.
How does this sound?
Comment #18
agentrickardBy sun's logic it sounds as if file_entity should be claiming ownership of files-as-entities, which would then get folded in to Drupal 8.
I think, however, that discussion is not related to this actual issue, which is that a file added to a node via WYSIWYG is, in fact, managed by that node and needs to be tracked as such.
Comment #19
aaron CreditAttribution: aaron commentedYes, I agree with that, agentrickard. sun's argument has its merits, but belongs more properly at #1239056: Remove Media at content edit page cause the file to be completely deleted from server/database?. It has no bearing on whether to mark a usage for an embedded media instance.
Comment #20
Codecaster-2 CreditAttribution: Codecaster-2 commentedIs this the feature I need if I want to know where was the image posted originally, so I can add a link in the image gallery that says something like "image originally posted at X"?
Is there any other way to link the images to the nodes they were attached to in the first place?
Thanks.
Comment #21
theunraveler CreditAttribution: theunraveler commentedUntil filters are context-aware, is this something that could be done on hook_entity_{insert/update/delete}? If this seems like a sound approach, I can write a patch.
Comment #22
agentrickardI think hook_field_attach_insert / update / delete, actually, which give you more detailed data about the field.
See http://api.drupal.org/api/drupal/modules!field!field.api.php/function/ho...
Comment #23
Dave ReidYes this is actually possible to do with hook_field_attach_insert/update. I have some code that I would like to eventually upload to this issue but I'm currently unable to do so because of client code release policy. If anyone wants to work on it until I can resolve it they should be more than welcome to do so, but I also don't want people to waste time on something that I've already kinda solved. :/
Comment #24
das-peter CreditAttribution: das-peter commentedAs soon as this is fixed I'd like to port this patch to the media module: #1193036-13: Firefox handling of Drag & Drop images: inserted as binary blobs in WYSIWYG editors.
At least if there are no objections.
Comment #25
theunraveler CreditAttribution: theunraveler commentedOK, here's an initial pass at it. A couple notes:
1. I am likely doing the field stuff wrong, particularly lines 57-68 in the patch. Any suggestions?
2. I'm a little concerned about the "media" namespace in the file_usage table. It seems like we could inadvertently wipe out other file_usage entries from elsewhere in the media module. Should this be something like "media_filter"?
Comment #26
theunraveler CreditAttribution: theunraveler commentedComment #28
theunraveler CreditAttribution: theunraveler commentedSorry, this patch should pass.
Comment #29
theunraveler CreditAttribution: theunraveler commentedWhoops, blew away the updated regex.
Comment #30
aaron CreditAttribution: aaron commentedI like that approach. Unfortunately, it looks as though it will not remove file usage when an entity embedding the media has been deleted.
Comment #31
aaron CreditAttribution: aaron commentedmaybe that is a simple matter of implementing hook_field_attach_delete().
Comment #32
theunraveler CreditAttribution: theunraveler commentedSeems like you could just implement
hook_entity_delete()
to remove all entries in the file_usage table. Does this seem good?Comment #33
theunraveler CreditAttribution: theunraveler commentedComment #34
theunraveler CreditAttribution: theunraveler commentedCleaned up the field stuff a bit. I'm sure it's still wrong.
Comment #35
Dave ReidComment #36
David_Rothstein CreditAttribution: David_Rothstein commentedRerolled with a few small cleanups. (Mainly what I did is break _media_filter_parse_from_fields() up into smaller helper functions, since I need to reuse this code in another patch.)
I think the field code looks pretty reasonable. I'm not sure if using field_get_items() is technically correct here (since it will use the current language that the field would happen to be displayed in, which can change) but in practice it might not matter much since the media tags are likely to be the same across different translations of the text.
Comment #37
aaron CreditAttribution: aaron commentedreroll
Comment #38
Jackinloadup CreditAttribution: Jackinloadup commentedTested patch in #37. Works as expected. Great work!
When adding the same file to the same node twice file usage reflects that.
I'm using the interface provided by #1286508-12: Provide a file usage report tab at file/%file/usage
Comment #39
Jackinloadup CreditAttribution: Jackinloadup commentedThis issue is affected by #1451316: Clean up wysiwyg-media.js
After applying images still work great but due to changes "applications" and I would assume other non image types will most likely not be tracked. Heads up to all you great peoples.
Comment #40
Jackinloadup CreditAttribution: Jackinloadup commentedThe attached patch works with #1451316: Clean up wysiwyg-media.js and corrects the issue I brought up in #39.
All new code in this patch is related to MEDIA_TOKEN_REGEX_ALT
Please review.
Comment #41
mrfelton CreditAttribution: mrfelton commentedSlight issue with this. I created a node and attached an image to an image field. The usage count at /file/1/usage (provided by #1286508: Provide a file usage report tab at file/%file/usage) showed the file was used once on my node. Then, I edited the same node and added the same file to the body through the WYSIWYG. I then had two entries for the same file at /file/1/usage both showing a count of 1. I then edited the same node again and added the same file yet again through the WYSIWYG. This time one of the file usage entries incremented by one.
So the end result is like in this screenshot.
I would have thought that it show just one entry with a count of 3. Essentially it appears to be counting usage of the file through the WYSIWYG separate from usage of the same file attached to a field on the same node.
Comment #42
Dave Reid@mrfelton: I would actually say that it's working as designed. We aren't exposing the 'Type' information properly. It should be returning 'Node' and 'Media' for the two different rows' first column. Or #1286508: Provide a file usage report tab at file/%file/usage needs to be fixed to group all of the same entity type + ID together.
Comment #43
mrfelton CreditAttribution: mrfelton commentedSo I have 'file' and 'media' for the one added to the image field (using the media widget), and 'media' and 'node' for the two added to the node body through wysiwyg.
Comment #44
David_Rothstein CreditAttribution: David_Rothstein commentedQuick and trivial reroll of #40 to get rid of some annoying PHP notices caused by this patch (due to the fact that field_get_items() can sometimes return FALSE rather than an empty array).
Comment #45
ParisLiakos CreditAttribution: ParisLiakos commentedI am not sure, where this got stuck?
@mrfelton is dave's answer enough to get this back to NR?
Comment #46
HyperGlide CreditAttribution: HyperGlide commented@mrfelton and @David_Rothstein any thoughts per #45.
Comment #47
bbinkovitz CreditAttribution: bbinkovitz commentedmedia-filter-file-usage-1238116-44.patch works for me.
mysql> select * from file_usage;
+-----+--------+------+----+-------+
| fid | module | type | id | count |
+-----+--------+------+----+-------+
| 1 | media | node | 3 | 1 |
| 2 | media | node | 4 | 1 |
+-----+--------+------+----+-------+
2 rows in set (0.00 sec)
Comment #48
bbinkovitz CreditAttribution: bbinkovitz commentedComment #49
arthurf CreditAttribution: arthurf commentedI tested the following with success with the patch on #44
* adding via wysiwyg to the body field
* adding multiple to the wysiwyg field
* node in publish and unpublished state
* deleting the node
What other considerations do we need to take into account?
Comment #50
agentrickardNot storing false positives on the regex?
Comment #51
tsvenson CreditAttribution: tsvenson commented@arthurf: Have you tested it with revisions?
AFAIK a file that is deleted from a file/image field (core) in later revisions is still being counted. So, we need to comply with how that works too.
Comment #52
bbinkovitz CreditAttribution: bbinkovitz commentedRemoving the file markup from the node without deleting the node removes the file from the usage table. This is good.
Only FIDs that correspond to existing files are added to file_usage. Also good.
Changing "type" from "media" to something else (I used "dave", "image", and "panda" as tests) does not prevent inclusion in the usage table. This seems bad? Here's a version of the patch that only includes if type="media".
Comment #53
arthurf CreditAttribution: arthurf commented@tsvenson ok, so from what I understand to deal with revisions I think we can't delete the usage here. The delete query that's done on each update should be removed from the patch. However we're going to need to query the db to see if this fid is already counted. We also need to be aware of how many times this fid is on this node to ensure that we can increment correctly. So probably we need to do something like:
Comment #54
arthurf CreditAttribution: arthurf commented@agentrickard I think bbinkovitz's patch on #52 should address the regex concern because it's checking the type of the insert code based on the string- it isn't improving the regex but it is using the data itself.
Comment #55
tsvenson CreditAttribution: tsvenson commented@arthurf: Yeah, its a tricky one this how it should be counted. If the current revision is the only revision, then the count should go down if a file is removed. But if there exists earlier revisions, they need to be parsed as well to check if a file was used in any of them.
I suppose the easies is to simply parse all revisions of an entity and build a file use table from that.
How is used files kept track on? Is it capable of knowing if a file is used in both a file field and the text are for example?
On tsvenson.com I have used the model to upload images to a file field and then embedded them in the text area. Thus, those files are used "twice" on the same node/revision.
Comment #56
arthurf CreditAttribution: arthurf commented@tevenson: from what I understand of revisions we have to keep track of the usage on revisions no matter what. So what I'm thinking is that we simply never decrement the usage of a file because of this. However, as you note, there maybe instances where there is only one revision in which case we should.
So I think then the code has to check to see if there are revisions of this particular field (is this simply version id == entity id?) and in this case we can decrement the file usage, otherwise we only increase the file usage count if the current usage is greater than what is returned by file_usage_list() which is what the proto code I posted above is intended to do.
Does that seem to address both cases?
Also, per your comment about file upload with wysiwyg, the usage table tracks these separately based on the module that is using the file, so from my perspective that is correct. Are you thinking that the usage should only be one in this case?
Comment #57
agentrickardIntroducing parsing of all revisions is, IMO, waaaaaaay out of scope for this patch.
If you need that behavior, use File Lock.
Comment #58
arthurf CreditAttribution: arthurf commented@agentrickard - is it sufficient to simply not decrement file usage based on the current entity save operation? If I understand things correctly this should keep the file usage for previous revisions even if the file is removed from the field for this current revision. The only time we would need to increment the file usage is if the current file usage for file (ie: added more times to this field) is greater than the current file usage count. I think this will protect revisions which use this file while handling a file added more than once to a field. I think.
Comment #59
agentrickardGiven that the {file_usage} table is used for garbage collection, I would argue that any non-binary value in that table for an nid is meaningless.
Comment #60
arthurf CreditAttribution: arthurf commented@agentrickard- so are you in agreement that we should not decrement file_usage unless the current revision is the only revision of this entity? I'm a bit unclear on how you're thinking we should proceed.
Comment #61
tsvenson CreditAttribution: tsvenson commented@agentrickard: The problem with getting this right is that the file usage table and particularly the "garbage collection" was implemented without proper analysis of the consequences. In my view it has caused a lot more problems than it solved.
A massive amount of valuable resources has now been spent on compensating for that behaviour and coming up with methods, such as this one, to try and make sure that Drupal doesn't decide that a file is not relevant for the site any longer. Promptly delete it without any warning whatsoever as a result.
Based on that, I don't actually agree that parsing older revisions is out of the scope here. Simply because the goal is to implement a correct counting of file usage in text area fields. The only way, as I see it with my limited knowledge, to do that, and comply with how the file/image fields work, is to parse the various versions of the content that exists.
Comment #62
arthurf CreditAttribution: arthurf commentedFrom what I can tell the patch in #52 will correctly increment the current usage of files on this entity. The question we need to figure out is how and how much to deal with revisions.
I am proposing that we do not need to worry about revisions so long as we capture the initial usage. Once a revision exists, the file's usage is captured. I think we can safely never decrease the file usage quantity for this entity, we only have to increase it if this file is added a subsequent time (ie: two copies in the text) to this entity.
There maybe an exception to this if the current entity revision is the only revision- in this case we might have to decrement the file usage if the file is removed. Somebody with good knowledge about versioning should chime in here.
The only way to do this properly that I see is to add a version id to the file_usage table. I don't think that is going to happen and more importantly it would also require significant amount of work up stream. I think for now we're better off writing a wrapper around file_usage_add() which takes a version id argument. This might be appropriate for file entity- no idea, just throwing the idea out there.
@agentrickard I take your point that the usage can be thought of as binary as the file is either in use on this particular entity or it isn't. While I'm not sure if that's the right approach, I don't think your point would alter what I'm proposing above significantly- the code would still be basically the same.
I think we're actually pretty close on this- the code is mostly here in this thread, it's really just an issue of figuring out if we have correctly captured the use cases.
Comment #63
John Pitcairn CreditAttribution: John Pitcairn commentedWhat happens when:
1 - A file is added to an existing item which already has revisions.
2 - The file is removed in a later revision.
3 - Earlier revisions containing the file are manually or automatically deleted, but there is still more than one revision.
If you do not worry about revision-tracking, recorded usage will be incorrect, since the file is no longer used anywhere.
I can see this being a followup issue however.
Comment #64
arthurf CreditAttribution: arthurf commented@John Pitcairn I think you might have misunderstood me. I'm suggesting that we do not need to parse through versions of entities so long as we do not decrement the file usage count. There is one possible exception to this if the current revision is the only revision.
As per your third point, we can't address that without tracking vids in the file usage table and there would be a significant amount of code that would need to be written to support this for other use cases as well.
Comment #65
agentrickardFor the use case in #63, just use File Lock. That's what the module is for.
These discussions (#61 and #63) are out of scope of WYSIWYG tracking. Revision tracking in {file_usage} is a separate issue.
Comment #66
arthurf CreditAttribution: arthurf commented@agentrickard - I disagree. For one we don't need a second module to handle this. Secondly the proposed behavior is what is expected of modules that implement file_usage- if there is agreement the proposed functionality correctly handles revisions. Unless this is not the case I don't think yet another module is the answer.
Comment #67
agentrickardIncrement / decrement if you must, but do so on the current revision. Don't try to parse the entire version history. That way lies madness.
Comment #68
arthurf CreditAttribution: arthurf commented@agentrickard #67 does not address the solution that I've proposed.
Comment #69
agentrickardIdeally, we would be incrementing/decrementing the file usage based on the revision being saved. If removing a file turns out to be the only 'version' using the file, and file_usage drops to 0, that sounds like proper behavior.
But we would only be parsing the current change, not the entire revision history.
Comment #70
arthurf CreditAttribution: arthurf commentedThe following I think does what I've proposed in the thread above. This patch includes #42, #44, #52 from above. I've tested under the following conditions:
* content type revisions off: image add, image delete, image add multiple, image delete multiple.
* content type revisions on: image add, image add multiple, image delete, image delete multiple.
In each case I'm seeing what I think should be the correct behavior.
Comment #71
arthurf CreditAttribution: arthurf commentedHere is a revised version of the above patch which I believe should handle revision deletion correctly. Here's why I'm think we maybe able to do this safely:
* when a revision is created we count all the files in the field and increment the file_usage value
* the value in file_usage reflects the total number of file uses across all of this entity's revisions
* when a revision is deleted we can count the number of file uses in that revision the same way we do for saving them
* we can delete this file count from the file_usage table
I think this technique will actually reflect the proper number of files in use and allow for deletion of versions without leaving cruft in the file_usage table.
I've tested this a bit myself but it would be good to hear others thoughts on this.
Comment #72
tsvenson CreditAttribution: tsvenson commentedAfter a short IRC discussion with @arthurf the way this patch works as outlined in #71 seem to cover all needs. Great work all.
I haven't had the ability to actually test the patch so can't set it to RTBC (yet).
Comment #73
agentrickardShould this last bit be an IF, to ensure we pass a proper $file object:
To:
Comment #74
arthurf CreditAttribution: arthurf commentedGood point, re-rolled.
Comment #75
bbinkovitz CreditAttribution: bbinkovitz commentedSo, am I understanding correctly what this latest patch does? I am trying to write some tests for this.
* Upon save, if a new revision is not being created and a file is being removed from the body of the node, the usage table will decrement immediately.
* Upon save, if a new revision is being created and a file is being removed from the body of the node, the usage table will not decrement.
Is that correct?
Comment #76
arthurf CreditAttribution: arthurf commented@bbinkovitz both of those statements are correct.
Comment #77
HyperGlide CreditAttribution: HyperGlide commented@arthurf do you feel this patch is ready for testing on a wider scale? I
Great work so far!
Comment #78
arthurf CreditAttribution: arthurf commented@HyperGlide - yes, it would be great to have more testing under different circumstances. I tested what were the obvious cases to me but the more eyes the better.
Comment #79
bbinkovitz CreditAttribution: bbinkovitz commented@arthurf, Is it your intention for the following behavior to occur?
In other words, the usages of a given file on a given node = the sum total of all usages on all revisions.
The incrementing in step 2 was surprising to me, although I'm not sure what behavior I expected. I just wanted to make sure I understood your intention correctly.
Comment #80
bbinkovitz CreditAttribution: bbinkovitz commentedI've attached patch containing a test that ascertains four things:
Comment #82
bbinkovitz CreditAttribution: bbinkovitz commentedD'oh. Re-rolled. Includes arthurf's patch and the tests for it. And has the right issue number now too.
Comment #83
bbinkovitz CreditAttribution: bbinkovitz commentedComment #85
bbinkovitz CreditAttribution: bbinkovitz commentedThird time's the charm.
Comment #86
arthurf CreditAttribution: arthurf commented@bbinkovitz- I think this is a good start. There are some code style issues that need to be addressed and some additional tests particularly for revisions I'd like to see:
Comment #87
bbinkovitz CreditAttribution: bbinkovitz commentedIt appears that deleting the node doesn't set usages to zero.
Comment #88
arthurf CreditAttribution: arthurf commented@bbinkovitz - I just tested the patch from #74 with nodes with and without revisions and every case that I've tested with a node delete it correctly deletes the entries from file_usage. Is your comment specific to the tests you've written?
Comment #89
arthurf CreditAttribution: arthurf commentedbbinkovitz identified a case where the patch on #74 fails- when a node which is revisioned is saved not as a new revision the usage count was still incremented. In this case the usage has to be adjusted based on the previous revision compared to the current.
Comment #91
arthurf CreditAttribution: arthurf commented#89: file_usage_count_revisions_1.diff queued for re-testing.
Comment #93
arthurf CreditAttribution: arthurf commentedRerolled patch
Comment #95
arthurf CreditAttribution: arthurf commentedOne more go
Comment #96
bbinkovitz CreditAttribution: bbinkovitz commentedArgh. Ok, well just because I already spent a bunch of time on this, here's my (probably way less efficient) solution to this problem.
Also, arthurf, your patch works great. However, when I use it I get errors such as:
Notice: Undefined offset: 3 in _media_filter_add_file_usage_from_fields() (line 77 of /sites/all/modules/media/includes/media.filter.inc)
Comment #97
bbinkovitz CreditAttribution: bbinkovitz commentedHere's arthurf's latest patch, rerolled with the same tests that I wrote for mine. I also corrected what I think may have been a small typo on line 202 that looked like it might have been throwing some of the errors.
Comment #98
ParisLiakos CreditAttribution: ParisLiakos commentedseems test file is missing from last patch
Comment #99
bbinkovitz CreditAttribution: bbinkovitz commentedSo it is. Here's another go at actually including the tests.
Comment #100
bbinkovitz CreditAttribution: bbinkovitz commentedNew version of the previous patch to fix an undefined index error upon decrementing file usages within the same revision number.
Comment #101
bbinkovitz CreditAttribution: bbinkovitz commentedSomehow the last comment posted twice. Sorry about that.
Comment #102
wizonesolutionsDid not work for my test case:
1. Edited a node.
2. Inserted an existing PDF (Document file type) with fid 285 into the node. Prior to insertion, file_usage showed its count at 1.
3. Saved.
4. Checked file_usage again. Count was still 1.
Generic file formatter. The filter needs to look at the data attribute apparently because this formatter doesn't use the Media-style filter tags.
Comment #103
bbinkovitz CreditAttribution: bbinkovitz commentedI'm having trouble even testing this because I'm running into #1743040: "You have not selected anything!" error message, when a selection has clearly been made. It seems problematic to try to fix this because then my patch would rely on another patch. How to proceed?
Comment #104
arthurf CreditAttribution: arthurf commented@wizonesolutions
I've tested this and have not been able to confirm your experience. Here's what I did:
I created a new node and inserted a pdf into the body area with the wyswig. When I disable the the wysiwyg I see the following markup:
When saved, file usage reports expected results:
I edit the node again, copying the markup above so that there are two instances of the same PDF markup in this body area. Code looks like:
When I save the node, I see:
This is to be expected because I did not create a new revision of the node. Now I edit the node again and do not change anything other than to check the "create a new revision" box. I see:
These are the expected results.
@wizonesolutions it seems like either the patch didn't apply correctly or your markup is somehow wrong. Can you confirm that the markup that you're using looks something like the above? Also what view mode are you choosing to insert the file- in this example I'm using the default.
Comment #105
ParisLiakos CreditAttribution: ParisLiakos commented100+ comments..we need an issue summary here
Comment #106
arthurf CreditAttribution: arthurf commentedSummary:
Files inserted by the WYSIWYG need to create a record in the Drupal {file_usage} table. Patch from #101 provides code that handles the insertion of images and properly accounts for node revisions as well as providing tests for various use cases raised in this issue.
To review this patch, please run the tests included with it with an eye for the conditions identified in #86, #89, #102, and #104
Further consideration may be needed for non-node entities.
Comment #106.0
wizonesolutionsUpdate summary.
Comment #107
wizonesolutionsI put this into the issue summary.
arthurf: I will look at your notes and respond in a bit.
Comment #108
bbinkovitz CreditAttribution: bbinkovitz commentedI am not able to reproduce the problem either.
I am curious about this: "Generic file formatter. The filter needs to look at the data attribute apparently because this formatter doesn't use the Media-style filter tags."
The generic file formatter is a field display, not a view mode. If I'm understanding this issue correctly, field display settings are applied separately, and shouldn't affect file usage tracking. Am I misunderstanding something here? Picking a file formatter directly instead of a view mode is not standard behavior and is not supported.
Comment #109
Dave ReidI'm curious as to why _media_filter_add_file_usage_from_fields() is so complex and why we can't do something more like the following:
If we can at all avoid having file_usage_delete() that would be good since it would prevent files from getting deleted unnecessarily.
Comment #110
arthurf CreditAttribution: arthurf commented@Dave Reid - The reason why I changed the code in the original patch is to support revision handling, I might be missing something but it seems to me that if you delete from file usage you can't keep track of the total number of files that are used across all revisions. That was my thinking anyway.
Comment #111
Dave ReidThen how can we account for existing nodes with lots of revisions that have embedded files before the user updates to a version with this patch?
Comment #112
arthurf CreditAttribution: arthurf commented@Dave Reid - I tried to explain my thinking in #71 . Though that code is wrong, the fixes and tests further down the thread seem to confirm that this does work.
Comment #113
bbinkovitz CreditAttribution: bbinkovitz commented@Dave Reid depending on how many revisions exist, this could be pretty heavy. You'd be parsing the entire revision history on every save, which means essentially loading the full node as many times as there are revisions.
It also only works for nodes, until other entities start supporting revisions, so everything else would have to continue to be parsed the same way.
A better option would be a button that can be accessed from Media's administration page or some such, that, when pressed, searches through all nodes whose last update was within a selected date range, and parses through their revision histories, rewriting their file_usage table rows as needed. That way the user will understand the performance toll involved and can schedule that maintenance at their convenience. It also makes sense to bulk update everything if we're going to be updating anything.
Comment #114
wizonesolutionsThere are already other entities with revision support - Field Collection Item (the actual entity that gets created and associated with a Field Collection field), for example. Entity API supports revisions, so it should be assumed that any entity might support revisions. There are functions to check for this IIRC.
Comment #115
arthurf CreditAttribution: arthurf commentedThe method for finding media data in fields is handled in media_filter_parse_from_fields() and _media_filter_fields_with_text_filtering(). _media_filter_fields_with_text_filtering will get all the fields from an entity and parse each field that has text-filtering enabled. These functions do not have any node specific functionality. The major concern is the conditional _media_filter_add_file_usage_from_fields() which attempts to figure out if the entity being saved is a new revision or not.
My hunch is that the patch makes some node specific assumptions about the data being passed in .
@wizonesolutions can you test this patch with a field collection?
Comment #116
arthurf CreditAttribution: arthurf commentedThis updates the patch from #101. I wanted to more explicitly test for file deletion against file usage. I've done some abstraction of the tests and some code style cleanup.
Comment #117
arthurf CreditAttribution: arthurf commentedSorry, last diff didn't include the other changes, reroll.
Comment #118
aaron CreditAttribution: aaron commentedI have tested this against all the conditions listed in #86, #89, #102, and #104. I have also tested it with field collections. Everything checks out fine from my end.
Comment #119
ParisLiakos CreditAttribution: ParisLiakos commentedno need for $nid variable.
we should use LANGUAGE_NONE here instead of und.This happens in several places
Sending to bot for testing
Comment #120
ParisLiakos CreditAttribution: ParisLiakos commentedI went through the entire thread and patch..i cant find anything else besides some docs fixes, but i wont hold this patch for this..also we have really good test coverage here..congrats all and good job!:)
I have committed this.
We should consider backporting it to 1.x i think
Comment #121
HyperGlide CreditAttribution: HyperGlide commentedMuch needed patch. ++
With all the recent commits should there be a new release?
Comment #122
ParisLiakos CreditAttribution: ParisLiakos commentedsee #1889134: [meta] Toward a Media 2.0-alpha1 release
Comment #123
gmclelland CreditAttribution: gmclelland commenteddoes this mean we don't need to use http://drupal.org/project/file_lock anymore?
Comment #124
arthurf CreditAttribution: arthurf commented@gmclelland yes- media should record file usage properly now so files won't be deleted if they are in use in a text field.
Comment #125
gmclelland CreditAttribution: gmclelland commented@arthurf - thank you for the clarification
Comment #126
ParisLiakos CreditAttribution: ParisLiakos commented#1937864: Unable to remove missing image from body content
Comment #127
aaron CreditAttribution: aaron commentedHere you go, I have re-rolled this patch for version 7.x-1.x. This just needs to be manually tested, which I have not done yet.
Comment #128
ParisLiakos CreditAttribution: ParisLiakos commentedComment #129
treksler CreditAttribution: treksler commentedthe patch in #127 adds a reference to test/media.file.usage.test in media.info but omits the actual tests
Comment #130
aaron CreditAttribution: aaron commentedOops! Sorry about that.
Comment #131
treksler CreditAttribution: treksler commentedok the one in #130 refers to test/media.file.usage.test but puts the file in tests/media.file.usage.test
once, I move the file to test/ I get: Class 'MediaTestHelper' not found.
it seems to me that perhaps future versions of media module have a wrapper for DrupalWebTestCase?
is that necessary here? I changed it to DrupalWebTestCase, and all tests passed.
sorry, I am new to Drupal 7, I would submit a patch, but I have to read up on how to do that properly first.
Comment #132
aaron CreditAttribution: aaron commentedI'm sorry about the misnamed directory. This patch fixes that. I don't know offhand about the wrapper for the DrupalWebTestCase but I will let the test bot take care of checking that out.
Comment #134
treksler CreditAttribution: treksler commentedI took the last patch and swapped out MediaTestHelper with DrupalWebTestCase
Comment #136
treksler CreditAttribution: treksler commentedlooks like there were some exceptions in creating the test files
Undefined property: stdClass::$filemime Notice media.types.inc 196 media_is_type()
I am pretty sure that I did not get these exceptions last time, but anyway..
I have to admit that since I am not intimately familiar with Drupal 7 or the Media module codebase yet, I have no idea what is going on there.
BUT, upon taking a closer look, it seems that the other tests that are already there are setting up the test files in a round about way..
Eg. testQueryMedia() in media.entity.test calls:
$file = file_uri_to_object($files[0]->uri);
instead of:
$file = $files[0];
no Idea why, but making that change works.
maybe it fills in a default missing media type? just a guess.
I am uploading a patch that should work.
Comment #137
treksler CreditAttribution: treksler commentedComment #138
Dave ReidComment #139
ParisLiakos CreditAttribution: ParisLiakos commentedthis is already in 2.x
this issue is open for backport only
Comment #140
Dave ReidBlargh, thanks ParisLiakos.
Comment #141
treksler CreditAttribution: treksler commentedthe only review that was needed was to verify whether it was OK for the test to extend DrupalWebTestCase directly (instead of MediaTestHelper) since the tests pass, I think the review is unneccessary. correct me if I am wrong.
Comment #142
treksler CreditAttribution: treksler commentedthe odds of this ever getting committed are what exactly? should I bother testing patches in the future?
Comment #143
ParisLiakos CreditAttribution: ParisLiakos commented@treksler given this is a major bug, i would say it is gonna committed rather soon and be in next 1.x version
thanks for testing!
A maintainer will care for it once he finds time
Comment #144
tobiasbThis is a patch for the current version "2.0-unstable7". Useful for drush make or something.
Comment #145
aaron CreditAttribution: aaron commentedI've committed this. Thanks for the hard work everyone!
Comment #146
David_Rothstein CreditAttribution: David_Rothstein commentedIt looks like the test file (test/media.file.usage.test) was left out of the commit?
Comment #147
Devin Carlson CreditAttribution: Devin Carlson commentedA patch to add the missing test file (just checking that it's still passing).
Comment #148
Devin Carlson CreditAttribution: Devin Carlson commentedCommitted #147 to 7.x-1.x.
Comment #149.0
(not verified) CreditAttribution: commentedUpdated issue summary.
Comment #150
PascalAnimateur CreditAttribution: PascalAnimateur commentedI can only get the file usage to work on image files, not on document / audio / video. Is this normal?
I'm using the following:
media 7.x-2.0-alpha4+26-dev
file_entity 7.x-2.0-beta1+22-dev
wysiwyg 2.x-dev
CKEditor 4.4.7
Even with CKEditor's ACF disabled, the media token [[...]] isn't stored in the database for audio/video/document.
Am I missing something?
Update: It turns out the JS function replacePlaceholderWithToken didn't recognize the needed
<audio> / <video> / <iframe>
tags. The problem is around line 97 of media_wysiwyg.filter.js, where the code expects the media-element class to be found on either img or span tag. I managed to fix this by adding the following lines after the regex for span:Comment #151
PascalAnimateur CreditAttribution: PascalAnimateur commentedComment #152
cclafferty CreditAttribution: cclafferty commentedThanks for this PascalAnimateur your solution almost fixed this problem for me.
I found two problems with the regex in the replacePlaceholderWithToken code:
</span>
needs it's forward slash escaped like this<\/span>
[\\S\\s]+?
needed to be replaced with[\\S\\s]*?
Here is my final working combined combined code:
Comment #153
PascalAnimateur CreditAttribution: PascalAnimateur commentedRe-rolled the patch with information from #152.
Thanks @cclafferty !
Comment #154
rooby CreditAttribution: rooby commentedDoing
[\\S\\s]
seems unusual. It means any non-whitespace or any whitespace character, which means any character, so why not just do.*?
?Comment #155
cclafferty CreditAttribution: cclafferty commented@rooby I don't think
.
matches line breaks whereas[\\S\\s]
does.I was having a play around with the regex here if you want to have a look: https://regex101.com/r/bY2kY1/1
Comment #156
rooby CreditAttribution: rooby commentedAh very true. My mistake.
Comment #157
rdentry CreditAttribution: rdentry commentedI'm still having trouble with the File Usage not updating for image added via wysiwyg. I am currently using 7.x-2.0-rc3+3-dev . Does anyone know if this patch has since been rolled into 7.x-2.x-dev? I see that the patch proposed in #101 is for 7.x-1.x-dev.
Reproducing my issue is as follows:
Any help with this or guidance to finding the solution would be much appreciated.
Comment #158
darrell_ulm CreditAttribution: darrell_ulm as a volunteer commentedClosed issue, but am seeing same as #157 https://www.drupal.org/node/1268116#comment-11885837
Comment #159
joseph.olstad@darrell_ulm , open a new issue with your findings and link this one to it . We won't re-open this closed issue but feel free to make a new issue.
Comment #160
darrell_ulm CreditAttribution: darrell_ulm as a volunteer commentedThank you for that. Solved now by looking at config more.