When harvesting files from a cck field, if the node already has media_mover files attached, the files are harvested multiple times. This causes additional file system cruft and wasted processing. I suspect it is from the left join on the media_mover_files table in the harvest query. To fix this I've added SELECT DISTINCT to the query.

CommentFileSizeAuthor
fix-duplicate-cck-harvest.patch718 bytesjethro

Comments

jethro’s picture

Status: Active » Needs review

I think maybe this is the status I should have used when I filed this issue. I've been using this fix for months and it's resolved the issue for me.

arthurf’s picture

I'm not quite sure this is happening. From what I understand of your description, you have a node that has a cck file field on it. I'm unclear as to why you'd be getting a second file with the same file path on that node? What does your configuration look like that is producing this?

jethro’s picture

I had this happen on a variety of configurations, all where files were harvested from a cck file field. It would happen if a configuration had already been run for the node and there were already mm files associated with it. Like if one configuration processed video and another created thumbnails.

The file would be harvested once extra for every file that was attached. This was from the db_query on line 332 of function mm_cck_harvest in mm_cck.module. Because of the left join on the node, files and media_mover_files tables the selected rows are duplicated and the same file path is returned multiple times. Adding the DISTINCT option after SELECT in the query, as in the patch, made the same file stop being harvested and processed multiple times.

Thanks

arthurf’s picture

Ah now I get what you having issues with. Your fix is committed to 6.1.x and 6.2.x please give it a test when you have a chance.

kobnim’s picture

Version: 6.x-1.x-dev » 6.x-1.0-beta9

fyi ... This patch did not make it into 6.x-1.0-beta9.

arthurf’s picture

@kobnim - have you been testing the 6.x-1.x branch? If you think it's ready to cut another release I'm glad to- obviously I have not been paying this much attention over the last months (year).

kobnim’s picture

Arthur,
No I have been working with 6.x-1.0-beta9. I will try switching to 6.x-1.x and let you know whether it seems stable.

kobnim’s picture

Arthur,
Just wanted to let you know that I have been using 6.x-1.x-dev for the past six weeks, and it seems stable.
- MIndy