Closed (fixed)
Project:
Drupal.org infrastructure
Component:
Other
Priority:
Major
Category:
Task
Assigned:
Issue tags:
Reporter:
Created:
22 Apr 2012 at 22:28 UTC
Updated:
4 Jan 2014 at 02:05 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
drummtag
Comment #2
drummFor future reference, the following tables reference fids. A lot of the duplicates may be orphans.
Comment #3
drummtag
Comment #4
drummComment #5
drummAttached are 43836 files rows that I'm deleting. They have duplicate names with other rows and are not referenced from the fields listed in #2.
Comment #6
drummThat took care of all duplicated files in files/images/*. Only 85 duplicated filenames left. Most have 2, or sometimes 3-4 rows, but 'files/issues/' has 93.
Comment #7
drummThe 15 with filepath like 'files/releases/%' all had bad release nodes creating the duplicates. Things like having one each for CVS and Git tags. I deleted the bad ones and we are down to 70 duplicated filepaths.
Comment #8
drummI removed the duplicated files which had the wrong size, we just don't seem to have those files, so they are essentially bad uploads. In many cases, these were already re-uploaded by people. All the issues were quite old, I even saw one of mine from 2004.
That gets us down to 31. I'll try going by timestamps next. filepath = 'files/issues/' has 91 rows.
Comment #9
senpai commentedTagging for sprint 3.
Comment #10
drummDone!