via http://drupal.org/node/67624 killes and i found the bug that was causing releases to not be saved and published for a handful of projects. the tarballs were getting made, but the release directory scan was failing to associate them with their project. the reason is that there were stale records in the {project_releases} table that had the same filename (e.g. "files/projects/version-cvs.tar.gz") but had a different nid (e.g. 57991 instead of 64590). because of the duplicate key, the insert failed. i'm going to fix the watchdog errors to regularly report these kinds of problems in the future. however, the real bug is that these stale records were left in the table in the first place.

whenever we delete a project node, we *must* also delete all records with the same nid from the {project_releases} table.

while we're at it, we should probably delete all records from {project_comments} and {project_subscriptions} with the same nid, too. (we're already deleting from {project_issues}).

Comments

nedjo’s picture

Several months ago we had a perplexing issue about duplicate release records being generated for projects, e.g., we would get two releases both with 4.6. We never determined the source. I wonder if the problem you're solving here was it.

dww’s picture

could be. i noticed that code and was wondering what it was about. ;) so, do you agree with this patch? is it RTBC?

nedjo’s picture

Status: Needs review » Reviewed & tested by the community

Since there's little documentation or organizational memory for project.module, it's often hard to figure out if something was done a certain way for a reason or through an oversight. In this case I can't think of a reason why we shouldn't be deleting this stuff. So RTBC I guess, unless someone pipes up with a reason.

AjK’s picture

Tested and RTBC.

regards
--AjK

dww’s picture

Status: Reviewed & tested by the community » Active

committed to head, 4.7 and 4.6.

the drupal.org DB still has stale records, so i created http://drupal.org/node/67934.

maybe i should make a project_update_N() hook to delete stale records from these tables in case any other sites have this problem? i guess that'd be an easy way to resolve #67934, too. ;)

dww’s picture

Status: Active » Needs review
StatusFileSize
new1.1 KB

attached patch for project_update_3() works fine on a local test site running MySQL. i have no idea if this is valid in pgsql or not, and i don't have access to my forthcoming pgsql test site yet. :( this obviously needs serious review before i commit it and have killes or dries run it. ;)

also, it'd be nice to propagate to the results page how many rows were deleted with each query. anyone know how to do that?

thanks,
-derek

dww’s picture

StatusFileSize
new1.43 KB

this might be kind of a hack, but it works to display how many rows we're deleting with each query... maybe update.php should do this automatically for update_sql()? or update_sql() should take an optional arg to indicate you want to see this? i guess i should file that as a separate issue against core. ;)

dww’s picture

for the interested reader, on my test site, this produces output like this:

<b>Update #3</b>
<ul>
<li>DELETE p FROM {project_releases} p LEFT JOIN {node} n ON p.nid = n.nid WHERE n.nid IS NULL
<li>5 rows deleted
<li>DELETE p FROM {project_comments} p LEFT JOIN {node} n ON p.nid = n.nid WHERE n.nid IS NULL
<li>0 rows deleted
<li>DELETE p FROM {project_subscriptions} p LEFT JOIN {node} n ON p.nid = n.nid WHERE n.nid IS NULL
<li>4 rows deleted
</ul>
dww’s picture

i'm curious how much of a problem this really is on drupal.org. can anyone run the read-only versions manually?

SELECT p.nid FROM project_releases p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
SELECT p.nid FROM project_comments p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
SELECT p.nid FROM project_subscriptions p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
killes@www.drop.org’s picture

mysql> SELECT count(p.nid) FROM project_releases p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
+--------------+
| count(p.nid) |
+--------------+
| 70 |
+--------------+
1 row in set (0.01 sec)

mysql> SELECT count(p.nid) FROM project_comments p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
+--------------+
| count(p.nid) |
+--------------+
| 73 |
+--------------+
1 row in set (0.18 sec)

mysql> SELECT count(p.nid) FROM project_subscriptions p LEFT JOIN node n ON p.nid = n.nid WHERE n.nid IS NULL;
+--------------+
| count(p.nid) |
+--------------+
| 3139 |
+--------------+
1 row in set (0.12 sec)

dww’s picture

Status: Needs review » Needs work

good news: i finally have a pgsql test site setup on my laptop. :) bad news: #7 is definitely broken in pgsql. :(
i'll see if i can find a simple way to accomplish the same thing in pgsql.

dww’s picture

Status: Needs work » Needs review
StatusFileSize
new1.16 KB

woo hoo. in IRC, halkeye clued me into a syntax that works on both mysql and pgsql for this. ;) new patch that works everywhere. it also is nice enough to show you how many rows were deleted. i think this is RTBC, and RTB run on d.o. however, someone should review, since this is clearly a very important update to get right. ;)

thanks,
-derek

dww’s picture

Status: Needs review » Needs work

argh, apparently subselects are only in mysql 4.1.x, and d.o is still running 4.0.

dww’s picture

Status: Needs work » Needs review
StatusFileSize
new1.84 KB

new version that figures out if we can use subselects. if so, we do the simple thing. if not, we do a separate LEFT JOIN to find all the nids we want to delete from each table, then do a DELETE with a WHERE clause constructed by implode()'ing the array. ok, *this* should hopefully do the trick. ;)

dww’s picture

Status: Needs review » Fixed

killes said this was RTBC in IRC. applied to HEAD and 4.7. phew! glad to have this complete. ;)

Anonymous’s picture

Status: Fixed » Closed (fixed)