as part of the new release system (http://drupal.org/node/77562) we need to have access to info about all CVS tags and branches in each project. the attached patch adds a {cvs_tags} table to the cvs.module, and changes the xcvs-* scripts so that as cvs tag commands happen, we record them in the table for each directory that's part of the tag command. we remove entries from the table when tags are deleted. we also keep track of if the tag is a branch tag or not, since we'll need to know that in various places, too.
the only thing missing here is a way to import all the existing branches and tags. ;) at this point, i'm just planning to do that as a separate, stand-alone script that you'd run once. to be really complete, i suppose i should fix the automatic log parsing code in cvs.module to also populate the {cvs_tags} table. but, even if i did that, it wouldn't help us on d.o, so i'm going to need the separate script, anyway.
but, if someone else wants to take a look here and review what i've got so far, that'd be swell. ;)
thanks,
-derek
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | cvs_store_tags.patch_4.txt | 17.66 KB | dww |
| #4 | cvs_store_tags.patch_3.txt | 15.08 KB | dww |
| #3 | cvs_store_tags.patch_2.txt | 10.36 KB | dww |
| #2 | cvs_store_tags.patch_1.txt | 10.44 KB | dww |
| cvs_store_tags.patch.txt | 5.75 KB | dww |
Comments
Comment #1
dww"what about the 'branch' field in the {cvs_files} table?" you might ask... ;)
well, a few problems:
... just in case you were wondering. ;)
Comment #2
dwwanother step closer to RTBC -- now the automatic log importing code knows about {cvs_tags} and will populate that table, too.
so, in theory, the update path on drupal.org could be to just run all that stuff. however, i'll probably want to just split out this part of the functionality into a separate function or script as a 1-time thing for d.o...
Comment #3
dwwnew patch that applies cleanly after a little cleanup in nearby code
Comment #4
dwwnew patch that provides:
i ran xcvs-import-tags.php on a complete snapshot of the d.o cvs repositories on my laptop, and it took about 7 minutes. not ideal, and we might be able to optimize, but i'm not sure i really care. for example, one thing we could do is use the following workflow:
i'm not sure how easily we could optimize this 7 minutes, it's just an expensive operation on a ton of data. we basically have to dump the entire cvs log history on all files, and parse through all of it.
i suppose i could try a more crafty approach where we only query all the existing nodes in {cvs_projects} (which includes the cvs-related data for each project), make some assumptions about what the files will be called, and try to only slurp log info on a subset of the files. but, i don't think it's worth my time to write/test all that code, when i've already tested this, and there's a fairly easy work-around for the 7 minute update problem. also, the update will probably go about twice as fast on the d.o hardware, so we're only really talking like 3 or 4 minutes...
anyway, i've tested this, it's good to go, and it's a prerequisite for the new release system, so getting this committed and installed on d.o is a top priority...
can i get a final review from someone before i proceed?
thanks!
-derek
Comment #5
dwwfinal patch with a few enhancements:
after a "a passing look..." by killes, this is RTBC. ;)
Comment #6
dwwcommitted to TRUNK and 4.7. i'm working with killes in IRC to get this installed and imported on d.o right now... i'll set this to closed when that's all done.
Comment #7
dwwi was curious about the KEYs i defined, and asked killes... i thought if you have a PRIMARY KEY(nid, tag), but then you try to select on just nid, you can't use the key, since the key is specifically on both values. i know, for example, that this primary key allows you to add 2 rows with the same nid and different tags, but not 2 rows with the same nid and same tag...
upon further investigation, selecting on nid works fine. the problem comes when you want to select on tag:
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
"If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3). MySQL cannot use a partial index if the columns do not form a leftmost prefix of the index."
so, i removed the extra KEY(nid), but left the one on tag... committed to TRUNK and 4.7.
Comment #8
dwweverything is now installed on d.o, and the historical tag data has all been imported.
one step closer to the new releases system -- done! ;)