Note: We consider this task to be more important than the related task of implementing a Git backend for the Version Control API. However, you may choose whatever task interests you most.
The Version Control API module is a relatively new module that provides functions for interfacing with the server side of version control systems (VCS). In order to work, Version Control API needs at least one VCS backend module that provides the specific VCS's functionality. At the moment, only a back end for CVS has been written (see references).
For this task, you will create a module similar to the Version Control API -- CVS backend module but which instead provides an implementation of the Mercurial version control system. You should also have a look at the example "FakeVCS backend" that ships with Version Control API itself, and the overall OVERVIEW.txt for a better understanding of the API's main concepts.
Your module should have functionality similar to what is currently present in the CVS backend.
Deliverables:
* A new versioncontrol_hg.module that implements the required functions for Version Control API backends. These are mostly functions that transform the revision data from the database representation to the API's array format. Here's the exact list of required functions:
* hook_versioncontrol_backends()
* versioncontrol_hg_get_commit_actions()
* versioncontrol_hg_get_directory_item()
* versioncontrol_hg_get_commit_branches()
* versioncontrol_hg_get_branched_items()
* versioncontrol_hg_get_tagged_items()
* versioncontrol_hg_get_current_item_branch()
* versioncontrol_hg_get_current_item_tag()
* versioncontrol_hg_get_parent_item()
* ...and others that you might consider practicable - in particular, you'll probably need versioncontrol_hg_commit() for managing additional commit data in the database.
* Functionality to import commits from hg logs, similar to the CVS backend's "log fetching" functionality.
* Hook scripts that enable recording and access control for commits, similar to the CVS backend's xcvs-* scripts.
* The task will be complete when the submitted module is marked as RTBC by one of the mentors.
* Develop a database table structure to store information that is required for repository configuration, user account properties and displaying transactions.
Resources:
* Version Control API Module ( http://drupal.org/project/versioncontrol)
* Version Control API -- CVS Backend ( http://drupal.org/project/versioncontrol_cvs )
* Mercurial (http://www.selenic.com/mercurial/wiki/)
* Mercurial commit hooks ( http://hgbook.red-bean.com/hgbookch10.html)
Contact:
* chx ( http://drupal.org/user/9446)
* jpetso (http://drupal.org/user/56020 )
| Comment | File | Size | Author |
|---|---|---|---|
| #7 | hg-commitlog.png | 43.29 KB | ezyang |
Comments
Comment #1
jpetso commentedThis issue on Google's GHOP issue tracker: http://code.google.com/p/google-highly-open-participation-drupal/issues/...
Claimed by ezyang. (Go Edward go!)
Comment #2
jpetso commentedezyang has created a project for the Mercurial backend at http://drupal.org/project/versioncontrol_hg - please update your subscription settings there so this issue can be moved over to the new project.
@aclight: could you notify ezyang of this issue's existence, and that any updates should not only go into the GHOP issue tracker but in here as well? thanks!
Comment #3
aclight commentedI'm fixing the title and moving this into the Mercurial queue.
Comment #4
ezyang commentedOk, I will post issues here. For now, the only problems are unimplemented features and some concerns raised at the bottom of the README.txt file.
Comment #5
jpetso commentedOk, thanks for posting here.
For the concerns in the README.txt file, I would consider the following:
* The exact time of the branch and tag operations is not really that critical, it's just a measure so that they'll show up correctly in the commit log (once displaying commits, branch ops and tag ops in one list is implemented there). So I'd suggest that tag operations are assigned the time of the changeset plus one (== a second after the commit) and for the branch operations as well, assuming it's feasible to retrieve the changeset that this branch was branched off.
* What you are using as 'node' is really the 'revision' property {versioncontrol_commits}. Good point about the missing index for that column, I'm going to change that in versioncontrol.install. (Assuming that MySQL and Postgres let me assign an index to a 255-length varchar. Need to try that out, otherwise it may still be a good idea to shrink that field.) If you could replace 'node' with 'revision', I'd very much appreciate that.
The helper library looks good so far, I'm looking forward to the rest of the module :)
Also note my write-ups about commit restrictions and user authentication from today - short version: you don't have to implement access control scripts, and account import/export is too complex to support all kinds of authentication methods that are provided by Mercurial, so at maximum an .htaccess/.htpasswd file would be nice. (That's the least pressing of all issues, though.)
If you're interested, you can also have a look at the issue for the Git backend, there's quite a bit information there already, some of which also applies to Mercurial.
Comment #6
ezyang commentedThe module now works! I'd like some feedback on the things I posted in the KNOWN ISSUES section of README. In particular, I'd like to know which repository specific informations commitlog uses, what to do about hashes versus numeric revision IDs, docblock duplication, refactoring of versioncontrol to remove common code from git/hg/svn/cvs, and the nodeid lookup issue (i.e. how to let commitlog know that we're missing info). Thank you!
To set up the repository, enable the module (if you enabled it before, you may need to disable, uninstall, and then load again), use the standard repository creation interface, and point it to an existing Mercurial repository on your computer. Run cron, and then check the commit log.
Comment #7
ezyang commentedWith the latest commits, there has been quite a bit of code cleanup, and a proper implementation of source item detection. There are now no major issues with the module (save missing functionality!) I've attached an image of the commit log for the curious.
I've been keeping my eye on the git implementation, and I've specifically avoided (I think) all of the issues posed in #21, as well as the earlier ones.
Comment #8
jpetso commentedOk, I just lost like two hours (a bit less, maybe) of writing up stuff in here. So, 1. sorry for that, and 2. sorry for being late with a review altogether. Running out of time again (sounds familiar), so I'll just tackle the most important points:
I think that's it from my side... leaving until the 8th of February, so chx will do any remaining reviews and evaluate your work. There might be some edge cases that are not yet perfectly handled, but overall I'd say that you did good research and developed the backend from a solid base, so later changes should be relatively easy to do. Congrats, thanks, and good luck for the remaining days of the GHOP! I'm off now. *shwoop*
Comment #9
aclight commentedRe #10 above (whether to display email addresses or not): What about somewhat hiding them like Google does on Google code? So, you might have ez.....@example.com instead of ezyang@example.com.
Comment #10
ezyang commentedThank you for taking two hours to review this module. Your comments are greatly appreciated!
For reference, the points can be categorized as such:
No action needed: 6
Upstream: 9, 10
New features: 2, 3, 4, 8
Modifications: 5, 12
Trivial: 1, 11
Unknown: 7
For point seven I have to respectfully disagree: Mercurial commits only apply to one branch. For example, in your use case, "commit on master and then branch off drupal-6.x", the commit on master would count as one commit, and then the "branch off" would be another commit. Or, in the case that drupal-6.x already existed, it would be a "merge" not a "branch" (but still would get its own commit). Re-recording the commit to the master branch under the "drupal-6.x" branch wouldn't make any sense, because the data would be duplicate, and it never actually happened to "drupal-6.x" (it's a "virtual" commit).
My proposed behavior does lead to a limitation, which is that branch history stops on copy/rename. We can, however, resolve this by checking the parents of the earliest branch revision and then further retrieving the history for them (I don't know if this can be done in one SQL query or not).
I agree with everything else and will be addressing them shortly.
Comment #11
ezyang commentedI'm considering this GHOP-task complete, although there are a few still unimplemented features that are out-of-scope for GHOP (and probably more kinks to work out with real-world use).
I've also chosen to disagree with point five, simply due to the fact that I would need really complicated joins otherwise (there's already some really convoluted ones for DELETEs), and there is never any situation in which the normalized (not denormalized; check the vocabulary) database would be easier to use/better.
Comment #12
ezyang commentedI'm marking this as closed, since GHOP is over. Further issues with the module should be get their own issues.
Comment #13
jpetso commentedOk, back from my holidays :)
I'm glad this task worked out so nicely, thanks a lot for working on the module!
Disagreement on points 5 and 7 accepted - I was outright wrong with 7 (changesets are indeed assigned to exactly one branch only) and the argument on 5 is valid. Sorry for wrong usage of vocabulary for normalization x-)
Yeah, and one more thing (unrelated to this issue, but I need to put it somewhere): you don't need to manually close other issues - "fixed" is a perfectly fine state for issues that have been fixed, and issues with status "fixed" will be "closed" by the closebot after two weeks of being marked as "fixed".
So, great work. I very much appreciate your involvement with upstream issues, and would be glad if you'd stick around for a while :)