The Version Control API module is a relatively new module that provides functions for interfacing with the server side of version control systems (VCS). In order to work, Version Control API needs at least one VCS backend module that provides the specific VCS's functionality. At the moment, only a back end for CVS has been written (see references).
For this task, you will create a module similar to the Version Control API -- CVS backend module but which instead provides an implementation of the Git version control system. You should also have a look at the example "FakeVCS backend" that ships with Version Control API itself, and the overall OVERVIEW.txt for a better understanding of the API's main concepts.
Your module should have functionality similar to what is currently present in the CVS backend.
Deliverables:
* A new versioncontrol_git.module that implements the required functions for Version Control API backends. These are mostly functions that transform the revision data from the database representation to the API's array format. Here's the exact list of required functions:
* hook_versioncontrol_backends()
* versioncontrol_git_get_commit_actions()
* versioncontrol_git_get_directory_item()
* versioncontrol_git_get_commit_branches()
* versioncontrol_git_get_branched_items()
* versioncontrol_git_get_tagged_items()
* versioncontrol_git_get_current_item_branch()
* versioncontrol_git_get_current_item_tag()
* versioncontrol_git_get_parent_item()
* ...and others that you might consider practicable - in particular, you'll probably need versioncontrol_git_commit() for managing additional commit data in the database.
* Functionality to import commits from Git logs, similar to the CVS backend's "log fetching" functionality.
* Hook scripts that enable recording and access control for commits, similar to the CVS backend's xcvs-* scripts.
* The task will be complete when the submitted module is marked as RTBC by one of the mentors.
* Develop a database table structure to store information that is required for repository configuration, user account properties and displaying transactions.
Resources:
* Version Control API Module ( http://drupal.org/project/versioncontrol)
* Version Control API -- CVS Backend ( http://drupal.org/project/versioncontrol_cvs )
* Git (http://git.or.cz/)
* Git commit hooks (http://www.kernel.org/pub/software/scm/git/docs/hooks.html )
Contact:
* chx ( http://drupal.org/user/9446)
* jpetso (http://drupal.org/user/56020)
Comment | File | Size | Author |
---|---|---|---|
#6 | versioncontrol_git.tar_.gz | 10.76 KB | boombatower |
#6 | README.txt | 1.16 KB | boombatower |
#6 | versioncontrol_git.admin_.inc_.txt | 3.86 KB | boombatower |
#6 | versioncontrol_git.info_.txt | 217 bytes | boombatower |
#6 | versioncontrol_git.install_.txt | 5.15 KB | boombatower |
Comments
Comment #1
jpetso CreditAttribution: jpetso commentedThis issue on Google's GHOP issue tracker: http://code.google.com/p/google-highly-open-participation-drupal/issues/...
Claimed by boombatower. (Go Jimmy go!)
Comment #2
boombatower CreditAttribution: boombatower commentedI found a bug in the install hook of the Version Control API and documented it here: http://drupal.org/node/213581.
Comment #3
boombatower CreditAttribution: boombatower commentedI think I have completed most, if not all the requirements for this module. Please let me know if I'm missing anything.
Also not that if there is some functionality that is out of the scope of this task I would not be opposed to working on it outside of the GHOP.
Comment #4
aclight CreditAttribution: aclight commentedComment #5
dwwI'll have to take a closer look at this another time, but this is extremely promising and exciting. Thanks!
Comment #6
boombatower CreditAttribution: boombatower commentedFixed a few mistakes and created a project for this future module at: http://drupal.org/project/versioncontrol_git.
Comment #7
jpetso CreditAttribution: jpetso commentedThat's a great start, and the log parsing code looks like it could work nicely. (Haven't tested yet, I'm going to do that over the weekend.)
However there's a few points so I won't yet mark it as RTBC... I'd even go as far as to claim that the hardest part is still lying ahead of you. Here are my initial observations:
Those are all points that I can think of at the moment... there may be more once I get to test the module. Don't let this list get you down - the work you did up to now is very cool, it just needs some refinements :P
Would everyone agree that it's probably a good idea that the code should be committed to the versioncontrol_git project, and we go with patches to the then-current state from here on? Also, as the Git backend project exists now, I think this issue should move there - make sure to have issue subscriptions enabled for that module.
Comment #8
aclight CreditAttribution: aclight commentedNow that boombatower created a Version Control API -- Git Backend project, I agree that this issue should be moved there. I'll let boombatower do that, so that way he can subscribe to all issues of that project first if he hasn't already. (As a reminder, everyone will need to edit their Subscription prefs. since new projects don't inherit the "All issues", "My issues", etc. subscription options set before the new project was created).
As for committing the code, if it makes sense I think it will be easier to track this from the GHOP perspective if everything for that task is kept i one issue. However, I realize that this is a rather large task, and if it makes more sense for boombatower and/or jpetso to commit some starter code and then create new issues dealing with modifications to that code, that's perfectly fine with me. If that happens, a link to each issue that needs to be completed for this GHOP task to be marked complete on this issue would be helpful.
As others have said, this is great work so far.
Comment #9
dwwFWIW (and I'm not learned in the ways of GHOP admin'ing), I think it'd be best to commit what he's got already to cvs.d.o/contributions/modules/versioncontrol_git, and fix these things as followup patches/commits, instead of doing it all via reposting changed files (or patches) in here. That's what version control is for, after all. ;) Better to get the code up where others can see it, since that'll make it easier to test and to get reviews. Clearly, after only a few more iterations, this will be "done" as far as GHOP is concerned, and you can just upload the latest revisions of the files to the code.google.com site to make them happy...
That's my preference, anyway.
Thanks again to everyone -- boombatower for the code, jpetso + aclight for the reviews, etc, etc.
p.s. Utterly out of scope for this issue, but I figured I'd point interested parties here: http://groups.drupal.org/node/8102 -- that's what needs to happen before we can deploy versioncontrol_api on d.o, which is obviously of interest to everyone involved in this effort, too. Seems like we're not going to be able to do that before the D6 port of project*, so it's not the top priority anymore, but it'd be great to get that list hammered out in the next few months. Getting vc_api on d.o will be a huge step towards hardening and stability that'll make vc_* much more legitimate in the eyes of site builders around the world.
Comment #10
aclight CreditAttribution: aclight commented@dww: That's perfectly fine to commit what's available now and then patch from there. As you said, that is what version control is for, and we might as well take advantage of such features.
@all: Once the code from this issue is committed it will be a little unclear from looking at the official Google task tracker what issue to look at to monitor progress. So, if boombatower can just keep us posted occasionally there, and then post when you feel you've completed the task, we can look through the issues created in response to this GHOP task and make sure you've completed the requirements and mark this as closed. Or, if chx stays involved in the review process, he can just close the GHOP task when boombatower's done.
So, in summary, do as you'd like with respect to using the tools available to you, and we'll keep up with the GHOP task status as necessary.
Comment #11
boombatower CreditAttribution: boombatower commentedI have fixed a few of the requests, from jpetso, and will work on the others, but I figured I would post a response to get the ball rolling on a few of them.
versioncontrol_git_item_branch_points
and made related changes. Fixed - commit #97718The original code has been committed in commit #97667. Any items marked 'fixed' without a commit reference are in the original commit.
Other commits:
Comment #12
boombatower CreditAttribution: boombatower commentedI updated the post above as I completed tasks, but now I believe the work needs to be reviewed.
If further discussion of the xgit scripts is necessary a new thread could be created.
As a reminder the project for this is located: http://drupal.org/project/versioncontrol_git. The CVS can be checked out from
contributions/modules/versioncontrol_git
.Comment #13
jpetso CreditAttribution: jpetso commentedMan, you're frickin' awesome. Started testing now, but I don't have much time right now (yay for meeting friends to play a game of Worms or two). Short report:
Ok, gotta go! Maybe I can catch you on IRC sometime, but this seems to work out really nicely :D
Comment #14
aclight CreditAttribution: aclight commented@boombatower: A few things related to this task:
1. It's probably not a good idea to make major edits to your comments on issues, because it gets confusing quickly. If you need to open new issues to track certain tasks that's fine.
2. Find me in IRC and I'll send you a copy of the xsvn scripts I wrote for my port of the cvslog module to work with subversion. I don't know if those would be useful to you at all, but at least they will give you another example of some functional commit scripts.
3. Regarding getting the commit user, the way I do this in subversion is to call svnlook from the commit hook scripts. svnlook is a binary that comes with subversion which allows one to inspect information about a previous commit or a pending commit. I don't know if Git has a similar feature, but you might see if it does.
Comment #15
boombatower CreditAttribution: boombatower commented@jpetso: Summary of changes:
git tag
without -l works on my box, but I changed it so that it is more correct and possible works on all setups. Fixed commit - #97863@aclight: Response to your points:
Comment #16
jpetso CreditAttribution: jpetso commented@aclight: The main problem that I see with retrieving the committer in the access control scripts is that it doesn't have to be the person who actually contributed the patch. Git remembers who did the original commit even when all the commits are merged into a different repository, and we only want to check on the "uploader", not the one that initially did the commit.
So, in order to solve the problem of associating "uploaders" with accounts, we need to know how an uploader appears on the system. Does Git assume that any user with write access to the repository has sufficient rights? In that case, we might want to use the current system user for the commit scripts. There might also be a shortcoming in the Version Control API's access control API as it doesn't distinguish between committer and uploader, guess I should introduce an optional uploader username parameter or something.
In short, we need more research on 1. how Git's standard workflow works with regards to user identities, and 2. if it can be different to that standard workflow, and if so, how.
Apart from that, the module starts to become really nice. In order to keep the whole thing manageable, I'd suggest we have a list of "active" and "needs response" tasklets, and let's assign those to letters - a), b), c), etc.
Seems I don't find any time for a proper answering/improvement session, need to go to sleep now as I'm going to get up and drive in 6 hours and still haven't packed. But well... :P
Comment #17
morphir CreditAttribution: morphir commentedthe ssh key should be submitted when you register or apply for a git account to the central repository(git.drupal.org).
The RSA encryption makes this secure as hell. See git web on how this is done.
Short debrief:
Git opens up for having several trees against one project.
[Scenario]
Views module would have one master tree maintained by Earl Miles, and one "fork" maintained by jpetso. This way, earl can watch any research or crazy attempts that are being done trying to improve views, and ultimately merge that fork into the master tree when or if the fork matures.
Now, we could introduce "The Drupal developer forest" - which lists all trees, or links to all trees. How about that?
Comment #18
jpetso CreditAttribution: jpetso commentedI had a short (...NOT!) research on commit access restrictions and user authentication today, and came to the conclusion that commit access scripts are not necessary for distributed version control systems. So that eliminates the need for a "pre-receive" or "update" hook script, and leaves us with migrating post-commit to post-receive.
Comment #19
jpetso CreditAttribution: jpetso commentedOh right, and the nice people in #git also had a proposal on how to retrieve the latest commits. They advised not to base anything on the commit date/time (which is naturally a good idea for nonlinear commit trees like in Git) and rather go from the latest commit of each branch. That means we'll need a table like {versioncontrol_git_latest_commits} with the columns 'branch' and 'revision' (= SHA-1 hash) and on each update we'd store the latest commit on each branch (as retrieved by git log) in there. I could imagine a repository update algorithm like this - pseudo code, should only communicate the idea, not usable as is. Er... it grew a bit large... well what the heck, I like it :)
Something like this should work pretty well... the branches should be stored (and retrieved, and deleted) as branch ids, but apart from that I think I got it quite realistic. Bonus points for storing all previous revisions in {versioncontrol_git_item_source_revisions}(revision, source_revision). Well, bonus points... actually that should really be done.
Comment #20
boombatower CreditAttribution: boombatower commented@jpetso: summary of changes:
Comment #21
jpetso CreditAttribution: jpetso commentedVery cool, it's really starting to get ready for real-life usage. Not yet perfect, however:
'..[branch]'
part of the comment should now be'[branch]'
only, and the description of the get_current_revision() and get_previous_revision() functions doesn't match the code anymore. Also, a typo in line 45: s/obsolute/obsolete/.Now, the {versioncontrol_git_item_source_revisions} table is supposed to help us to find out which source items there are for a given current item. (And the other way round, for item history purposes.) With the current approach, you can only distinguish between the first three cases, but especially the "merged" action is very prominent and widely used in distributed version control too. So, I don't know if it's feasible to find out, but it would help a lot to have {versioncontrol_git_item_source_revisions} hold the columns (item_revision_id, source_item_revision_id), that is, storing file-level parents instead of commit parents. In any case, you should clean up entries from that table in versioncontrol_git_commit() when a commit is being removed.
Sorry for all that nitpicking... don't get scared off by all that stuff, you're doing a fabulous job. It's only that distributed VCS provide a lot of possibilities, and therefore present a lot of edge cases to someone who tracks the commit history.
Those are all issues that remain, if I'm correct. Nothing left from earlier comments, so we can use this as the new starting point. I believe that when these issues are fixed (and no new ones arise by suboptimal implementation of those), the GHOP task will be done, with the Git backend as one hell of a module.
I'm leaving for skiing holidays in two days and taking tomorrow mostly off for preparation, so this is probably my last review in this issue. (No promises on that, though.) Need to get hold of chx and tell him to do the final reviews. Maybe aclight can also help out. Have fun, and thanks for the fish! boombatower, you rock!
...oh, and I'll be back on the 9th of February, see you then!
Comment #22
mlncn CreditAttribution: mlncn commentedSubscribing in awe.
benjamin, Agaric Design Collective
Comment #23
boombatower CreditAttribution: boombatower commentedRenamed
get_source_revision()
and updated comment. I'm not sure what is wrong with the documentation forget_current_revision()
Fixed - commit #98670Clean up entries in
versioncontrol_git_item_source_revisions
when a commit is deleted. Fixed - commit #98757A few other commits that fixed/changed some miscellaneous items were also committed.
Comment #24
gordon CreditAttribution: gordon commentedsubscribing
e-Commerce is already using git at http://git.drupalecommerce.org, and if we were to use something like git we could restrict access even more and patches can be past as git patches which will information like the original author or the patch and who the committer is. These patches can be signed giving a chain of evidence so you can track down exactly who did the code without any doubt.
Comment #25
jpetso@drupal.org CreditAttribution: jpetso@drupal.org commentedStill a bit time left before leaving! I know that I should review the Mercurial backend first, but whatever... let's get this finished (sorry ezyang).
ad 1.: Good point, let's keep the 'updated' value then. (maybe with a comment somewhere that it's only there for "documentation" purposes and not needed for the backend to work correctly.)
ad 3.: You're right, get_current_revision() is alright - sorry for the riot.
ad 5. What I meant was "it will replace the table as it is now", not "the table is unnecessary altogether". So as last major point, I would like to see this table transformed from a mapping of commits to a mapping of items. Stealing from my own Version Control API issue:
The difference is that you can retrieve all source items in get_commit_actions() with this structure, so if a file has been merged from two source files, you can provide two items in the 'source items' array. In other words, you'd track source revisions on a file level rather than on a commit level. Also, get_source_revision($revision) (single return value, identified by the commit id) will need to change to get_source_items($item) (return array possibly containing multiple items, identified by the item_revision_id) and move over to the .module file. I hope that explanation was good enough, in combination with re-reading 5. from above it should contain the info that is necessary to implement it. If not... well, don't know what to say more, and I'm running out of time again.
Other minor points:
8. It seems to me that the commit_branches_id in {versioncontrol_git_commit_branches} is unused and not necessary, if there's not a good reason for it then I'd suggest you remove it and set (vc_op_id, branch_id) as primary key. repo_id in that table is imho not totally necessary as well, because the linked vc_op_id already has that information. Haven't thought about this much, but that's what I currently think about this.
9. Just noticed that you are using {versioncontrol_git_item_revisions}(revision) for the commit id - that's not really what I meant before. CVS does it this way because there are no commit-wide revision numbers, so it has to be stored for each file individually. This is better (and easier) with other VCS as there is only one revision number / commit id per commit, and that is supposed to be stored as $commit['revision']. It then goes into {versioncontrol_commits}(revision) and will be autoretrieved from there, so when doing get_commit_actions() you can just assign $commit['revision'] to $item['revision'], and you can get rid of the hackish functionality that retrieves the commit id from {versioncontrol_git_item_revisions}. So, please delete 'revision' from that table and get it into $commit['revision'] instead (which you currently leave empty).
10. Related: if you fix 9. and add VERSIONCONTROL_CAPABILITY_ATOMIC_COMMITS to the backend capabilities, then commit log will display the SHA-1 hash as commit identifier instead of the vc_op_id. (At least, it should work this way - untested because I didn't have a matching backend yet.)
So, only one major issue left? Seems like next time this will be RTBC, or stuff. I love it.
Comment #26
dww2 minor points:
a) yes, i think we'll still need tag/branch validation checks, but that's outside the scope of this GHOP task (not that it'd be all that hard, but it's certainly not essential to consider this "done").
b) as rocking as all of this is, no one should confuse this effort with the idea that d.o itself will inevitably move to Git as soon as this is committed. there are still vast unanswered questions about such a switch, a lot of work to do before d.o can use VersionControl API at all, much less versioncontrol_git. i don't say this to squelch anyone's enthusiasm, especially not boombatower's, but just to appropriately manage everyone's expectations. ;)
i'll be offline most of today, but hopefully available tomorrow sometime to do some final reviews.
thanks everyone!
Comment #27
boombatower CreditAttribution: boombatower commented@jpetso: summary of changes:
source_revisions
code and table, redesigned delete commit and repository to handle the new changes. Fixed - commit #99025versioncontrol_git_commit_branches
. Not exactly sure why that was in there. Fixed - commit #99028versioncontrol_git_item_revisions
and use theversioncontrol_commits
revision column. Fixed - commit #99029It seems we had quiet a bit of miscommunication. Many of the things that have been requested to be changed over the course of the thread are things that I purposely changed. Oh well, it seems that everything is working and the module should be just about, if not ready, to be reviewed and released.
One thing I noticed is that when a deleted file commit is being viewed in the log the first letter is cut off. For example:
code.txt
turns intoode.txt
. I'm gonna guess this has something to do with everything being in the root folder, but that is purely a gusss.@all: I see there are a few things that I can add to the module, but since none of them are necessary I think it is safe to say I can add them after GHOP.
Comment #28
dwwThis code is clearly Done Enough(tm) for a first stab at a functioning git backend. It's all been committed, gone through the ringer for reviews, etc. In fact, chx seems to have already marked this closed at google's tracker. So, I'm marking this 'fixed' for the initial task here, as well.
I just started looking and found a few minor issues that I submitted separately (http://drupal.org/node/217424 and http://drupal.org/node/217425), but clearly, this is almost entirely ready, and we can move reviews, testing, etc into the issue queue for this new module just as any other.
Fantastic work, boombatower! Welcome aboard being a productive contributor to Drupal. ;) Looking forward to working with you more in the future.
p.s. boombatower: chx said over at the google tracker thingy that you still need to upload code there -- please do so to ensure that you get full credit for this work.
Comment #29
boombatower CreditAttribution: boombatower commentedI fixed the two minor issues you found and I have uploaded the final code to GHOP.
Comment #30
Anonymous (not verified) CreditAttribution: Anonymous commentedAutomatically closed -- issue fixed for two weeks with no activity.