One of the few things that vcapi could and should but does not currently provide is an 'activity' log for repositories. This makes, perhaps, some less sense for a centralized vcs, since the 'activity' is pretty much by nature just commits, and maybe branches and tags. But there's a bit of a crisis of identity, I'm realizing, inherent in the {versioncontrol_operations} table - recording branch/tag "operations" doesn't make sense, because there is no vcs I'm aware of in which they're first-class objects in the way that commits are. More importantly, though, that table is kept as an up-to-date mirror of the actual _commits_ that live in a repository - at least, that's how we're using it now that we have dvcses where commits are entities that can actually disappear.

Anyway, that's a bit of a ramble - the point is, if we want a table that truly records 'activity' (that is to say, is never pruned of data unless a whole repository is deleted), it should be a separate table from the operations table. So I propose a {versioncontrol_repository_activity} table. Using such a table, it'd be trivially easy to create things like the activity stream we see on github, gitorious, etc. As a side benefit to this, we can probably return the versioncontrol_operations table to being JUST a versioncontrol_commit table, and cut out some of the abstracted silliness.

Marking phase 3 because we don't strictly need this to launch, though I'd love it if a volunteer worked on it.

CommentFileSizeAuthor
#11 ops-to-commits.patch62.85 KBsdboyer
#6 ops-to-commits.patch62.9 KBsdboyer
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sdboyer’s picture

Just to say a little more on this: there's a theoretical flaw in the plan that treats versioncontrol_operations as an activity table, namely that any vcapi-attached repo must necessarily operate in a position where the best unit of activity to be recorded is network interactions - not commits, not 'labelling', not 'operations', not blah blah blah. While this may have been the original goal, it's simply not something the table is suited for - there's absolutely _no_ guarantee that a given network interaction has a 'message' associated with it, for example. So to ameliorate this conflicted identity the table has developed for itself, we move the true log activity log of network interactions into the proposed, separate table.

Longer term, then, and should be covered in a separate issue would be the effort to re-specialize the 'operations' table into one that's focused on capturing commits, and then as necessary, tables for branches and tags. Making branch/tag tables would also, naturally, mean reworking some of how the versioncontrol_labels table works.

sdboyer’s picture

Issue tags: -git phase 3 +git phase 2

This actually ought to be phase 2, I think, since we'd probably like to be tracking pushes/network operations right from the start.

sdboyer’s picture

Priority: Normal » Critical
Issue tags: +git sprint 1

marking critical and putting in this sprint. really want to get this one out so tizzo isn't blocked on views

chrisstrahl’s picture

Assigned: Unassigned » sdboyer

Assigning to sdboyer

sdboyer’s picture

Status: Active » Postponed

This is gonna be more than can be gotten done in this sprint, unfortunately - just a little too much.

sdboyer’s picture

Title: Create a repository activity table » Separate "operations" concept back out: commits, branch/tag ops, and true activity
FileSize
62.9 KB

Changing title.

I've made progress on at least moving the commits concept back out, but I don't want to commit it back into CVS until I've solved some of the other problems with it, as the changes made thus far could blow up work other people are doing. Will have to make progress on this next sprint.

Patch posted so that people can at least have a looksee, though.

sdboyer’s picture

Status: Postponed » Needs work
sdboyer’s picture

Title: Separate "operations" concept back out: commits, branch/tag ops, and true activity » Separate "operations" concept back out into commits, branch/tag ops, and a true activity stream

Slightly improve title

sdboyer’s picture

Issue tags: +git sprint 2

adding git sprint 2 tag

sdboyer’s picture

Title: Separate "operations" concept back out into commits, branch/tag ops, and a true activity stream » Separate "operations" concept back out into commits, branch/tag ops, and a true activity stream [d2]
Issue tags: -git sprint 2 +git sprint 3

moving to sprint 3, sprint 2 is overloaded

sdboyer’s picture

FileSize
62.85 KB

OK, have a very preliminary patch here. Really this is just a quick first-round of changes, I need to work more on it before there's much to evaluate. The idea, though, is that we'll probably retain the original underlying table structure ({versioncontrol_operations}) and just make the VersioncontrolCommit class handle all of the loading logic transparently.

Anyway, there are basic changes to the classes and some test updates in the patch.

eliza411’s picture

Issue tags: +git sprint 4

Tagging Sprint 4 for completion Week 1.

marvil07’s picture

marvil07’s picture

Now that #879858: Unify entity C(R)UD got in, we have a clean api to provide such an activity feature from other module, so I would just want to point to #498156: Integrate with triggers and the activity module where IMHO we should implement that.

In the other hand I think it's better to just use this issue to rename operations to commits(and maybe also clean up a little more that class).

The only thing we should also need is a way to detect network operations, but I think adding hooks on fetch would be straight forward.

eliza411’s picture

Issue tags: +git sprint 5

Tagging for consideration in git sprint 5

sirkitree’s picture

sdboyer and I discussed using Activity module as a means of recording and displaying activity messages since that project has some key things figured out already. I don't exactly grok the whole change here of operations/commits, but in order to even start on any Activity reporting stuff, I need some help in getting setup. Once I have the following, I can open up a new ticket for the Activity integration.

1. Where do I get a copy of d.o cvs converted to git?
2. How do I hook this up in a local install so that I can see records being saved?

sdboyer’s picture

@sirkitree -

  1. Currently, there's no way to rsync all of git, but you can just clone some repos down individually from http://git.drupalcode.org. You really just need one, though, so that should be fine.
  2. Enable versioncontrol & versioncontrol_git, manually add a repository at admin/project/versioncontrol-repositories that points to wherever that local clone is, then at admin/project/versioncontrol-repositories you just hit 'fetch logs' on the repository. Hitting that link will ultimately call VersioncontrolGitRepository::fetchLogs() and then the super-function _versioncontrol_git_log_update_repository. You should be able to follow it from there...well, depending on how decipherable the log parsing code itself is :)

Of course, since we're talking about actions, and not just getting data in, admin ui-triggered log fetches are quite different than those triggered off of hook scripts. The resulting data will be the same, of course, but the activity logs are concerned with how it gets there, not what gets there.

eliza411’s picture

Issue tags: -git sprint 5

Although we anticipate progress on this issue especially in terms of decisions, it's not officially on the plate for sprint 5.

sirkitree’s picture

So I guess the big question here is: What do we want as a list of activity messages? If I can get a list of messages/desired-end-result - I think I have the necessary background to provide some integration.

Examples of what I'm looking for: (taken from github currently)

1. [username] [action:pushed/more?] to [branch-name] at [repo-name] [time-ago]
2. [username] [action:created/edited/removed/more?] a page in the [repo-name] wiki [time-ago]
3. [username] [action:created/removed/more?] branch [branch-name] at [repo-name] [time-ago]
4. [username] [action:opened/created/closed/more?] [pull-request] on [repo-name] [time-ago]
5. [username] started following [username2] [time-ago]
6. [username] [action:opened/closed/more?] issue [issue-number] on [repo-name] [time-ago]
7. [username] [action:merged/more?] [pull-request] on [repo-name] [time-ago]
8. [username] commented on [repo-name] [time-ago]

I'm not sure that comments and content creation will be handled here really - they'd probably be part of a larger activity integration on d.o - but this at least identifies a few of the actions that we might have possible. Just let me know what ones to start with.

I'm also not sure just yet the extent of token integration vcapi provides, if any, so that'll be something else to look into/add.

marvil07’s picture

marvil07’s picture

About token: token integration was removed some time ago, but we definitely want to add it for this and for #788134: Re-add rules integration when API is stable.

sdboyer’s picture

@sirkitree - sorry it took me so long to respond. Really, these are just PERFECT questions, exactly the kind we need to hopefully get this sucker done and knocked out :)

As marvil07 points out, we need to redo our token integration, it's been gone for some time.

Here's the ones that have been in my head as particularly useful, and that are centered on vcapi:

[username] [action:pushed] to [branch-name] at [project-name] [time-ago]
[username] [action:created/deleted] [branch-name] at [project-name] [time-ago]
[username1] granted [username2] access to [project-name] [time-ago]

Those all have the advantage of being events that occur serverside, so they make for more of a true activity stream. Since git commits are all local and the timestamps are recorded locally, but they get transferred as a batch in a push, I imagine it could generate really confusing activity streams, so we'd want to stay away from those.

Once we have per-issue repositories, I'd love to have something like github's pull request note. For us, it might look like

[username] merged [branch-name] from [issue] into [project-name] [time-ago]

Also, something not strictly under vcapi, but still potentially interesting to track, would be releases, I think:

[username] created release [release-name] at [project-name] [time-ago]

That good enough to start with?

sirkitree’s picture

Yes, thanks Sam.

I'll be working on the patch for Activity integration over here: #498156: Integrate with triggers and the activity module

And also the prerequisite to #1020220: Reintroduce token integration

I'm a bit unclear of anything that actually still needs done in this thread.

sdboyer’s picture

Title: Separate "operations" concept back out into commits, branch/tag ops, and a true activity stream [d2] » Meta: introduce an activity stream separate from commit logs
Issue tags: +git sprint 9

I don't think anything really needs doing, so remarking this as a meta and retitling it appropriately, as #879730: Replace the entire VersioncontrolOperation "subsystem" with VersioncontrolCommit is where we're doing the separating-ops-and-commits bit.

Also tagging for sprint 9 consideration, maybe we can wrap those other issues up then, and this one too.

sdboyer’s picture

We don't have time to do this properly now, not in any way shape or form. However, we do still need to record push data, which I'll be adding as a bare-minimum logging service with no UI. Once this is done, we can use that backlog of pushes and retroactively generate activity for everything that's happened since launch. But we need that bare minimum service for audits & detecting abuse, etc.

sdboyer’s picture

Issue tags: +vc-next

we have the gsoc-activity branch, which is where this'll happen whenever we get to it. kicking it down the road.

sdboyer’s picture

Status: Needs work » Closed (duplicate)

actually, gonna mark this a duplicate to #498156: Integrate with triggers and the activity module