Closed (fixed)
Project:
The Great Git Migration
Component:
Decisions
Priority:
Critical
Category:
Task
Assigned:
Issue tags:
Reporter:
Created:
25 Aug 2010 at 14:36 UTC
Updated:
20 Dec 2010 at 21:57 UTC
In order to implement #852334: Add a mechanism to create Git repos on-demand for new projects, we need to determine a job queuing system for creating project repos.
After talks with Narayan, I think we're leaning towards beanstalkd, but we need to finalize this.
Comments
Comment #1
webchickHere's a post from the github blog about all the different ones they tried and ruled in/out:
http://github.com/blog/542-introducing-resque
Comment #2
chrisstrahl commentedsdboyer to talk with David and Narayan and follow-up on the Drupalcon CPH discussion. Update to be published.
Comment #3
sdboyer commentedNarayan and I had originally discussed using beanstalk for our job queueing needs, but David made a convincing argument in CPH that we'd be better off sticking with Hudson, at least initially. Most important is the basic reality that a) we've already got Hudson set up and working swimmingly on infra, b) it's perfectly well capable of having jobs remotely triggered (my initial concern). We don't, and probably won't for at least the reasonably forseeable future, have the sort of scalability concerns wrt a job queue that github does (as described in the link from #1). Hudson has the added benefit of all that logging, which'll probably be especially helpful for us early in the process while we iron out the kinks. In the long term, if we need to replace Hudson with something lighter, that'll be doable.
Using Hudson to manage our web-triggered jobs means a few things. From my (limited) understanding, it's doubtful we'll be able to do an effective DrupalQueue implementation that works directly with it; rather, PHP will just have to send a notification to Hudson and hope that everything works out right. Within Hudson, we'll have two types of jobs: direct passthroughs which more or less allow the calling PHP to determine exactly what logic should be run, and Hudson's just a dumb wrapper, and more structured jobs that just take arguments from PHP. Ideally we'll start with the former and move towards the latter wherever possible. We lose a lot of Hudson's benefits if it just acts as a wrapper for different sorts of jobs, as all of its trend and failure reporting data become kinda arbitrary and unhelpful.
Comment #6
tizzo commentedI just opened a new issue forgetting about this one: #1003642: Determine job queue for Git operations
I think deployment of beanstalkd in production will be easier than getting the relevant people up to speed on using hudson and hudson may very well have issues handling the floods of new jobs that might come in when a batch of commits is queued for being added to drupal (we think each commit may be queued individually when large pushes (like big merges come in)).