I'd like to propose a refactoring of the cron system for 2.x, this will incorporate some non-drupal crontab management stuff for the scheduled tasks and then a lightweight use-case of running the drupal sites cron tasks on the scheduled task stuff.

I talked to anarcat about this at Chicago, and he suggested that for scheduled tasks we should maybe use crontab directly, and inject our tasks into there. From reading around it seems that crontab should be able to cope with handling lots of tasks, we just might need to be careful about running lots of intensive tasks at the same time.

So, this issue is for hashing out ideas and getting a plan for implementation together, I'm happy to lead the effort to get this architected and coded.

So, essentially we'd need a service that controlled the crontab for the aegir user on any Aegir controlled system, I think that we'd only need a basic implementation, but the option would be there for people to build something for more exotic cron implementations.

Now I don't think that cron has an include functionality (beyond debian's extra cron stuff?) so we can't manage the crontab in the same way as say apache. However, we can still use that as a model, and write things that look like 'include' files and then we can assemble them and inject them into the crontab.

Being a service, means we can be tied to different servers, so for example, we could have a server that executes specific cron tasks, e.g. the webserver that a site is on could execute Drupal cron for that site. We might need to require drush to be installed for some tasks to complete on 'remote' servers but this would be a requirement of the server, like needing mysql on a DB server. Note, that I don't see this as an answer to the running tasks remotely issue, I just mean that we should be able to run Drupal cron from drush on a remote machine fairly easily.

Feedback? Questions? Please post some comments!

Comments

anarcat’s picture

I like this idea (hey, i was in the original discussion after all ;). I especially like the idea of "not reinventing the wheel" here - we shouldn't try to reinvent cron for ourselves, but instead use the existing facilities.

Also, by making this pluggable, we could make this run under jenkins or other similar tools. Jenkins, in particular, supports the crontab syntax for specifying task recurrences so it's a really good candidate.

I agree with the idea of assembling parts of the crontab into a single one. I have already improved the crontab management implementation a bit in 2.x so that it's simpler and fed from a pipe, btw...

I'm all go for this, and I also agree that this is a different problem scope than running tasks on demand.

Oh, and I wonder how we could assign certain servers to run certain tasks and not others ("this is the cron server")...

Steven Jones’s picture

Interestingly http://drupal.org/project/job_scheduler has an implementation of crontab in PHP. I'm not sure if it would be hugely useful for Aegir to use an all Drupal solution, but it is interesting. More thought needed about what we exactly want scheduled tasks to do. Note that Job Scheduler handles periodic and one-off jobs equally well.

anarcat’s picture

Note that for one-off jobs, there's that "at" daemon too. ;) Maybe we can make that modular? Things like jenkins can also behave that way and would be lots of fun to a lot of people... :)

cweagans’s picture

Is this still on the table for 2.x? I'd love to move the entire queue into Gearman or Beanstalkd or something. Cron is not really a good tool for this - it'd be better to use something that will process tasks as soon as they are queued.

ergonlogic’s picture

Title: Replace cron (and scheduled tasks) system in 2.x » Replace cron (and scheduled tasks) system
Version: 6.x-0.4-alpha3 » 7.x-3.x-dev

@cweagans, we ship with hosting-queued in core now, which the Debian package installs by default. So we pretty much already "process tasks as soon as they are queued."

We won't likely be making this fully pluggable in Aegir 2, though.

anarcat’s picture

2.x should be hackable to do what you want. It won't be pretty because you would be basically duplicating the queue instead of replacing it, but I see no reason why you couldn't disable all queues and just process tasks another way.

Note, however, that this is specifically about running cron.php on multiple sites, not necessarily the whole queuing system, for which there isn't an issue yet, but you can see: http://community.aegirproject.org/roadmap/2.0#Modular_queuing_system