Jump to:
| Project: | Drupal core |
| Version: | 8.x-dev |
| Component: | cron system |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Issue Summary
At least since 4.6 and probably earlier too, cron tasks have been (as I see them) an unclassified amount of tasks handled by module on all cron runs, requiring each module to take uncoordinated steps on its own to manage its own scheduling, with the net result that cron tasks have to fire all existing hook_cron implementations, potentially to memory exhaustion problems in many hosting situations (see the number of issues around this, notably regarding search).
One way to work around this would be to enable cron-based scheduling instead of module-based scheduling, and/or allow cron to be called to invoke specific cron-ed tasks instead of the whole lot, much like the original UNIX cron uses a crontab instead of relying on every cron job to know when it must be scheduled.
Upsides:
- one instance of scheduling code instead of many incompatible ones
- ... which can justifiy the work going into a nice scheduling UI
- ...without preventing modules from doing their scheduling themselves
- ...and can still work with the existing hook_cron specification
- ability to reduce the memory/cpu requirements on cron runs. For instance, long-running tasks like search indexing can be run on their own instead of along with the othet tasks, reducing breakage probability
- ability to minimize the impact of choking tasks, like aggregator updates failing to obtain the upstream source, since these would be performed on their own runs instead of along other tasks
- probably still compatible with poormanscron
Downsides
- potential compatibility loss for some situations (which ones ?)
- some form of reentrancy would have to be considered: unproperly scheduled cron tasks would be more susceptible than now to being invoked without the previous run being finished. A solution would probably to handle the runs within cron as some form of critical section, now allowing a scheduled subtask to be started if the previous run has not returned (possibly with a failure situation)
What do you think of it ?
Comments
#1
Setting to future version.
#2
It would be nice to have Drupal emulate a "virtual crontab" (only that it can't work in real time, so it would have to be slightly different).
The problem with a simple solution - like having hook_cron() pass an argument to the module that tells it when it was last triggered - is that it's still relatively useless information to the module: It needs to know when it last did whatever it does, not when cron was last fired.
A better method might be a scheduling hook. Primitive example:
<?phpfunction hook_cron($op, $delta) {
switch ($op) {
case 'list':
return array(
0 => array('description' => t("Optimize table"), 'interval' => 3600),
1 => array('description' => t("Rebuild search index"), 'interval' => 300),
);
}
case 'run':
switch ($delta) {
case 0: return optimize_table();
case 1: return rebuild_index();
}
}
}
?>
#3
Nice feature, this. Please don't forget to fix cron in D7.
#4
This would be great, and would help a lot with the issues that cause #131536: Make cron watchdog more granular and informative to be necessary for cron debugging.
#5
Added subtitle.
#6
Moving to new cron system component.
#7
Marked #246871: Flexibility in Drupal Cron scheduling a duplicate of this issue.
#8
What about a modification to hook_cron that would make it a registration of events to occur in an array much like the hook_theme. The data in the array would contain a default value for the cron task timings and a UI is created to schedule those timings to their desires. For for module foo the hook_cron implementation would look something like the following to be refined in discussion to follow.
<?php
/**
* Implementation of hook_cron
*/
function foo_cron() {
$items[] = array(
'task' => 'foo_batch_task',
'time' => array(
'period' => 'hour',
'hours' => array(
08:00,
20:00,
),
),
);
return $items;
}
/**
* A task to be performed in cron
*/
function cron_foo_batch_task() {
...
}
?>
Obviously I'm whiteboarding here and the data returned from the hook_cron implementations needs to be refined. The cron.php script would be changed to call the registered tasks at the appropriate time instead of iterating through the list of implemented cron_hook.
#9
It would seam to me that the best way would be to be able to weight tasks for cron and then set parameters for various weights that should run in a different job. Also a specific module should be able to specify itself as its own cron job. So for example migrate module (http://drupal.org/project/migrate) could have its own cron.
Would there be one cron task that would look for something telling it which cron tasks to run or should files be created that represent the various cron task to run. How does this integrate with the implementation of poormans cron in Drupal 7.
#10
I suggest hook_cron() to act like hook_menu() or hook_theme() (like already proposed). This also opens up for implementing a hook_cron_alter() like the hook_menu_alter().
Elysia Cron and Ultimate Cron already does this in some way or another, which could provide inspiration to a new and more versatile cron system in Drupal 8.
#11
I would love to see the likes of Elysia Cron as default install for core.
#12
I would like to see a weighted cron system, simply specifying an interval does not take into account the specific site implementing the cron. Not all sites need the recommended interval set on cron runs. Having an interval could also be problematic due to the fact that if several cron passes have been missed, all the crons will be fired at the same time which should be ok but probably undesirable. What I think would be better is if the cron system takes all of the crons hooks in the system (btw I like the idea of being able to declare multiple crons in one cron hook), look at the crons weight (which can be configurable in a cron admin page), looks at how long that particular cron run took to run last pass and using an equation decides which cron runs it will run this time based off a calculated weighting system. I think also implementing a hook_cron_info command to declare cron functions would enable this to be backwards compatible.
@gielfeldt what would the hook_cron_alter be able to change? the weighting/interval? the callback? stop the callback?
I might have a crack at writing this system in the next few days.
#13
BTW, http://drupal.org/project/job_scheduler allows scheduled jobs to have a crontab syntax for determining when they are run.
#14
@timhilliard I would say that hook_cron_alter() should be able to change anything that has been declared in hook_cron(). Including removing the job by unsetting the job in the array.
#15
Let's suppose we have 3 cron hooks to be execd:
-node_cron
-system_cron
-search_cron
They are (as all cron hooks by default) included in the group "all" or "default". Let's call it a "channel".
Cron.php is called hourly from crontab as usual and the "all" channel is run by default.
Now we want to change search index scheduling:
We create a new channel "daily" and via some kind of drag interface we move search_cron from channel "all" to newly created channel "daily".
We add to the OS crontab another line like
cron.php?cron_key=KEY&channel=dailywith its own scheduling.Channels have not to match scheduling intervals, there can be channels like "housekeeping", "heavy_tasks", "offpeak" or whatever your particular installation requires, as long as you later add the specific channel crontab entry. Anyway if you are tweaking cron, you should probably have at least the knowledge of how to add a crontab entry. I prefer some sort of weighed system instead of specifying time intervals from inside Drupal as mentioned earlier.
This respects the assumption that cron is simply invoked externally from Drupal without having any control or beforehand knowledge of how often it will be called as I think it is currently.
Perhaps this can better split responsibilities between the site admin and the host system admin on who decides/controls what as it is now, but that's a supposition since I usually wear both hats. This may or may not be desirable, too.
At crontab file level, it's a "call cron.php very often enough and let it decide the scheduling interval" vs. "call each channel with this exact scheduling and don't let Drupal [site admin] decide anything about the scheduling interval".
Note that I have not a deep understanding on cron's implementation, so:
-Some base assumptions on my part may be plain wrong.
-I might be biased in favor of keeping it simpler or at least not much different of how it works currently.
What do you think about this system?
#16
Having found elysia_cron (or an equivalent) to be a necessity on anything but the simplest of Drupal sites, and having helped a ton of people with cron issues (usually having one module's cron task break cron on a site), I think at least having the ability to run different module's cron tasks at different times would be helpful. The other thing I would love to see is the ability for a module to specify an array of cron tasks (rather than have to try to schedule different things to happen on different cron runs).
See #1442434-15: Collaboration and Drupal 8.
#17
Something like a registry of callback functions to execute with the registry entry giving the cron run frequency and time limit?
Something like
registry[] = array(
'callback' => 'mymod_foo'
'parameters' => array(1, 2)
'frequency' => 1
'time limit' => 60
)
# 'frequency' values of 1 means always, 2 means every 2nd run, 3 means every 3rd run, etc.
# 'time limit' integer values represents max time in seconds, string values of integer followed by '%' represents a percentage of max time.
#18
Something along those lines, yes. Although I don't know if setting a frequency would be as flexible as I'd like. I often run cron every minute on a site, and I have some tasks run every minute, some every 5, some every hour, some twice a day, once a day, once a week, etc., so it'd be tough figuring out all those intervals (but much better than the current situation).