Drunkins - an in-Drupal job runner.

This project is not covered by Drupal’s security advisory policy.

What does this do?

Drunkins

defines a clear and (hopefully) simple interface for Drupal developers to write classes that represent a 'job', which processes items;
provides ways to run those jobs: interactively (through the Batch API), in the background (on cron), or using drush.
allows for a UI for interactive job runs to provide extra settings, which change job behavior.
The 'job runner' for cron is explicitly suitable to process items on short lived HTTP requests. (It may still need to be tweaked to be ultra reliable in some cases, but this is possible.)

Drunkins is only suitable to run jobs that process items one by one; not for anything else (e.g. starting 'background' shell processes and monitoring them).

What can be done with this?

Widely different things :)

Personally, I've run this in production since 2012 (while developing it into a reusable 'job runner'), to interface between various Drupal Commerce websites and external ERPs (called AFAS and Exact Globe). That is: I've written Drunkins jobs to perform product/stock/user imports, and sending orders back into the external system in a reliable way, on cron. The importer code is included with this module; it's suitable for e.g. creating combinations of Products and Display nodes, creating customer profiles, creating node references for 'linked' products.

I've also implemented other data synchronisations between external systems, e.g. from an AFAS contact database into Sharpspring. (The fetching and processing parts of the job are open sourced in different projects, and it takes only some configuration and a little custom code to put a daily automatic contact-database synchronization together.)

Obviously these processes can be implemented in other ways, but I've found it's quite nice to be able to interactively test (in browser/batch) processes and then be confident that my code will run unattended on cron, without changes.

What would more interesting for Drupal in general, is for instance:

a job that deletes all (comments and) nodes safely, before (depending on whether no errors were encountered) automatically deleting the content type at the end. This should be quite easy to code.
code that switches between 'runners' (so that e.g. you can start the above comment/node deletion as a batch process while deleting a content type... but if it turns out the batch process needs to delete tens of thousands of entities... you have a button that will continue processing everything in the background, on cron. Or conversely, if some job is processing items on cron but you want to finish it faster, you can 'pick it up' in your browser and run the rest through the Batch API.) This is possible to do safely, within the Drunkins module (but should still be coded - and it's not so easy).
it should be possible to code a 'drunkins job' that can handle views bulk operations, thereby enabling running (many) bulk operations on cron / in drush.

How to work with this (as a developer)

You don't worry about the code that is running the jobs; you only implement the job itself. And if you want your job on cron, you take care that a method call does not take up too much time.

Writing a job means writing a class that implements DrunkinsJobInterface. This has three methods:

start(), which is run at the start of the process and returns items to process
processItem(), which is run for every item
finish()
(and also: settingsForm(), to be able to tweak settings before submitting a job run by the batch API.)

All three methods receive a context array, which can be manipulated throughout the process. A job also has settings. Read the comments in job.inc and fetcher.inc for more info how to use these, as long as there is no other documentation yet.

Common questions:

We already have this. It's called Drupal Queues.

Actually, no. Drupal queues can have items stuck into them, but they do not have the notion of a "job/process" that is "finished" at some point. We often want to do something like reporting on the number of items processed by a job, or doing something (like above: deleting of a content type) after all items have been processed. The standard Drupal queueing mechanism does not support this. Drunkins jobs (which use Drupal queues) do.

We already have this. It's called the Batch API. Why make yet another abstraction layer?

Because I want to be able to run things (like daily imports/synchronisations/long deletions) silently in the background, unattended, and I want to only get notified if an error happens.

... but I have to admit that Drush (which can run batch processes) can do largely the same thing. IMHO what I have written is nice, but the added value of it is not very clear/big. I mean, who really wants to use stuff like imports on 'regular' cron runs, if you can do it through commandline php which does not time out?

same

I do still like my own system, however, so I am using it in several production websites. And since some of my clients use crappy hosting, I am happy that I now do have several pretty advanced synchronization methods that can run on 'stock' Drupal, independent on Drush. I was always planning to do a proof-of-concept for bulk deletion of nodes that can run on cron... It just hasn't happened so far because I keep finding things to fix up first.)

Why make yet another import framework? We already have Migrate and Feeds.

This is very true. The intention was never to build an 'import framework' to replace the existing ones; the intention was to build generic 'runner' code that could run jobs both on top of both the Batch API and cron/queues.

In the ideal world, I would have checked whether the existing import could be changed to work on top of Migrate. I've never really had the time and courage so far, to do such a big overhaul of existing production code. Changing the existing code to fit new requirements always seemed safer - which. after a lot of small rewrites, has resulted in a generic Drupal-entity import class and a set of Drupal Commerce-specific subclasses.

Having said that: around 2013, the Migrate (D7) code did not strike me as very capable to process item by item on HTTP requests with a limited lifetime (Cron). (I might be wrong.) And while Feeds has the ability to do processing on cron, it's not scalable. Large feeds will just break processing on limited-lifetime HTTP requests.

Does this only work on D7?

So far, yes, and there are no real plans for porting to D8 yet, unless there is interest from others. I have not been able to spend as much time on it as I'd hoped.

I'm curious and have questions that are not answered here.

Please contact me. I'm interested in not letting this project slowly die.

Roadmap

Note: the roadmap is more like a wish list because I don't have time dedicated to work on this. The project always seemed to generate new requirements that should be included in an alpha release...

Change the interface definion for:

start() should be able to return any type of 'forward-only iterator', that is: not only an array but any generator as well. (This seems to be mostly a question of documentation.)
finish() should not return a message, but should just log any messages and have an 'undefined' return value.

Work out exactly how we want to implement settingsForm() -the submit functionality needs work- and create a separate interface for it.
Rework the current procedural 'runner' code into classes. (I already have a special-purpose subclass in mind for cron.)
Implement and document 'capabilities', and the possibility for jobs to only run on a runner that advertises a minimum set of capabilities. Document what this means. (it's in my head only at the moment.)
Write tests for everything. (It's seriously creepy that this still has none.)
Port to D8.
Implement "send to background" button in Batch screens processing a job, and "pick up and process the rest as batch" functionality for jobs running on cron.
Implement bulk processing jobs for Drupal core. (As in: bulk deletion of comments/nodes before deleting a content type, or before deleting a language.) This will be fairly easy.
???
Profit!

Project information

Created by roderik on 30 November 2012, updated 24 March 2017
This project is not covered by the security advisory policy.
Use at your own risk! It may have publicly disclosed vulnerabilities.