The built in cron runner just runs at the end of the request, which is horrible. See #1554872: "Run cron every" setting (cron_safe_threshold) is not an interval but a Boolean for amusing things you can run into now due to this.

It was originally put in running from a 1px gif, but this was reverted later since that 1px gif meant a full Drupal bootstrap every time it was requested (including when that 1px gif was served from a cached page, so you'd hit Drupal even if you were serving pages from varnish).

However there's new tricks now, so we don't need to do either. If we have this, then it'll also be most of the necessary infrastructure for #1189464: Add a 'poor mans queue runner' to core.

This depends on #1447736: Adopt Guzzle library to replace drupal_http_request() since one of the criteria for finding a new http client is that it can handle non-blocking HTTP requests. So marking postponed, but leaving at major since this is a nasty gotcha on sites that don't want it, and crappy performance for those users who are unlucky enough to be the ones to trigger the cron run.

We could also fire each cron hook in its own separate PHP runner to keep memory usage down and prevent an error in one cron job killing the others, doesn't necessarily need to be done here though.

Files: 
CommentFileSizeAuthor
#15 cron-async-request-1599622-15.patch23.89 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 50,741 pass(es).
[ View ]
#13 cron-async-request-1599622-13.patch26.53 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cron-async-request-1599622-13.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#8 cron-async-request-1599622-8.patch12.1 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cron-async-request-1599622-8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#8 cron-async-request-1599622-8-system-only-do-not-test.patch1.47 KBBerdir
#4 cron-async-request-1599622-4.patch11.83 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 49,400 pass(es).
[ View ]

Comments

For inspiration, Ultimate Cron already does this for D6+D7 using a non-blocking version of drupal_http_request() via Background Process.

Currently mikeytown2 and I are discussing merging the HTTP Parallel Request & Threading Library with Background Process to produce a cleaner separation and better API for respectively http requests and starting background processes.

I know improving the http request API in D8 is in the works, which among other things would probably include non-blocking request possibilities. Perhaps this task also warrants a library for running arbitrary code in the background?

Regarding the individual cron hooks, an extension of this API is being discussed here: #1442434: Collaboration and Drupal 8

EDIT: forgot the link to the background process discussion #1357652: Merge Projects: Replace Background Process's HTTP client with HTTPRL's.

Status:Postponed» Active

Guzzle has been added to core, so this should now be possible...

Status:Active» Needs review
StatusFileSize
new11.83 KB
PASSED: [[SimpleTest]]: [MySQL] 49,400 pass(es).
[ View ]

Here we go. Seems to be working fine for me. The API is not very intuitive, but works.

Had to add the async plugin, copied the version definition (*) from guzzle itself, but I think this wrong, it seems to do a git clone?

Doesn't this need to acquire a lock to avoid a stampede scenario? (If the site is busy and slow such that HTTP requests are having trouble connecting, then it seems like triggering a new HTTP request off of each one which itself will not connect could be a problem.)

I'm also wondering what happens here on servers that are configured to not be able to make HTTP requests to themselves (see #245990: HTTP request testing unreliable and may be undesirable). As I recall there are reports of 30-second delays in those scenarios, and I assume that would be even before any of this non-blocking stuff kicks in (though I'm not sure).

Regarding the async plugin, I think you can add that plugin to the request rather than the client. So you'd have something like this:

<?php
$request
= $client->get($url);
$request->addSubscriber($async_plugin);
$request->send();
?>

copied the version definition (*) from guzzle itself, but I think this wrong, it seems to do a git clone

See #1834594-28: Update dependencies (Symfony and Twig) follow up fixes for Composer. At some point, there may be a Guzzle release we can use instead, but at least as of a few weeks ago, there wasn't.

StatusFileSize
new1.47 KB
new12.1 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cron-async-request-1599622-8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Thanks for the reviews!

- Yes, adding the subscriber to the request is *much* easier, thanks for the pointer.
- Added a lock around the thing, agreed that this makes sense.
- The whole function should probably be moved into a separate listener (it is only invoked by RequestCloseSubscriber) that gets all the things injected. That is, however, not trivial as "the things" means CacheFactory, KeyValueFactory for state, Lock*, guzzle client and there's the do-not-execute-this-on-ajax-requests @todo in RequestCloseSubscriber. Separate issue/follow-up?

* which is broken as it is missing the shutdown function/releaseAll() part as a opposed to lock() and yes, there is an issue for it.

@effulgentsia: Thanks, that explains it. Looking at https://github.com/guzzle/guzzle/commits/master, it seems that the fixes is included in 3.0.6 and 3.0.7, so it should be save to change to that version in yet another separate issue?

Is this the issue referred to in #8

which is broken as it is missing the shutdown function/releaseAll() part as a opposed to lock() and yes, there is an issue for it.

#1825450: Use lock service in lock()

Follow up for

The whole function should probably be moved into a separate listener (it is only invoked by RequestCloseSubscriber) that gets all the things injected. That is, however, not trivial as "the things" means CacheFactory, KeyValueFactory for state, Lock*, guzzle client and there's the do-not-execute-this-on-ajax-requests @todo in RequestCloseSubscriber. Separate issue/follow-up?

needs clarification/correction on what function is to be moved.
#1883896: Move cron invocation to separate listener

Follow up for

Looking at https://github.com/guzzle/guzzle/commits/master, it seems that the fixes is included in 3.0.6 and 3.0.7, so it should be save to change to that version in yet another separate issue?

#1883904: Update to Guzzle 3.0.7

#8: cron-async-request-1599622-8.patch queued for re-testing.

Status:Needs review» Needs work

The last submitted patch, cron-async-request-1599622-8.patch, failed testing.

oops.
#1875818: Update composer.json to use the latest stable 3.0.x version of Guzzle is the one that should be the follow-up.

closed as a duplicate of that: #1883904: Update to Guzzle 3.0.7

Status:Needs work» Needs review
StatusFileSize
new26.53 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch cron-async-request-1599622-13.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Re-roll, no other chances.

Patch is a bit bigger, due to strange changes in the composer files. I updated my local composer version and now it ordered stuff differently it seems...

Status:Needs review» Needs work

The last submitted patch, cron-async-request-1599622-13.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new23.89 KB
PASSED: [[SimpleTest]]: [MySQL] 50,741 pass(es).
[ View ]

That was a re-roll against an old 8.x version, this should be better.

Status:Needs review» Needs work

<?php
+      // Only execute the request if we can aquire the lock. If this fails then
+      // another request is already taking care of it and we don't have to
+      // retry.
+      if (lock()->acquire('cron_request')) {
+       
// Execute the asynchronous request. This will return immediately with a
+        // HTTP/1.1 200 OK and X-Guzzle-Async as the singe header.
+        $url = url('cron/' . state()->get('system.cron_key'), array('absolute' => TRUE));
+       
$request = drupal_container()->get('http_default_client')->get($url);
+       
$request->addSubscriber(new AsyncPlugin());
+       
$request->send();
+       
lock()->release('cron_request');
+      }
?>

As implemented, the locking is pointless. ->send() is asynchronous, so this code path is always really fast.

What would make sense is to try to acquire the cron lock, so that we don't try to reschedule a cron run while the cron is actually already running (we update system.cron_last at the end of the run).

Right, that makes sense.

There is still a window between checking that and executing the HTTP request where we could try to start multiple cron multiple time. Wondering if we don't care (any bigger site should not use this anyway) or if we should add another state flag or something like that. The locking system unfortunately doesn't allow us to acquire a lock and keep it for the requested time even if the process is finished.

IMO it all comes down to if we can reacquire locks across requests.

There's another lock api thread in progress: http://drupal.org/node/1225404

Fixing the locks would be nice. In HTTPRL I hack around core to make the magic happen. As a result of this I also have a cron job to clean up orphaned locks.