Race Condition in the code.

pgregg - January 6, 2006 - 00:14
Project:Poormanscron
Version:HEAD
Component:Code
Category:bug report
Priority:critical
Assigned:Unassigned
Status:patch (code needs review)
Description

The code from the module has a large race condition:
1. Check time of last run
2. If > 1 hour, run cron
... time passes for cron to finish
3. Update db with current time.

This will result in multiple instances of the cron tasks running depending on how many requests there are between steps 1-3.

e.g. Say User 1 makes a page request and the script decides it is time to fire off the cron job
For every other page request (from this user or others) that make a request before this cron job finishes and the script falls through to update the time *another* cron task will be started.

The solution is to use lockfiles (or lock columns in a db table - more appropriate for drupal - this also means you should move your last run timestamp into the same table). So your check for the last time it was run will check:
SELECT lastrun FROM crontable WHERE lastrun < NOW()-3600 AND cronrunning=0
(horribly mysqlised, use $then = time()-3600; instead).

If you get a row back then you know that you *may* be running cron this time.
Next you try and lock the column:
UPDATE crontable SET lastrun=$timenow, cronrunning=1 WHERE cronrunning=0

If the number of affected rows / rows updated == 1, then you got the lock, if it ==0 then some other process got the lock and this script can exit without running the cron tasks.
Finally remember to update the cronrunning back to 0 when the cron tasks finish.

If you want to be able to run the cron tasks in the background then have php exec() a shell script which contains "nohup wget -O - -q http://domain/cron.php &" (unix only) - if you get a wget equiv for windows you can use a batch script to launch it. The exec should return immediately and you get control back in php so you can release the cron lock and the crontab will continue to run in the background.

There is a risk that if the cron tasks take > 1 hour to complete then you will again have 2 simultaneous crons running, but then that is true of the regular cron system.

Hope this helps.

Paul.

#1

greggles - January 6, 2006 - 01:06

Is this different from http://drupal.org/node/24001

I feel like this is a duplicate.

#2

pgregg - January 6, 2006 - 01:10

It is different in that the other report appears to try to describe it in relation to a user session. The bug is the same, just the description of the other report is wrong.

#3

James Harvard - January 6, 2006 - 01:46

Does PHP not have the ability to spawn a separate thread in which to execute some code? If so then the original thread (i.e. the user request) could continue and finish immediately, but the cron code could then run for as long as necessary in its new thread.

#4

pgregg - January 6, 2006 - 22:20

Yes, I already described this in my original post: exec() the shell script with nohup and & to put it into the background.

#5

James Harvard - January 9, 2006 - 04:17

Sorry, I actually meant a way to execute PHP code in a separate thread. As you pointed out, your exec() would only work on Unix. Also, would it work in a shared-hosting environment, or do ISPs disable exec()?

In the middleware I normally use (Lasso) one can very easily execute a block of code in a new thread. This is useful if you want to do something time-consuming (e.g. send a bunch of e-mails) and the current page doesn't need to wait for completion. However I can see that because of the way that (AFAIK) PHP is invoked for most web apps (embedded as an Apache module) this might not be possible.

Alternatively, is there a way for PHP to initiate an HTTP request for cron.php without waiting for the response? I assume that include('http://www.example.com/cron.php') would wait. Is there a lower-level TCP function that could be used to make the request but not wait for the response?

#6

dmuth - May 7, 2006 - 18:48
Title:Race Condition in the code.» I fixed this
Status:active» patch (code needs review)

I got bit by a very real instance of this race condition wherein my site got hit by a bot which caused several copies of poormanscron to run at the same time. This in turn caused a higher load which slowed down those copies, which caused even more copies to get run on subsequent pageloads, which sent Apache into a death spiral.

So, I wrote a fix for this. In the process, since the code in poormanscron_exit() was a little complicated (and not commented so well), I refactored that function into a few smaller ones.

The fix uses MySQL's GET_LOCK() function, so at the time this is not database agnostic. Sorry.

Also, I added in code to tell how long it took to execute the crontab, and to print out details on each module's cron hook that gets executed.

Comments welcome!

-- Doug

AttachmentSize
poormanscron_0.module8.19 KB

#7

dmuth - May 7, 2006 - 18:58
Title:I fixed this» Race Condition in the code.

Undoing my change to the title of the issue. I thought the title would be attached to my comment, and not the title of the bug. Sorry about that!

-- Doug

#8

conkdg - June 12, 2006 - 18:19

I've attached a version of dmuth's module that's updated for the 4.7.x form API. It seems to work fine for me. I noticed that it doesn't register a cron run under the settings at http://www.yoursite.com/admin/settings but it does register all the info under watchdog.

Maybe someone out there knows how to fix it up so it registers undr settings as well?

Cheers!

AttachmentSize
poormanscron_1.module8.51 KB

#9

enxox - January 22, 2007 - 20:43

I need to solve the problem descripted, but the poormanscron_1.module doesn't work on my 4.7.5 site.
How can I obtain a release or a patch?
thanks

#10

peterx - February 11, 2007 - 08:21

For PostgreSQL you could use a .install file to set up a table named poormanscron then use a table lock on that. You could also use a variable as a generic lock. I am using the following code in Drupal 5.0 and the code uses the MySQL lock when available. The code was tested first with the variable lock then with the MySQL lock.

Note that this code has limited testing and may not work on any computer that has a bit set to zero or any other value.

<?php
/* From PeterMoulding.com 2007
4/ Add lock processing.
*/
function poormanscron_lock()
    {
    if(
$GLOBALS['db_type'] == 'mysql' or $GLOBALS['db_type'] == 'mysqli')
        {
       
$query = db_query("select get_lock('poormanscron_lock_'" . conf_path() . ", 1) as get_lock");
       
$result = db_fetch_array($query);
        if(empty(
$result['get_lock']))
            {
            return
false;
            }
        }
    else
        {
       
$lock = variable_get('poormanscron_lock', false);
        if(
$lock)
            {
            return
false;
            }
       
variable_set('poormanscron_lock', time());
        }
    return
true;
    }
function
poormanscron_unlock()
    {
    if(
$GLOBALS['db_type'] == 'mysql' or $GLOBALS['db_type'] == 'mysqli')
        {
       
$query = db_query("select release_lock('poormanscron_lock_'" . conf_path() . ")");
        }
    else
        {
       
variable_set('poormanscron_lock', false);
        }
    }
?>

petermoulding.com/web_architect

 
 

Drupal is a registered trademark of Dries Buytaert.