Incompatible with page caching (incomplete cron runs, possible PHP fatal errors)

mrgoltra - March 8, 2008 - 20:11
Project:Poormanscron
Version:6.x-1.1
Component:Code
Category:bug report
Priority:critical
Assigned:Unassigned
Status:needs review
Description

I keep getting this

Fatal error: Call to undefined function format_interval() in
/hsphere/local/home/mysite.com/sites/all/modules/poormanscron/poormanscron.module on line 57

anyone have any ideas?

#1

gpk - August 12, 2008 - 10:46
Title:Fatal Error» PHP fatal error can occur with Drupal's normal cache mode (can render site unavailable to the visitor)
Category:support request» bug report

I've just had this error.

The function format_interval() http://api.drupal.org/api/function/format_interval/5 is in common.inc.

What I reckon may have happened is as follows:
- normal page cacheing is turned on
- I visited a page on the site as an anonymous, first time visitor (so the page was not already stored in the browser's local cache)
- Drupal found the page in the {cache_page} table and tried to return it to the browser (the specifics: code execution had reached http://api.drupal.org/api/function/_drupal_cache_init/5 with $phase == DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE) and we are in the elseif code block i.e. the CACHE_NORMAL case).
- walking through the code from this point, first of all module.inc gets included
- then we hit bootstrap_invoke_all('init'). This causes all modules with either init or exit hooks to be loaded, and those with init hooks have them called. In my case this means that content.module (CCK), og.module and statistics.module get loaded, and content_init() and og_init() get called.
- then we hit drupal_page_cache_header($cache); This sets headers, sends them to the browser, and prints the (cached) page content to the browser
- then we hit bootstrap_invoke_all('exit'). In my case, poormanscron_exit() and statistics_exit() will be called, in that order. Since I have progress logging of the poormanscron run enabled, lines 51 - 57 of poormanscron.module get executed (http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/poormanscro...)
- the first problem here is that in this situation only the cron hooks of modules that also have init or exit hooks will be called this these are the only ones that will be loaded at this point
- the second problem, as you have seen, is that in line 57, format_interval() is called. However, common.inc has not yet been included (it only gets loaded in the final bootstrap phase, at the bottom of http://api.drupal.org/api/function/_drupal_bootstrap/5).

The solution AFAICS would be to force a full Drupal bootstrap at around line 48 of poormanscron.module. As well as making sure that common.inc has been included, this would also ensure that all cron hooks get called.

By recreating the conditions above, I have successfully caused the error to occur again :-).

When I first hit this problem it actually interrupted the sending of the (cached) home page data to the browser, so the home page didn't display correctly. Because the data was then cached locally by the browser, the site was completely broken!!! Therefore changing status of this issue to critical.

#2

gpk - August 10, 2008 - 20:21
Priority:normal» critical

#3

gpk - August 11, 2008 - 14:00
Status:active» needs review

Here's a patch for 5.x. Seems to work fine, I'll leave the patched code active on my site and monitor behaviour.

AttachmentSize
poormanscron-bootstrap-231724-5.x.patch 565 bytes

#4

gpk - August 11, 2008 - 14:09
Version:5.x-1.x-dev» 6.x-1.x-dev

And a patch for 6.x/HEAD, untested.

AttachmentSize
poormanscron-bootstrap-231724-6.x.patch 481 bytes

#5

gpk - August 12, 2008 - 12:25

Hmm actually it's not so simple.

While the DRUPAL_BOOTSTRAP_PATH phase poses no problems, DRUPAL_BOOTSTRAP_FULL is another matter (e.g. see http://api.drupal.org/api/function/_drupal_bootstrap_full/5). It sets a header, but oh dear if a cached page has already been sent then the headers have already been sent -> PHP error..!

A few possible ways forward I can see:

1. Replicate much of _drupal_bootstrap_full() in poormanscron_exit() ... this may be one of the the simpler ways forward if not the most elegant. There could potentially be issues though with some of the code being run twice.

2. Use hook_init() (D5) or hook_boot() (D6+) in poormancron.module to do the full bootstrap there, but only if we are going to do the cron run and if we are serving the page from Drupal's page cache (note that actually if Drupal sends a 304 header then the exit hooks are *not* called!). Complexities here are firstly that we need to do the "Shall I run cron?" tests in init/boot hook instead of exit, and then probably store a positive result in the session. More importantly, checking if we are going to serve the page from Drupal's page cache is probably means replicating some of http://api.drupal.org/api/function/drupal_page_cache_header/5, which seems undesirable. Also one of the headers (the one set in _drupal_bootstrap_full() would get duplicated but this is probably not important.

3. A bit like 2, put the check for "Shall I run cron?" in hook init/boot and if the answer is yes then force Drupal not to serve a cached result, e.g. by adding a special drupal_set_message() which can be removed later. [update] But since the decision on whether to serve a cached page or not has already been made by the time that init hooks are called, it would be necessary then to do a drupal_goto() or similar to force the page request to be restarted. Also some page requests cause hook_init() to be invoked but not hook_exit() [e.g. those that return a 304 header, requests for system/files/*, ...]. This would need careful handling. This option is beginning to look rather complex.

4. Only do the cron run if we have fully bootstrapped. This means that the cron run may be slightly delayed because it won't be run if either a 304 is sent (as a present) or if the page is served from Drupal's page cache; however, that's probably not crucial. Only catch - I'm not sure how we tell if a full bootstrap has happened. [update] But in D6+ at least there is a way round this - in poormanscron_exit() we can check if the page cache is being used to serve the current page request by calling page_get_cache(TRUE). If this returns FALSE then we are safe to do the cron run.

5. An simple partial fix would be just to include common.inc and load all modules, but since modules' cron hooks can reasonably assume that a full bootstrap has been performed (see ) this is not a 100% answer.

[updated] Hence 4. looks to be most promising for D6+; for D5 the jury's still out!

#6

gpk - August 12, 2008 - 11:01
Status:needs review» needs work

Instead of the above patches at #3 and #4, here is a simple fix that should work in 90% of cases (this is essentially point 5 from #5 above). Inserting the following after the line $saved_messages = drupal_set_message();):

<?php
   
// Ensure that common.inc is loaded.
   
require_once './includes/common.inc';
   
// Set the Drupal custom error handler, so we get watchdog reporting of errors yay!
   
set_error_handler('error_handler');
   
// Load all enabled modules.
   
module_load_all();
?>

I'll continue to monitor the behaviour of this.

#7

gpk - August 12, 2008 - 12:42
Version:6.x-1.x-dev» 5.x-1.x-dev
Status:needs work» needs review

Here are new patches for 5.x (approach 1. at #5) [seems to work OK, currently monitoring its behaviour]

AttachmentSize
poormanscron-bootstrap-231724-5.x_0.patch 2.42 KB

#8

gpk - August 12, 2008 - 12:51
Version:5.x-1.x-dev» 6.x-1.x-dev

and for 6.x/HEAD (approach 4. at #5). This one completely untested but much simpler.

These should make poormanscron work much more reliably when Drupal's page cache is enabled. Also, if any PHP errors occur during the cron run then they will now be reported in the dblog/watchdog. Previously they would only have been visible in the Apache error log.

AttachmentSize
poormanscron-bootstrap-231724-6.x_0.patch 810 bytes

#9

Dave Reid - October 7, 2008 - 23:02

The trouble with the patch in #8 is that when caching is enabled, the user is anonymous, and the requested page has not been cached yet: before hook_exit is called, the cached data will be saved. Using page_get_cache will always return TRUE in that case. The problem is you should be able to call drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL) and not have PHP error on the drupal_set_header line if headers have already been sent.

#10

Dave Reid - October 7, 2008 - 23:03

Ah nevermind, page_get_cache has been changed for Drupal 7. This will work for Drupal 6.x, but will need to be changed for D7.

#11

gpk - October 7, 2008 - 23:09

@9/10: #168909: Drupal messages (status/error) are cached along with nodes is still waiting to be tested and applied to D7, which is why the D7 version of page_get_cache() is missing the $status_only parameter.

But agreed, #8 won't work when #318102: hook_exit() not invoked for some cached requests lands, your suggestion at #11 in that issue looks promising.

#12

gpk - November 6, 2008 - 04:47

Note that #8 won't work with Drupal 6.6 and higher because of #318102: hook_exit() not invoked for some cached requests. But may do for 6.0 - 6.5.

#13

gpk - April 12, 2009 - 14:00
Title:PHP fatal error can occur with Drupal's normal cache mode (can render site unavailable to the visitor)» Incompatible with page caching (incomplete cron runs, possible PHP fatal errors)

Making title clearer.

Executive Summary:
cron hooks can reasonably assume that a full Drupal bootstrap has occurred (http://api.drupal.org/api/file/cron.php/6/source). If Drupal is set to normal page cache mode, then poormanscron_exit() will sometimes initiate cron runs after a partial bootstrap. This will result in some cron hooks not being invoked (since not all modules are loaded) and will cause fatal PHP errors if for example functions defined in common.inc (which as not been loaded) are called from cron hooks.

Further notes:
These errors will not be logged to the database or screen (but should feature in the site's error log). It is also possible for the site to be rendered unavailable to the visitor, because a fatal error can terminate the cached page processing abnormally, and that can trash the response from Drupal's page cache, which will still be cached locally by the browser.

#14

vacilando - April 15, 2009 - 13:12

Same problem in 6.x-1.0.
(Caching is on 'normal'.)

Subscribing.

#15

fuquam - April 16, 2009 - 15:52

subscribing
same in 5.16

#16

Dave Reid - April 20, 2009 - 17:09

#17

mikesir87 - November 10, 2009 - 22:00
Version:6.x-1.x-dev» 6.x-1.1

Subscribing. Running into same issue with 6.x-1.1. With caching on, errors are thrown in the Apache logs, but not on the screen. With caching off, no errors thrown. Running Drupal 6.

 
 

Drupal is a registered trademark of Dries Buytaert.