CRON generates a "Page not Found" error...
AlexisWilke - October 31, 2009 - 23:36
| Project: | Boost |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Description
Not too sure (yet) what is causing this error, but if I disable the boost module, my cron.php works just fine.
Then I turn ON boost, make sure there is at least one page cached (i.e. my home page) and then I do cron.php again. At that point it breaks and I get no info in any log other than a "Page not Found", specifically saying that node/1, which is my front page, does not exist.
node/1 does work just fine, thank you. But I was wondering how can I debug that?!
Thank you.
Alexis
P.S. I am NOT using the CRON crawler.

#1
Interesting. Here's the boost_cron() hook
<?php
/**
* Implementation of hook_cron(). Performs periodic actions.
*/
function boost_cron() {
if (!BOOST_ENABLED) {
return;
}
$expire = TRUE;
if (BOOST_CHECK_BEFORE_CRON_EXPIRE) {
$expire = boost_has_site_changed(TRUE);
}
// Expire old content
if (!BOOST_LOOPBACK_BYPASS && variable_get('boost_expire_cron', TRUE) && $expire && boost_cache_expire_all()) {
if (BOOST_VERBOSE >= 5) {
watchdog('boost', 'Expired stale files from static page cache.', array(), WATCHDOG_NOTICE);
}
}
// Update Stats
if (module_exists('statistics') && variable_get('boost_block_show_stats', FALSE)) {
$block = module_invoke('statistics', 'block', 'view', 0);
variable_set('boost_statistics_html', $block['content']);
}
// Crawl Site
if (BOOST_CRAWL_ON_CRON && !variable_get('site_offline', 0)) {
boost_crawler_run((int)$expire);
}
}
?>
Not really sure why it would be bombing on you... that an odd error.
Well we need to rule out what code is causing the errors...
// Crawl Site - Not an issue, you said its disabled.
// Update Stats - Try Disabling this, go into the boost block for AJAX stats and unchecked the show stats checkbox.
Next if your still getting the error uncheck BOOST_CHECK_BEFORE_CRON_EXPIRE (Check database timestamps for any site changes. Only if theres been a change will boost flush the expired content on cron.)
Finally set the variable_get('boost_expire_cron', TRUE) to Disabled (Clear expired pages on cron runs:).
If your still getting the error then it's not in the boost_cron hook since all effective code in there will not run.
#2
I'm getting similar behavior. Though the log says cron runs I'm getting a page not found error when I go to cron.php... and having the related issue of pages not being purged on cron runs.
#3
mikeytown2,
The problem is BEFORE the drupal_cron_run() because when I add a statement to write something in the output when entering that function, I get it at the END the "Not found" page.
Now I see that you have one hook_boot() and one hook_exit() function...
I'm still testing. So far, it generates the pages and often it gets stuck (it does not even reach the drupal_cron_run() function!). 8-P
Thank you.
Alexis Wilke
#4
mikeytown2,
Okay, got the location, not the very reason, but I know what causes the problem.
<?php// Make sure this is not a 404 redirect from the htaccesss file
//$path = explode($base_path, request_uri());
//array_shift($path);
//$path = implode($base_path, $path);
//$path = explode('?', $path);
//$path = array_shift($path);
//if ($path != '' && $_REQUEST['q'] == '') {
// $GLOBALS['conf']['cache'] = CACHE_DISABLED;
// $GLOBALS['_boost_cache_this'] = FALSE;
// drupal_not_found();
// return;
//}
?>
That's part of the boost_init() function. I guess you wrote that a long time ago and did not remember you were calling drupal_not_found() in there...
So... as you can see, I commented the code out and it works like a charm. Of course, there may be a bit of a problem with that not having that special feature. I guess that your array_shift() × 2 remove the cron.php name from the path and now it looks empty and "fail" the test above.
Let us know how we should fix that function.
Thank you.
Alexis Wilke
#5
Nope, I wrote that very recently to fix an old bug #345484: 404 hits to /files directory cached as homepage with broken form actions. Try this latest patch, based off of what my server output I hopefully can isolate this so the 404 only happens when it should. Can you test that you get a 404 when you visit this URL
example.com/sites/default/files/In short I'm working around a core bug... if you hit a dir that exists but doesn't have an index.php file then you get a 200 and the home page instead of a 404. This was bad because boost would then cache that page.
#6
#7
The following also works...
<?phpif ($path != '' && $path != 'cron.php' && $_REQUEST['q'] == '') {
?>
I tried with update.php and it works fine. The only other page would be install.php and I'm really not worried about that one!
I do not know whether we can call this is a fix, but it does work. Another way is to use cron.php?q=something, but that could have some other unknown side effects...
Thank you.
Alexis Wilke
#8
Okay, got your patch... shall I change the lone '=' with a '==' in the last entry?
I tried the path you asked me to test, and I get a 403. (forbidden) I guess that my Apache is forbidding me from listing folders.
Thank you.
Alexis Wilke
#9
I guess REDIRECT_STATUS is not set when I hit that line so it works. 8-)
I just changed the = 404 into == 404 as, I suspect, was intended.
<?phpif ($path != '' && $_REQUEST['q'] == '' && isset($_SERVER['REDIRECT_STATUS']) && $_SERVER['REDIRECT_STATUS'] == 404) {
?>
Thank you for your quick replies.
Alexis
#10
Can you check that it still fixes the bug?
#11
Here's a better patch taking a hint from you.
#12
The line in #9 resolves this issue and doesn't seem to break anything else...
#13
mikeytown2,
That works great. (#11) 8-)
I have another problem with CRON, but I think that's a node I created earlier that's invalid.
Thank you.
Alexis
#14
Committed #11
#15
I applied the patch at #11 and it works. The cron problem has disappeared. I guess all these fixes will be integrated in the next stable release?!
#16
@Geir19
Yep, next release should be soon I hope & hopefully it will be a very stable one like 1.03 was.
#17
Automatically closed -- issue fixed for 2 weeks with no activity.