Closed (fixed)
Project:
Drupal core
Version:
6.16
Component:
search.module
Priority:
Normal
Category:
Support request
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
19 Jan 2009 at 19:59 UTC
Updated:
21 Aug 2013 at 11:06 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
marqpdx commentedTurns out there was bad content which was killing the cron when it got to the search module.
am closing this.
thx,
m
Comment #2
jyamada commentedI'm having this problem too. Wondering how you discovered the bad content.
Comment #3
Anonymous (not verified) commentedI have the same problem of with cron not executing with 6.10. I'll try disabling the search module and report back.
Comment #4
dave reidAlso try this: #139537-63: search indexing gets stuck: node_update_index() - if comment.module is disabled
Comment #5
Anonymous (not verified) commentedAdding tag. Why do I need to add a comment sometimes?
Comment #6
jyamada commentedJust an update. I discovered that one of my unpublished pages contained php code with a function.
And I was using revisions so there were apparently two functions with the same name causing the search index to break.
Much time was lost to discover this.
Does anyone know about some general debugging techniques in checking cron runs?
Joe
Comment #7
halfiranian commentedany tips on how you found the offending php code :) ?
I've got lots of nodes...
Comment #8
halfiranian commentedthe tip about node revisions helped me!
just searched through phpmyadmin for php and deleted the culprit node...
phew that took me ages
Comment #9
snorkers commentedJust spent the whole morning stuck on this... realized that only a fraction of my DEV machine had been indexed by Search, so eventually the indexer found some pages with dodgy PHP embedded code. So thanks to earlier posts, had a quick nose around in MySQL. Found offending nodes by running following query (which you can do via PHPmyAdmin or MySQL command line):
(Which just flags up any node that contains either the text 'php' in the Body or has been set to use the PHP filter).
This gave me a list of only a few nodes to then manually check - I had duplicated a function name in embedded PHP across 2 nodes (v bad practice BTW), and Search just did not like it. Complete success in running cron once these nodes had gone :)
If this approach still fails, check your blocks ('boxes' table).
Comment #10
Anonymous (not verified) commentedHmm... Maybe search should skip those nodes that are dependent on the php filter since they are most likely dynamic content anyway. I don't like your
`body` LIKE '%php%'filter since that could hit any number of combinations and I think the`format` = '3'is sufficient. I'll have to give it a look, I definitely have failure of cron because of search since D6 upgrade.Comment #11
jonathan1055 commentedI've had the same problem, where search indexing would get to a specific point then halt with 'Cron run exceeded the time limit and was aborted.' After tracing through node_update_index() in node.module, to _node_index_node(), node_prepare and on to check_markup in filter.module I finally found where it was crashing. Several nodes with php input filter and active php were being indexed fine, but the two which failed contained a call to drupal_goto, which was the cause of the problem. I made the following addition to check_markup() to specifically avoid calling module_invoke php_filter('process') when a drupal_goto is in the php code to be evaluated.
The tricky bit was realising that I should only skip the php_filter when check_markup was being called as part of search indexing and it must be allowed to continue as normal in all other cases! This solved my indexing problem, but I can see that there could be any number of php function calls which could break the indexing. But it is too strong to always skip for every node with format=3. If anyone has noticed other functions which should be avoided, let me know here.
This bug should be fixed somehow, as a quick search of the forums shows that this is quite a common problem.
I changed the title to accurately reflect what is now the problem, hope that is OK
Jonathan
Comment #12
Zalatar commentedSubscribing
Comment #13
Breakerandi commentedSubscribing
Comment #14
Mguel commentedI was struggling with cron for several hours to get it work after upgrading to drupal 6, until I read something about search module, so I disabled it and cron worked fine... and it was the same problem mentioned here... I deleted a node for testing php code I made once and then reanabled search module and runned cron... I got this time an error message mentioning a php function: menu_get_active_trail which i used on drupal 5.
I searched with phpmyadmin for menu_get_active_trail and deleted that php function in a couple of nodes and solved the problem, and now cron is working with search module enabled and hace my site indexed again for searches.
Cheers,
Mguel
Comment #15
Breakerandi commentedI hoped so much that this issue is finished with drupal 6.15.. but it isn't.. Same problem..
Comment #16
gpk commented@15: if you have a node with broken PHP code or that includes a redirect then search *will* fail. This is pretty much "by design" (i.e. for search.module to work you have to ensure you have no broken PHP code and no nodes that redirect).
To get round this you can use search config module or set the node to unpublished so it doesn't get indexed. See also #286263: Make search indexing smart to handle nodes wth php redirects.
Comment #17
marc.zonzon commentedI have got the same problem after copying back my production site to a test site. The production site was indexing right but cron on the test copy was aborting.
With this cryptic message : 'Cron run exceeded the time limit and was aborted.'. I finally found it was caused by one format used in few page (creole) was depending on the module PEAR Wiki filter that ask for a path of a pear module, that was not the same on my test and production site. So these page cannot render nor be indexed for search.
After correcting the path everything was OK.
A sensitive error message like "Search Module cannot index the page node/123" would help a lot lo localize the cause of the error.
And in the present case the core drupal must also have logged that the call to the module PEAR Wiki filter failed.
Comment #18
jhodgdonAnother issue that appears to be a duplicate of this one:
#643474: Invalid PHP code in node, causes whole cron task to crash
Comment #19
jhodgdonActually, this is a duplicate of #286263: Make search indexing smart to handle nodes wth php redirects, which has a clearer definition of the problem.
Comment #20
manderson725 commentedHaving this same issue with 6.x except i get this error.
Fatal error: Class 'view' not found in /var/www/drupal6/includes/common.inc(1685) : eval()'d code on line 3
Disabling Search allows cron to run, but i'm not sure what the error indicates, a view with bad php code? I would appreciate any help/suggestions where to look/how to get around this.
Comment #21
jhodgdonThis issue has been closed as a duplicate of another issue. See comment #19 above.
Please comment on that other issue instead if you would like your comment to be noticed and replied to. Thanks!
Comment #22
gpk commented@19, 21: While #286263: Make search indexing smart to handle nodes wth php redirects is related the problem in this issue - namely broken PHP code in a node which probably isn't normally viewed - is different.
@20: most likely it's a node using PHP input format which contains invalid or buggy PHP code that is causing your error.
@17:
> core drupal must also have logged that the call to the module PEAR Wiki filter failed
Unfortunately not, PHP doesn't give Drupal a chance to react to the error.
Comment #23
jhodgdonOK, then we should reopen this issue if it's really not the same issue.
It seems to me that the underlying issue is the same in both cases: If there is a PHP error while indexing a node, search_cron() causes cron to abort. But I agree they may not be quite the same.
Comment #24
jhodgdonApparently, making Drupal search indexing happy with someone's bad PHP code would be considered a feature request, rather than a bug in the core of Drupal. See http://drupal.org/node/286263#comment-2668562
Feature requests at this point are only considered for Drupal 8.
Comment #25
Junro commentedsubscribe
Comment #26
jruberto commentedTo troubleshoot cron problems:
- Add watchdog debug to module.inc right before the
$result = call_user_func_array($function, $args);& run cron to identify which module is crashing (probably search)- Add watchdog debug to search.module (or other applicable module) in search_cron() & run cron to identify what is causing search to crash (probably node)
- Add watchdog debug to node.module in the loop at the bottom of node_update_index to identify which node is causing search indexing to crash. Look at offending node. Slap forehead, delete or repair offending node (in my case I had a PHP node which called a function in a module that no longer existed in my new install)
If the problem is in other modules, debug accordingly. Cheers, j
Comment #27
Junro commentedThe problem is Search module. Disable it fixe the problem but we all need Search...
Comment #28
jhodgdonThe problem is that search.module needs to render nodes, and it is crashing when there is bad PHP code in a node, and cron isn't finishing.
Comment #29
Junro commentedHow to know wich php code is the cause? Must be custom php code in page node for me...
Comment #30
jhodgdonYes, it is some custom PHP code in a page that would cause the crash.
Comment #31
Junro commentedIs there a way to know wich page? Wich node has wrong php code? I've a lot of custom php content.
Comment #32
jhodgdonSee #26 above. The last suggestion is most applicable (put debug statements in node_update_index()). You will need to use watchdog() in your debugging, because cron() won't generate output otherwise.
Comment #33
jruberto commentedAn example debugging /modules/node/node.module -- see watchdog line in the loop at the bottom. The last node that you see referenced in the watchdog log is the one that's causing cron to stop.
Comment #34
Junro commentedThanks jruberto but I have errors:
line 2587 is:
t('Change the author of a post'),stange
Comment #35
gpk commented@34: it looks as if you have accidentally changed http://api.drupal.org/api/function/node_action_info/6 further down in node.module...
Comment #36
Junro commentedI don't have the warning and parse error at #34 anymore. Certainly I was due to some php code in my tpl files.
But with the code, I still can't run cron automatically.
sO I put back the original node.module.
In /admin/reports/dblog, I have what I always had:
Comment #37
jhodgdonSince this issue has been taken over by discussion of how to debug a particular site's bad PHP problem, I'm filing a separate issue to discuss whether search.module needs a feature where it can keep running cron in such a situation.
#752184: Keep running cron when bad PHP content is encountered
I'm therefore converting this issue into a 6.x support request, so you (and anyone else who has similar problems) can continue to debug this issue.
Comment #38
sinasalek commentedSame problem here, i suggest a better error handling in _node_index_node function. It's an easy issue to fix if there was a proper message at watchdog
Comment #39
jhodgdonWe all agree there should be better error handling. It just hasn't been implemented by anyone yet.
I think the question of "how to debug" that this issue has become has been answered, so I'm going to mark this issue as fixed.
Comment #40
danharper commentedI have been struggling this for a while now and this is what I did to fix it.
Install the following modules
http://drupal.org/project/elysia_cron - Elysia cron allows you to run individual portions of the cron job manually.
http://drupal.org/project/dtools - Dtools has the white screen of death diagnostic tool which you need to enable.
once these have been enabled go to admin/build/cron and click on the run link next to search_cron if this part of the cron has been crashing then you should receive a line of text where the cron crashed rather than the WSOD.
The error will read something like "call requested to undefined function blablah". Basically you need to make a note of the function name and then search the database using something like phpmyadmin, this will give you a list of tables and how many times the function "blablah" appears. In my case it was the revisions table and once I clicked to browse the table I could see all the node revisions that contained the function blablah. I made a note of the node id's and worked my way through deleting the revisions or the complete nodes.
Cheers Dan
Comment #41
Sanilshet commentedI am using spam module to detect spam posts when i entered content (comment.txt attached below) my cron job got stucked and so that particular comment was getting posted every 15mins (cron was set to run 15mins cron.php) any idea ......
any idea wats the real problem?
Thanks in advance.
Comment #42
jhodgdonThat doesn't sound like a search module cron stoppage problem due to bad PHP content. Please file an issue on the spam module you are using instead.
Comment #43
Junro commented#11 method is perfect!
I found the bad content :)
Thanks Jonathan
Comment #45
joris.verschueren commentedFirst of all, thanks to everyone who contributed solutions, this has kept me busy for a few days.
Although the issue is closed - and seems fixed for me - I'd like to add to this thread as I remain with a number of questions.
after running through the Make search indexing ... thread, it became clear that my issue was probably PHP-related. It was however only after finding this thread that I was able to locate the snippet that caused the issue (a remnant of a long disabled taxonomy redirect). Most of what can be found on the cron/broken php topic is hard if not impossible to understand if you're not into coding or hacking modules. I hope that this post may contribute to making the available solutions more accessible.
In the end I relied on snorkers solution in #9, although earnie doesn't like (#10) the approach. indeed,
SELECT * FROM YOUR_DATABASE_NAME.`node_revisions` WHERE (`format` = '3');was sufficient to find and delete the faulty code, but I used it not knowing what's so unsuitable about this method. could someone please explain this for others?By the way, I had to change the "
3" into "2" - "PHP code" comes second in the list of input formats on my site.I had to rely on this simply because I didn't actually understand jrubertos add watchdog debug to ... suggestions in #26. would anyone (jruberto?) be ready to clarify this further in a HOWTO? (to which particular file has the suggested code to be added, where, ... - #33 makes a fine start)
Is this problem related to the upgrade from D5 to D6? the code snippet had been there for over a year without troubling the search module.
Moreover, I can't say with certainty whether my cron runs absolutely fine now. my "recent logs" indicate successful cron runs, but if I manually run a cron (example.org/cron.php), I am directed to a white screen, whereas in D5 I'd return to the page where I started and be notified that cron ran successfully. Is this by design in D6? (on this ground, I reopen this issue)
(especially if it is related to the upgrade) is there a possibility to make a documentation/handbook page on cron issues out of this and related threads? there's a huge number of posts and support requests about search and cron interactions and solutions aren't ready to be found (as said, it took me a few days). I'd very much like to provide/contribute to a kind of debug checklist, but I personally wouldn't know where to put it and don't have the coding expertise to present solutions. Any suggestions are welcome.
Kind regards,
Joris
Comment #46
jhodgdonThanks for the additional information. Marking this back to a closed/fixed support issue.
Comment #47
priapurnama commentedThis worked like a charm for me, indeed there were some bad content that made cron stuck when indexing the nodes for search module. Tracked some bad content, deleted them, cron ran successfully. Thanks!
Comment #48
SchwebDesign commentedThank you very much- this was incredibly helpful for me as well, specifically post #9. In my case the php code was an include to a file that didn't exist.
Comment #49
ikeigenwijs commentedkeeping track
Comment #50
davidbessler commentedYou, my friend, are a God! Worked exactly as you said. Found the offending node and deleted ... now Search works with Cron! yay!
Comment #51
math-hew commented#9 was what I was looking for. Thanks.
Comment #52
rahim123 commentedThanks VERY MUCH to jruberto for #33, that was a lifesaver for me.
Comment #53
beto_beto commentedwhen i run the cron.php it give me blank page
and there is no back button to work
it's filed to run the cron
can you please help me and tell me where is that file or function you were talking about ?
Comment #54
danharper commentedI think http://drupal.org/project/dtools will help
It provides useful messages when you get WSOD.
Cheers
Comment #55
fuzzy76 commentedcron.php is supposed to give a blank page.
Comment #56
WaPoNe commentedYeah.. i fixed my issue.
I followed the steps indicated by jruberto (very thanks) in comment #26 and doing debugging I found out that my search_cron did not finish its execution because some nodes were referring to a nonexistent db.