Calais Bulk Processing has encountered an error
| Project: | Calais |
| Version: | 6.x-3.1 |
| Component: | Miscellaneous |
| Category: | support request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Jump to:
Hey Folks,
(Thanks, Febbraro for the recommendation - Sorry I opened that old issue up - I'm starting to learn this Drupal thing more and more now).
I applied a recent patch that fixed Errors upon bulk processing , and now when I go to do bulk processing, I don't get the fatal error, but I DO get the "An error occurred" message on the bulk processing page, when I'm such-and-such a percentage done.
Here's an example:
Calais Bulk Processing has encountered an error.
Please continue to the error pageAn error occurred. /batch?id=52&op=do
We checked our error logs through Watchdog and some of our other log files and couldn't find anything. Apache is reporting this process as a POST, so I increased the max_input_time in our WHM from 1 minute to 4 minutes. Looks like we are still getting this "error" that isn't reporting any logs.
Edit: After reviewing the logs on this batch attempt (id=52) at this, I just found that dblog reported a 500 - Internal Server Error:
Message: Calais processing error: (500 - Internal Server Error)
Any ideas?
One suggestion made earlier by febbraro was that this is an issue with the Calais service itself, and not the module. We have been encountering this error quite often, and have been able to reproduce on different batches and on different node types. I have made the changes in Calais Node Settings for the nodes that we want to run Calais on, so that shouldn't be the issue.
Is anyone else encountering this error on a regular basis? Usually when we retry (from where Calais was interrupted and left off), it works, or at least gets more percentage done before failing again.
Any help or suggestions would be appreciated.
Thanks,
David

#1
Can you try to figure out if a specific content item in particular is causing this error over and over again? If we can reproduce it, we can find a solution by adding some additional debug statements.
#2
Febbraro, I apologize that I have not responded to this yet, as I thought I had.
As it turns out, the content types that I was running Calais when this specific error was returned was with Audio content type. However, yes, the Audio content type as well as several other content types causes this error to be returned over and over again.
- David
#3
Evan here - from the same place as David (dwrudy) - it looks like the issue is with particular nodes that have out-of-date PHP code in them. Apparently, the PHP is getting executed during the process of Bulk Processing. The error can be resolved by editing the PHP so it doesn't have invalid functions in them. Ideally, it would be good if Calais didn't run the PHP during Bulk Processing, but that might not be possible, since it happens in node_load() I believe.
I tried modifying calais.admin.inc to not pull up nodes where the input format was PHP code, but I think I did that incorrectly, as it stopped working altogether after I did that.
Anyway, there may be other reasons why Bulk Processing wouldn't work, but this was the one that we discovered right now. We'll comment again on this issue if we run across other ones.
#4
Thanks for the reports. I'll leave this open in case we see more of this from people and there are some good ideas for ways around it.
There is only so much we can do though. While you may not want PHP bodies evaluated, others might, so it is hard to come up with a hard and fast rule, but we may be able to add some configuration options.
#5
I get the same error on bulk processing. K
#6
I am receiving the error also. Since it was happening just after install on a new site I had originally assumed that I had screwed up the RDF install. So I uninstalled and went on to other things. I just had some free time so I'm back to try and resolve. I did notice that I previously had an error in my key ID, but I fixed that
It works when I create individual pages, but fails on batch. It may be something site wide since I tried doing diff node types (to vary content) and still failed.
I am running an Acquia install
Calais Bulk Processing
Calais Bulk Processing has encountered an error.
Please continue to the error page
An HTTP error 500 occurred. /batch?id=12&op=do
Bulk processing of nodes is starting.
===The error page had ====
An error occurred while processing calais_batch_process with arguments :calais_batch_process
#7
Hi I also have this issue with 6.x-3.1.
Calais Bulk Processing has encountered an error.
Please continue to the error page
An HTTP error 500 occurred. /batch?id=10&op=do
Bulk processing of nodes is starting.
Additionally I notice this is error_log
[Tue Jul 14 10:40:07 2009] [error] [client X.X.X.X] PHP Fatal error: Call to undefined function dpm() in /var/www/html/foo/drupal-6.13/sites/all/modules/opencalais/calais.admin.inc on line 416, referer: http:/foo.com/batch?op=start&id=10
#8
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented it from fulfilling this request.
exception
java.lang.RuntimeException: An earlier attempt to initialize the calais-logic module has already failed: java.util.InvalidPropertiesFormatException: org.xml.sax.SAXParseException: The element type "entry" must be terminated by the matching end-tag "". com.clearforest.calais.ClfCalaisImpl.getInstance(ClfCalaisImpl.java:38) com.clearforest.calais.servlets.CalaisRESTservlet.processRequest(CalaisRESTservlet.java:73) com.clearforest.calais.servlets.CalaisRESTservlet.doPost(CalaisRESTservlet.java:98) javax.servlet.http.HttpServlet.service(HttpServlet.java:709) javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
note The full stack trace of the root cause is available in the Apache Tomcat/5.5.9 logs.
#9
I've been getting the same error off and on today during processing of single posts. I think that is more likely related to server problems then the Bulk processor failures.
I posted about it on their sites forum since I think it's not a module problem.
http://www.opencalais.com/forums/about-website/server-giving-apache-tomc...
#10
deltab & MacRonin:
Yes, I've been getting that error today also, but, as you said, it looks like an issue on the remote Calais server. It can't be a Drupal issue since it references Java methods.
I think we should reduce the scope of this issue to prevent it from becoming a catch-all issue for bulk processing errors. Would it be OK if I limited this particular issue to bulk processing errors coming from evaluation of PHP bodies, as dwrudy & I determined was the case on our site? (see my reply #3)
#11
SemanticProxy processing error: Internal Exception when trying to use Calais WS (RemoteException)) - com.clearforest.calais.CalaisWSException: Calais Web-Service exception; nested exception is: java.lang.RuntimeException: An earlier attempt to initialize the calais-logic module has already failed: java.util.InvalidPropertiesFormatException: org.xml.sax.SAXParseException: The element type "entry" must be terminated by the matching end-tag "".
#12
Hi EvanDonovan
I'm all for keeping this thread focused on the Batch processing issue. I only commented on the server issue to try and head off a bunch of folks commenting on server problems here instead of the Calais site. The Calais folks won't be looking here for server problems. Why not add to my thread over on their site about that. Right now it looks to them like I am the only one experiencing the problem. I have already opened a thread for the server issue at http://www.opencalais.com/forums/about-website/server-giving-apache-tomc...
#13
My original reason for posting to this thread was to report that I am also having problems with the bulk processing function. And for one of my two sites experiencing the problem, it was also happening a few weeks ago and so is most likely unrelated to the current server issues.
At the moment I have no idea what is causing my bulk processing failures. On the new site I can be sure that I have not included any PHP code in any user entered content, but of course can not be 100% sure about any of the modules that are used to build the page such as embed media_field
For my upgraded site, I'm not as sure about the possibility of "out-of-date PHP code" since I have a LOT more nodes that go back a while. I do have PHP code in one of the block display control fields though.
#14
MacRonin:
I didn't mean to suggest that you were expanding the scope of the issue needlessly. I appreciate your pointer to the forums, which I didn't know about. I would like to keep this issue focused on batch processing errors, but since the batch processing error reporting is so vague it's hard to know whether the source of the errors is the same.
Anyway, if your issue is caused by PHP code you could check by looking in the node_revisions table to see if you have any nodes in the PHP input format & then looking at the body of those nodes. But if it's something else, which is quite possible, then I don't know.
I just don't want this issue to become like the "Javascript is required to display this map" issue for Gmap module which has 100+ responses, most of which have different causes.
#15
@bailsbails, see this. http://drupal.org/node/429560 for the 3 ways to fix it (install a patch, install devel, or use the dev version)
#16
@febbraro - I just looked at the issue you pointed out and as it mentioned I installed(and activated the basic DEVEL) the DEVEL module and things appear to be working for my new site. I haven't tried it yet on my older one. Working on the new one doesn't surprise me much since it was all just basic text(with some Farsi chars) in the node-type I was running against. It handled processing over 600 nodes with only a few small quirks (primarily duplicate category entries? and a few entries in wrong category) But I'll dig into that later.
I'm going to try the same thing on my older upgraded from D5 site and see how it goes.
#17
OK I tried my older site that was upgraded from D5 to D6 and then had Calais installed. In this case(like my first) I also have the DEVEL module installed and activated at its most basic (just the DEVEL entry)
Previously the batch update failed instantly, but this time it got a little over 30% of the way thru my 6330 nodes before failing. This time I have a similar but different problem. In this case I got a 404 instead of a 500.
Calais Bulk Processing has encountered an error.
Please continue to the error page
An HTTP error 404 occurred. /batch?id=22&op=do
The config page has a msg that seems to have a dual personality. I was wondering which was the true one.
Blog entry / 6330 / Interrupted, will continue (Start Over)
The first text says it will continue, but the link says it will start over. I was wondering which would happen. Will it be picking up where it left off, or will it be going back to the beginning and 'Start Over' thereby reprocess all the nodes already worked on?
#18
Sorry for the confusion, what it is saying is that if you attempt to run the process again for that partocular node type it will pick up where it left off, however, if you click the "Start Over" link, it will the previous run's results and start over from the beginning.
Hope that clears things up.
#19
Automatically closed -- issue fixed for 2 weeks with no activity.