SimpleXML error

lrobeson - September 25, 2009 - 18:23
Project:Google Search Appliance
Version:6.x-2.0-beta1
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

This error message will randomly show up when customers search for items, I think only for IE users (6 and 7), so far I haven't been able to replicate it in Firefox. When the error shows up it doesn't give any results, even though the query is valid:

simplexml_load_string() [<a href='function.simplexml-load-string'>function.simplexml-load-string</a>]: ^ in C:\web\htdocs\sites\all\modules\google_appliance\GoogleMini.php on line 318.

simplexml_load_string() [<a href='function.simplexml-load-string'>function.simplexml-load-string</a>]: Temporary server error. Try again in a minute. in C:\web\htdocs\sites\all\modules\google_appliance\GoogleMini.php on line 318.

simplexml_load_string() [<a href='function.simplexml-load-string'>function.simplexml-load-string</a>]: Entity: line 1: parser error : Start tag expected, '&lt;' not found in C:\web\htdocs\sites\all\modules\google_appliance\GoogleMini.php on line 318.

#1

lrobeson - September 29, 2009 - 13:44

Update: it's not IE-specific, I just got the error using Firefox 3.5, trying to view page 2 of some search results.

#2

jweowu - September 29, 2009 - 23:48

Check near the end of this page:
http://www.justindeltener.com/google-mini-appliance-review-and-integrati...

This page also lists the error:
https://uisapp2.iu.edu/confluence-prd/display/SEA/Release+Notes

147312 - Front Ends with KeyMatch or synonym files of 75,000 lines or more may cause the appliance to intermittently return the following error: "Temporary server error. Try again in a minute."

It may or may not be the same cause, as that seems like it would be a HUGE number of keymatches/synonyms!

At any rate, it rather sounds like it's a problem with the GSA, and GoogleMini::resultFactory() ought to test for it so it can process the errors properly.

This looks appropriate:
http://www.php.net/manual/en/simplexml.examples-errors.php

#3

lrobeson - September 30, 2009 - 16:08

Thanks, we definitely don't have that many lines of KeyMatches or Synonyms.

That last link seems like it might be close to helping (http://www.php.net/manual/en/simplexml.examples-errors.php) but I'm not familiar enough with PHP to know what to really do with that info?

#4

jweowu - September 30, 2009 - 21:37

lrobeson: I think you really ought to be looking into software updates for your GSA.

All we can do with simplexml.examples-errors is to hide those error messages away (or rather, pass them to watchdog in a cleaner fashion). It won't make the query work.

Well, we could sleep and re-try the query a few times, I guess.

(I presume that on your live site you have error reporting set to log to the database only, btw?)

#5

lrobeson - October 1, 2009 - 15:16

Actually, I'm glad you mentioned that -- I still had errors writing to the log and screen, thanks for reminding me to change that! So now I'm guessing when this error comes up, they'll just see "no results" instead of the XML error and "no results". Still not good, but better.

I'm not sure how firmware updates work with the GSA, if we're automatically notified or not, I'll look into that.

#6

jweowu - October 3, 2009 - 13:51

No problem. I wasn't sure whether your previous comments meant that customers were seeing the errors, or merely that you were noticing them in the logs, but I thought it might be worth mentioning just in case. And yes, that will stop errors from being displayed to users.

If you wanted to try a work-around, something along the lines of the following might do the trick as a replacement for the $resultXML = curl_exec($ch); line in GoogleMini.php

<?php
// Catch any XML errors in the GSA response. Attempt the query up to 3 times before aborting.
libxml_use_internal_errors(true);
$max_attempts = 3;
while (
$max_attempts-- > 0) {
 
$retry = FALSE;
 
$resultXML = curl_exec($ch);
  if (
$payload = simplexml_load_string($resultXML)) {
    break;
// Successful
 
}
  else {
    foreach (
libxml_get_errors() as $error) {
     
watchdog('google_appliance', $error->message);
     
// Automatically re-try the request if the "Temporary server error." message was returned.
     
if (strpos($error->message, "Temporary server error. Try again in a minute." !== FALSE)) {
       
$retry = TRUE;
       
sleep 1;
      }
    }
  }
  if (!
$retry) {
    break;
  }
}
if (
$retry) {
 
// Provide friendly failure message for users.
 
throw new GoogleMiniResultException("Temporary server error. Try again in a minute.");
 
watchdog('google_appliance', t("Aborting query after maximum failed attempts."));
}
?>

That's completely untested, and I'm unsure about raising the exception, but with any luck that will do what I intended (which is automatically attempt the query up to three times if that error is being returned, and then provide a user-friendly failure message).

It relies on simplexml_load_string() having no side-effects, as it's now getting called in query() as well as in resultFactory().

Those max_attempts and sleep values should probably be constants, but otherwise this might be potential patch material.

#7

lrobeson - October 5, 2009 - 15:30

Thanks, that looks promising! I tried out the code but now it only returns a white-screen-of-death page (removed the opening and closing php tags of course.) No error messages or anything so I'm not sure how to go about getting it to work? It doesn't return an error to the log or anything, I don't think it gets that far.

 
 

Drupal is a registered trademark of Dries Buytaert.