When using GSA device version 6.14.0.G.28 the search functionality is broken (no results found) and an error message of "the response from the search service was unreadable. Please try your search again. If the problem persists, please contact us." is returned.

A few notes:

  • After a recent upgrade from GSA version 6.2 all search functionality has stopped. I have tested that the indexing is working correctly by visiting the GSA URL and directly running a search query.
  • GSA configuration in Drupal is also correct. I verified that the collection name and frontend_client name match (with case-sensitivity in mind).
  • After looking at the module's source code I found:
    case 'google_appliance#error_lib_xml_parse_error';
          $output = '<p>' . t('The response from the search service was unreadable. Please try your search again. If the problem persists, please <a href="@contact-url">contact us</a>.', array('@contact-url' => url('contact'))) . '</p>';
    

    So maybe this is an error with parsing the XML?

  • I am running PHP 5.3.6, curl 7.15.5, libxml 2.6.26, Drupal 7.
  • I also tried the dev version of the module with no luck.
  • Also attempted a search through mydomain.com/gsearch.

Any help would be greatly appreciated.

Thanks!

Comments

mpgeek’s picture

Title: Module not working with GSA version 6.14 » Persistent XML parse error when using GSA firmware version 6.14

@toshinobu, I cannot reproduce this error with the current release (dev version is in sync), but I do not have access to a GSA device with firmware version 6.14 (the GSA i have access to is . The error is actually detected by google_appliance_parse_device_response (line 527 in google_appliacne.module). You might use devel's dsm() and poke around inside that function and see what you are getting back from the device.

The xml parse error arises from parse errors in php's simplexml_load_string, so it would be good to know what the device is returning and if the payload is xml or not. It's possible that the firmware update included some change in what is returned by the GSA, structure-wise, but i have not consulted the docs on that.

toshinobu’s picture

@mpgeek, thanks for your reply!

I tried your suggestion and inserted the dsm() function to see what was the value of the parameter being passed into the google_appliance_parse_device_response function and it's actually blank!

/**
 * Parse the response from the Google Search Appliance device into a php array
 *
 * @arg $gsa_xml
 *    response text obtained from the device query
 * @return
 *    php array structure to iterate when displaying results
 */
function google_appliance_parse_device_response_xml($gsa_xml) {
dsm($gsa_xml);

The error from simplexml_load_string is also blank.

Regarding the change to what is returned from GSA in this new version... Google had this to say:

Yes, the search results response has changed between 6.2 and 6.14.  Our Search Protocol Reference for 6.14 contains the results format, both for HTML and XML outputs.  You can view it here: https://developers.google.com/search-appliance/documentation/614/xml_reference

Link so you can click it: https://developers.google.com/search-appliance/documentation/614/xml_ref...

Do you have any other suggestions? I'm confused as to why the $gsa_xml variable is blank. Is there a GSA setting that I should be looking at?

Your continued help would be greatly appreciated.

Thanks!

mpgeek’s picture

There's no setting to look for, other than making sure you are actually connecting to the device. If this was working before, and broke after the GSA firmware upgrade i'm inclined to think this is not a connectivity problem, but rather a problem with how the module is handling the request-response round trip. My concern is that you have discovered a bug that will be a critical issue moving forward: as users are upgrading their devices, the module will start failing.

What i think makes sense is for you to work with the code and come up with a patch that takes the new format into account. If you were able to come up with a working function that handles the new format properly, we could work with that. In order to keep the module backwards compatible, we'll need to keep both parsing functions so we'll need to add version sensing. At the very least, perhaps a version string check in the code, then a switch to the right parse function. Or maybe we use a module setting to instruct the module which parse function to use.

toshinobu’s picture

Thanks for the suggestions!

After a bit of digging I discovered that it seems there is no response from the GSA device. I inserted the following code at line 492, within the google_appliance_search_view() function:

// query the GSA for search results
    $gsa_response = _curl_get(
      $search_query_data['gsa_host'], 
      $search_query_data['gsa_query_params'],
      array(),
      $settings['timeout']
    );
dsm($gsa_response);

This returned nothing but a blank string.

I verified that the search data is valid by checking the value of $search_query_data and it all appears to be fine.

Any help would be greatly appreciated.

iamEAP’s picture

Chiming in, here. We're running 6.14.0.G.28 with no problems.

I don't recall if this was during update, but we did run into an issue around the same time where search results were failing if they happened to include a onebox that returned malformed XML. It's possible the latest GSA software handles malformed XML less gracefully than previous versions.

mpgeek’s picture

@iamEAP, thanks for your insight. @toshinobu, I would suggest creating a frontend on your GSA that's totally vanilla and use that for your your troubleshooting in your Drupal site. Start bringing things into the frontend, (oneboxes, etc) and see what is breaking the results payload. Since we know that others are using 6.14 without issue, I'm guessing there's something funky with what the device is returning.

cdnsteve’s picture

Have you tried to Query the GSA directly, through a browser directly on the appliance to make sure it is in fact returning results?

EG:
http://GSAURL/search?site=COLLECTIO_NAME&client=FRONTEND_NAME&output=xml...

This will ensure that the GSA is spitting out the XML properly.

You can then see if, the GSA is returning XML and also if the GSA has crawled the site recently.
Check your web server logs.

After you can confirm that I would then proceed to go through various tests to ensure.
1) Your web server has access to the GSA, and it is crawling.
2) There are not host restrictions set in the GSA to block out your web server via IP or otherwise.
3) Turn on Query inspection to see what the GSA module in Drupal is actually querying the GSA with -> try that directly in your browser to see what it does.

It sounds like an access host issue from the initial details.

iamEAP’s picture

Category: Bug report » Support request
Issue summary: View changes
Status: Active » Closed (cannot reproduce)

Cleaning up the issue queue; this has been open for over a year with no response regarding suggestions from #7. Closing.

If you're still experiencing this, feel free to reopen, but only do so if you can provide more details.