I'm having a bit of an odd issue I can't track down.

When viewing my search results on the site, they all have URLs like http://default/path/to/node instead of the expected http://mysite.com/path/to/node.

When I execute a query via the solr admin interface, the returned XML results each have the proper url set - that is, in the index, the document URL is correct. However, if I dump the $doc record in apachesolr_search_process_response(), it shows that the results returned by the apachesolr module have $doc->url set to the wrong path.

I am continuing to trace this down, but any input as to where this could be occuring would be very helpful!

CommentFileSizeAuthor
#13 1852088-13.patch1.25 KBpwolanin
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

brianV’s picture

To clarify, this doesn't happen with the core search module. However, I think that's implicit since the problem seems to come into play between the solr index (where the document URL is correct) and apachesolr_search_process_response().

Nick_vh’s picture

Perhaps the site url in settings.php is set to some explicit version or an exported feature that overrides this?

brianV’s picture

Tried those. $site_url isn't making a difference.

Also, wouldn't that affect the document prior to indexing when $document->url is set in _apachesolr_index_process_entity_get_document()? It appears that the proper $document->url is set in the index.

Nick_vh’s picture

Yeah, that would only take effect in the indexing process. Some other things I can think of are contrib modules and or theming issues?

In indexing times (you can see to replicate this). Perhaps multilingual content is making this akward?

  $path = entity_uri($entity_type, $entity);
  // A path is not a requirement of an entity
  if (!empty($path)) {
    $document->path = $path['path'];
    $document->url = url($path['path'], $path['options'] + $url_options);
    // Path aliases can have important information about the content.
    // Add them to the index as well.
    if (function_exists('drupal_get_path_alias')) {
      // Add any path alias to the index, looking first for language specific
      // aliases but using language neutral aliases otherwise.
      $output = drupal_get_path_alias($document->path, $document->language);
      if ($output && $output != $document->path) {
        $document->path_alias = $output;
      }
    }
  }
brianV’s picture

Well, being a blocker and relatively out of ideas, I tried re-indexing again with $base_path set in settings.php, and that seems to have worked this time.

The odd thing is that it has been correct in the index, so theoretically, setting $base_path without a reindex should have fixed the problem if it was introduced between the correct index, and the incorrect $doc in the results processing callback.

And it had no effect until I reindexed again. Now it's working, and it appears to stay working when I comment out $base_path now. I'm not sure why this is happening, but it's working now...

rjbrown99’s picture

This happened to me as well, and adding a $base_url to settings.php and reindexing did fix the problem.

julitroalves’s picture

Wow, Thanks man. ;D

Nick_vh’s picture

Status: Active » Closed (works as designed)
JeremyFrench’s picture

I didn't have a base path set, but did index via Drush without the URL being set ended up with this issue as well.

I'll reset and index with url set as a Drush flag and see if that helps.

mksweet’s picture

+1 I added $base_url to settings.php, deleted the index, and it worked. Still not sure why this happened to begin with though.

gdl’s picture

I had the same situation as #1852088-9: Apachesolr search results come back with $doc->url starting with 'default'. Without a $base_url set in the settings.php file, using "drush solr-index" to index content resulted in Solr documents with a 'site' property of "http://default", and a 'url' property that started with "http://default".

Using drush's "--uri" option with the proper URL resolved the problem.

FWIW, this appears to be an issue because _apachesolr_index_process_entity_get_document always tells the url() function to generate absolute URLs. I can see why you'd want to do that in some situations, but it makes my use case of prepopulating a Solr index from a staging site for eventual use on a live site more difficult. That could be done with proper _alter()ing, though, I suppose.

holtzermann17’s picture

Component: Code » Documentation
Status: Closed (works as designed) » Active

@gdl: thanks so much for the tip in #11! drush --uri=SITENAME solr-index for the win. Can this be documented in the drush help description?

      $items['solr-index'] = array(
    'callback' => 'apachesolr_drush_solr_index',
    'description' => dt('Reindexes content marked for (re)indexing.  You must use the --uri=SITENAME parameter.'),
    'options' => array(
      'environment-id' => 'The environment ID',
      'limit' => 'The total number of documents to index',
    ),
  )
pwolanin’s picture

Status: Active » Needs review
FileSize
1.25 KB

like this?

damontgomery’s picture

We were able to use the --uri='http://www.mysite.com' parameter to have this work with drush.

Here is a super clear example,

drush solr-mark-all
drush --uri='http://www.mysite.com' solr-index

Nick_vh’s picture

Version: 7.x-1.x-dev » 6.x-3.x-dev
Status: Needs review » Patch (to be ported)

Committed, patch needs backport

enekoalonso’s picture

I like the drush --uri command, but what happens with subsequent index tasks triggered by cron jobs or node creation?

frankcarey’s picture

Issue summary: View changes

You might want to checkout this option with solves the issue a little more directly. #2223797: Wrong URL still stored in indexes when using drush.

Sylvain_G’s picture

i have the same issue on D6 on a much more complexe setup
* multisite search
* multilangue with domain based negocialtion http://fr.xxx.org and http://en.xxx.org

so i cannot set base_url in settings.php, i cannot add --uri to drush command.

:(

Sylvain_G’s picture

My solution was finaly

function XXXX_apachesolr_update_index(&$document, $node, $namespace = null) {
    $languages = language_list('enabled');
    $options = array('absolute' => TRUE, 'language' => $languages[1][$node->language]);
    $path = 'node/'.$node->nid;

    language_url_rewrite($path, $options);
    $document->url = url($path, $options);
}

this way the language negociation mode domain and using url from drush works as expected.

Hope it helps