I've got an issue that I'm hoping has a simple solution.
I have a setup where we use the secure login module to force administrators and content editors to use https://mysite.com instead of the public-facing http://mysite.com. This means when we go to manually re-index the site at https://mysite.com/admin/config/search/apachesolr, all of the indexed pages are stored with the url of https://mysite.com/my-page.
This problem gets worse when we have a development or preview link for a site before it goes live, and we the index was built from that test url: https://test.mysite.com/my-page - it becomes really obvious that somethings wrong.
If Drupal is reindexing pages or content as a result of a save or edit, it works fine and uses the correct http://mysite.com url.
The current workaround for us is to turn off Secure Login, log out, log back in without the https:// in the url, kick off the index, then turn Secure Login back on.
I want to be able to manually kick off the index, without having to disable Secure Login.
I can see a bunch of issues and changes where the links have been changed from being absolute to relative and vice-versa (#667650: Results of apachesolr_process_response should return absolute URLs, #337879: Store relative not absolute paths, #1765938: Move the variable_get() for "apachesolr_environments" after the cache_set() so that URLs can be modified dynamically), but I'm struggling to figure out what's current and how I can fix my issue. Any help appreciated.
Comments
Comment #1
marblegravy commentedThis is very closely related to #1881164: Wrong domain in the path of the search results (using domain access module), and the comment at http://drupal.org/node/1881164#comment-7028786 was enough to help me solve this for now.
My version of hook_apachesolr_process_results() looks like this:
Comment #2
marblegravy commentedAfter thinking about this overnight, I think this version might be more robust in case clean_urls are turned off or the page has a query string for whatever reason.
Comment #3
marblegravy commentedFinal version... completely different approach which should be the most reliable version provided that:
All I'm doing here is throwing out the url sent from solr, then letting drupal generate the most appropriate path it has the ability to create based on the raw path of the result.
This has the benefit of working with clean_urls on or off, it also works if the paths are supposed to have query strings or whatever, and it takes care not to try and re-create paths for content that the current site doesn't know about.