I have a rather interesting situation. On a Drupal multisite setup, a Solr search on the main_site shows results from a sub_site. If, for example a result is listed as node/18 in the results page, selecting this link takes me to main_site/node/18 even though the actual search result is at sub_site/node/18.

Just wondering if anyone else ever came across this behavior and how to tell Solr not to search subsites (such as in anything in the sites/ directory).

Thanks.

Comments

JacobSingh’s picture

Status: Active » Closed (won't fix)

Multisite is not working.

I'm not sure how you managed to enable it, but it is not functional. If you want to work on it, be my guest, but please don't post any more support requests about it.

Thanks!
Jacob

rj.seward’s picture

Jacob:

Thank you for your reply. Perhaps I should have perhaps phrased my request differently. I do not have Apache Solr multisite search module enabled and I am not attempting to get this working. I only have modules Apache Solr framework and Apache Solr search enabled.

What I have is a problem where Solr appears to be searching through different sites on a multisite setup but I DO NOT want it to do this. I only wish to search the one site on which I have enabled the Apache Solr module and configured Search to use Solr.

As it is, I am getting results back with links to another site in the multisite install, but when followed take me to the corresponding node in the first site. Here is a link to a search page for "apache" so you can see it firsthand: http://scis.wju.edu/drupal6/search/apachesolr_search/apache . The first link, for example, points to http://scis.wju.edu/drupal6/node/19 (which has nothing to do with Apache) whereas the teaser indicates that what it should be pointing to is http://scis.wju.edu/ralph/node/19, which is a page on a sub-site in the multisite install about setting up SVN on an Apache server. Or more correctly, these links shouldn't be showing up at all.

It looks like a bug, but I was not sure and wanted to post it to see if anyone else had a similar experience.

Thanks!
Ralph

JacobSingh’s picture

Hi rj

When ApacheSolr indexes (on cron), it grabs nodes from the database of the site which is running. So there is no way it would include nodes from your other sites in your multisite setup unless you run cron against those sites and they have the same AS index settings.

Which it appears, you are doing:
http://scis.wju.edu/ralph/search/apachesolr_search/apache

Hope that helps,
Jacob

rj.seward’s picture

Thanks again Jacob.

I disabled Solr module ion the subsite, then deleted and rebuilt the index on the main site. Now the problem appears to have been corrected.

So now, question: You wrote, "So there is no way it would include nodes from your other sites in your multisite setup unless you run cron against those sites and they have the same AS index settings." Could I theoretically have AS module enabled on more than one site on a Drupal install on a server? If so, which AS index settings would need to be changed? And where to change these?

Thanks for your help.

RJ

JacobSingh’s picture

Are you using Acquia Search?

If so, you're out of luck until we implement Solr Multi-site. Or, you can have two subscriptions (one for each site).

If you have your own setup, look into Solr multicore. You'll need to configure the settings at admin/settings/apachesolr.

hth,
jacob

kdes’s picture

Does this module now work with a multisite setup ? I have installed this module (assuming it worked for a multisite setup based on this http://drupal.org/node/322048#comment-1523900) and I experienced the same problem as described in the first post. Also, as per post #3 by JacobSingh the sites should not have the same "AS index settings" for results from another site not to be displayed on the current site. How do you go about doing this ? (Where do I change the "AS index settings", the settings on admin/settings/apachesolr would be the same for all sites so there's nothing I can change there what else needs to be changed ?).

In fact I actually want the results from other sites to be displayed which does happen right now but, when I click on the result it doesn't take me to correct page.

eg: If I search from mydomain.com and the page is actually on subdomain.mydomain.com/node/13 , instead of dierecting me to "subdomain.mydomain.com/node/13" it directs me to "mydomain.com/node/13".

kdes’s picture

Status: Closed (won't fix) » Active
robertdouglass’s picture

Status: Active » Fixed

Hi kdes - it's currently broken, as you've described. The framework for multisite to work is there, but a lot of parts have shifted around it and it needs repairing. If you have any resources to be used to help fix it I can help direct you. Otherwise you can wait until we get around to fixing it - which we will - but there's not guarantee about when.

kdes’s picture

Status: Fixed » Active

I don't have any resources in terms of money or programing skills. But, I will try and look into this if you could direct me where. I guess the problem is that currently the url's are being stored as relative url's and for it to work with a multisite setup they need to be absolute url's.

JacobSingh’s picture

@kdes: Unfortunately, it's more than that. Most of the work around multisite is making sure it will work with facets which are single site only (such as userids, etc), or at least differentiating between which are multisite safe, and which are not. But there are other issues as well.

kdes’s picture

ok.. but faceted search is optional isn't it ? and userid's need not necessarily be different for multisites (I'm sharing users tables so user id's are same for all multistes). Anyway if we forget about the other issues for now, how can the current problem with url's not being stored correctly be resolved ?

I guess at the moment the nodes are being indexed only using nid and the base url of the current site is being used to display results but, the index needs to store the base url as well for each node.

robertdouglass’s picture

@kdes - yes, you're right. It would be a step forward to get just the multisite searching working again with or without facets. I have to review what's changed since I wrote the multisite search code, but I think your analysis is correct. With the base URL and a node ID we should be able to fix this up.

One thing to look out for: if you run cron with www.example.com one time and example.com another, the base URL variable in Drupal might change. Therefore I suggest creating a new Solr specific variable that gets set either automatically or by the admin that can override the base url. This might be handy for other needs, to, like when you're building on dev.example.com and want to move to example.com later. Just a thought - probably needs refinement, but thought I'd point it out since it seems like you're interested in working on this issue.

pwolanin’s picture

@robertDouglas - I though I added to the README the suggestion to set the $base_url. That's the basica solution to the proble, rather than creating an additional variable.

We are currently storing and retrieving the absolute url for each node, so I think a multi-site search without any facet support would work as easily as writing a module implementing the right _alter hook for search results and swapping the relative URL for the absolute one.

kdes’s picture

@ pwolanin

could u please post a patch for this ?

robertdouglass’s picture

@kdes - unfortunately it doesn't work like that. This is an issue that pwolanin will most likely work on, and most likely relatively soon. You can know this because Acquia is paying him to work on the module, and multisite is one of the features that we'd like to support. However, pwolanin and myself largely have to work on Acquia's schedule. If you need to speed things up (ie you can't wait for pwolanin or someone else to get to this issue) you need to submit the patch yourself or hire someone who can help. Sorry to disappoint.

ronnbot’s picture

StatusFileSize
new876 bytes

I ran into the same problem with wrong node urls too. The issue is essentially that the search result is using the relative path instead of the absolute one, which is also available actually.

The username link is wrong too as the uid may not exist in the current searched site. To fix, you just have to take the base path from the node url/path and build the link manually instead of using the theme('username').

See the code below from apachesolr_search.module with the fix:

      $base_path = substr($doc->url, 0, strlen($doc->url) - strlen($doc->path));
      $results[] = array(
        'link' => $doc->url,
        'type' => apachesolr_search_get_type($doc->type),
        'title' => $doc->title,
        'user' => $doc->uid ? l($doc->name, "$base_path"."user/$doc->uid") : theme('username', $doc),
        'date' => $doc->created,
        'node' => $doc,
        'extra' => $extra,
        'score' => $doc->score,
        'snippet' => $snippet,
      );

I've attached a patch to alter apachesolr_search.module to fix the node and user link issue. Hope this helps.

Scott Reynolds’s picture

Why arn't we just using url('user/ID, array('absolute' => TRUE)); ?? That give you absolute paths

ronnbot’s picture

@Scott Reynolds - true but that would require changing apachesolr.index.inc and schema, clearing all indexes, etc. Its just a more complicated solution to fixing simple url issues. Although, the bigger issue still persists with faceted search and so on. To get that all working fine would require a lot of changes to the apachesolr modules.

kdes’s picture

thanks ronn. applied ur patch and it works.

pwolanin’s picture

Version: 6.x-1.0-beta5 » 6.x-1.x-dev

Good multi-site search is probably going to be a while yet in coming.

By design we don't use the $doc->url for single site search, since that is more fragile (e.g. if the indexing base url is not the same as the search base url).

xarbot’s picture

i subscribed

xarbot’s picture

This patch works for me, but it need to change the 1303 line of apachesolr.module

it has

$links[] = l($result->title, $result->path);

and i change to

$links[] = l($result->title, $result->url);

this needs for work the more like this block with absolute url's

Xarbot

xarbot’s picture

hello. Now if I a need a search based only in one of this subsites (for example imagine that we have a general search and a locally search) how can i do this kind of search?? Maybe with a hidden value that says filter by url or something similar? Because i think that it must be an argument to the machine search engine in java, isn't it?

Thanks in advanced

Xarbot

pwolanin’s picture

see 'hash' field in the schema

add to the query on the alter hook a filter:

$query->add_filter('hash', apachesolr_site_hash());
xarbot’s picture

Sorry, but i can't find the hook filter, in which archive it is?

Thanks

Xarbot

pwolanin’s picture

hook_apachesolr_modify_query(&$query, &$params)
xarbot’s picture

Ok, it works!

If i have a bit of time this weekend i try to modify the form and the module for search in global site or local site and then with the patch in this thread the module could works in multisite environtment and could be searched in both modes (local and global)

Thanks in advanced!

Xarbot

francewhoa’s picture

+1 for the Solr multisite search feature. Subscribing

robertdouglass’s picture

Version: 6.x-1.x-dev » 6.x-2.x-dev
Component: SolrPHP Client » Multisite
Category: support » feature
Status: Active » Needs work

Some good stuff here, including a patch. Moving to 6.2.

evoltech’s picture

subscribe

evoltech’s picture

Status: Needs work » Needs review
StatusFileSize
new973 bytes

Re-rolled the patch from @ronn abueg in #16 against the 6.x-2.x-dev branch.

robertdouglass’s picture

Title: Solr Multisite search » Prepare search results for multisite search

Adjusting title to reflect contents of current patch.

pwolanin’s picture

Status: Needs review » Needs work

The reason I did not do this previously is that the url field is the absolute url including the domain where the content was indexed. If your site content is accessible on multiple domains - or the same content is accessible on sub-domains - this patch breaks the existing functionality.

Certainly needs more discussion before putting this in as-is. Eventually I could imagine having some toggle for this.

pwolanin’s picture

pwolanin’s picture

StatusFileSize
new3.98 KB

here's a patch - parts of which I'd like to commit in any case to support multisitesearch via that other module.

pwolanin’s picture

StatusFileSize
new2.58 KB

here's the patch to just support multisite image snippets via another module doing the actual search.

jpmckinney’s picture

Is this still needed? If so, let's get it reviewed. Otherwise, let's close it.

pwolanin’s picture

$query->multisite

seems a little specific in its naming- maybe we can make it more a flag about wanting absolute links?

ataneja’s picture

HI!

Can anyone please tell me exactly what needs to be done for showing nutch results in drupal.

How do I need to merge the schema files, do I need to copy paste the above patch provided by David Stuart in to nutch schema.xml?
What else needs to be done?

I have one more doubt that is we have to change the schema of solr server twice.
Firstly, we need to copy the nutch schema to solr schema so that the index from nutch gets transfered to solr.
secondly, we need to copy the schema of solr module to schema of solr server so that solr module can connect to solr server.

Is there any other way out. Please Reply

Thanx

pwolanin’s picture

@ataneja - totally off topic for this issue.

ogi’s picture

subscribe

jpmckinney’s picture

Status: Needs work » Closed (won't fix)

6.x-2.x won't be maintained. It seems no one but a single subscriber cares about this issue in over a year. It's possible to make it work with multisite by implementing enough hooks (this is what I do in 7.x, still). Closing.

superfedya’s picture

Sad, I just installed Solr on my seconde site and Solr show the results from all my sites with incorrect url :(

Any fix?

barwonhack’s picture

Bump! 7.x with Domain Access the search result URLs are all effectively redirecting the the default site.

How to get the domain paths to match the current URL?