I think this issue doesn't really concern the module, as the faults seem to ly entirely on Solrs side, but I still thought, someone here might be knowledgable enough about Solr to explain or at least provide helpful suggestions.

The issue itself: while testing several things lately, I discovered two types of imo weird behaviour of Solr.

1) If you enter "Funny OR Wikipedia" in the Solr search (no matter into which of the two, it even works with the Solr admin interface) of my demo site (or look at http://soc2008.hotdrupal.com/~dmonkey/drupal6/?q=search/apachesolr_searc...), and then click on the "Filter by Author" > "drunken monkey" facet, there is suddenly one MORE result than without the additional "uid:1" query part. It also works when adding "uid:1" manually, no matter in which order.

2) The other problem seems to concern the boolean operators. The query "Psycho OR Wikipedia AND uid:1" at my test site (http://soc2008.hotdrupal.com/~dmonkey/drupal6/?q=solrsearch/Psycho%20OR%...) yields 4 results, the same results you get when omitting the "Psycho OR" part. Neither "(Psycho OR Wikipedia) AND uid:1" nor "Psycho OR (Wikipedia AND uid:1)" result in the same matches.

I could also reproduce both of these things locally, so it's at least not entirely specific to that installation. But maybe it's because of something in the schema.xml?

Any ideas or suggestions are highly appreciated.

Comments

mattconnolly’s picture

Is your default query "OR" (in the schema.xml file)...?

When the default query is "OR" filter's don't work as expected, for example, the query above changes from:

Funny OR Wikipedia

to:

Funny OR Wikipedia OR uid:1

drunken monkey’s picture

No, the default operator was kept at "AND" as is the default for this module.
You can easily test this by searching for "Funny Wikipedia", which doesn't yield any results.

But thanks for the answer.

robertdouglass’s picture

Priority: Normal » Critical

We need resolution on this before 1.0 release.

kleung11’s picture

It seems that when you put the 'OR' criteria in the search term, solr converts the default of 'AND' to 'OR' so what mattconnolly is saying is true. It becomes Funny OR Wikipedia OR uid:1

I tried these cases which would be the same
Funny OR uid:1
Wikipedia OR uid:1
Funny OR Wikipedia OR uid:1

So if you remove 'OR', the following works correctly
Funny uid:1
Wikipedia uid:1

I tried the same thing directly on solr so it's definitely not the module's fault. This is also a behavior of std request handler, changing to dismax I find that 'OR' is dropped as indicated in stopwords.txt and so this isn't overridden, which may or may not be a desirable thing.

Also, I don't want to sound like I'm voting for the use of fq constantly, but *I think* fq should fix it as the filter is run after the actual query.

robertdouglass’s picture

What Solr versions are you using?

kleung11’s picture

My version was 5.x alpha3 when I tried it. Try any search term, see the number of results. Try another search term (unrelated), see the number of results. Try combining the 2 terms, see the number of results (to prove solr is using AND). Try combining the 2 terms with OR (case matters) in between, see the number of results (see that solr is using OR). Since the filter wasn't doing (query) filter, it essentially became something like "term1 OR term2 uid:2". My guess is that "(term1 OR term2) uid:2" might work, but I'm not sure.

robertdouglass’s picture

I meant which version of the java Solr server are you using? 1.2 or 1.3?

kleung11’s picture

solr 1.2

JacobSingh’s picture

Status: Active » Closed (won't fix)

Closing for inactivity

jarchowk’s picture

Does anyone have an update to this? We are finding the search query too generic, and any attempt to use the AND OR in the query string doesn't seem to have any effect. +, -, NOT do work however.

http://lucene.apache.org/java/2_3_2/queryparsersyntax.html

Our problem is that we have a few hundred thousand documents and our users need more control over the search.

jarchowk’s picture

Ok I think I found this, reading 2 posts from Nabble.com

http://www.nabble.com/Question-on-query-syntax-td11571616.html#a11571832
--------------------
Lucene's "boolean" operators are not true boolean operators.
Instead, every clause is one of:

OPTIONAL
REQUIRED
PROHIBITED

for a query (or parenthesized subqueries) to match, all REQUIRED
clauses must match, zero PROHIBITED clauses must match, and if there
are not REQUIRED clauses, at least one OPTIONAL must match. You
cannot have only PROHIBITED clauses.

Now, the syntax for each is (nothing), +, -, and they can be applied
to entire subqueries using brackets:

+hello -(goodbye -night)

returns docs that have hello, and do not have (goodbye without night)

In lucene, AND/OR/NOT are syntactic sugar that translates clauses to
the above form. However, it imperfectly matches people's (rational)
expectations of how boolean operators work. Also, brackets _create
subqueries_, not just group operators. I suggest that AND and OR
never be used programmatically, if possible.

Try these alternatives:

docs (must) containing 'text' that do not match (col=pile1 or col=pile2)
> text -(collection:pile1 collection:pile2)

same as above
> text -collection:pile1 -collection:pile2

docs (must) contain 'text' that (must) match (col=pile1 or col=pile2)
> +text +(collection:pile1 collection:pile2)

Note in the last example, the + is necessary before the text because
otherwise it would be optional and not required (as there are other
required clauses).

http://www.nabble.com/querying-with-two-words-returns-less-results-when-...
----------------

"Yep setting mm=1 did the trick for us. Thanks! "

So I just added mm=1 and my AND/OR's started working

function customsolr_apachesolr_modify_query(&$query, &$params) {

$params['mm'] = 1;

}