Project:Apache Solr Search Integration
Version:6.x-2.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

http://localhost:8983/solr/select?fl=nid%2Ctype&facet=true&facet.mincount=1&facet.field={!ex=type}type&fq={!tag=type}type%3Apage%20OR%20type%3Afoo

This will produce a list of documents that include pages and foos, and will list other types not selected and their counts:

[X] page (15)
[X] foo (8)
[ ] story (11)
[ ] forum (3)

Note that this doesn't work with q.alt.

Comments

#1

This looks good if the syntax can be reliably decomposed/recomposed. Checkboxes / multiple selects would be a standard way of picking several inclusive items. CGI would normally encode them as separate instances of the parameter but I appreciate that fq= has to encode Lucene-syntax queries. That does mean you'd need an intermediate stage, where the checkboxes sent you to a URL of the format:

http://...?type=page&type=foo...

and that would then have to redirect to the proper Lucene syntax. Either that or get the ApacheSolr module to handle conversions between CGI and Lucene syntax and keep CGI-friendly URLs? I suppose it's a moot point if the initial search submission is HTTP POST: you'll have to redirect somewhere, as with standard non-Solr search.

As tweeted, we've done some work on OR queries, but we never exposed it to the user: it was for making complex initial-conditions facets for particular site searches (publications searches etc.) Alongside checkboxes and multiple selects you could maybe model using the widget Django uses for transferring users between groups: it's like two multiple selects with arrowed buttons in between, so you can pass terms back and forth. That's pretty cumbersome on the page but it does lessen the impact of accidentally not holding down CTRL during multiple select.

I'm trying to think if there's a non-form equivalent way of doing this. Radio buttons (AND searches) have their parallels in standard web links, which ApacheSolr AND facets already use to great effect to drill down. Is there any non-form parallel to the multiple non-exclusive select? Maybe Flickr-like edit-in-place, where (at least after you've made your initial multiple selection) you just have a set of items with Xs beside them to delete, and clicking on the link body turns into a set of checkboxes.

#2

So, like you said, there are two ways to go. Either try and encode a fair amount of solr syntax in the url ({!tag-type} for example), or have an intermediate step that parses to the right syntax. The other option would be to have application state logic (like a block configuration) that modifies the query. I don't like that option because it means one query could behave differently under different circumstances depending on configuration. So my current favorite is to prefix the filter with an underscore _ if it is to be an OR facet. This would make URLs like this:
http://localhost/drupal-6.13/search/apachesolr_search/?filters=tid%3A113%20_type%3Astory%20tid%3A335%20_type%3Apage
Not horrible looking, and it doesn't conflict with Lucene syntax. http://lucene.apache.org/java/2_3_2/queryparsersyntax.html

#3

Status:active» needs review

Please test! This is a big patch. I only tested on the apachesolr_module facets. Please help test on other modules' facets.

You set the operator for the facet on the block configuration page.

AttachmentSizeStatusTest resultOperations
or_facets.patch12.86 KBIgnored: Check issue status.NoneNone

#4

Added a trim on the $queryvalues['filters'] = trim($queryvalues['filters']);

AttachmentSizeStatusTest resultOperations
or_facets.patch12.98 KBIgnored: Check issue status.NoneNone

#5

CCK facet field block deltas are not the same as their Solr index field names.

AttachmentSizeStatusTest resultOperations
or_facets.patch13.24 KBIgnored: Check issue status.NoneNone

#6

Status:needs review» fixed

#7

Status:fixed» active

The numbers next to the facets in OR blocks are not accurate, and there is a question around what they should be. In AND filters the numbers show you how many documents would be in the result set if you click the link. To be consistent the OR facets would then have to show the same - how many results will be in the result set. I'm not sure how easy/hard this will be to calculate, but it will involve arithmetic on the currently selected facets within the same filter to find a delta to add to the current document set.

#8

One issue with the current OR facet search is that it appears to function fine for CCK fields, but not taxonomy fields. The reason seems to related to this section of code in the apachesolr_modify_query function around line 1236.

      if (in_array($delta, $ors) || in_array($cck_delta, $ors)) {

In the case of taxonomy terms, the $ors array contain information on the individual vocabularies, but $delta is 'tid'. As a quick 'hack' I had change the line above to be:

      if ($delta == "tid" || in_array($delta, $ors) || in_array($cck_delta, $ors)) {

which worked, but is probably not the ideal solution.

#9

Tried the patch for taxonomy in #8, works for us... Interested in following up on this.

#10

OR faceting was not working for taxonomy facets. I fixed that in http://drupal.org/cvs?commit=358016. However, the facet counts are still wrong.

This fixes #8, #9, but leaves #7 open.

#11

Title:OR facets» OR facet counts are confusing

See #7 for a description of this bug.

#12

<?php
       
if (in_array($delta, $ors)) {
         
$ex = "{!ex=$delta}";
         
$op = 'OR';
        }
       
$ff = implode(" $op ", $params['facet.field'][$delta]);
       
$params['facet.field'][] = $ex . $ff;
?>

What? facet.field={!ex=foo}foo OR bar is not valid Solr syntax, and yet that is exactly what the above code will do if $params['facet.field'][$delta] has multiple values. It looks like this code was reused from apachesolr_modify_query, which operates on the fq parameter, for which fq={!tag=foo}foo:1 OR foo:2 is valid. Fixed in http://drupal.org/cvs?commit=360436

At the same time, closing #726660: Tag & Exclude Filters Use Inconsistent Names and #648036: Don't move fq params to q.alt if fq params include local params.

We can now continue with the issue in #7.

#13

#14

subscribe

#15

re: @robertDouglass #7:

An example interface using OR facets is at http://www.lucidimagination.com/search/#/p:lucene,solr :

  • the counts on the "Project" block facet values show the # of items that match.
  • if two or more facet values from that block are picked, yes, the total count will be higher than any of the facet value's count, and in some cases it could be less than the sum of the picked facet values.
  • the numbers do not change while facet values from the *same block* (e.g. "Project" facet values) are chosen. They do change, however, when any other facet value from another facet block is.

There is also the issue of OR hierarchic facets; we are not really handling those very well (meaning, in a way that makes sense). The above example also (to me) makes sense.

We should probably have a mockup of how this should work.

#16

#17

@janusman: Perfect example. Is there any fix for that ever built? I desperately need this

#18

There's a recently-committed patch for the 7.x branch that actually rewrites the behavior of hierarchic taxonomy facets, and it seems to work perfectly there. I think we should backport to 6.x. See: http://drupal.org/node/1049114#comment-4198252

#19

As soon as James is happy enough to roll another 6.x-2.x release I think we'll call it EOL/unsupported and work instead on a 6.x-3.x. That's the place I'd like to see such a backport happen.

#20

I probably won't do more work on 6.x-2.x. I was mostly going through that queue because it is closest to 7.x. Any bugs in 6.x-2.x likely exist in 7.x. I just wanted to fix 7.x bugs.

There are 132 issues in 6.x-2.x, roughly half the issues for the entire apachesolr project (41 bugs, 20 tasks, 48 feature requests, 23 support requests). We should close/move those issues to other versions if we choose to EOL 6.x-2.x. No point in carrying 132 issues around forever.

#21

Indeed - we need to decide if they are relevant to 7.x-1.x or a new 6.x-3.x.

Probably support requests older than ~14 days can just be closed at that point.

Same for most feature requests unless they have a useful patch or it's something we have in the roadmap.

James - I'll leave it up to you if you like (since you've been looking the most) when to roll a final 6.x-2.x release. I think we should also roll a 6.x-1.x release within the next couple weeks, and ideally a 7.x-1.x-beta4 by early next week.

#22

I agree with that outline. I'll go through the 6.x-2.x queue again to evaluate things.

#23

I went through 2.x support requests and tasks, and the open/needs work feature requests. Remaining feature requests and bug reports are left. Would love some help with those.

#24

And just like that, 21 issues in the 6.x-2.x queue.

http://drupal.org/project/issues/search/apachesolr?text=&assigned=&submi...

I've updated #1098954: Short-term roadmap for 6.x-1.3 release with the "Needs Review" patches that I wrote and want feedback for (all bugs).

#25

Status:active» closed (fixed)
nobody click here