Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I need to get Solr doing Field collapsing/result grouping which amounts to sending some extra parameters to Solr, and interpretting the results a little differently when they come back.
To do this cleanly I've proposed the notion of 'Query extenders' in #1720348: Add the concept of query extenders which mean that the collapsing behaviour can be turned on and off per query.
Comment | File | Size | Author |
---|---|---|---|
#30 | 1721262-30--field_collapsing.patch | 6.57 KB | drunken monkey |
#29 | 1721262-29--field_collapsing.patch | 4.75 KB | das-peter |
#29 | interdiff-1721262-23-29-do-not-test.diff | 2.4 KB | das-peter |
#23 | 1721262-23--field_collapsing.patch | 3.48 KB | das-peter |
#20 | 1721262-20--field_collapsing.patch | 3.57 KB | drunken monkey |
Comments
Comment #1
Steven Jones CreditAttribution: Steven Jones commentedThis patch requires the patch in http://drupal.org/node/1720348#comment-6330704
It does the following:
Comment #2
Steven Jones CreditAttribution: Steven Jones commentedAnd this patch allows you to specify a sort order within the groups.
Comment #3
Steven Jones CreditAttribution: Steven Jones commentedActually the previous patch was flawed and wouldn't actually return any results.
Comment #4
das-peter CreditAttribution: das-peter commentedHey Steven, thank you very much for the patch. I'm playing around with it to figure out how I'll create a feature I need for a customer :)
What would you think about changing
search_api_solr_field_collapse
tosearch_api_grouping
? I think collapsing / grouping isn't a feature only of / for solr - thus it would be nicer to use a more generic name so that other service adapters can hop on an implement this feature on their own without adding more feature names.Comment #5
das-peter CreditAttribution: das-peter commentedJust struggled over this:
As far as I can see
$type
is undefined here. I think it should be$field['type']
.Comment #6
das-peter CreditAttribution: das-peter commentedI've created a sandbox for my work on this topic Search API Grouping.
It provides processors for splitting indexing items and grouping results.
The grouping code relies on the code provided here. This now introduces the issue related to query extenders I mentioned here: #1720348-3: Add the concept of query extenders
I suggest to handle just the grouping support in this issue and the query extender feature in this one: #1720348: Add the concept of query extenders
To do so, I've split the patch into one that only deals with the grouping mechanism and one that provides the query extender. I fixed some coding standard issues and the issue mentioned in #5. Besides that I renamed the feature according to my suggestion in #4.
Comment #7
das-peter CreditAttribution: das-peter commentedI've updated the field collation support patch - I struggled over some small issues.
The query extender patch from #6 is still valid.
Comment #8
Steven Jones CreditAttribution: Steven Jones commentedThanks for the feedback.
My impression is that Solr field collapsing will have different parameters and options from a similar feature in say MongoDB, and so making the support of it specific to Solr is what's wanted here I think. If a little way down the line they actually end up being very similar, then yes, this could be combined in a search API backend feature.
I've added the ability to change the amount of results returned in each group, which meshes really well with Views' built in render-time grouping actually!
This patch is based on mine in #3, sorry.
Comment #9
Steven Jones CreditAttribution: Steven Jones commentedSorry that patch had a debugging line in it.
Comment #10
drunken monkeyThis looks like a handy feature, and if the results still look the same (i.e., adhere to the contract of
SearchApiQueryInterface::execute()
), I see no reason why we shouldn't include this feature and option.The only problem I see is that, if I understand it correctly, the returned results could be kinda weird in some cases (when grouping on a field with only a few different values), having a large result count but still returning less results than the limit, massive problems with the pager, and so on. That would, at least in some sense, violate the method's contract. Maybe we should just return the number of groups as the result count?
Thanks for splitting, Peter – I do think we should only handle the Solr support in this issue. The query extender isn't Solr-specific at all (and shouldn't be) and should probably go into another project. This project should then also contain clear documentation of the syntax and semantics of the
search_api_grouping
feature (or whatever name we end up using). The Solr module should then adhere to those specifications as closely as possible.I'll look at the patches more closely later.
Comment #11
Steven Jones CreditAttribution: Steven Jones commentedI'm not sure it's odd, Solr at least will return the correct number of groups, and the paging works just fine. If you're returning one item per group then it's really transparent to the outsider. If you are returning more than one result per group then yes, it is a little weird, e.g. ten items per page, but you actually get one hundred back from Solr.
I would suggest looking at adding this feature to the MongoDB Search API backend or similar and seeing if the feature can be generalised to those backends in a consistent manner before deciding to name this non-Solr specifically.
Comment #12
das-peter CreditAttribution: das-peter commentedGlad to see some movement here :)
As SQL understands grouping I'd bet there's need in other search backends too. Thus I'd say the point is more to figure out which options the different backends have in common to create a generalized options set.
Comment #13
drunken monkeyBut if you, e.g., have only three groups but twelve results, wouldn't you get a pager with two pages, but only three results listed and an empty second page? That doesn't seem very transparent to me.
You could return the total number of results in a separate key, something like
search_api_grouping_result_count
, so as to enable displaying the real number somewhere – but the pager should definitely use the number of groups as the basis.I agree with Peter here – it would definitely be possible in other backends, too (we also don't know what other backends might come in the future), and there is really no advantage to naming it in a Solr-specific way. Currently, nearly no feature is supported by a non-Solr backend, but still none is named in such a way.
Comment #14
Steven Jones CreditAttribution: Steven Jones commentedSo with the default settings that the Solr implementation provides, it returns 1 result per-group, and gets Solr to use the groups for paging information. So in the above scenario you get back from the backend that there were three results total, and these three results returned.
Solr supports returning N results per group (and then paging through those in addition to paging through the groups themselves) and so in the latest patches that is supported. What that looks like though is getting 12 results back when the paging information indicates that there are only 3 results (the 3 groups still). This is a little weird at this stage, and is clearly stated so in the UI where you are given the option. This actually works really well with Views result grouping though, because actually you can get all the results for each group back, and page through the groups, which is usually what you want to do with Views result grouping anyway.
Comment #15
Steven Jones CreditAttribution: Steven Jones commentedHere's a patch that builds on the patch in #9 but adds support for the new Solr 4 group.facet parameter.
Comment #16
dasjowhile working on Geocluster's solr integration, i found the need to use grouped results with search api solr. for a more details see #1800850: Use a solr plugin with search_api_solr for clustering
in my case i'm adding grouping functionality on the code-level, so i'm not depending on the query extender. i therefore sticked to the field collapsing patch from #6. find a re-roll attached (search_api_solr-field-collapsing-support-1721262-16.patch).
for completeness, i'm also attaching (search-api-solr-1721262-field-collapsing-16.patch) which is an update of #9.
Comment #17
stella CreditAttribution: stella commentedI got this working with the patches from comment 9 and comment 15. However, I'm using Solr 4.x (with config files from the sandbox at http://drupal.org/sandbox/cpliakas/1600962) and while it works perfectly, the above patches only allow field collapsing on string type fields (in line with what Solr 3.x supports).
In case anyone else is also using Solr 4.x, the attached patch removes that restriction. As this module only officially supports 3.x, I've just rolled this patch on top of the above patches, rather than rerolling them.
Comment #18
stella CreditAttribution: stella commentedHere's an additional patch to expose the per group 'numFound' field to Views, so we can display the number of records in a grouped result.
Comment #19
Anonymous (not verified) CreditAttribution: Anonymous commentedWhat are the steps to use the UI to configure field collapsing? I installed all patches and Search API Grouping, but fail to see where to configure anything.
Attached a re-roll of patches #9 + #15 + #17 + #18 against latest devx.
Comment #20
drunken monkeyAs already said in #10, the query extender isn't Solr-specific and should thus be contained in a different project, not in this patch. I also think the additional Views field, although probably tricky to add otherwise, is much too specific to be included in the module.
So, attached is a re-roll with those changes, along with some others (e.g., using
search_api_grouping
as the featurename and option key).One thing still needed is a module actually defining the
search_api_grouping
feature, so we have a clear spec on which options we actually need to recognize, and for other developers to know how they can use this feature. This would then also need to be referenced in ourREADME.txt
file, like the other features.A module with Steven Jones' query extender would be a candidate for that, if he is still interested (or someone else wants to go ahead and create it). Defining it right in this module would be possible as a “last ressort”, too, but would be anfavorable because the feature is neither Solr-specific nor actually used by this module.
In any case, we need the feature documented somewhere. (And when doing so, I would vote for removing the
search_api_grouping[collapse]
key. I don't think there's any use in specifying the option but setting this toFALSE
, or is there? Also, I'd combinegroup_sort
andgroup_sort_direction
into a single associative array which also allows for multiple sorts.)Other than that, the patch seems to work fine. The only thing that needs to be fixed there is not setting
group.field
for fields that aren't supported (multi-valued, etc. – essentially the same checks that were also in the query extender), maybe with adding a warning to the response additionally. I also think the query extender, if it gets added to some module, should support all fields, as other backends might have different restrictions.Comment #21
das-peter CreditAttribution: das-peter commentedI've started first working with this in a sandbox called Search API Grouping, later I've switched over to jgraham's sandbox Search API Denormalized Entity Index and created the similar functionality based on his approach in the 7.x-2.x branch.
As both projects still are sandboxes I could port the 7.x-2.x branch of "Search API Denormalized Entity Index" to "Search API Grouping" and add all the missing stuff needed to close this issue here.
Sounds that like a viable plan?
Comment #22
drunken monkeyYes, that would be great!
Comment #23
das-peter CreditAttribution: das-peter commentedHere we go
Comment #24
drunken monkeyOK, great!
However, as far as I can see a documentation for the
search_api_grouping
feature is still missing. Please add that to the README.txt, then I think we're all set. (If you don't know how you'd do that, see, e.g., the documentation of the features defined bysearch_api_views
andsearch_api_facetapi
, in their respective README.txt files.)Except for the last paragraph in #20, that is.
Comment #25
das-peter CreditAttribution: das-peter commentedDocumentation added: http://drupalcode.org/sandbox/daspeter/1783280.git/blobdiff/e89237d73c08...
That should be part of the service class, right? Since I don't think the Search API Grouping module itself should or can deal with the service specific restrictions?
Shall I promote Search API Grouping to a full project?
Comment #26
drunken monkeySeems fine, thanks!
As previously noted, I'd combine the two keys for sorting to a single one containing an associative array – but if you want to keep it like that, we can do that as well, of course. In that case, you should add that the first one contains the field name, though.
Anyways, as far as this issue here is concerned, that documentation will do fine, thanks!
Yes, exactly.
Oh, and we should also add documentation that we support this feature (and a link to the Grouping project) to the README.txt.
If you are confident it's usable, then please do so. I don't think it's necessary for our issue here, though.
Comment #27
das-peter CreditAttribution: das-peter commentedHmm, that makes sense.
How about:
This allows multiple fields and easy handling of fields or directions if necessary.
I'll do that as soon as we agree on the group_sort format. Then I've to adjust the patch anyway.
I'm never confident enough that something I wrote really is usable :P But since we use this code for months now I think I'll give it a try.
Comment #28
drunken monkeyWould be an option, of course – but I was more suggesting the format
$query->getSort()
already uses:I can sympathize with that – see my countless non-stable modules … ;)
But yes, do that!
Comment #29
das-peter CreditAttribution: das-peter commentedHere's the updated patch which includes the group_sort formatted as proposed in #28 as well as the requested update of the README.txt
Comment #30
drunken monkeyGreat, thanks for your work!
However, the check for which fields allow grouping was still missing. I added that in the attached patch. Please test whether this still works for you, and then we can finally commit.
Comment #31
das-peter CreditAttribution: das-peter commentedJust gave it a try: Works and looks good.
Only thing I noticed is this error reported from the code sniffer
Concatenating translatable strings is not allowed, use placeholders instead and only one string literal
, which is related to this:Should look like:
But for me that's not really a showstopper. Thus I'd say RTBC :)
Comment #32
Anonymous (not verified) CreditAttribution: Anonymous commentedMight there be an issue left with the sort field? After appyling the patch from #30 I go this:
This line:
The
$grouping['group_sort']
looks like 'price' instead of array('price' => 'ASC')(Using the 7.x-2.x branch from the search_api_grouping sandbox.)
Comment #33
drunken monkeyAh, yes, this occurs very frequently in my modules. I just think it's both ugly and impractical to have lines with several hundred characters. I also haven't really found documentation that this is forbidden, let alone why. Therefore, I'd leave that in there until someone gives me strong enough reasons to change this (at which point I'd have to patch nearly all of my modules).
Ah, yes, seems the module wasn't updated with the change in #29 yet. However, that has to be fixed there, I don't think we have to wait with this patch for that.
So, committed!
Thanks again for your work, everyone!
Comment #34
das-peter CreditAttribution: das-peter commented@morningtime Are you sure you used the latest version of Search API Grouping? Because the repository viewer tells me the latest version should properly generate the new group_sort format: http://drupalcode.org/sandbox/daspeter/1783280.git/blob/refs/heads/7.x-2... :
If the problem persists, please open an issue in the Search API Grouping issue queue and add information about how you've configured the processor.
Comment #36
fagoGreat work everyone. I initially thought that calculation of total-rows is wrong, as it does not match the number of documents returned. However still, I think it's the best approach as it matches limit + the number of groups returned - so it's the only way paging can work.
So with the default of group limit 1 everything is perfect, with something different this is still the best we can do. I posted a patch to allow users to configure the group limit over at #2026807: Make group limit configurable - I guess it could use some description that number calculation of queries still go to the number of groups though.
Then I filed #2027843: Make the solr response available as part of the search results to make the grouped-results and extra information available to the API consumer, e.g. to views via $query->getSearchApiResults();
Comment #36.0
fagoUpdated issue summary.