The facet links don't have any awareness of other GET parameters that might be in the URL other than the ones that AS puts there itself. In fact, anything in the query string should be preserved so that other modules can use it too.

Comments

pwolanin’s picture

Status: Needs review » Needs work

We worked hard to remove all references to $_GET from the class. Seems like it would make more sense to pass in $_GET to the constructor.

Also, drupal_query_string_encode() has a serious bug in that it does:

$key = drupal_urlencode($key);

I'll have to find the issue link.

JacobSingh’s picture

I don't love the class taking in $_GET... I suppose it is okay, but for DX, classes that take anonymous arrays of "stuff" are usually deadly. Seems to me that parsing $_GET should be outside of the class, and perhaps making sense of some random variables could be in a class factory. But the constructor should, if possible, be clean of that kind of logic and ambiguity.

If we just shove $_GET in as a property of the class along with other well defined ones in the constructor, I'm okay with that, as long as the $_GET munging doesn't happen there.

robertdouglass’s picture

Status: Needs work » Needs review
StatusFileSize
new1.88 KB

Well, this approach here is the most flexible, and keeps GET out of the class, but is crappy for DX. Please comment separately on the issue of the function signature, when/where/if we deal directly with GET, and on the static helper function that this introduces to work around the bug Peter mentioned.

robertdouglass’s picture

I thought about putting the static function in the apachesolr.module but decided against it. Even though the concept of static functions in Drupal is somewhat stupid, it's nice that our class is almost completely free of dependencies on the module, so I elected to keep it a static function to maintain that.

robertdouglass’s picture

StatusFileSize
new2.2 KB

This one munges $_GET into a private static variable with a pattern reminiscent of static caching (using if empty()). It then automatically includes that array in the query building. Since there is an incoming $excludes array the developer can override any of this behavior.

robertdouglass’s picture

StatusFileSize
new2.21 KB

my original version filtered $_GET by check_plain. Peter, do you think this is necessary?

pwolanin’s picture

It's not clear to me whether we need to worry about it, but maybe we shoudl based on this function: http://api.drupal.org/api/function/filter_xss_bad_protocol/6

Actually, if we are doing it we shoudl probably do it when we output the querystring, rather than on input?

robertdouglass’s picture

I think (again for DX) doing it once, and doing it in the class when the variable gets set, is the best option.

I'm seeing a possibly related issue (need to role back the patch to confirm); my URLs are coming back with spaces, not plus signs (+), and this is breaking the unclick links.

pwolanin’s picture

@Robert - we are using rawurlencode() now which shoudl encode spaces as %20, not +.

Filter on output is the Drupal way - I think we shoudl stick to that pattern.

pwolanin’s picture

@Robert - why would we use a static class variable for this? Seems to defealt the purpose of allowing overrides.

Scott Reynolds’s picture

Status: Needs review » Needs work

this patch makes changes that should also be reflected in the Interface.

I also don't understand the need for this change. Scares me a lot, I really like the class to know nothing but the stuff that it did makes it easy to test, easy to understand. What is the intended bug to be fixed or feature to be added?

pwolanin’s picture

StatusFileSize
new7.75 KB

Here's a start on a patch that's a more general cleanup. It's totally untested, but just so you can see what I'm thinking.

@Scott - I think the idea is that there might be extra GET params you care about, especially for custom search pages. I'm not sure what Robert's particualr motivation is in this case.

robertdouglass’s picture

Peter, that looks like it will meet my needs. I'll give it a test. I might need to simplify the patch some and have it not refactor so much all at once.

Scott, it's exactly like Peter said. If you have a module (say the print module) that sticks extra GET params onto URLs for whatever reasons, you need the $query to be aware of them or they just disappear in facet links, etc.

I was trying to find a pattern that didn't make you remember to send $_GET into the search_execute function, but if everyone else is more comfortable with Peter's pattern I'll adjust.

robertdouglass’s picture

Status: Needs work » Needs review
StatusFileSize
new13.56 KB

This patch updates the new method to be called set_extra_queryvalues since we're storing an array. It fixes the querystring building logic in get_url_querystring, adds, some PHP docs.

robertdouglass’s picture

StatusFileSize
new8.2 KB

This one is the same as the last rerolled after I cleaned up whitespace in the module. Less noise in the patch.

robertdouglass’s picture

This is a URL that I'm using in a custom search built on ApacheSolr: http://localhost/alz/s?xor=all&s=1&show=200

If I want to show facet links that retain the xor, s and show GET parameters, we need a mechanism like the one proposed. With the patch in #15 a typical facet link looks like this:
http://localhost/alz/s/?xor=all&s=1&show=200&filters=type%3Abook

Without #15 the same facet link looks like this:
http://localhost/alz/s/?filters=type%3Abook

Scott Reynolds’s picture

Title: Other GET parameters ignored by ApacheSolr » Other GET parameters ignored by Apache Solr Facet Blocks
Status: Needs review » Needs work
StatusFileSize
new746 bytes

Ok thanks for the explanation. Changing title to reflect the problem better and providing a demonstration patch. Unfort, I am stuck on another project, which has a solr version that is lower and I really don't have time to spend. But as this affects the Views implementation, and I feel pretty strongly about this patch, I created a mere demonstration patch. It does have bugs in it, namely it will repeat variables in some cases. But I believe the code is placed in the proper spot so it affects how we create links and doesn't affect how we create Query objects.

Probably doesn't apply to the latest dev, again don't have time, brain cycles to spare.

Works in ur example though just as you drive deeper repeats GETs.

Tested against Apache Solr Views.

pwolanin’s picture

Sure, we could roll something liek that - but I'm not sure how this avoid duplicating params like solrsort, and of replicating aprams you don't want like 'page'
$diff = array_diff_assoc($_GET, $options);

pwolanin’s picture

We should be able to use this helper function I think
http://www.php.net/manual/en/function.http-build-query.php

pwolanin’s picture

@Scott - the downside of doing it the way you suggest, is that there is no easy OO way to override the behavior. Not sure what the right trade-off is.

pwolanin’s picture

Actually - thinking more about it - this also would potentially alleviate the $_GET hack I just committed in the taxonomy browsing pages.

Scott Reynolds’s picture

is that there is no easy OO way to override the behavior

this is the point. why would the query object every want to override the behaviour. What would it do differently about things that it knows nothing about?

And as Robert stated in the opening of the ticket

The facet links don't have any awareness of other GET parameters that might be in the URL other than the ones that AS puts there itself

Its the facet links, not the object that should concern itself here.

pwolanin’s picture

StatusFileSize
new7.8 KB

This is more like what Scott suggests - but I see again that it's problematic in as much as we will fail to retain any GET params when we submit the search form.

pwolanin’s picture

@Scott - the query object is providing you with a querystring (or possibly an array of query values). How it generates that does seem to be something you might want to override - e.g. if you want your url to look like:

?facets=tid:99&mysort=created desc

or like project module does, to add keys to a 'text' GET param

pwolanin’s picture

StatusFileSize
new10.1 KB

This patch fixes the sort link for score and the retain filters behavior - though note that misc GET params are not retained except when we specify retain-filters.

I think this might be a bit of a flaw in this architecture.

Scott Reynolds’s picture

the query object is providing you with a querystring (or possibly an array of query values). How it generates that does seem to be something you might want to override

Right but those are parameters it knows about. things like xor or print it doesn't know about. By making the class retain these GETs we have the class maintain things it doesn't know how to handle and thats a lil scary. The class itself should be a self contained object. How do ever generate a link that doesn't have those GETs in it now. At least with my approach you could use the object to generate the link.

Which is why I was searching for real tangible reason why this needs to happen. So we couldn't just say "Look project module does it" or look "Solr Views does it as well". An tangible example of where the current architecture fails, where one module fails to function in the optimum because this change isn't present. The Views object adds the exposed filters' identifiers to the query string, those can be arbitrary.

we will fail to retain any GET params when we submit the search form.

Doh! probably need to add the additional GET params as hidden values in the form. Same problem would occur in the Views object.

But in the end, I can make the Views object handle this, it already stores all the information it needs. It is small and trivial I guess.

robertdouglass’s picture

Doh! probably need to add the additional GET params as hidden values in the form. Same problem would occur in the Views object.

This might be the best solution.

By the way, while I'm thinking about it, I just have to say that I hate the %menu_tail aspect of search.module and how we currently do things with keywords. They're nothing but a GET param but we treat them differently and there are assumptions about them throughout the module. In a next iteration I want us to consider just having ?key=foobar for the keywords. It would simplify a number of things.

robertdouglass’s picture

Peter, maybe we can move fixing the sorting to a different issue?

pwolanin’s picture

@Robert - sure, they were jsut touching some of the same code.

After thinking abotu this more I've wavering back towards Scott's approach.

To do that well, however, I think we want to return an array of query params so that we can more easily filter by keys.

robertdouglass’s picture

I'm closing in on a patch with scotts approach now. Will post shortly.

robertdouglass’s picture

Status: Needs work » Needs review
StatusFileSize
new8.68 KB

Based on Scott's approach. One question I have is whether more unsets are needed here:

+  // Retain GET parameters that ApacheSolr knows nothing about.
+  $diff = array_merge($_GET, $options['query']);
+  unset($diff['q']);
+  unset($diff['filters']);
+  foreach ($diff as $key => $value) {
+    $options['query'][$key] = $value;
+  }
robertdouglass’s picture

StatusFileSize
new8.35 KB

Oops, that version still handled encoding itself. This one lets it pass through to url().

Scott Reynolds’s picture

Status: Needs review » Needs work

I think that if you flip the array_merge() you solve your problem

array_merge($options['query'], $_GET);

I know for a fact that this won't work with Solr Views because it doesn't use $_GET['filters']. So with the array_merge flip described above, that should work with Solr Views, and I think fix the need to unset filters.

I applied the patch and try to test on a site Im building, and got a preg_match() error from drupal_validate_utf8 when rendering the hidden apachesolr_search_querystring form element. (Taking a guess at the form element, by looking at the backtrace).

pwolanin’s picture

Robert - why unset when use can use array_diff_key()?

Also - for example - when we are handling a form submission we don't want to carry over any filter param from $_GET since there may be no such key in the destination query values.

pwolanin’s picture

Status: Needs work » Needs review
StatusFileSize
new10.35 KB

ussing array_diff_key() and retaining GET params on form submission too.

JacobSingh’s picture

I read the patch, and although I haven't been involved in the issue, I think I get the gist. Turning the query into an array instead of a string seems to have not introduced any regressions.

Would be nice to see it go through our QA cycle, but AFAICT, no issues. I am not testing it with Views or anything like that.

-J

robertdouglass’s picture

Status: Needs review » Fixed

Great. Thanks for the help, guys. Committed #35.

mkalkbrenner’s picture

Scott Reynolds’s picture

Thought I would point out as you guys rev up to end the 1.x branch, that this fix does not work with Solr Views. #508826: Currently Broken with latest apachesolr dev

And the reason is

$get = array_diff_key($_GET, array('q' => 1, 'page' => 1, 'filters' => 1, 'solrsort' => 1, 'retain-filters' => 1));

That happens to be very hardcoded for apachesolr_search.module.

I pointed this out here:
1.) http://drupal.org/node/502976#comment-1764658
2.) http://drupal.org/node/502976#comment-1761590

I will fix this, but I busy busy busy. Hopefully, tonight before I pass out.

pwolanin’s picture

@Scott - sorry if I misunderstood your prior comments - you seemed to be arguing to not include this query parameter handling in the object.

Where do you suggest defining this filter list if it's going to be variable?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Scott Reynolds’s picture

Status: Closed (fixed) » Needs review
StatusFileSize
new1.22 KB

This is now usable by Apache Solr Views. It makes no assumptions about what $_GET params are in the url

For instance, Apache Solr Views might have the search keys as $_GET['my_text'] and have node types exposed via a form like this $_GET['node_types']. And this will work by leveraging the Interface and the abstraction it provides

robertdouglass’s picture

This line is to be ignored, right?

 /**
  * Mark one node as needing re-indexing.
- */
+$get = array_diff_key */
Scott Reynolds’s picture

StatusFileSize
new1.82 KB

I swear I look at the patch through less... sigh. Reroll

I have fixed Apache Solr Views to now work :-D. This patch carries the params through with standard solr and the Views query.

Scott Reynolds’s picture

....

the patch is just supposed to change one line

-  $get = array_diff_key($_GET, array('q' => 1, 'page' => 1, 'filters' => 1, 'solrsort' => 1, 'retain-filters' => 1));
+  $get = array_diff_key($_GET, array('q' => 1, 'page' => 1), $options['query'], apachesolr_current_query()->get_url_queryvalues());

This would be why I stop committing code after 10pm. Brain refuses to function....

Scott Reynolds’s picture

Bumping this as this functionality currently borken with Apache Solr Views. Just a one line change.

robertdouglass’s picture

Status: Needs review » Patch (to be ported)
StatusFileSize
new1003 bytes

Committing this patch to 6.2.

robertdouglass’s picture

Status: Patch (to be ported) » Fixed

And to DRUPAL-6--1

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.