Hello,

We make use of this (excellent) module in combination with the pathauto module (for directly linking between taxonomy terms and the solr module), but when we upgrade to the new alpha (6) release something doesn' work anymore.

In the previous version of the apachesolr module (alpha 5 for drupal 6), it was possible to search without a key and only using a filter, eg a tid or/and a content type.

Why do we want this ?

When using the pathauto module, we can link the terms of a vocabulary directly to the apache solr module.

Example

Alpha 5 (this works)

/search/apachesolr_search/tid:47

Alpha 6 (this fails)

/search/apachesolr_search/key?filters=tid:47

Here we have to use a key, while in the previous version this was'nt necesarry. Using something like

/search/apachesolr_search/?filters=tid:47

does not work.

How can we solve this, is there a workaround ,...

greetz,
Wim

Comments

JacobSingh’s picture

Hi Wim,

We changed the URL format from:

http://buytaert.net/search/apachesolr_search/drupal+tid:19

to:

http://buytaert.net/search/apachesolr_search/drupal?filters=tid:19

To make query parsing easier. So you just need to make a patch which will take the keyword * or "all" or something to get all results.

This can be accomplished with q.alt=*:* added to the $params.

Would love to see this patch! Please post it here when you get it working, or ping me if you need more help. I'm also on IRC sometimes (JacobSingh) or Skype (pajamadesign)

Cauliflower’s picture

StatusFileSize
new885 bytes

Hello,

I've made a patch to add this functionality to te apachesolr module.

When you've applied this patch, you can define a wildcard, default '*', on the apachesolr settings page.

Using this wildcard gives you all results from the solr index. You can still define filters, sorting options,...:

/search/apachesolr_search/*?filters=tid:47

When using a wildcard, the snippet remains empty. This is a workaround to use the teaser:

- Make a new module
- Use the hook apachesolr_process_results for replacing the snippet:

function modulename_apachesolr_process_results(&$results) {
  foreach ( $results as $key => $result) {
    $fields = array();
    foreach( $result['node'] as $var => $content ) {
      $fields[$var] = $content;
    }

    if ( $results[$key]['snippet'] == '' ) {
      $results[$key]['snippet'] = node_teaser($fields['body']);
    }
  }
}
pwolanin’s picture

The patch looks reasonable - since we already set q.alt to default to '*:*'

An alternative would be to have that setting used as the basis to modify the form validation so that an empty search works.

Cauliflower’s picture

Status: Active » Needs review
StatusFileSize
new994 bytes

There is, due to a hurry, an error in the patch.

The code for extending the settings form should not be in the apachesolr_menu() but, of course, in the apachesolr_settings() function.

Add this piece of code to apachesolr_settings() in apachesolr.module :

  $form['apachesolr_wildcard'] = array(
    '#type' => 'textfield',
    '#title' => t('Solr wildcard'),
    '#default_value' => variable_get('apachesolr_wildcard', '*'),
    '#description' => t('Wildcard to show all results from the solr index.'),
  );

There is a correct version of the patch added as attachment.

greetz,
Wim

langworthy’s picture

$fields['body'] remains empty for me when using the wildcard, so I am unable to fill the snippet.

Cauliflower’s picture

You can also make your own snippet without using your body (strange that this is empty, do you use the right schema.xml and solrconfig.xml files ?)

Look at the search index page of apachesolr (admin/settings/apachesolr/index) or do a print_r of the $fields variable to see which fields you can use.

We (www.jeugdwerknet.be) are busy with a remake of our website, in drupal. The site will contain 40 000 nodes and has 10 different content types when we launch it. The apachesolr module will have an important role in this, for searching and using facets to narrow the search. The import/export script (from our old database to the new drupal database) is ready and we ar now busy with finetuning the solr module for our own needs (using only 2 patches at the moment (the one you find on this page is one of them) and the hooks the solr modules uses). Eg. each snippet is extended with some cck fields (which are also indexed) and taxonomy specific terms. Launch at the end of february.

langworthy’s picture

I just double checked the two .xml files and I'm still not getting $fields['body']. I'm using firep($fields['body']); (drupal for firebug). Body is listed as a field name in the solr settings.

When I search using the wildcard, the wildcard is not inserted into the facet links so I am unable to narrow down the search. ie: ?filters=tid:{tid} rather than *?filters=tid:{tid}

//edited with more info on body field name

Cauliflower’s picture


When I search using the wildcard, the wildcard is not inserted into the facet links so I am unable to narrow down the search. ie: ?filters=tid:{tid} rather than *?filters=tid:{tid}

I also discovered that error today, will try to fix that tomorrow and will post a patch.

langworthy’s picture

StatusFileSize
new5.67 KB

Here's a quick patch I just finished (my first drupal patch!)

It includes the patch in post #4 and addresses the issue in post #7 (facet links when wildcard used)

Cauliflower’s picture

This patch works fine for me, but when searching with the wildcard, there is an empty bullet in the list of the 'Current search' block.

Cauliflower’s picture

StatusFileSize
new5.8 KB

I tried to make a patch for the new release of this module, 6.x-1.0-beta2.

blackdog’s picture

Version: 6.x-1.0-alpha6 » 6.x-1.x-dev
Status: Needs review » Needs work

cptnCauliflower: Your last patch includes another patch as well, try to keep them separeted.

blackdog’s picture

Category: support » feature

Upon further investigation I've found that just setting $keys = '' won't do it.

For example: I have a menu item linking to an empty search for all items of a certain type: /search/apachesolr_search/*?filters=type:organisation which works great, it returns all nodes of type organisation. But when one tries to filter this search, the URL becomes: /search/apachesolr_search/?filters=type:organisation tid:13, i.e looses the wildcard. I don't know if it's an easy thing to add the wildcard again, but I haven't found the right place to do it.

It seems like the best approach would be what pwolanin said in #3 - to allow empty searches.

dreed47’s picture

Subscribing and +1 for this feature. My site will need this. I'm testing the patch now.

Scott Reynolds’s picture

I believe that in the Solr_Base_Query.php class you could change

/**
   * A function to get just the keyword components of the query,
   * omitting any field:value portions.
   */
  public function get_query_basic() {
    return $this->rebuild_query();
  }

To

/**
   * A function to get just the keyword components of the query,
   * omitting any field:value portions.
   */
  public function get_query_basic() {
    $basic = $this->rebuild_query();
    if (empty($basic)) {
      return variable_get('apachesolr_wildcard', '*');
    }
     else {
       return $basic;
     }
  }

And that would solve your problem.

blackdog’s picture

#15 - nope, that makes the search return empty.

blackdog’s picture

Status: Needs work » Needs review
StatusFileSize
new6.33 KB

This actually seems to work now. Changed from empty($basic) to !isset($basic), and now wildcard search with * returns all nodes, and for example /search/apachesolr_search/*?filters=type:organisation returns all organisations, and facets and sorting seems to work just fine.

langworthy’s picture

Above patch generally works for me.

Using D6.9 and apachesolr 6.x-1.0-beta2

Remaining issues:
- when searching with wildcard, the body is empty. suggestions to this problem above do not work. note that body is present when using a keyword.
- an empty unordered list is printed in the 'current search' block

janusman’s picture

Status: Needs review » Reviewed & tested by the community

Patch works.

The patch modifies:

  • Solr_Base_Query.php's get_query_basic().
  • Adds an option to apachesolr.admin.inc
  • Adds a function apachesolr_get_path() to apachesolr.module
  • changes places in apachesolr_search.module and apachesolr.module where the path was originally generated with $path = 'search/' . arg(1) . '/' . $query->get_query_basic(); with a call to this new function.

    Works as advertised, for example: /search/apachesolr_search/* returns all results. /search/apachesolr_search/*?tid:XXX returns all items filtered by tid XXX.

damien tournoud’s picture

Status: Reviewed & tested by the community » Needs work

The dismax doesn't support wildcard queries. Search for "*" on d.o correctly return 0 results. I don't see how that could work.

damien tournoud’s picture

I believe that the cleanest way to do this is to implement the q.alt parameter.

blackdog’s picture

Status: Needs work » Reviewed & tested by the community

Damien: Without this patch, wildcard search doesn't work, so trying to search with a wildcard on d.org shouldn't work. This patch makes it work.

janusman’s picture

The patch by @blackdog is basically making the module issue an empty q= paramterer to Solr, which then responds with all results.

The logic in @damien 's comment seems to be this (IMO): an empty q= seems "fragile" in that it is undocumented that an empty q= would have to return all results. Therefore, he is proposing we instead explicitly send an empty q= AND an additional q.alt=*:* parameter that would then return all results.

This seems more logical; q.alt takes over when q is empty. However, where would be the correct place to plug in that extra q.alt parameter? This works (when used in conjunction with the patch from #17), but perhaps it's not the best place:

@@ -765,6 +764,10 @@ function apachesolr_modify_query(&$query
   if ($query && ($fq = $query->get_fq())) {
     $params['fq'] = $fq;
   }
+  // Add q.alt=*:* if q is empty
+  if ($query->get_query_basic == "") {
+    $params['q.alt'] = '*:*';
+  }
 }
damien tournoud’s picture

Status: Reviewed & tested by the community » Needs work

Let's think about this a little bit more.

pwolanin’s picture

We set q.alt already in solrconfig.xml as a default param - that's why this patch worked at all.

     <str name="q.alt">*:*</str>

It is fine to send an empty q param for this reason - we are, in fact, taking advantage of this q.alt dismax feature already.

Cauliflower’s picture

Just as a comment:

This week we launched our site (20 000 users, 45 000 nodes), http://www.jeugdwerknet.be/ where the apachesolr module is highly integrated and it was the reason to start this topic. The 'wildcard' search is fully integrated. When you look at e.g. http://www.jeugdwerknet.be/spelen you see taxonomy terms at the right (made with the advanced taxonomy blocks). When a user clicks on one of this terms the solr modules is used to search through the nodes and the apachesolr filter blocks appear.
Also clicking on a taxonomy term through the whole site returns an apachesolr search result ( eg surfing to http://www.jeugdwerknet.be/spelen/soort/bosspelen is an alias of a taxonomy term which links to an search with solr and wildcard ). We also integrated different layouts for each content type on the solr page. Normally we will publish an article about the making of later this month.

janusman’s picture

So, then, we can set this as RTBC again? =)

damien tournoud’s picture

So, then, we can set this as RTBC again? =)

No.

This is a very elaborate way of not changing anything (isset($basic) is always true, because $basic = '' by default):

   public function get_query_basic() {
-    return $this->rebuild_query();
+    $basic = $this->rebuild_query();
+    if (!isset($basic)) {
+      return variable_get('apachesolr_wildcard', '*');
+    }
+    else {
+      return $basic;
+    }
   }

So the only thing that this patch change is the apachesolr_get_path() logic.

janusman’s picture

Title: Search without key » Search for just facet(s)

Let me rephrase the problem, in hopes I can get different feedback: How can we gain the functionality to search just for a facet's (or facets') value(s)?

It seems the logic of the module is "there is always a key", and the patch just adds some code to bypass that logic (kind of signalling "if you see the wildcard, it's ok to issue a search no key" whereas the current logic is "no key? then show the search form")

I don't understand if it's the logic ("there must always be a key") that's being evaluated, or the quality of the patch ("it's ok not having a key, but this patch in particular is badly written").

Care to elaborate a bit? =) BTW I apologize for ping-ponging on this so much =)

pwolanin’s picture

i I need to review the patch agin - I'm not sure we really want to have the "special" character, but this might be ok for the short term. We had been thinking instead that a search with no key would bring you to a page where the facet blocks are in the content area - clearly a list of all content is not very useful - so the idea would be focus on the facets right away. Drupal.org is basically doing a key-less search for project browsing, but clearly the results are already filtered there.

The prohibition on a key-less search is basically coming from search module. We have another issue open to decouple this module from search module, so we could accelerate that, or use a form_later to remove this particular validation.

fpahl’s picture

Hello,

I just installed the patch and it seems to work as described. Anyway I would prefer a solution that can just filter the results without the wildcard given.

For example I have a simple taxononmy block that allows the user to filter for the taxonomy:
/search/apachesolr_search?keys=*&filters=tid:1
After patching this works, but it is not very nice to find the "*" symbol in the search box after the filter has been applied.

Do you have any hint for me how to solve that?

Thanks - Florian

Cauliflower’s picture

Hello,

@pwolanin: You're right, the origin of using a wildcard in this patch ( http://drupal.org/node/358166#comment-1204232 ) is because using no key gives an error (by the core search module) and I didn't want to patch the core. It would be nice to decouple the apache search module from the core module. At jeugdwerknet.be we use the patch given by the coresearches ( http://drupal.org/project/coresearches ) module to disable the core search completely.

If you need some help for this (writing code, discussing, thinking,...) just let me know.

greetz,
Wim

fpahl’s picture

Hello,

After I installed the above wildcard patch, the coresearches patch and turned off the core searches, I was still not able to do a search without a key. The reason seems to be in the file search/search.pages.inc:

function search_view($type = 'node') {
    ...
    $keys = search_get_keys();
    // Only perform search if there is non-whitespace search term:
    $results = '';
    if (trim($keys)) {

After I replaced
if (trim($keys)) {
with something like
if (trim($keys) >= 0) {
searching without a key seems to work.

Anyway I still get empty snippets in the search results. Did anyone already solve that?

Greetings,
Florian

Cauliflower’s picture

We solved the 'empty snippet' like this:

- we als indexed the teaser of the node seperately
- a search results always includes this teaser
- if the snippet is empty, we show the teaser.

This is not the most efficient way, but if there are suggestions, I would like to hear them.

greetz,
Wim

xnickmx’s picture

Subscribing +1

xnickmx’s picture

I am trying to test the patch posted in message 17, but getting errors while trying to apply the patch:

C:\Data\Workspaces\drupal\modules\apachesolr-HEAD>patch -p0 < wildcard_search.patch --binary --verbose
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|? solrconfig.xml
|Index: Solr_Base_Query.php
|===================================================================
|RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/Solr_Base_Query.php,v
|retrieving revision 1.1.4.19
|diff -u -p -r1.1.4.19 Solr_Base_Query.php
|--- Solr_Base_Query.php        6 Feb 2009 03:55:10 -0000       1.1.4.19
|+++ Solr_Base_Query.php        11 Feb 2009 12:40:07 -0000
--------------------------
Patching file `Solr_Base_Query.php' using Plan A...
Hunk #1 FAILED at 224.
1 out of 1 hunk FAILED -- saving rejects to Solr_Base_Query.php.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: apachesolr.admin.inc
|===================================================================
|RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr.admin.inc,v
|retrieving revision 1.1.2.6
|diff -u -p -r1.1.2.6 apachesolr.admin.inc
|--- apachesolr.admin.inc       28 Jan 2009 13:53:03 -0000      1.1.2.6
|+++ apachesolr.admin.inc       11 Feb 2009 12:40:07 -0000
--------------------------
Patching file `apachesolr.admin.inc' using Plan A...
Hunk #1 FAILED at 60.
1 out of 1 hunk FAILED -- saving rejects to apachesolr.admin.inc.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: apachesolr.module
|===================================================================
|RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr.module,v
|retrieving revision 1.1.2.12.2.107
|diff -u -p -r1.1.2.12.2.107 apachesolr.module
|--- apachesolr.module  10 Feb 2009 20:47:03 -0000      1.1.2.12.2.107
|+++ apachesolr.module  11 Feb 2009 12:40:09 -0000
--------------------------
Patching file `apachesolr.module' using Plan A...
Hunk #1 FAILED at 618.
Hunk #2 FAILED at 662.
Hunk #3 FAILED at 861.
Hunk #4 FAILED at 949.
4 out of 4 hunks FAILED -- saving rejects to apachesolr.module.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: apachesolr_search.module
|===================================================================
|RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr_search.module,v
|retrieving revision 1.1.2.6.2.68
|diff -u -p -r1.1.2.6.2.68 apachesolr_search.module
|--- apachesolr_search.module   9 Feb 2009 14:12:22 -0000       1.1.2.6.2.68
|+++ apachesolr_search.module   11 Feb 2009 12:40:11 -0000
--------------------------
Patching file `apachesolr_search.module' using Plan A...
Hunk #1 FAILED at 366.
Hunk #2 FAILED at 402.
Hunk #3 FAILED at 412.
3 out of 3 hunks FAILED -- saving rejects to apachesolr_search.module.rej
done

I am guessing there is some kind of problem with the version of the code that I have vs. the version of the code that the patch is trying to change. I got my version from CVS yesterday. It is the HEAD branch.

Does anyone know what I can do to get this patch installed and help test this code?

langworthy’s picture

we're running the patch in comment #17 against apachesolr 6.x-1.0-beta2

Scott Reynolds’s picture

Also see my comment http://drupal.org/node/254565#comment-1404872 on #254565: Integrate with Views front-end. There i provide a small 2 line patch to make facet apachesolr views work.

janusman’s picture

StatusFileSize
new112.5 KB
new9.71 KB
new9.71 KB

This patch includes the one in #17 and tries to solve the issue completely (remember, "Search for just facets")

After applying, it does this:

  • ApacheSolr now accepts keyword searches for "*" which return all items
  • A keyword search can now include a "command" that filters by taxonomy terms.
    Examples:
    • [filters=tid:1] (or * [filters=tid:1]) would match all items that have term ID 1
    • bike [filters=tid:1 tid:2] would match all items containing "bike" that also have terms 1 and 2
  • Terms inside nodes, tagadelic, and all other modules that correctly use the API to build term links will now be associated to an ApacheSolr search instead of taxonomy/term/X.

To test, install patch against latest 6.x-1.x-DEV, and *MAKE SURE* you go to settings/apachesolr/enabled-filters and SUBMIT it (again) with facets from taxonomy vocabularies. If you fail to do this then Drupal will not associate those vocabularies correctly and the patch will not work as advertised.

I also attached a screenshot to illustrate.

This patch will kill exactly 1.2 kittens.

janusman’s picture

Status: Needs work » Needs review

Forgot to set as needs review.

janusman’s picture

StatusFileSize
new10.15 KB

Missed an apachesolr_get_path() in the new date range facet code.

janusman’s picture

StatusFileSize
new10.15 KB

aaand another one. dang.

pwolanin’s picture

@janusman - what is the motivation for providing a pseudo-query syntax within the keyword search? this doesn't seem desirable.

Having jsut the '*' search would be a better frist step - though I'm still wary of givving users the impression that they can use wildcards in the search

janusman’s picture

The "pseudo-query syntax" (as you call it) is needed because the taxonomy_term_path() hooks can only provide a Drupal path with no arguments. That is, it can provide /foo/bar/xxxx but not /foo/bar?arg=foobar (which is needed for the "filters=tid:XXX" argument) AFAIK.

It's "desirable" in that it achieves the goal to be able to "cleanly" launch an ApacheSolr search from within taxonomy term links. Other options are welcome. (I'm using what the taxomomy_redirect module is using).

Or... do you think it's a bad idea overall to extend the query syntax? I don't see why it should be problematic, being that google and lots of other search engines have an ample query language to add filters or fielded search. For example: the "site:XXX", "title:", "inurl:" fielded searches in Google, phrase delimiters, wildcards, parentheses and boolean operators in lots of others =) It would probably be good to expose these in a RESTful way, though, and not the way I'm doing it in this patch. Perhaps we should open up a different issue to come up with a good way to expose filters cleanly in URIs and keep some sanity regarding keeping it simple for users (keep them from seeing complex syntax in the query box).

@pwolanin (and others!) please share your thoughts on this.

pwolanin’s picture

@janusman - ah I was thinking about more directly taking over the taxonomy paths via hook_menu_alter, but that taxonomy hook might indeed be lower impact in some ways.

If we assume we are not presenting a search box on the taxonomy pages (?) there is no reason we can't have a separate callback that just accepts a list of tids?

If we want to go to the actual search page, we can use hook_link_alter() to supply the query string. In fact I see some fun hacks to play with those hooks.

David Lesieur’s picture

Status: Needs review » Needs work

The patch no longer applies.
However, I prefer pwolanin's suggestion in #30: allowing empty keys, showing the facets in the content area. Solving #405206: Allow Apache Solr to be the default, let search module index 0 nodes per cron run first would help get rid of the "empty query" limitation.

janusman’s picture

I think we might be switching contexts here; again, the focus of this issue is (was?) to have a search for *just* a facet. @David is suggesting what would happen with NO search at all. (Which is cool too ;))

@pwolanin: re #45:: I was thinking of clicks on taxonomy terms putting users in the middle of a "normal" Apache Solr search; e.g. replace taxonomy/term/XX pages with search/apachesolr/[filters=tid:XX]. I don't see a reason *not* to (although, of course, this could just be a switchable option by the admin; the patch in #42 hardcodes it for now).

The reason for the [filters=XXXX] query "language" (again, as I mention in #44) is that the taxonomy_term_path() hook can only specify paths and not arguments (e.g. ?filters=XXX)

David Lesieur’s picture

From what I understand (but I could not test it), janusman's proposed solution would allow "keyless" searches (through a wildcard) in order to get the facets, but we'd still be stuck on the search/apache_solr page. A sure thing is that I'd like to be able to put those facet blocks on any page, not just the search page.

Could the following approach help accommodate that?:

1) The "fake" wildcard does not seem necessary as the 'q' param can be left empty on a query, so as pwolanin has mentioned earlier, we might just want to override the search form's validation to allow empty keys.

2) With empty keys, hook_search() would not build any search query.

3) Instead of returning when apachesolr_has_searched() is false, hook_block() would build its own query (a slightly simpler one, with no 'q' param, no boosts, nor any other params useless to facets) and perform a search. This would be made easier after moving the query building logic into usable function(s). And when apachesolr_has_searched() is true, hook_block() would simply continue doing the same thing as it currently does (use the available search response).

Overriding the taxonomy paths to launch a search would be a really cool feature, but isn't this also a separate idea from this issue's main focus? :-)

I hope all this is not pure nonsense, from a Solr newbie. :-)

EDIT: As this comment was bringing up different matters, it has been posted to a separate issue: #442976: Facet blocks anywhere

janusman’s picture

Dave:

The issue (as originally proposed by @cptnCauliflower) was originally trying to get search results (a node listing) in a way that an earlier version of the module allowed: by entering a keyword like tid:1234. You can see an example of this in action here: http://khub.itesm.mx/en/search/apachesolr_search/tid:136... this is a D5 version.

This was useful because @cptn and others (like me) already have in place sites that use modules like taxonomy_redirect in order to make all taxonomy terms' links deposit the user inside a Apache Solr search rather than the normal taxonomy/term/XXX drupal path.

When some changes were made to the module in order to move facets from the keys into an argument (e.g. from search/apachesolr_search/foo+tid:136 into apachesolr_search/foo?filters=tid:136 ) modules like taxonomy_redirect don't work anymore, because: (1) taxonomy term's links can be altered only in path, not arguments (see API function taxonomy_term_path()) and (2) even if they would, apachesolr.module does not accept an empty query (keys) even when a filter is present (e.g. search/apachesolr_search/?filters=tid:136).

So, in summary, I'd like *some way* to get apache solr *search results* issuing just one taxonomy term as the query.

I think you're talking about getting *just* the facets for display in a block or somewhere else to *start* a search.

Perhaps this issue's title should be "Search for items *just by* facet value", and we should open a new issue for "displaying a list of available facets to launch a search".

David Lesieur’s picture

@janusman: You're right. To me "search for just facet(s)" meant to be able to start or refine a search with just facets and no keys, regardless of the filter syntax. I'll move my point to a separate issue. Sorry for the noise.

janusman’s picture

StatusFileSize
new4.54 KB

New patch rerolled against latest 6.x-1.x-DEV.

Changes from previous patch:

  • Less code! Did away completely with the "apachesolr_wildcard" logic.
  • URLs for a search for a facet's value (with no keys) are in the format: search/apachesolr_search/[filters=fieldname:value]
  • Removing the filter after this search will return the normal "empty" search screen; e.g.: search/apachesolr_search
janusman’s picture

Status: Needs work » Needs review
pwolanin’s picture

StatusFileSize
new2.58 KB

I still really don't see why we need to have this entry of filters into the keywords box - it's inconsistent anyhow since you enter them and then they are gone.

This patch is more of the level of simplicity I was thinking of. Not sure why we even are even using %menu_tail in D6 core.

This doesn't yet include hijacking of taxonomy links, but entering a link with just a filter GET string now works (after a menu rebuild).

janusman’s picture

StatusFileSize
new3.97 KB

Ok, I see what you're saying.

Rerolled based on your patch, added more to hook_menu_alter() to hijack taxonomy/term/XX calls and make them return apachesolr results.

However, I'm using drupal_goto() for now, don't know how yet, exactly, to invoke the search from within.

pwolanin’s picture

@janusman - playing with this yesterday, it seems like we will have to switch possibly to the standard handler with q='*:*' in order to get snippets of the body text i the search results

janusman’s picture

StatusFileSize
new4.08 KB

Stop the presses! =)

This patch works and drops drupal_goto() (knew it could be done somehow!)

pwolanin’s picture

Ok, found a possible (but somewhat ugly) work-around if we have to use the standard handler to get a body fragment via the highlighter (see this to solr-user).
It seems we can still get biasing by including extra terms in the q as OR clauses like:

http://localhost:8983/solr/ad/select/?q=(*:*%20OR%20promote:true^2%20OR%20_val_:%22recip(rord(comment_count),1,110,110)%22^2)&qt=standard&hl=true&hl.fl=body&hl.alternateField=body&hl.maxAlternateFieldLength=256&fl=nid,title,comment_count,type,created,changed,score,path,url,uid,name,promote

a more readable (non-encoded) version of the relevant part:

?q=(*:* OR promote:true^2 OR _val_:"recip(rord(comment_count),1,110,110)"^2)&qt=standard

which should be similar to doing a bq=promote:true^2 and bf=recip(rord(comment_count),1,110,110)^2 with dismax.

janusman’s picture

StatusFileSize
new5.54 KB

You missed "id" in the fl=nid,title,comment_count,type,created,changed,score,path,url,uid,name,promote (needed to assign $snippet). Issuing the search to Solr works and gives us the data we need...

... however ...

I *think* issuing a query via the Solr q= parameter implies that a lot of the module will break; the facet blocks will include the keys "(*:* OR promote:true^2 OR _val_:"recip(rord(comment_count),1,110,110)"^2)", as will the search box.

E.g. as I understand it we need to do something like:

$keys = '(*:* OR promote:true^2 OR _val_:"recip(rord(comment_count),1,110,110)"^2)';
$query = apachesolr_drupal_query($keys, $filters, $solrsort, 'search/' . arg(1));

inside apachesolr_search_search() (or other) when a search with filters but no keys is issued. This means the $query object will now include the '(*:* OR ...' as search keys, which then means all the facet blocks' links have that as the search keys in them, etc. So while we can *fetch* results correctly, other interface elements break down.

Perhaps, instead of getting the snippets via Solr we could just try getting the complete body field and snipping that post-search. Here is a rewritten patch that does just this (builds on the one from comment #56). It's simpler, does the job, and maybe fetching 10 complete body fields from Solr per request, only under this case, should break anything. =)

Scott Reynolds’s picture

So Re: getting the complete body field, this is what I did in the Views implementation http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/apachesolr_...

Views makes it easy of course because any field can be trimmed to a max length, which you would need to do here. I don't like it though, thats too much data to be bringing back and forth. And I don't know why ht.alternateField doesn't work.

David Lesieur’s picture

StatusFileSize
new3.09 KB

The patch in #58 did not apply anymore. Re-rolled it, but without the hijacking of taxonomy links. If we are going to provide a starting point for faceted browsing (as suggested in #442976: Facet blocks anywhere), then I'm not sure everyone using apachesolr_search will want to hijack the taxonomy links. Perhaps the hijacking logic should be moved to a separate module, or made optional through a setting?

Instead of showing the full node body, I have added a call to search_excerpt() to generate a shorter version. However, that's still just working around the lack of a proper snippet from Solr...

janusman’s picture

Thanks, @david. I think it's easy enough to add an option to the module to let admins decide whether or not to hijack taxonomy term links.

Do others think this should go into a separate module? "Apache Solr Taxonomy Redirect"?

And yes *facepalm*, search_excerpt() =)

pwolanin’s picture

I'd rather not go back to using search_excerpt() - we removed that since it implied either running a node_load ofr each result, or getting a larger bodt field back form Solr.

We could potentially index a short excerpt using: <copyField source="body" dest="teaser" maxChars="300"/>

see the wiki

And then never ask the highlighter for an alternate field.

janusman’s picture

@pwolanin: any reason not to just store $node->teaser at index time?

Although I can see how just doing a copyField would be a generic solution if we're indexing non-node data.

Also, would appreciate your thoughts on where "replacing taxonomy/term/XXX pages with results from Apache Solr" should go: (a) in apachesolr_search.module as a switchable option, or (b) create a separate module for it.

David Lesieur’s picture

StatusFileSize
new4.94 KB

Here's a patch that uses a teaser copyField instead of running search_excerpt() on the body. I made the teaser 256 chars long, as in core search.

David Lesieur’s picture

Oops, forgot to disable the highlighter when there are no search keys... Will submit another patch.

David Lesieur’s picture

StatusFileSize
new6.45 KB
David Lesieur’s picture

@janusman: On #63, a new module would eliminate any weight in apachesolr_search.module when the feature is not needed. Also, no configuration settings would have to be implemented as enabling the new module would suffice to activate the feature.

janusman’s picture

Status: Needs review » Reviewed & tested by the community

Patched, copied schema.xml to apachesolr directory and restarted Solr.

Snippets show properly with recently-indexed items when searching just for facets, e.g.:

search/apachesolr_search/?filters=type:biblio

pwolanin’s picture

@David - I used 300 chars above, since we probably want to truncate back to the 1st whitespace.

David Lesieur’s picture

StatusFileSize
new6.47 KB

Pushed limit to 300 characters. Used truncate_utf8()'s word safe truncation.

pwolanin’s picture

The other thing to consider then is whether it makes sense to use this field for all search results? i.e. never ask for a highlight alternate field?

janusman’s picture

@pwolanin: if you mean "don't send a highlighting request to Solr when $keys is empty", that's currently in the patch:

  if ($keys) {
    apachesolr_search_add_highlighting($params, $query);
    apachesolr_search_add_spellchecking($params, $query);
  }
  else {
    // No highlighting, use the teaser as a snippet.
    $params['fl'] .= ',teaser';
  }
nick_vh’s picture

Subscribing

nick_vh’s picture

StatusFileSize
new6.52 KB

I made a change to this patch which makes it easier to programmatically add default frontpages for the apache solr. This makes it easier for example to have all the blocks you needs (facets) at your frontpage without have weird arguments

So, how do you make your own homepage after this patch? Create a custom module and put this in the module. In the future it might be easier if this could be implemented with settings of the module?

function custommodule_menu() {
$items['customfrontpage'] = array(
'title' => 'Search fields',
'page callback' => 'apachesolr_search_view',
'page arguments' => array('apachesolr_search','type:product'),
'access callback' => TRUE,
'type' => MENU_CALLBACK,
);
return $items;
}

David Lesieur’s picture

@Nick_vh: Could your change be more related to #442976: Facet blocks anywhere?

@pwolanin: Does janusman's comment in #72 answers your question? I'm also not sure if you meant something else.

pwolanin’s picture

@David - it's a partial answer - but I meant even for normal quereis would it make sense to return this 300 word snippet for each doc.

David Lesieur’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new7.28 KB

I guess the highlighter will find more relevant snippets by working with the full body.
But we could use the teaser instead of the body as the alternateField (as in this new patch). I'm not sure how useful the alternate field is.

nick_vh’s picture

@David my change was indeed related to both patches, that's why I made another issue which combines both patches + some modifications to enable certain functionality

David Lesieur’s picture

Unless I have misunderstood pwolanin's request about the highligher, I think the patch is RTBC. If we need to support additional features, then separate patches could take care of that.

pwolanin’s picture

My thought was to not specify an alternate field at all. I think this change or the change in the last patch is less-than-ideal at this point since it will cause grief for people with existing indexes.

JacobSingh’s picture

I tried the patch in #77.

It worked, but I didn't get teasers...

Rebuilt my index from scratch on the latest version.

pwolanin’s picture

Status: Needs review » Needs work
pwolanin’s picture

The patch in #70 looks very close, but needs a little work - let me re-roll that.

pwolanin’s picture

Hmm, as a side note - reading http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-an..., this sort of pattern (using q.alt=*:* + a fq) may have extremely bad performance for larger indexes. Even playing locally with a small index, I can see that moving the fq params to q.alt is much faster.

pwolanin’s picture

Existin #70 patch has problem since the keys are not retained when switching from a different search tab.

pwolanin’s picture

StatusFileSize
new7.85 KB

ok, I think this is almost there, barring tweaking the submit function to allow one to retain filters and submit w/out keywords, plus possibly make an unclick link for the current search keywords.

JacobSingh’s picture

Works. I like it.

In terms of the ability to submit w/o keywords, it works for me, it just shows the search module warning about "Please enter some keywords." I suppose we'd have to gag this by doing a drupal_goto in the submit handler and not letting it complete?

Another thing which would be a good way to demonstrate this functionality (which currently can only be seen by typing in the filters=... would be to have a default "start page". I would like to see a list of results sorted by date descending as the default, and perhaps allow the admin to specify a "default" query which shows on search/apachesolr_search.

I started messing around with this a little and I noticed this in apachesolr_search_view

// Collect the search results:
$results = search_data($keys, $type);

Why don't we just call:
$results = apachesolr_search_execute($keys, $filters, $solrsort, 'search/' . arg(1), $page);
directly?

If we did that, we could pass null keys and therefor get the latest 20 items of something. As it stands you would need to hack $_GET, which, as we all know is bad voodoo.

Actually, I take back what I said about he sort. It would be the best behavior, but it is going to be a nasty sort across all records. Perhaps just desc by nid, should be more or less the same? The temptation is just to return them without a sort, but rarely do people want users to be presented with a set of the first nodes they ever created.

pwolanin’s picture

I think it makes sense to use

// Collect the search results:
$results = search_data($keys, $type);

here since we are overriding the page callback for the /search path, so this will be the submit path for anyone executing a new search, I think.

pwolanin’s picture

Status: Needs work » Needs review
StatusFileSize
new9.96 KB
pwolanin’s picture

Status: Needs review » Fixed

committing to 6.x

Open a new issue if needed for follow-ups

janusman’s picture

@jacobsingh: I opened an issue for something related a while back, #457826: On empty search, show enabled filters to start a search

It also seems @nick_vh opened another similar (?) issue: #481986: Combined #358166 and #457826 + default filter + be able to put solr as <front>

xarbot’s picture

All of this patches are in the release of the 22 of June?

Thanks

Xarbot

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

rjbrown99’s picture

I was going back over this and wanted to find the commit. To save some searching, here it is.
http://drupal.org/cvs?commit=224662

bozho’s picture

I tried to apply this patch on Apache Solr Search Integration 6.x-1.4 module, but without success:

Hunk #1 succeeded at 86 with fuzz 1 (offset 34 lines).
Hunk #2 succeeded at 205 with fuzz 1 (offset 91 lines).
Hunk #3 FAILED at 178.
Hunk #4 FAILED at 192.
Hunk #5 FAILED at 210.
Hunk #6 FAILED at 346.
Hunk #7 FAILED at 668.
Hunk #8 FAILED at 681.
Hunk #9 FAILED at 713.
Hunk #10 FAILED at 863.
8 out of 10 hunks FAILED -- saving rejects to file apachesolr_search.module.rej
patching file schema.xml
Hunk #1 FAILED at 11.
Hunk #2 FAILED at 263.
Hunk #3 FAILED at 293.
3 out of 3 hunks FAILED -- saving rejects to file schema.xml.rej

I examined the patch and the files and it seems that the changes from the patch are already implemented, but wild-card searching still doesn't work.

bozho’s picture

StatusFileSize
new1.97 KB
new852 bytes