Since rc1, it is possible to optionally "hijack" taxonomy pages and do a faceted search with the according taxonomy term instead.

However, in our case, we use faceted search only for two *highly structured* content types (the bulk of our content, many thousand nodes) - we exclude all other content types such as blog posts, etc.
I.e., we exclude these other (less or unstructured) content types from the solr search index in the 'content bias settings'.

We would love to hijack taxonomy pages for the taxonomy vocabularies associated with the solr-searchable content types.

It would be great if the "hijack taxonomy pages" option would automatically only select taxonomies which are actually indexed by solr (as per 'content bias settings').

Comments

robertdouglass’s picture

Category: feature » bug

Agree.

janusman’s picture

Hmm, in fact it should only hijack terms for vocabularies that are indexed AND were specified as available filters for Apache Solr in admin/settings/apachesolr/enabled-filters ...

/**
 * Overrides taxonomy/term/X links
 */
function apachesolr_search_taxonomy_term_page($str_tids = '', $depth = 0, $op = 'page') {

  drupal_add_feed(url('taxonomy/term/' . $str_tids . '/' . $depth . '/feed'), 'RSS - '. $title);
  
  $vids_to_override = apachesolr_get_enabled_facets('apachesolr_search'); 

  [... stuff removed ...]

    // Check if term belongs to vocabulary selected by admin as an available filter
    $term = taxonomy_get_term($terms['tids'][0]);
    $vocabulary_facet_name = 'im_vid_' . $term->vid;

    if (!in_array($vocabulary_facet_name, $vids_to_override)) {
      $redirect_to_apachesolr = FALSE;
    }

  [... stuff removed ...]

Now, if you're sharing a vocabulary that has been enabled as a filter across several content types, and one of those content types is being excluded from the Solr index then yes, you will not get those nodes back when you visit the hijacked taxonomy/term/XX page... are you saying this is what's happening?

pips1’s picture

Clicking on a taxonomy term should give you a list of *all* content that has been tagged with the term, regardless whether its in the Solr index or not, IMHO.

For the case you describe in #2, I'd say "hijacking" for that vocabulary should be automatically disabled / disallowed, in favour of the standard taxonomy term listing.

In other words, hijacking should only apply to taxonomy vocabularies that

  • were specified as Solr filters (as per 'Enabled filters') and
  • are assigned exclusively to content types that are indexed by Solr (as per 'content bias settings').

Makes sense?

janusman’s picture

Got it =)

Although, it's really the admin's call to decide whether any of this happens or not... =)

We just need to make a decision (I tend to agree with your #3)

The task now becomes how to make this obvious to the admin... perhaps a message under admin/settings/apachesolr/settings > "Use Apache Solr for taxonomy links" ... perhaps also sprinkle some help messages on other forms like in admin/content/taxonomy/add/vocabulary for instance? @Robert added some contextual help that can be switched off by the admin, these could go there.

As for the actual patch, I'm swamped for now, so it's up for grabs.

pwolanin’s picture

I think there is a misunderstanding here - all the taxonomy terms are indexed afaik - the bias page only selects which ones to search or regard as important.

janusman’s picture

Title: "Hijack taxonomy pages" should only select vocabularies which are actually indexed by solr (as per 'content bias settings') » "Hijack taxonomy pages" should only hijack when no content is hidden because of node type exclusion in content bias settings.

@pwolanin: In hopes of making the issue clearer... changed the issue title, and here's an explanation:

Try this:
* create 2 content types: "story" and "page"
* Create a taxonomy vocabulary, applicable to both. Say "Fruit" with the term "Apple" (let's say that term has tid = 1)
* Label one node in each content type as "Apple"
* Tell apachesolr to only index "story"-type content
* Tell apachesolr to hijack taxonomy pages.
* Index all the site's content.
* Visit taxonomy/term/1 ("Apple").

The problem is: you only get the result for story nodes labelled "Apple" but not for the page-type tagged "Apple".

@Pips in #3 is (I think) referring to content (and not terms) in the index:

"Clicking on a taxonomy term should give you a list of *all* content that has been tagged with the term, regardless whether its in the Solr index or not, IMHO."

So, he proposes that the taxonomy hijack function check whether the hijacking that term's page would cause ANY exclusion of content, and if so, do not hijack.

@pips, @robertDouglass: Hope I understood correctly =)

pips1’s picture

@janusman: You describe well what I meant!

pwolanin’s picture

@jnusman - ok, yes that makes it clea why there is a problem, but it's not clear if apachesolr should catch it or if the admin should catch it.

janusman’s picture

Version: 6.x-1.x-dev » 6.x-2.x-dev
StatusFileSize
new1.3 KB

That makes two of us. I thought the admin *should* catch it.

Perhaps we just need to add a line of documentation to the UI, like so?

(Patch for 2.x)

Index: apachesolr_search.module
===================================================================
RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr_search.module,v
retrieving revision 1.1.2.6.2.111.2.7
diff -u -r1.1.2.6.2.111.2.7 apachesolr_search.module
--- apachesolr_search.module	18 Aug 2009 11:03:12 -0000	1.1.2.6.2.111.2.7
+++ apachesolr_search.module	18 Aug 2009 19:05:50 -0000
@@ -977,7 +977,7 @@
     '#type' => 'radios',
     '#title' => t('Use Apache Solr for taxonomy links'),
     '#default_value' => variable_get('apachesolr_search_taxonomy_links', 0),
-    '#description' => t('Note: Vocabularies that need this behavior need to be checked off on the <a href="@enabled_filters_url">enabled filters</a> settings page', array('@enabled_filters_url' => url('admin/settings/apachesolr/enabled-filters'))),
+    '#description' => t('Note: Vocabularies that need this behavior need to be checked off on the <a href="@enabled_filters_url">enabled filters</a> settings page. WARNING: Content types ommitted from the Apache Solr index will not be shown.', array('@enabled_filters_url' => url('admin/settings/apachesolr/enabled-filters'))),
     '#options' => array(0 => t('Disabled'), 1 => t('Enabled')),
   );
   $form['advanced']['apachesolr_search_taxonomy_previous'] = array(
janusman’s picture

Status: Active » Needs review

Bah, forgot to set as needs review.

janusman’s picture

Issue tags: +Quick fix

Tagging

jpmckinney’s picture

I'm in favor of ripping this out into contrib/, as it has nothing to do with the core functionality of the module. I think the warning in #9 is preferable to #3, as #3 will lead to inconsistent UI between different taxonomy pages.

jpmckinney’s picture

Issue tags: +taxonomy hijack

Add tag.

jpmckinney’s picture

Title: "Hijack taxonomy pages" should only hijack when no content is hidden because of node type exclusion in content bias settings. » Document that "hijack taxonomy pages" only displays indexed content
Category: bug » task
Status: Needs review » Fixed
StatusFileSize
new1.11 KB

I think the best we can do is have better documentation per #9. You hijack, or you don't hijack. If you want fancy-hijack, you need to write some code, because there are mutually exclusive ways of being fancy.

Fixed in 6.1 and 6.2. 7.x not affected.

Status: Fixed » Closed (fixed)
Issue tags: -Quick fix, -taxonomy hijack

Automatically closed -- issue fixed for 2 weeks with no activity.