In Facet API, the "type" key in the hook_facetapi_searcher_info() reflects the type of content being indexed. For example, it could be a node, or a comment. In addition, it can be a non-entity such as "external html" if you were crawling non-Drupal sites and indexing them in Solr. Depending on the type, different facets will be available.

Currently Apache Solr always adds node fields as facets regardless of the type. Even though the bug doesn't show up right now, it will be problematic when Apache Solr more easily indexes non-node data, because the system will think that you can facet on fields that are not in the index.

Comments

cpliakas’s picture

Status: Active » Needs review
StatusFileSize
new587 bytes

The attached patch will prevent any issues in the future. Once non-node data is easily indexed, I'm sure some refactoring of the hook_facetapi_facet_info() implementation can happen to reuse that code to get fields for multiple entities.

In addition, the hook_facetapi_searcher_info() implementation will have to make sure to currently set the "type" based on the type of content it is indexing.

pwolanin’s picture

Status: Needs review » Needs work

So, discussing this in IRC w/ Chris

probably each searcher should have an array of types (or entity_types assuming we are not using a strict Drupal definition of entity) that it has in the index, plus each facet should be relevant for one or more of these types.

the facets need to know this at least in terms of the bundle dependency code - maybe in some cases the facet should be entity_type agnostic, and the bundle dependency code thus disabled?

pwolanin’s picture

StatusFileSize
new747 bytes
new2.02 KB

Here's what I was starting on, but needs more thought.

cpliakas’s picture

A "type" might be an entity, and it might not be as well. For example, a type could specify that you are indexing non-drupal data outside of the entity system. Note that there is also a backwards compatible API change posted at #1167974: Allow for multiple types to be associated with a searcher which applies to this issue.

Also, check out the example code for how this is intended to work.

cpliakas’s picture

Status: Needs work » Needs review
StatusFileSize
new593 bytes

Refreshed patch with API change take into account. Note that this change will not work with Facet API < 7.x-1.0-beta5.

cpliakas’s picture

Priority: Minor » Normal

Bumping this up in priority, because as we start to introduce the use case of indexing non-Drupal data, we will not want the node fields to show up as available facets.

nick_vh’s picture

+1 Approved.

pwolanin’s picture

Status: Needs review » Reviewed & tested by the community

Looks like we need a deeper review and discussion of the architecture, but this is fine as a stop-gap.

cpliakas’s picture

What's the concern here?

pwolanin’s picture

The architecture of dealing with indexing multiple entity types and their fields.

Also, of dealing with external (non-Drupal) data in the index and how one can facet on that (possibly more of a multisite module issue)

pwolanin’s picture

Status: Reviewed & tested by the community » Fixed

committed

cpliakas’s picture

Excellent. Regarding Non-Drupal data, this system supports that because you can use any string for the "type". For example, if you are indexing non-Drupal data from nutch or something, you could add another searcher that indexes data of the type "html". Then your hook_facetapi_facet_info() implementation would look similar to the code below:


/**
 * Implements hook_facetapi_facet_info().
 */
function apachesolr_facetapi_facet_info($searcher_info) {
  $facets = array();
  
  if ('apachesolr' == $searcher_info['adapter'] && isset($searcher_info['adapter']['types']['node'])) {
    // Add your node facets here.
  }

  if ('apachesolr' == $searcher_info['adapter'] && isset($searcher_info['adapter']['types']['html'])) {
    // Add your non-Drupal facets here.
  }

  return $facets;
}

This system doesn't make any assumptions as to what you are indexing, or what combination of data you are indexing. The type could be an entity, but is also might not be. That is perfectly fine, and there are no issues with that.

Automatically closed -- issue fixed for 2 weeks with no activity.