Using Sarnia to interact with external Solr data

This document refers to the 1.0 release of the Sarnia module. Sarnia allows a Drupal site to interact with and display external data from Solr, mainly by building views of data from Solr. This is useful for large external datasets that either aren't practical to store in Drupal or that are already indexed in Solr.

Sarnia is also the name of a town in Ontario, Canada, home of the largest photovoltaic power plant on the planet.

Table of contents

  1. Installation
  2. Generating a Solr core for testing
  3. Configuring Search API
  4. Creating Views of Solr data
  5. Advanced Solr
  6. Advanced Entities

Installation

Sarnia depends on Search API, Search API Solr, and Search API Views. The full list of dependencies includes:

Installing Apache Solr on Windows 2008 with Jetty Running as a Service

Please note that this article was reproduced with permission from Bill Beckelman's website. The original article can be found here. This guide has been used succesfully with Drupal.

Initial Solr Setup

1. Install the latest Java JDK from http://www.oracle.com/technetwork/java/javase/downloads/index.html.
Make sure to select 64bit version if you need it.

image

 

2. Download Solr 1.4.1 from one of the mirrors at http://www.apache.org/dyn/closer.cgi/lucene/solr/
(at the time of writing, not all mirrors seem to be hosting 1.4.1, but most seem
to have at least 1.4.0)

image

3. Unzip the Solr download. You should have the files listed in the image
below. Open the example folder.

Only local images are allowed.

Manipulating Apache Solr search results using the QueryElevationComponent and elevate.xml

In order to use these directions, you must have access to the your Apache Solr installation including solrconfig.xml and elevate.xml.

By default, Apache Solr does not use the QueryElevationComponent (which enables the administrator to manipulate query results) in the default request handler. This page is a guide on how to use QueryElevationComponent to artificially promote nodes based on a search query.

Filter search results by site when using the same Apache Solr Index for multiple sites

When you are using one Apache Solr instance to index multiple sites, you will have contents from all sites in the same index.

To ensure that the search results shows only contents from the current site, you can filter the Apache Solr index based on an attribute called 'Site Hash'.
This attribute is passed by the Apache Solr module every time the site is indexed and is stored together with the related pages.

For those who are familiar with GSA (Google Search Appliance), if you are looking for a "collection" behavior, this filter may be the answer.

Note: the site hash is generated based on the base_url by a function called "apachesolr_site_hash()" once and then it's stored in a variable in your database. So if your sites are sharing the base_url or the database, they will share the Site Hash.
In this case, you may need to add another filter (e.g.: Domain ID if using the Domain Access module).

Pages

Subscribe with RSS Subscribe to RSS - Apache Solr