NOTE: This page is outdated because some config paths and security defaults have changed. Please help to update this documentation.

Here are some of the steps to do a multi-core setup on Debian (Lenny or Squeeze using some packages from Sid). This uses java, solr, and jetty from Debian packages rather than installing custom versions which should be easier to maintain. Solr packages are removed from Squeeze but can be found in Sid (testing).

This is focus of the installation of the solr server on Debian and not the installation and configuration of the related Drupal module. For more information refer to some of the other howtos.

The SOLR set-up works also for Search API (http://drupal.org/project/search_api_solr).

The structure this ends up with:

  • have all cores in /var/lib/solr
  • enable cores via symlinks in /usr/share/solr
  • configure HTTP authentication in jetty

Packages

Since there are no packages for solr or jetty in Debian squeeze these instructions will use the packages from Debian testing (wheezy). For consistency the whole java/jetty/solr stack will be mainly using packages from wheezy. There was a suggestion that this isn't a very secure way of using Debian. The Debian testing suite does receive security updates though. From experience, there is an added advantage of being able to upgrade with much less hassle to the next Debian stable using this technique. But if you are worried that this may be less secure and less tested than using bleeding edge tarballs from solr with no Debian testing at all then stop here (and reconsider your logic).

There are no packages for sun-java in Debian wheezy so far and there is a dependency in one of the java libraries used by solr that breaks when using the sun-java from squeeze. That's why it's better to use openjdk's java packages.

Add this line in your /etc/apt/sources.list

..
# packages for java/solr/tomcat6/jetty
deb     http://ftp.uk.debian.org/debian/     testing main contrib non-free

Create or edit /etc/apt/preferences like this (this is to ensure that only necessary packages are being taken from the testing suite)

Package:  *
Pin:  release a=stable
Pin-Priority:  999

Package:  *
Pin:  release a=*
Pin-Priority:  400

This uses jetty

aptitude install jetty solr-common solr-jetty openjdk-6-jdk openjdk-6-jre openjdk-6-jre-headless openjdk-6-jre-lib javacc

Versions of packages
Use dpkg to see what versions you have ended up with, e.g.

dpkg -l \*openjdk\*|grep ^ii
ii  openjdk-6-jdk                         6b18-1.8.13-0+squeeze2       OpenJDK Development Kit (JDK)
ii  openjdk-6-jre                         6b18-1.8.13-0+squeeze2       OpenJDK Java runtime, using Hotspot JIT
ii  openjdk-6-jre-headless                6b18-1.8.13-0+squeeze2       OpenJDK Java runtime, using Hotspot JIT (headless)
ii  openjdk-6-jre-lib                     6b18-1.8.13-0+squeeze2       OpenJDK Java runtime (architecture independent libraries)
dpkg -l \*solr\*|grep ^ii
ii  libsolr-java                          3.6.0+dfsg-1                 Enterprise search server based on Lucene - Java libraries
ii  solr-common                           3.6.0+dfsg-1                 Enterprise search server based on Lucene3 - common files
ii  solr-jetty                            3.6.0+dfsg-1                 Enterprise search server based on Lucene3 - Jetty integration
dpkg -l \*jetty\*|grep ^ii
ii  jetty                                 6.1.24-6                     Java servlet engine and webserver
ii  libjetty-extra-java                   6.1.24-6                     Java servlet engine and webserver -- extra libraries
ii  libjetty-java                         6.1.24-6                     Java servlet engine and webserver -- core libraries
ii  solr-jetty                            3.6.0+dfsg-1                 Enterprise search server based on Lucene3 - Jetty integration

If you end up with different versions then leave a comment here. It's possible to install particular package versions using the aptitude gui.

You can check your java version using this command:

java -version

The output should look similar to this:

java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
OpenJDK Server VM (build 14.0-b16, mixed mode)

Once you have all of this in place continue to configuration. Check other documentation for getting the drupal schema, solconfig.xml, and the PHP connector.

File system layout

The solr-jetty integration package creates a symlink from /var/lib/jetty/webapps/solr to /usr/share/solr/webapp. This symlink is currently broken. Until it's fixed the best solution is probably to create a symlink

ln -s /usr/share/solr/web /usr/share/solr/webapp

The Debian policy is to put configuration files into /etc, persistent data into /var. Sadly, the main configuration file for the multi-core (solr.xml) needs to live in /usr/share/solr. Nowadays it is symlinked to from /etc/solr/sol.xml.

For multi core you'll need to uncomment or add a section similar to this:

/usr/share/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/cores" shareSchema="true">
    <core name="core0" instanceDir="core0" />
    <core name="core1" instanceDir="core1" />
    <core name="core2" instanceDir="core2" />
  </cores>
</solr>

(Note: Somebody reported that if "name" and "instanceDir" are not the same they can't access solr (uncomfirmed).)

Change to /var/lib/solr - this is where our configurations and data will live.

mkdir /var/lib/solr/core0
cp -a /etc/solr/conf /var/lib/solr/core0/conf
cp solrconfig.xml /var/lib/solr/core0/conf/solrconfig.xml
cp schema.xml /var/lib/solr/core0/conf/schema.xml

If you wonder where the last two xml files came from they are part of the Apache Solr search module for Drupal (or search_api).

Don't forget to give jetty write permissions to the data directory because it needs to create a data folder for every core.

chown -R jetty /var/lib/solr

To enable the core it needs a symlink in /usr/share/solr

ln -s /var/lib/solr/core0 /usr/share/solr/core0

After restarting jetty (you may have to enable it in /etc/default/jetty, also set host to 0.0.0.0)
solr is now available at http://127.0.0.1:8080/solr/admin/cores

Authentication

Check the documentation on jetty security. On a global level you want to protect access to anything admin:
/etc/jetty/webdefault.xml

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>Solr authenticated application</web-resource-name>
      <url-pattern>/*</url-pattern>
    </web-resource-collection>
    <auth-constraint>
      <role-name>admin</role-name>
      <role-name>solr-role</role-name>
    </auth-constraint>
  </security-constraint>

And this needs also a login-config:
/etc/jetty/webdefault.xml

  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Solr Realm</realm-name>
  </login-config>

(Both of these snippets are supposed to go into <web-app>)

For each core add something like this:
/etc/jetty/webdefault.xml (in <web-app>)

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>Solr authenticated application core0</web-resource-name>
      <url-pattern>/core0/*</url-pattern>
    </web-resource-collection>
    <auth-constraint>
      <role-name>admin</role-name>
      <role-name>core0-role</role-name>
    </auth-constraint>
  </security-constraint>

/etc/jetty/realm.properties

core0: Password, core0-role

Finally, to include the password realm add this to /etc/jetty/jetty.xml if it isn't already there.
In Debian the Realm name is set to Test Realm by default. If you get a 500 server error "No realm" then you most likely forgot to change the realm here to Solr Realm (or whatever you set it to in login-config previously.

    <!-- =========================================================== -->
    <!-- Configure Authentication Realms                             -->
    <!-- Realms may be configured for the entire server here, or     -->
    <!-- they can be configured for a specific web app in a context  -->
    <!-- configuration (see $(jetty.home)/contexts/test.xml for an   -->
    <!-- example).                                                   -->
    <!-- =========================================================== -->
    <Set name="UserRealms">
      <Array type="org.mortbay.jetty.security.UserRealm">
        <Item>
          <New class="org.mortbay.jetty.security.HashUserRealm">
            <Set name="name">Solr Realm</Set>
            <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
            <Set name="refreshInterval">0</Set>
          </New>
        </Item>
      </Array>
    </Set>

Restart jetty and check it works.

Now go to your Drupal site, enable the apache_solr module and go to it's configuration.

Solr host name
core0:Password@IP
Solr port
8080
Solr path:
/solr/core0

All done. Any question post a comment here.

Troubleshooting

Compilation of tips & tricks by myself and other people.

Fix error: "Unable to find a javac compiler"

If you get this error (under ubuntu?)

Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "/usr/lib/jvm/java-6-openjdk/jre"

Install the JDK and restart Jetty.

sudo apt-get install openjdk-6-jdk

My comment to this is that I've found it cumbersome mixing openjdk with Sun stuff. I'd install sun-java6-jdk (also added to the instructions above)

Notes

Although this is using packages from the Debian "unstable" distribution it is by no means an unstable set-up.

TODOs

  1. DIGEST authentication - how does this really work?
  2. SSL support

Comments

murrayw’s picture

Many thanks. This recipe has got me up and running. I've now got basic auth working with Jetty :)

I'm a novice on Jetty config so following suggestions may be off base but maybe not.

1. It might be better to add the and sections into /usr/share/solr/WEB-INF/web.xml. That way you are setting security on just the Solr application rather than the whole Jetty server. The webdefault.xml config then doesn't have to be touched.

2. I think that realm.properties file may also need a "solr" user with a "solr-role" applied so that they can access the "Solr authenticated application". eg
solr: password, solr-role

murrayw’s picture

I was also having massive problems getting certain JSPs to compile. The core indexing and querying was fine, as was listing the cores at /solr/admin/cores. The problematic paths were /solr, /solr/admin, /solr/core0/admin etc. The error was along the lines of:

Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "/usr/lib/jvm/java-6-openjdk/jre"

Thankfully I found this page after hours of banging my head:
https://bugs.launchpad.net/ubuntu/+source/solr/+bug/321889/comments/23

The solution is to install the JDK rather than the JRE. (I think the JRE might be installed as a dependency of Jetty but not 100% sure of that.) Once you install the JRE you can see all the HTML pages reporting on the state of your cores.

$ sudo apt-get install openjdk-6-jdk

The above link also reports a couple of other bugs which had me confused for a while:

1. you have to manually stop and start jetty rather than doing a restart

$ sudo service jetty stop
$ sudo service jetty start

2. Jetty binds to 127.0.0.1 rather than the hostname as suggested in the jetty doc comments.

miiimooo’s picture

My tests show that it works best with the SUN Java SDK so I have updated the page accordingly.

Maxkupr’s picture

It works for me but after I have changed url-pattern
from
/core0/*
to
/core0*

marcoka’s picture

the path solr/core0 is not found

HTTP ERROR 404

Problem accessing /solr/core0. Reason:

NOT_FOUND

nerdcore’s picture

As indicated on Debian's "Releases" page, "unstable" codenamed "sid" is for development purposes ONLY.

Just because you've used a package from sid once or even fifteen times does not mean that it should be EXPECTED to be STABLE, nor will it receive security updates in a timely fashion. "wheezy" a.k.a. "testing" is a bit closer to a stable release and is likely more appropriate if you need cutting-edge packages in your production environment.

I believe this is very bad advice, but as my previous updates have been reverted I will simply leave this little note down here at the bottom.

Also, Lenny (Debian 5.0) has not had any security updates since February 2012 and should not be used as a production environment. I would recommend upgrading to Squeeze (Debian 6.x).

USING PACKAGES FROM THE UNSTABLE DISTRIBUTION, CODENAMED "SID", IN A PRODUCTION ENVIRONMENT IS A VERY BAD IDEA.

I had originally posted this on top of the page not because it caused me any specific grief, but because it is a bad idea and could break at any update without any reasonable expectation of stability. I still think this deserves a big warning if you are going to instruct people to use this. It is considered very dangerous.

miiimooo’s picture

Yeah that is the general approach. But for this particular situation I think it's not relevant. Have a quick think about it. You mentioned you use SOLR from tarballs. Surely that's less secure than using something that's been sitting in the Debian packages for years, quite literally.

So in conclusion, using SOLR packages from the unstable or testing distribution seems like A BETTER IDEA.