I'd be quite interested in using BOA with Nutch 2.1 sending results from web crawls to Solr 4.0 and then integrate those results in Drupal sites hosted within BOA 2.04 using Solr Views, Search API etc

I just got Nutch 2.1 / Solr 4.0 (Gora+Mysql) running using this
http://nlp.solutions.asia/?p=180

interestingly, Nutch 2.1 can also use Elastic Search

Comments

niccolox’s picture

ok, looks like MYSQL vs MariaDB might have issues.. perhaps simply the internationalization UTF-8

am actually going to pause the test of merging the two installations and just use two VMs

1 VM. Nutch 2.1 / Solr 4.0 (Gora+Mysql) running using this http://nlp.solutions.asia/?p=180
2 VM. BOA 2.04

really interested in the sending of general web crawling results into Solr and then into Drupal Search API for Views etc

very promising with the latest Nutch 2.1 / Solr 4.0 approach, (Nutch 1.x and the Drupal module has been not so easy to use)

niccolox’s picture

ok, looks like MYSQL vs MariaDB might have issues.. perhaps simply the internationalization UTF-8

am actually going to pause the test of merging the two installations and just use two VMs

1 VM. Nutch 2.1 / Solr 4.0 (Gora+Mysql) running using this http://nlp.solutions.asia/?p=180
2 VM. BOA 2.04

really interested in the sending of general web crawling results into Solr and then into Drupal Search API for Views etc

very promising, but the Nutch 1.x and the Drupal module has been not so easy to use

omega8cc’s picture

Status: Active » Postponed

BOA comes with Solr 1.4 by default, and here is the explanation why:

http://drupal.org/node/1841230#comment-6747200
http://www.koumbit.org/en/articles/version-compatibility-chart-tomcat-ap...

We could think about option to install Solr 3.x if you don't need backward compatibility, but not Solr 4, at least not yet, because it is not supported by Search API Solr search nor Apache Solr Search Integration modules.

niccolox’s picture

thanks for the detailed and timely response

looks like Solr 4.0 support is not so far off
http://drupal.org/node/1550964#comment-6575468

Anonymous’s picture

The option to provide solr 3.x would be greatly appreciated.

I have used solr 3.3, 3.4, 3.5 and 3.6 with great success for drupal 7 sites. I would love to have it included or default in BOA. It is significantly better than solr 1.4 and works with tomcat 6 or 7 (or even jetty) quite easily. I am trying to use it with the BOA solr now (3.6) and hope to succeed. I guess I misread the docs SOLR.txt . I could not get BOA sites to run with solr 4.0 (i changed all the files to setup solr 4 multicore) and my first try with 3.6 didn't work - the sites couldn't connect to the solr server. I'll try harder next weekend, otherwise, now that I've seen this issue / feature request thread, i'll use 1.4?

niccolox’s picture

Solr Nutch Search Sandbox Project Updated to Integrate with Common Schema
http://groups.drupal.org/node/273813

niccolox’s picture

well, I got Solr 3.6 and Nutch 1.6 working, more here http://groups.drupal.org/node/273813#comment-869623

omega8cc’s picture

omega8cc’s picture

Component: Code » Solr/Jetty Server
Status: Postponed » Closed (duplicate)
omega8cc’s picture

niccolox’s picture

wonderful news !

ist it possible to run Solr 1.x, 3.x AND 4.x running side-by-side?

can I _XTRAS_LIST="ALL SR1 SR3 SR4" ?

niccolox’s picture

found the answer in your excellent docs !

http://drupalcode.org/project/barracuda.git/blob/HEAD:/docs/SOLR.txt

To install one or more supported versions of Apache Solr
with corresponding Jetty version, just add correct
keyword to the _XTRAS_LIST in /root/.barracuda.cnf

SR1 (for Solr 1.x with Jetty 7)
SR3 (for Solr 3.x with Jetty 8)
SR4 (for Solr 4.x with Jetty 8 or 9 on Precise)

It is even possible to add them *all* on upgrade when
you are already running now deprecated Tomcat 6 with Solr 1.x,
because new, Jetty based Solr instances use separate
ports and directories.

niccolox’s picture

ok, next question

is it possible to install them ALL SR1 SR3 SR4 on a clean, fresh, install? no upgrade?

omega8cc’s picture

Yes, but you would need to prepare /root/.barracuda.cnf with SR1 SR3 SR4 listed in the _XTRAS_LIST. Normally you could simply start standard boa in-head install and cancel it once it created /root/.barracuda.cnf file, edit it, delete two pid files it will complain about and start the install again to get them all installed.

niccolox’s picture

answering my own question again, certainly folders /opt/solr3 and opt/solr4 are installed (and I presume work)

will check that next

this snippet seems to be good
_XTRAS_LIST="ALL BDD BND FMG GIT SR3 SR4"

niccolox’s picture

next, need to open firewall/csf rules, configure hostname for jetty to allow nutch on external server to post to boa / jetty hosted solr 3

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.