It is in alpha version now, many big changes are included in this version

some key features may be useful for Drupal Search API

Read Performance:
Real-time Get – The ability to quickly retrieve the latest version of a document, without the need to commit or open a new searcher

Update Performance:
Atomic updates - the ability to add, remove, change, and increment fields of an existing document without having to send in the complete document again.

More Information: http://lucene.apache.org/solr/solrnews.html

Besides, a new schema.xml is also required for solr 4.0.

CommentFileSizeAuthor
#26 1676224-24--solr4.patch103.38 KBdrunken monkey
#22 1676224-22--solr4.patch103.3 KBdrunken monkey
#21 1676224-21--solr4.patch99.92 KBdrunken monkey
#9 SolrPhpClient-r60-4.x.zip265 KBAnonymous (not verified)
#9 solrconfig-4.x.zip24.05 KBAnonymous (not verified)
#6 solrconfig.zip24.83 KBAnonymous (not verified)
#3 solrconfig_40.xml_.txt21.64 KBdasjo
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drunken monkey’s picture

Category: feature » task

Thanks for the note!
Currently, we are developing new config files for the different Solr versions in this sandbox project, together with the apachesolr module.
There, we will of course also include 4.x config files, as soon as 4.0 is released (or maybe even a bit before that).

dasjo’s picture

i just gave this a quick try.

when indexing, i received the following error caused by the SolrPhpClient library that search_api_solr uses:

Unknown commit parameter 'waitFlush"
http://code.google.com/p/solr-php-client/issues/detail?id=75
fixed the problem by removing waitFlush from the SolrPhpClient library manually

still got errors:

no field name specified in query and no default specified via 'df' param
http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field
http://lucene.472066.n3.nabble.com/jira-Created-LUCENE-4339-Allow-deleti...

halting here for now and keeping with solr 3x :)

dasjo’s picture

Title: Support Apache Solr 4.0 » Apache Solr 4.0 available
Category: feature » task
Priority: Normal » Minor
Status: Needs work » Active
FileSize
21.64 KB

here's the solrconfig that i've used.

had to declare luceneMatchVersion

<indexDefaults> and <mainIndex> needed to be removed
https://issues.apache.org/jira/browse/SOLR-1052

also see
Apache Solr Common Configurations sandbox:
#1550964: Support Solr 4.0 schema

dasjo’s picture

Title: Apache Solr 4.0 available » Support Apache Solr 4.0
Category: task » feature
Priority: Minor » Normal
Status: Active » Needs work

updating status

Anonymous’s picture

Title: Apache Solr 4.0 available » Support Apache Solr 4.0
Category: task » feature
Priority: Minor » Normal
Status: Active » Needs work

Solr 4.1 is now available.

Using the dasjo's example from #3 I get

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'geocluster.GeoclusterComponent'

Anonymous’s picture

Status: Needs work » Needs review
FileSize
24.83 KB

Attached a set of working solrconfig.xml and schema.xml for Solr 4.1 (LUCENE_41 and Schema version 1.5)

Anonymous’s picture

#6 does throw another issue,
org.apache.solr.common.SolrException: Unknown commit parameter 'waitFlush'
waitFlush appears deprecated
http://code.google.com/p/solr-php-client/issues/detail?id=75

and a bunch of warnings like

15:25:00
WARNING
XMLLoader
Unknown attribute id in add:allowDups
15:25:00
WARNING
XMLLoader
Unknown attribute id in add:overwritePending
15:25:00
WARNING
XMLLoader
Unknown attribute id in add:overwriteCommitted
Anonymous’s picture

Regarding #7, it seems that the SolrPhpClient itself needs to be updated
https://groups.google.com/forum/?fromgroups=#!topic/php-solr-client/_KX_...

Anonymous’s picture

Attached also the SolrPhpClient with changes for 4.x, only some very small changes. Based on the r60 version.

Solr 4.x is promising. It uses a lot less memory (2/3 less) and seems to do faster queries. It also supports a Solr Cloud if you have very large data, you can scale over multiple servers like MongoDb.

Anonymous’s picture

Just wanted to let you know, I'm now succesfully running Solr 4.1 with Search API Solr using code from #9. Running a few million entities and a dozen facets. All good. I think Solr 4.x is the way forward.

rockaholiciam’s picture

i tried setting up solr 4.1 and it starts up fine and can be accessed via admin panel, however doesn't index so i tried replacing the schema and solrconfig xml files provided in 9 and started getting errors below:

SolrCore Initialization Failures
collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config for solrconfig.xml

There are no SolrCores running.
Using the Solr Admin UI currently requires at least one SolrCore.

Unable to load environment info from null/admin/system?wt=json.
This interface requires that you activate the admin request handlers in all SolrCores by adding the following configuration to your solrconfig.xml:

rockaholiciam’s picture

Got it working using the config files from sandbox project mentioned in 1.

andypost’s picture

I got it work with Client files #9 SolrPhpClient-r60-4.x.zip and sandbox config
It seems there's no compatible change in client library commited :( to https://code.google.com/p/solr-php-client/downloads/list

Anonymous’s picture

You're right, The client library requires a patch for 4.x. Guess technology moves so fast nowadays that even the developers aren't keeping up anymore.

rockaholiciam’s picture

what i am still unsure of is having upgraded to Solr 4, is the query syntax still dismax or is it edismax. The whole point of upgrading from 3.6 was to be able to perform fuzzy queries but either the syntax is wrong, or the code is not using edismax(?) which doesnt seem right for its simply an extension of dismax, with more or less similar syntax.

Anonymous’s picture

The solrconfig.xml uses edismax in 1 location. If you mean by 'fuzzy search' partial word search, in some cases is seems to work for me, but not consistently.

I can search for 'sweate' and get results for 'sweater'. But searching for 'weater' gives no results.

rockaholiciam’s picture

Hei morningtime, thanks for the response. That I have sorted out already. It had to do with the difference between edgengram and ngram where ngram allows you to partially search on the basis of strings located in the middle of words besides those on edges...the issue i am trying to resolve is if someone types 'swaeter' instead of 'sweater' and it is possible for Solr to still get results but somehow the dismax/edismax query parsers doesnt seem to be handling it...

nicksanta’s picture

After going through this thread a few times I was able to get it working. For anyone who is struggling I'm going to recap how I got the following configuration to work:

  • Apache Solr 4.1
  • Drupal 7
  • Search API Solr 7.x-1.0rc3

I'll start assuming you've gotten Solr installed and working with the default configuration.

  1. Clone the sandbox mentioned in #1 - http://drupal.org/sandbox/cpliakas/1600962
  2. Grab all of the files from the 4.x directory, and copy them into the conf directory for the core you are using. The default location of this directory for the collection 'collection1' is: ./example/solr/collection1/conf/
  3. Restart Solr and confirm it is working by visiting the admin page
  4. Download the solr php client library from #9 - SolrPhpClient-r60-4.x.zip
  5. Extract that zip into your libraries file, overwriting the existing SolrPhpClient directory if it exists
Anonymous’s picture

Cool, the sandbox from #1 seems perfect. So all you need is the library from #9.

chaby’s picture

Thanks !
Works great for me too with solr 4.1.0 (but remove included schema extra. Maybe just comment it in default schema.xml to have an example but not loading it ? ...)

drunken monkey’s picture

FileSize
99.92 KB

Sorry it took me a while to get back to this issue!
Attached is a patch which just adds the Solr 4.x config files to the module. After #1846254: Remove the SolrPhpClient dependency, nothing else seems to be necessary.
Please confirm this works and I'll commit.

drunken monkey’s picture

FileSize
103.3 KB

And here's a version which has also the appropriate INSTALL.txt changes included.

dcam’s picture

I've tested #22. I'm not sure what all should be tested, but here's what I did on my dev workstation:

I downloaded and extracted Solr 4.2.1.
I copied the config files to the conf directory.
I started the example application.
The two sites I normally have using search_api_solr with a Solr 3.x application detected the 4.x server automatically.
I re-indexed the content from the two sites, about 900 nodes.

It was that easy and I was done in about 5 min. The search views on the sites work as if nothing had ever changed.

For the record, the fields that are being indexed include the usual node properties (title, status, created, changed), several text fields, some numeric fields, and several taxonomy fields.

So this is RTBC +1 from me.

drunken monkey’s picture

I've tested #22. I'm not sure what all should be tested, but here's what I did on my dev workstation:

Thanks! Testing your use cases is help enough. Of course, it could be that some niche features aren't correctly supported, then, but that can hardly ever be avoided with certainty. So, a few people making sure their use cases work is great.

Attached is a slightly altered patch, adding the changes from #1984546: Add back buildOnOptimize for the spellchecker to the Solr 4 configs (and fixing some white space). (If you tested #22, you don't need to re-test this one, I think.)

Anyone else want to test?

kaizerking’s picture

Attached is a slightly altered patch,

where is that patch forgot to attach?

drunken monkey’s picture

FileSize
103.38 KB

where is that patch forgot to attach?

Oops. Yes, thanks. Here it is.

Owen Barton’s picture

Status: Needs review » Reviewed & tested by the community

Just tested this with a fairly new site, and can confirm that searching, term/entityreference facets and autocomplete all appear to work perfectly.

drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

Thanks a lot for testing!

Committed.

hgurol’s picture

Jeeez, I am just a little late.

I was supposed to say...
Since the support for 4.2.1 is not in yet and the version 4.3 is just released recently. Would it be better to delay this issue a little longer and make sure to include 4.3 support as well?

I guess it's a little too late for that now...
http://mirror.cc.columbia.edu/pub/software/apache/lucene/solr/4.3.0/chan...

drunken monkey’s picture

Is there any difference in the config files or handling? The upgrade notes for 4.2 and 4.3 don't sound like it.

Hm, OK, a short test revealed a small problem: see #1992806: Fix wrong path to contrib dir in 4.x (and comment there). If you find other problems with using Solr 4.3 with our configs, please also list them there.

Nick_vh’s picture

This does not support the softCommit as far as I can see. Please check this issue in the apachesolr queue http://drupal.org/node/1974124 for implementations and http://drupal.org/files/1550964-47_0.patch

I think Search API has the connector classes now so that should be resolved afaik. Just need to implement the boolean to allow or disallow soft commits. This will enable direct gets from Solr right after the commit.

drunken monkey’s picture

Thanks for the tip! However, I think this is already taken care of. I already included that when re-writing the connection class and here just added the configs, too.

Nick_vh’s picture

If I were you I'd rename optimizeOrCommit to just Commit. Optimize is deprecated in solr 4 and in Solr 3 it does not have any effect.
Where does the waitSearcher variable gets populated ?

drunken monkey’s picture

I didn't know it was deprecated, or effectless. The official documentation also doesn't mention it – where have you read that?
One thing we should probably change, though, is letting the user decide how often to optimize, if ever.

$waitSearcher is passed as a parameter. Or what do you mean?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

alesr’s picture

Updated to Search API 7.x-1.7 and Search API Solr Search 7.x-1.1 and my Solr server could not be reached now.

I'm getting this in solr's log:

Aug 29, 2013 4:28:49 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Ping query caused exception: <strong>no field name specified in query and no default specified via 'df' param</strong>

I'm using Solr 3.6.1 and didn't noticed any requirements for Solr 4.x to use Search API 7.x-1.7 and Search API Solr Search 7.x-1.1 modules.
Do I need to update schema.xml?

pinkonomy’s picture

Will the 8.x version of Search api Solr will use Solr 4 by default?

pinkonomy’s picture

@morningtime:
Hi,can you tell us the site you are referring to? In this site,are you using Solr Cloud?
thanks!

pinkonomy’s picture

I can read on this article http://www.openlogic.com/wazi/bid/283711/How-to-set-up-Solr-4-2-on-Drupa...
that "As of the version of 13 April 2013, this module should work with 4.x Solr; older versions only work with 3.x Solr. " regarding Apache Solr search.
Does this also apply to Seach api Solr i.e. the latest Seach api Solr version applies to 4.x Solr version?
thanks

pinkonomy’s picture

Could someone write some instructions how to use Solr 4 ?

MickC’s picture

Got it going after just adding this line into schema.xml in the <fields> section.

<field name="_version_" type="long" indexed="true" stored="true"/>

This is on a Bitnami Solr server with what I believe is the default Solr 4 configuration.
I started by copying the existing 'collection1' core, then copied the Drupal Apache Solr schema.xml and solrconfig.xml
It refused to load complaining about solrconfig.xml so I copied the one from collection back.
That worked, then got an error saying the '_version_' field was required.

So in summary
- solrconfig.xml from solr
- schema.xml from the Drupal module, plus the line as above.