I am trying to get this module working with Solr 4.0 (next major release).
There seem to be differences in how the solrconfig.xml needs to be prepared for 4.0 versus 3.5 (current stable release).
Does anyone have experience using this module with 4.0 who could share some tips or a copy of their configs?
Are there any plans to release a version of this module configured for 4.0?
Thanks for any tips you can offer.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Nick_vh’s picture

Version: 6.x-2.0-beta5 » 7.x-1.x-dev

First of all, can we start using D7 for Solr 4.0? See if that works?

Secondly, can you list the errors that show up?

Thanks for your efforts!

C-P’s picture

If we consider solrconfig.xml from example directory in Solr distributives, we could see that the structure of solrconfig.xml is changed in Solr 4.0 in comparison with 3.x. For example, the single field <indexConfig> is used instead of <indexDefaults> and <mainIndex> fields, some other options are added/deleted. Therefore solrconfig.xml attached to the module (both for D6 and for D7) doesn't fit to Solr 4.0.

If I try to use it with 4.0, I have an error at localhost:8080/solr:

This interface requires that you activate the admin request handlers, add the following configuration to your solrconfig.xml:

<!-- Admin Handlers - This will register all the standard admin RequestHandlers. -->
<requestHandler name="/admin/" class="solr.admin.AdminHandlers" />

The error persists even if I replace old line <requestHandler name="/admin/"... the module by this one.

Maybe somebody has succeeded in building solrconfig.xml for apachesolr module and Solr 4.0?

pwolanin’s picture

Status: Active » Postponed

not supported. postponed until 4.0 is released.

drzraf’s picture

FileSize
1.25 KB

the attached patch did the trick for me.
Additionally the following patch will need to be ported :
https://code.google.com/p/solr-php-client/issues/detail?id=75

drzraf’s picture

in /admin/reports/apachesolr, apachesolr_index_report should be updated to match the fact that $solr->getLuke(2); does not appear to return $data->index->numTerms anymore.

tinflute’s picture

Thank you thank you thank you for posting this patch.
Will test this out asap.

merryseeker’s picture

I'm having the same issue. I'm just learning drupal and I have tried to implement the patch provided - solr4-solrconfig-xml.patch - I'm not sure exactly what drzraf means by "Additionally the following patch will need to be ported :
https://code.google.com/p/solr-php-client/issues/detail?id=75". Are further instructions available or would it be advisable, given my lack of experience, to go back to a previous version of apache solr such as 3.6.1?

tinflute’s picture

If you are new to this then you should definitely stick to earlier apache like 3.6.x yes, which works out of the box with this module. You don't need to think about the (not yet released) Solr 4.0 unless you have special search needs, e.g. grouping/collapsing fields.

merryseeker’s picture

so I installed version 3.6.1 of apache solr. A test of the server in drupal shows that it connects.
When I run an index in drupal, it shows:

5 items successfully processed. 0 documents successfully sent to Solr.

If I "View more details on the search index contents" the report shows the following error message:

Notice: Undefined property: stdClass::$numTerms in apachesolr_index_report() (line 556 of C:\Documents and Settings\Lori\Desktop\commerce_kickstart-7.x-1.8\sites\all\modules\apachesolr\apachesolr.admin.inc).

tinflute’s picture

first go to your Solr admin panel to check whether the items are indeed indexed there.
On tomcat server, the Solr panel is at yoursite.com:8080/solr/admin or yoursite.com:8683/solr/admin
If that page don't load, then solr isn't installed (or it's not running on tomcat).
If it does, then click 'full interface' and run a Solr query to see what's in the index. (some docs here)
If you get your 5 items back, then your stuff is being indexed.
Try to setup some queries on your site, using for example the Solr views module or whatever you want. If that's working, then you're golden, who cares about the error message.

sylus’s picture

Title: Does v6.x support Solr 4.0. How to configure solrconfig.xml » Support Solr 4.0 schema. How to configure solrconfig.xml (backport to 6.x branch)
Status: Postponed » Active

As of August 14th, the 4.0 series of apache solr has reached beta status.

It would be great to try to attach a supported 4.x schema + solrconfig to support this version.

As per: http://wiki.apache.org/solr/Solr4.0 they commit to:

4.0 (final) release no sooner then 30 days after 4.0-BETA. The final release may contain additional features and API additions compared to the beta release, but should not change any APIs (or the index format) from the beta release unless absolutely necessary to fix a bug.

Is this enough to change the status to active? (please change back if to presumptive)

Of course apachesolr works great for the 3.6 version but the new interface in 4.0 is arguably improved! :)

rajivk’s picture

FileSize
9.09 KB
rajivk’s picture

I am very sorry for that I attached patch for apache_solr_autocomplete here and I don't know how to remove it.

Nick_vh’s picture

no problem. Just create a new one in the respective queue. We'll ignore it here! :)

rajivk’s picture

I am running solr 4.00 Beta on my development site after removing all instances of waitflush and it is running fine except for error messages and no statistics on index items (no of items indexed and no of distinct items for each term).

rjbrown99’s picture

I'm playing around with Solr 4.0 beta with Drupal 6 and will post thoughts here.

Item #1: The solr-php-client needs an update or commit and optimize may not work. This will cause me to roll forward and test with solr-php-client trunk.

Update: This is an issue with apachesolr 6.x-3.x and 7.x as well since the Drupal_Apache_Solr_Service.php class uses optimize. The waitFlush parameter is also gone which is a second issue.

The fix for optimize is the same as this update to solr-php-client.

The fix for waitFlush is to remove it, per this documentation.

rajivk’s picture

But how do you test Apachesolr Intgration Module with solr-php-client except for porting some changes in Drupal_Apache_Solr_Service because they have diverged too much. I tried porting change in solr-php-client Service.php and replaced optimize by expungeDelete as in solr-php-client trunk but still got the error waitflush not found. Also numterms is not used in Service.php so maybe we have to change code of ASIM to request from Term Vector Component.

rjbrown99’s picture

I have two choices - backport the fix from the later release of solr-php-client to work with Drupal's apachesolr module, or update the Drupal apachesolr module to work with the new solr-php-client. I'm not yet sure of which direction to take but I am more than open to suggestion from those that have been down this path. I'm off to research the issue queue and code.

rajivk’s picture

First of all let us have a list of changes in solr-php-client which are useful for DrupalApacheSolr.php.

rjbrown99’s picture

FileSize
2.43 KB

OK, here is a patch and a rough order of what I did to make it work.

First, I set up Solr 4.0. This involved a number of changes to solrconfig.xml. I started with my old solrconfig.xml and added a few things, such as luceneMatchVersion (I'm using LUCENE_40), and replacing both the indexDefaults and mainIndex sections with indexConfig. The patch from #4 was helpful. Because I moved to LUCENE_40 for the version I also deleted my entire index and started from scratch. I can post my solrconfig.xml if needed. I did not change my schema - not a single line was modified. I am using schema drupal-2.0-beta4, version 1.2. Yes, I use a slightly modified 6.x-2.x. Blah blah unsupported, it still has features missing from 6.x-1.x.

On to the patch... This is not intended to be considered for a commit, it just illustrates what changes I made to get it working. I am using the apachesolr 6.x module code, and the patch is against the SolrPhpClient, revision 22 - which is the default suggested client version. I first started playing around with the 6.x-3.x branch and its built-in Solr client code, then I tested with later revisions of SolrPhpClient (including trunk), but I finally came back around to the recommended v22 as it had the smallest number of changes required to make it work.

1) Modified the apachesolr module's Drupal_Apache_Solr_Service.php file and changed the commit function to call the variable $expungeDeletes to match the patch from solr-php-client that I linked to in post 16 above. One important note - the previous $optimize defaulted to TRUE, and $expungeDeletes defaults to FALSE. I did not investigate any upstream calls for $optimize, but I expect that this may be an issue because we are now including a different variable with different behavior in the same position. This still needs investigating as I can see that apachesolr.module calls optimize in at least one place.

2) Modified the solr-php-client in accordance with the patch from post 16 above. $optimize is gone and is replaced by $expungeDeletes.

That was it. I reindexed a development site with about 400K nodes with no errors or issues. Faceting and filtering still seem to work. I have not yet explored all of it but so far so good. My ultimate goal is to introduce NRT soft commit functionality for immediate updating, which is why I'm rolling down this path in the first place.

Hope that helps and inspires more people to try this. If all of my testing is successful I will be moving to production with 4.0 at some point in the near future.

rjbrown99’s picture

Tonight I learned something new about Solr in my quest to migrate. I'm coming from 1.4.1 straight to 4.0 beta.

I have a taxonomy field that was previously defined as im_vid_3_myfield, which contained a single numeric taxonomy term. I am using this field with apachesolr_views to perform sorting. This was working just fine for me, except now in Solr 4 I get a SEVERE error. That's because Solr 4 won't allow sorting on multivalued fields, and fields defined as im_* are multivalue. In my case I was only ever storing one value in the field which is why it used to work: older versions of Solr were more forgiving and would treat that as a single value field.

I had to redefine my field as is_vid_3_myfield and reindex. This is now in progress and I expect it to work, but as a note to the community I would recommend evaluating your multivalue fields in relation to sort behavior. If you are doing multivalue+sort, you should redefine the fields now before you waste cycles indexing and then reindexing.

According to this commit, it appears this new sanity check was added in Solr 3.x. So my note only really applies to folks upgrading from 1.4.x.

Update: Also see #1635624: Cannot sort by Taxonomy fields with Solr 3.5 for this issue.

pwolanin’s picture

In the newer versions of the module we are indexing the 1st value of any field as a additional single-value field for just this reason (and since in CCK or fields API, a value may at any time be switched in the UI from single to multi valued).

rjbrown99’s picture

Thanks for the tip, that's funny - same approach I came up with and implemented for my 6.x tree. Next time I need to read the newer versions of the code, it could save me some time.

rjbrown99’s picture

For what it's worth, I have been in production with Solr 4 (with Drupal 6) for the past month. So far - bulletproof. No issues at all on searching or faceting, and I defaulted to using soft commits. Working quite well. I did incorporate pwolanin's feedback from #22 and re-implemented the indexing of the 1st value of fields as single value to fix my other issues.

Nick_vh’s picture

Title: Support Solr 4.0 schema. How to configure solrconfig.xml (backport to 6.x branch) » Support Solr 4.0 schema

renaming this thread to reflect the challenges tackled here

Nick_vh’s picture

Status: Active » Needs review
FileSize
2.56 KB
61.57 KB

Making minor changes to the schema so Solr 4.0 at least starts. Have not tested indexing yet but this is a good start

Status: Needs review » Needs work

The last submitted patch, 1550964-26-diffsolrconfig.patch, failed testing.

Nick_vh’s picture

Status: Needs work » Needs review
FileSize
67.74 KB

With this patch you can already index and search.

Note : This is just to make it work, this patch does need work as it is not backwards compatible with 3.x and 1.4...

Nick_vh’s picture

FileSize
6.17 KB

The difference for the service class

Nick_vh’s picture

FileSize
67.71 KB

This should be a backwards compatible patch for all solr versions. oh, and indexing works :)

Nick_vh’s picture

And only the diff for the solr service, to make reviewing easier

devtherock’s picture

Any good news for solr4.0 users :)

Nick_vh’s picture

Was that a question? Please try the patch, it should work for solr 4.0

devtherock’s picture

Hi Nick

That was question :), yes I added patch #28 and #29 in conf files, indexing and search works fine, but solr fetching error while deleting index from solr also module isn't showing status of content in drupal backend. I was thinking if anyone have complete upgraded module and configuration files for solr4.0.

Thanks for the patch :)

Thanks
Kuldev

Nick_vh’s picture

Can you be a bit more explicit as to which actions fail?
Screenshots or a list with actions would be super handy

devtherock’s picture

For everyone if they are facing below two issues

1. Apache solr is not showing content status when indexing content from the backend interface

Solution: Add const STATS_SERVLET = 'admin/mbeans?wt=xml&stats=true'; in "Drupal_Apache_Solr_Service.php" file around line number 78.

2. If facing in "optimize" or "waitFlush" error in solr logs

Solution: find "commit" tag in "Drupal_Apache_Solr_Service.php" and replace "optimize" attribute with "expungeDeletes" and remove "waitFlush" attribute. Also find "optimize" tag and remove the "waitFlush" attribute.

Things should work nicely after the above changes. Hope my lil contribution will be helpful to others. :)

Nick_vh’s picture

Status: Needs review » Needs work

devtherock, have you actually tried my patch? I add a STATS_SERVLET especially for solr 4? Please see and review that. As the other changes, can you make a patch that compliments my patch in #31?

j0rd’s picture

@Nick_vh Thanks for your work on this. I personally need solr for field aliases, and returning pseudo fields (think geodist()) in the result documents.

I've hit a bug with deleting a full index. I think I've tracked it down to this Solr issue:

"SEVERE: org.apache.solr.common.SolrException: Unknown commit parameter 'optimize'"
http://code.google.com/p/solr-php-client/issues/detail?id=75

Code looks like it's in ./Drupal_Apache_Solr_Service.php in function commit();

Attached is a patch to remove waitFlush & optimize. I'm not sure if this is the correct way to go about things, and someone with more solr knowledge should take a look and resolve any outstanding issues.

heacu’s picture

patches above work well for me so far... solr 4.0 has so many new killer features!

Sylvain Lecoy’s picture

If patch is working for you feel free to put this issue as "Reviewed and Tested By Community" aka RTBC. :)

j0rd’s picture

Status: Needs work » Reviewed & tested by the community

With my additional patch, I've had no issues using solr 4 for the past 1.5 weeks under heavy development while grepping the logs.

Nick_vh’s picture

Status: Reviewed & tested by the community » Needs work

The additional patch is not backwards compatible with Solr 3.6 so we should work on that

Nick_vh’s picture

FileSize
106.23 KB

Adding a patch that includes support for all solr versions and includes the schema and config from the common solr initiative for Solr 4.0

Nick_vh’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 1550964-43.patch, failed testing.

Nick_vh’s picture

Status: Needs work » Needs review

Common configs come from here : http://drupal.org/node/1857862

Nick_vh’s picture

FileSize
106.23 KB

Status: Needs review » Needs work

The last submitted patch, 1550964-47.patch, failed testing.

Nick_vh’s picture

Status: Needs work » Needs review
FileSize
107.09 KB

meh, cached diff

Nick_vh’s picture

Committed, we will do the rest in a follow-up

Nick_vh’s picture

Version: 7.x-1.x-dev » 6.x-3.x-dev
Status: Needs review » Patch (to be ported)

Also, do we want to support this for 6.x-3.x? If you have additions to this committed fix, please open a new issue for 7.x-1.x!

j0rd’s picture

Personally I'm not using 6.x, but no reason not to support solr 4.x in D6.

Solr is kind of a black box solution, and as long as you have some glue code in the module for it, no reason 6.x can't get solr 4.x as well.

---
For those who are curious what you can do with 4.x and not 3.x (as easily or performant)

I've made a pretty awesome and performant Google Map using apachesolr + facetapi + field aliasing + pseudo fields and instead of re-encoding the json, I just return it directly from solr. I also have a helper function which returns of the facetapi HTML my solr queries facets, merge that with my results from solr and replace the facet HTML in my sidebar and update the markers and HTML facets upon map move. Use field aliasing to reduce the length of the names of fields, thus reducing the returned json. With 100s of markers per ajax this can be a significant difference. With pseudo fields, I don't need to always score via distance as well and can add it as a field.

Now if only solr could do GeoSpacial bundling / clustering of markers server side I'd be set. I think with Solr 4.x it may be possible, but haven't looked into it too much.

dirtabulous’s picture

Status: Patch (to be ported) » Needs review
FileSize
107.09 KB

I created a patch for 6.x-3.x based on the one above. It appears to work as expected with Solr4, though I currently have a very basic solr setup.

jvandyk’s picture

Patch applied to 6.x-3.x-dev from 2012-Nov-08. Tested with Solr 4 on a stock Drupal 6.27 site. Created a couple of Basic Pages and indexed them. Search for term on page worked fine.

j0rd’s picture

For all those using Solr 4.0, please watch out for this bug.
#1874420: Solr4 Entites Not Being Removed with deleteByQuery

AntiNSA’s picture

in the 6.3 dev version I cant see any solr 4 in the solr-conf dir...... when will solr-4 be supported for drupal 6?

AntiNSA’s picture

Priority: Normal » Critical
AntiNSA’s picture

Any reasoin why the march 09 dev version doesnt include this patch?

j0rd’s picture

Priority: Critical » Normal

AntiNSA there's a patch in #53. If you want Solr4 support for D6 you should apply that patch, test and give feedback if it works or not.

Reason it's not committed, is because it's not committed. Most likely due to the fact that it's not widely tested. You're moaning isn't helping with that. If you want it committed, test the patch, let the developers know if it works for you. Fix / Report any problems you encounter, and help move this forward.

As for the Major priority. I don't think it's major. ApacheSolr module works fine with Solr 3. If Solr4 is a major requirement for your project, I suggest you either install, test and fix any problems when provide the developer with a properly working patch, else you can pay him for his time to do your work for you.

Anonymous’s picture

No disrespect to anyone here (#59), but I applaud j0rd for articulating how one should view the "give-and-take" nature of working within an open source community while trying to figure things out.

Peace,
Michael Clendening

AntiNSA’s picture

Hey pal... open source goes beyond the computer. Im useing it with an open learning project ith thousands of students in China. If you can do my job teaching, I can have more time to learn how to write patches. At this time I dont have that time....

So try to brpaden your scope from the open source developmentn side of things to the people that actually impliment the open source solutions and their environment.

Barracudda just released support for jetty/solr 4. Jetty 9 is workign with solr 4 and SOLR 3 is working with jetty 8.... I dont even have the knowledge of the difference between the two, but would welcome any increase in efficiency from a newer rev in all cases.

soulfroys’s picture

@AntiNSA, The vast majority of people who use drupal are in a situation like yours, including me, but... please, read the Issue Queue Etiquette:

Do: Be grateful. Talented, hard working people are giving you their time for free. Saying "thank you" costs you nothing but makes the person who has just given you 10 minutes (or 30 minutes) of their day glad that they did so.
Don't: Be ungrateful. It can be very frustrating when something doesn't work as you'd hoped but don't take this out on the developer or the issue queue.

(Don't get me wrong, I Wish You All The Luck!)

Nick_vh’s picture

Aside from the nettiquete that is very well explained above I don't see any actionable items. If someone is willing to mark this as RTBC I'm willing to commit this. Also, to make sure -> this discussion is only for the backport of the patch that was committed to de D7 module.

Nick_vh’s picture

#53: 1550964-53_0-6.x-3.x.patch queued for re-testing.

Nick_vh’s picture

Status: Needs review » Reviewed & tested by the community
pwolanin’s picture

Status: Reviewed & tested by the community » Fixed

commited

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.