Summary: I have a fairly clean install of Drupal 7 with Apachesolr-1.0-beta18. I have created a content type called document with a number of fields. I am working with 30k+ records, most of which are related to "Marion, IA" in some way. A search for "library" (without the quotes) returns no results, while a search for "marion library" returns thousands of results. That doesn't make any sense to me at all.

Details:

  • Drupal 7 (latest stable version)
  • Apachesolr-1.0-beta18
  • Custom content type with many fields
  • LAMP stack running on Centos Linode
  • PHP 5.2.x

I also checked this through the solr admin interface, running the same searches with similar results, so I can't rule out the possibility that something is configured wrong... but since I am using the solrconfig.xml and schema.xml files provided with the modules, it is also a possibility that the issue lies here as well. I have watched the logs and during the searches that produce no results but should, there is no output in the log besides the regular [INFO] about the query.

I am stumped and I am past a deadline with this project, so any help would be greatly appreciated.

UPDATE: With some help, I got this figured out. The fields with the text for "library" was not being indexed because of display settings.

To fix the issue:

  • Add a custom display for "search index" under Admin > Structure > Content types > [Your content type] > manage display, in the Custom Display settings fieldset.
  • Make sure to switch to the "search index" display settings - this is what I forgot to do that made this process take longer
  • Set all the fields that you want to be indexed to not be hidden.
  • Re-index.
  • Profit!

Comments

nick_vh’s picture

Can you copy and paste the exact query as it is being sent to solr here?

pyrello’s picture

Here are the relevant lines from the log:

Apr 4, 2012 10:53:36 AM org.apache.tomcat.util.http.Parameters processParameters
INFO: Invalid chunk starting at byte [16] and ending at byte [16] with a value of [null] ignored
Apr 4, 2012 10:53:36 AM org.apache.solr.core.SolrCore execute
INFO: [marion] webapp=/solr-multicore path=/select params={f.itm_field_document_year.facet.mincount=1&spellcheck=true&facet=true&f.sm_field_document_month.facet.limit=50&facet.mincount=1&spellcheck.q=library&qf=taxonomy_names^2.0&qf=label^5.0&qf=content^40&qf=tos_content_extra^0.1&qf=tos_name^3.0&f.itm_field_document_day.facet.limit=50&hl.fl=content&json.nl=map&f.im_field_document_pubtitle.facet.limit=50&wt=json&rows=10&f.itm_field_document_day.facet.mincount=1&f.sm_field_document_month.facet.mincount=1&f.itm_field_document_year.facet.limit=50&fl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name&start=0&facet.sort=count&q=library&f.im_field_document_pubtitle.facet.mincount=1&facet.field=im_field_document_pubtitle&facet.field=itm_field_document_year&facet.field=sm_field_document_month&facet.field=itm_field_document_day} hits=0 status=0 QTime=3 

As you will notice, there is actually something that seems to indicate what the problem might be, which I had missed earlier.

nick_vh’s picture

solr/select?qf=taxonomy_names^2.0&qf=label^5.0&qf=content^40&qf=tos_content_extra^0.1&qf=tos_name^3.0&hl.fl=content&wt=xml&rows=10&fl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name&start=0&q=library&indent=true&debugQuery

in the worst case you can try this query
solr/select/?q=library&rows=10&wt=xml&indent=true&debugQuery

I removed the facet code from your query, would you be so kind to run this query directly to your solr instance?
As far as I see this, it should query the content field. If you have your fields hidden in the display properties it might be possible that it can't find your content. Make sure your display mode "search index" is set correctly with the right fields.

pyrello’s picture

Okay... I tried adding the "Search Index" as a custom display and set every field to not be hidden. I then deleted the index and re-indexed. I am still getting no results for "library" (without quotes)

I notice in the tomcat logs that I am still seeing the following message whenever I run the "library" search through my drupal site:

Apr 4, 2012 11:43:26 AM org.apache.tomcat.util.http.Parameters processParameters
INFO: Invalid chunk starting at byte [16] and ending at byte [16] with a value of [null] ignored

When I run the first query above (I added "=on" to the very end to get the debugging to work), I get the following response back:

<response>
<result name="response" numFound="0" start="0" maxScore="0.0"/>
<lst name="highlighting"/>
<lst name="debug">
<str name="rawquerystring">library</str>
<str name="querystring">library</str>
<str name="parsedquery">
+DisjunctionMaxQuery((content:librari^40.0 | taxonomy_names:librari^2.0 | label:librari^5.0 | tos_name:librari^3.0 | tos_content_extra:librari^0.1)~0.01) DisjunctionMaxQuery((content:librari^2.0)~0.01)
</str>
<str name="parsedquery_toString">
+(content:librari^40.0 | taxonomy_names:librari^2.0 | label:librari^5.0 | tos_name:librari^3.0 | tos_content_extra:librari^0.1)~0.01 (content:librari^2.0)~0.01
</str>
<lst name="explain"/>
<str name="QParser">DisMaxQParser</str>
<null name="altquerystring"/>
<null name="boostfuncs"/>
<lst name="timing">
<double name="time">2.0</double>
<lst name="prepare">
<double name="time">2.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">1.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">1.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">0.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>

When I run the second query above (with the same modification), I get the following results:

<response>
<result name="response" numFound="0" start="0"/>
<lst name="highlighting"/>
<lst name="debug">
<str name="rawquerystring">library</str>
<str name="querystring">library</str>
<str name="parsedquery">
+DisjunctionMaxQuery((content:librari)~0.01) DisjunctionMaxQuery((content:librari^2.0)~0.01)
</str>
<str name="parsedquery_toString">+(content:librari)~0.01 (content:librari^2.0)~0.01</str>
<lst name="explain"/>
<str name="QParser">DisMaxQParser</str>
<null name="altquerystring"/>
<null name="boostfuncs"/>
<lst name="timing">
<double name="time">0.0</double>
<lst name="prepare">
<double name="time">0.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">0.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>

Thanks for the prompt responses. It is greatly appreciated!

Sean

nick_vh’s picture

it seems it is searching for librari instead of library? Did you modify other solr config files?

http://www.junlu.com/list/15/972593.html might be able to help you. It could be that your solr is corrupt. Are you running tomcat or jetty? For debugging purposes I'd recommend jetty.

pyrello’s picture

Okay, I just discovered that before, when I thought I was setting up the correct "Search Index" display settings, that actually I was just changing the default settings and leaving the search index settings as they were. I am going to re-index and see if this makes a difference.

Thanks!

Sean

pyrello’s picture

This fixed the issue I was having. Thanks for all the help!

nick_vh’s picture

It would be good for others to know how you fixed it

pyrello’s picture

Status: Active » Fixed

When I created the content type "Document," it did not automatically have a "Search Index" custom display. I had set all except one field to not display in the default display settings. So, it appears that only that one field was being indexed (which explains why the indexing went so fast for 30k+ nodes).

To fix the issue:

  • Add a custom display for "search index" under Admin > Structure > Content types > [Your content type] > manage display, in the Custom Display settings fieldset.
  • Make sure to switch to the "search index" display settings - this is what I forgot to do that made this process take longer
  • Set all the fields that you want to be indexed to not be hidden.
  • Re-index.
  • Profit!

Much thanks to @Nick_vh for his help in figuring out this issue!

pyrello’s picture

Issue summary: View changes

Updated issue summary.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Adding fix information.