Make indexing comments optional.

robertDouglass - September 17, 2009 - 15:52
Project:Apache Solr Search Integration
Version:5.x-2.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:claudiu.cristea
Status:closed
Description

Add a variable, apachesolr_index_comments_with_node (TRUE | FALSE) which controls whether or not comments are globbed onto nodes during indexing. There is no UI for this, so at this point you have to use $conf in $settings.php to set the variable.

#1

robertDouglass - September 17, 2009 - 15:53

Committing this.

AttachmentSize
index_comments.patch 2.65 KB

#2

robertDouglass - September 17, 2009 - 15:53
Status:needs review» fixed

#3

Scott Reynolds - September 17, 2009 - 15:55
Status:fixed» needs work

I just Ran into this problem and did something more hack-ish because I needed on a per node type.

Would you consider switching this to a per node type setting?

#4

robertDouglass - September 17, 2009 - 15:59

Hmm. I'd consider it. Maybe you can roll a patch against this commit?

My next step is to create a module that indexes comments as documents so that you can search for comments exclusively.

#5

pwolanin - September 17, 2009 - 16:02

Indeed - per node type does seem like the appropriate level of selection.

For 2.x, we should think about several possible options, including:

  1. on/off per node type
  2. with the node body or in a separate field per node type
  3. All the comments for one node as a separate document (though this would potentially give multiple search results to the same node)

Depending on getting the highlighting right, some variant of #2 would be good since it would allow you to search node only, node + comments, or comments only either as user options or admin options without reindexing and without duplicating the other meta data.

#6

Scott Reynolds - September 17, 2009 - 16:13

To piggy back off of 2, you could then specify a mlt that only looks at the node body, instead of body + comments

Which would be a big win I think.

#7

robertDouglass - September 17, 2009 - 17:08

Ok. I agree with all that. I don't think, though, that we want to pollute our interface with that level of complexity, and we also don't want to automatically implement every strategy because it causes index bloat. So I'm open to ideas about how we can architect it to be lean and mean, but give the admin the right amount of flexibility without overwhelming. Suggestions?

#8

pwolanin - October 13, 2009 - 14:38
Version:6.x-2.x-dev» 6.x-1.x-dev
Status:needs work» needs review

Here's a patch for 1.x for per-type exclusion.

AttachmentSize
comment-exclsion-580404-8.patch 1.51 KB

#9

pwolanin - October 13, 2009 - 15:42

with README change

AttachmentSize
comment-exclsion-580404-9.patch 1.67 KB

#10

pwolanin - October 13, 2009 - 17:28

+ code comment

AttachmentSize
comment-exclsion-580404-10.patch 2.33 KB

#11

pwolanin - October 13, 2009 - 20:43
Version:6.x-1.x-dev» 6.x-2.x-dev

committed #10 to 6.x-1.x

#12

pwolanin - October 14, 2009 - 13:48
Status:needs review» patch (to be ported)

needs to be ported to other branches

#13

robertDouglass - October 19, 2009 - 14:37
Version:6.x-2.x-dev» 5.x-2.x-dev

Committed to DRUPAL-6--2

#14

claudiu.cristea - October 23, 2009 - 10:15
Assigned to:Anonymous» claudiu.cristea

Here's the patch against 5.x-2.x-dev

AttachmentSize
comment-exclusion-580404-D5.patch 3.09 KB

#15

claudiu.cristea - October 23, 2009 - 10:37
Status:patch (to be ported)» fixed

Committed to CVS in #278734

#16

System Message - November 6, 2009 - 10:40
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

#17

dark_religion - November 14, 2009 - 08:21

I had this problem. And couldn't solve it for a long time lol...

 
 

Drupal is a registered trademark of Dries Buytaert.