Closed (duplicate)
Project:
Apache Solr Multilingual
Version:
6.x-2.x-dev
Component:
Documentation
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
5 Oct 2010 at 14:07 UTC
Updated:
27 Aug 2012 at 21:13 UTC
To requests but i think the both are related.
It would be nice to include stopwords for different languages, not only english ( on not that kidn of bad ones ). I would also like to see german stopwords as default.
In addition, even with the "unique" setup, those stopword-files should have a postfix like stopwords-de.txt or similar. This makes it easier to store them in apachesolr_multilinugal and it makes it easier to reference them in the filter etc. ( depending on the lang(es)).
I have attached my version of the stopwords.
| Comment | File | Size | Author |
|---|---|---|---|
| stopwords-en.txt | 1.97 KB | eugenmayer | |
| stopwords-de.txt | 7.68 KB | eugenmayer |
Comments
Comment #1
mkalkbrennerThat's already on a todo list. But maybe your stopwords are too much. E.g. I won't put words like "anerkannt" or "veröffentlicht" on a germon stop word list.
Comment #2
eugenmayer commentedWas only a suggestion, no need to take mine, iam fine with more general ones. I guess everybody will maintain the stopwords later one themselfs, but eventhough we should provide good OOTB for the mass.
Comment #3
mkalkbrennerwork in progress for 7.x