Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Having a lot of difficulty getting this to work. I'll try to document my findings here.
Basically, the way I understand it is that Chinese needs a different strategy for indexing the content.
I haven't been able to dig up a good working schema.xml file that has both English and Chinese languages configured, just a lot
of use this class, not this one.
For starters i've added:
<!-- Smart chinese analyzer -->
<fieldType name="text_chinese" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer"/>
</fieldType>
to schema.xml.
It resulted in an increase in the fields and index term count listed on the "Apache Solr search index" page.
Some helpful links.
- http://bit.ly/gD7qPd - search of mailing list for SmartChineseAnalyzer
- http://wiki.apache.org/solr/LanguageAnalysis
- http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Comments
Comment #1
emackn CreditAttribution: emackn commentedOK, after carefully reading the instructions, I found my problems. The zip file would not extract for me.
I also had to manually add in the Lucene Smart Chinese Analyzer jar.
http://mirrors.ibiblio.org/pub/mirrors/maven2/org/apache/lucene/lucene-s...
so copy your solr war file to a new directory:
mkdir new_war
cd new_war
copy /path-to-war/solr.war new_war
jar -xf solr.war
copy lucene-smartcn-2.9.0.jar into WEB-INF/lib then
jar -cf solr-with-smartcn.war *
now move your new war file back to where you copied the original form.
Comment #2
emackn CreditAttribution: emackn commentedHaving some trouble with migrating our multilingual search enhancements to production. Indexes are the same, config files are identical, have deleted and re-indexed several times, and restarted apache. Anyone think of some other debugging methods to try and figure out why the chinese titles appear in the results on our dev server but not on the production.
Thanks.
Comment #3
mkalkbrennerReally interesting. Unfortunately I'm not able to read chinese and to test the solution.
But if anyone is interested in creating an extension for Apache Solr Multilingual to create the required config files, I'll support him.
Comment #4
gilzero CreditAttribution: gilzero commentedSub.
Probably integrate with this one?
http://code.google.com/p/ik-analyzer/
Comment #5
mkalkbrennerComment #6
mkalkbrennerPostponed until someone offers help, at least for testing.