Closed (fixed)
Project:
Search API Solr
Version:
7.x-1.0-beta3
Component:
Code
Priority:
Normal
Category:
Support request
Assigned:
Unassigned
Reporter:
Created:
6 Jul 2011 at 15:09 UTC
Updated:
23 Jul 2014 at 12:50 UTC
Jump to comment: Most recent
Comments
Comment #1
drunken monkeyThe status is that there aren't any language-specific settings in the schema. Spell checking is language-independent (if you want to use multiple languages with a single index, I don't really know how to do that, though) and if you want any of the others, you'll have to configure them locally.
You can also set up several Solr servers with configurations for different languages, and put indexes on them indexing only items of a certain language. (Although you'd need a few lines of code for that last step, or wait a few weeks until this gets added.)
Comment #2
Fidelix commentedI'll wait a few weeks.
Is Internationalization being considered in this module's and Search API's Roadmap?
I'm in the process of planing the architecture of a big project, which will be multilanguage, and I'm trying to figure out if I can rely on this project for the future.
Everything seems awesome so far, but multilingual search if very important for this project.
Thank you.
Comment #3
drunken monkeyIn principle, yes, internationalization is considered important, and I also tried to keep it in mind when designing the basic architecture.
However, in Drupal 7 this became even harder to do than before, so I can't really make any promises regarding which use cases will be supported. Generally, I hope the Search API is flexible enough to allow all necessary customizations at least locally, though.
Comment #4
mac_weber commentedAs you said in [#1] the fastest way to get it working is to use solr multi-core with one language per core. I'd be interested in testing it.
Also, is this module considering entity_translation? I'm using both i18n and ET.
Comment #5
drunken monkeyNo, at the moment it is not. This would be something to be dealt with in the Search API itself, and I'm really unsure of how to best tackle this. (That's one of the things I meant with Drupal 7 being an even harder environment for i18n in searches.) Currently, I think the current site language is used for retrieving the field data, which might actually be a bit random in certain cases.
Comment #6
marcoka commentedi am interested in the "workflow, how to do it" for multilanguage searches too. Like translating the page and then if you switch the language (language switcher) the it searches in that language (like a user would expect).
Comment #7
Anonymous (not verified) commented@e-anima if you use a Search API view, can you choose the language=current language as a pre-set filter? Or you can hook into a query_alter() kind of function, and add the language filter hardcoded in a
hook_search_api_query_alter().(Assuming your entities have a valid language property/field.)
Comment #8
marcoka commentedchose the language with a views exposed field? hm. Best usability would be to chose the language with the normal languageswitcher block provided by internatinalisation/core.
I am thinking about that as it seems very important.
Comment #9
Anonymous (not verified) commentedHi, why a exposed field? Why not just a unexposed language = current language filter?
Comment #10
marcoka commentedmorningtime, good ide i will try that.
i played around a little and found one possibility to set the language by code, like using query alter. this may be a bad solution, i do not know so far. digging deeper and testing.
Comment #11
drunken monkeyDoing this with
hook_search_api_query_alter()instead of the Solr-specific variant would be both simpler and cleaner. But in the end, morningtime's solution should work equally well without any custom code whatsoever.Comment #12
blackice2999 commentedHi @all and merry christmas ;)
all multi language ways that i have seen use often the same fields and try selecting/searching by a language field value or using different indexes. This solution works on the first but if you want to use the apache solr as backend we run into another problem. You can't separate the tokenizer based upon the language. This could be a problem if you want to use SnowballFilter or PorterStemmer filter with german and english.
so i think its necessary that we use the language of the content also as document key and field key. Specially for the apachesolr schema the dynamic fields was named:
<dynamicField name="t_*" type="text" termVectors="true" />but can only use one "fieldType"
i think its a good idea to add the Language key into the dynamic so we can use a different fieldType based upon the language.
Example:
Or:
Comment #13
danielnolde commentedFor anyone interested in the latest effort of getting support for Entity Translation based multi-lingual content search to Search API:
Search API Entity Translation Module:
http://drupal.org/project/search_api_et
At the moment, this module introduces a new fulltext field to Search API which simply concatenates all ET translations of an entity for indexing – so a search keyword is findable in all translations a content. This work, but, of course, is very very crude and blunt and somehow wrong (but works!). For finding and deciding on a better way of supporting ET in Search API, there is a discussion going on in the module's issue queue at http://drupal.org/node/1393058.
How language specific search server setting can be achieved is a very important and interesting topic, so any discussion about this here can also bring good thoughts into the Search API Entity Translation progress.
Feel free to try and use the module and state your thoughts how to progress in the issue!
Comment #14
Carsten Müller commentedHi,
i agree with #12. In the Apachesolr Multilingual module the languages are set in the schema.xml
Because of the multilanguage problems in D7 Apachesolr Multilingual (http://drupal.org/project/apachesolr_multilingual) was not ported yet to D7. But there are plans to start this soon.
The question is now, is it a good idea to port it or to help search api supporting multilingual fields, stemming for different languages (plural in german is different to plural in english), different stopwords for each language and so on?
I think one real good solution will be better instead of two separate ones ...
Comment #15
danielnolde commentedCarsten, the setting and support for language specific stemming, stopword etc. is part of the apache solr configuration. The apachesolr_multilingual.module only helped you by preparing this solr config files for you (based on the solr config needed by apachsolr.module). You can quite easily configure stemming, stopwords etc. directly via the solr config files.
What's missing in apache solr and therefore hard to come up with is the possibility of index multiple language/translated versions of fields or a search document within one index, and to have multiple different language specific config settings within one index.
I think Blackice tries to show us a way of working around both these solr shortcomings by utilizing dynamic solr fields via search_api.
Comment #16
klonosnote-to-self: ...coming from #1335394: Search API integration
Comment #17
stefan.r commentedAs of 9 months ago there is a 2.x branch in the Search API Entity Translation module which along with the Search API Entity Translation Solr search module addresses all of these concerns.
See also:
#1393058: Decide on strategy for language aware search
#2147489: Merge with Apache Solr Multilingual?
@drunken monkey, we can probably close this issue at this point?
Comment #18
drunken monkeyI guess, yes. Thanks!