Add support for translated (localized) taxonomy facet blocks
pips1 - April 17, 2009 - 12:03
| Project: | Apache Solr Search Integration |
| Version: | 6.x-2.x-dev |
| Component: | Language |
| Category: | feature request |
| Priority: | normal |
| Assigned: | mkalkbrenner |
| Status: | needs review |
Description
Hi,
After upgrading from beta7 to beta8, all taxonomy-based search facets only appear in English (default site language), but not in any other language anymore.
Other facet translations work fine. E.g. the 'language' facet still appears in the selected language, i.e. changes from "German" -> "Deutsch" -> "Allemand".
But all taxonomy-based facets are only shown in English.
Can anyone reproduce this?
Our setup:
Drupal 6.10
Apache Solr 6.x-1.0-beta8
Internationalization (i18n) 6.x-1.0
Translation table 6.x-1.0-beta1
PHP/5.2.8

#1
I don't think I made any significant changes to anything related to the taxonomy facets between these versions, so I'm surprised by this report.
Attached is a diff of the changes from beta7 to beta8. A bunch of code was moved to a .inc file, but essentially no meaningful changes occurred for taxonomy-related code.
I have not used any of the translation modules you cite - perhaps you updated one of those at the same time?
#2
We also need localized taxonomy facet blocks. To translate taxonomies we use i18n.
From my point of view the apachesolr module never supported this in the past. So I created a patch that adds i18n support for taxonomy facets.
#3
I made a little mistake in my patch. Here's a fixed version.
#4
Next step:
Use i18n to translate node type facet, too.
#5
Time to leave off work ...
Check for module i18ncontent and not for i18ntaxonomy when translating content types.
#6
I changed the title of this issue to reflect the change in focus.
mkalkbrenner is correct that apachesolr module never supported this in the past.
#7
We should have a separate issue for at least using the correct timezone for dates.
#8
see #463886: localize apachesolr for apachesolr localization in general and http://drupal.org/node/463886#comment-1664164 for localization of date facet.
#9
Tested in fresh checkout of DRUPAL-6--1 branch. Works!
Need this, sorely =)
#10
I'm not familiar with the i18ntaxonomy module - that's pat of the overall i18n module or stand-alone? I guess this functionality didn't make it into D6 core in any form?
In any case, I'm a little puzzled about how this should work - this patch seems to show the localized taxonomy terms in the search block, but those localized term names would not be available in the search index?
#11
i18ntaxonomy is part of http://drupal.org/project/i18n
I designed the patch to use i18n only if it's available and not to depend on it.
You're right that the patch only translates the terms and vocabulary names when showwing them in facet blocks. For our needs this was enough because we search for terms using term ids and not via full text search.
If terms are also indexed as text than this patch might need some work. But maybe this part is more related to #463886: localize apachesolr.
#12
Terms *are* indexed as text also, so at best this helps in terms of pure faceting for a single-site implementation.
#13
For our multilanguage site, we configured i18ntaxonomy so that every taxonomy term has a unique term id (i18n option "Localise terms") and is localised via the standard Drupal translate interface.
The alternative i18n option, "Per language terms", where terms in different languages have different term ids, wouln't have worked for us, since want to be able to use search facets *across all languages* and then be able to refine the search by language facet.
We have now updated to Solr 1.4 and apachesolr 6.x-1.0-rc1 and have applied mkalkbrenner's patch above. For our purpose, the translation of the facet terms in the blocks works.
#14
I did some test searches and I think I now understand pwolanin's comment better.
Apachesolr only indexes the taxonomy terms of the *default* language. It doesn't index the translation strings of those terms, right?
As far as can see, this patch here only translates the terms in the facet blocks. That helps for refining (filtering) searches by translated facets. But the taxonomy terms in the languages other than the default language *won't* be found if you actually search that other-language term.
Also, if I understand pwolanin correctly, this solution won't scale to searches across multiple sites either, since the translation strings are drupal site specific?
Does this describe the actual situation correctly?
#15
#14 pips1:
I think you described it correctly. My patch here only fixes one use case: The one you originally posted. (We also have two sites live now using this patch to solve the same issue).
I also agree to pwolanin that a perfect solution needs additional work. But from my point of view the context for multi language or multi site searches is much bigger and the current architecture of apachesolr module doesn't fit.
If you have a look at #463886: localize apachesolr you'll see that different languages require different stemming, stopwords, synonyms, acronym handling, concatenating rules, date and number formats, ...
So a multi site / language search might require multiple indexes (one per language) or language specific fields.
(And if you want to take it to extremes it's not enough to distinguish between languages but locales)
#16
I think we should have a BoF at DrupalCon about the larger context of multi language search. Some good brainstorming is in order.
#17
Count me in for the BoF. I'm coming to Paris. :-)
#18
The Doxygen comments don't say what the function parameters are. Typically there would be a one line description following each @param (on the next line).
<?php+ * @param $term
+ * @return object
?>
I despise function names that start with an underscore. _apachesolr_search_localize_taxonomy Can we change them?
+function _apachesolr_search_localize_taxonomy() doesn't sound like a function that returns TRUE or FALSE. Can we find a name more like apache_solr_search_needs_localization() ? (Open to other suggestions).
I don't like packing the module full of if(module_exists('foo')) ... wish there were a better way :[
#19
I'm also interested in such a meeting at DrupalCon. Seems I have to order a ticket ...
#20
Indeed, it would be good to officially announce a BoF session "multi language search" so that it can be properly scheduled in and other people can join in. Unfortunately, the official deadline for proposing sessions has past.. I have now contacted the organisers via the DrupalCon Paris contact form - let's see what happens.
#21
@pips - for DC, BoFs could not even be scheduled in advance, not sure what they are doing for PAris, but certainly the BoF schedule is not closed.
#22
Isabell from the DrupalCon organizers kindly opened a BoF session for us:
http://paris2009.drupalcon.org/session/multi-language-search
Anyone interested in the topic, please vote the BoF session up! :-)
@robertdouglass, @pwolanin I boldly added you to the co presenters list. Let me know if you want that changed.
#23
@pips1 - great work!
now... back to this patch...
#24
@pwolanin: re #12: Yes, this won't change the index, it just helps navigation. It goes without saying that search.module doesn't index different localized versions of strings; just the default language, so I don't think we should attempt any of that in this patch. Like @mkalkbrenner mentioned, #463886: localize apachesolr is the place for that.
I looked at @robertDouglass's comments in #18 and [hopefully] addressed them in this new patch... except the two module_exists() calls...
In @mkalkbrenner's favor, there *is* this "offending" piece of commenting above theme_apachesolr_search_snippets() :
/*** Theme the highlighted snippet text for a search entry.
*
* @param object $doc
* @param array $snippets
*
And there is also a module_exists() to build the book facet...
=) And no, I didn't kill any kittens by addressing this last bit in the patch =)
#25
Moving to 2.x; new patch attached.
#26
Patch did not apply anymore. New patch.
#27
We'll be discussing this patch Thursday at drupalcon.