Will there be a version 5.x ?

beert - April 30, 2007 - 12:28
Project:French stemmer
Version:4.7.x-1.0
Component:Code
Category:support request
Priority:normal
Assigned:Unassigned
Status:postponed (maintainer needs more info)
Description

Currently I'm running Drupal 5.1 for a multilingual site.
I already installed the Dutch stemmer 5.x-1.x-dev.
To get also more user friendly search results from the French pages, I would like to install the French stemmer.

Will the French stemmer be made available in version 5.x?

Thank you for any advice.

#1

zigazou - May 1, 2007 - 06:10
Status:active» postponed (maintainer needs more info)

I have no Drupal 5 installation to develop with. I'm using the 4.7 version.

But, a 5.x version should not be that difficult to have since the frenchstemmer module has been completely developped starting from dutchstemmer module.

Here are the points :

  • I can make the port but I need a tester to ensure everything goes well (the dutchstemmer and frenchstemmer themselves do not do anything to the database, it is entirely managed by the search module, so it should not be harmful to test it as a revert is possible by cleaning one table)
  • I don't know how multi-language is handled through the search engine or Drupal (in fact just like dutchstemmer)

#2

beert - May 1, 2007 - 12:56
Title:Will be there a version 5.x ?» Will there be a version 5.x ?

After reading your remarks, I did some testing with Dutch stemmer. The result is that with Dutch stemmer enabled, it applies itself to all installed languages.

  • search without Dutch stemmer:
    With a language selected (e.g. Dutch), the search results are only from the corresponding language.
    When subsequently another language is selected (e.g. French), the same search request will be run with the results for this other language.
    So, the search is based on the page locale.
  • search with Dutch stemmer enabled:
    Ditto, but Dutch stemmer has been applied to the Dutch and the French pages.

Clearly, the page locale is not taken into account by the Dutch (or any other?) stemmer.
Taking this into account, the question arises whether it's possible to enable two language stemmers without conflict...

I'll be happy to test the module; please advise me on any specific testing procedure (or terminology).

#3

zigazou - May 1, 2007 - 14:23

I've been doing some research on the search module. It does not do any "locale" thing. So, if enabled together, the dutchstemmer module and the frenchstemmer module will do messy things.

The way these modules work is the following : whenether a user types text into the search form, or the search module wants to index a content, it gives the text to any module that implements the hook_search_preprocess (like dutchstemmer and frenchstemmer) before doing anything. The hook_search_preprocess is responsible of simplifying text for a better search experience (it could, for example, reduce sizeable, resizeable, resize etc. to size). Text is the only thing the search module passes through the hook_search_preprocess (no node id, no revision id, no locale...) therefore the dutchstemmer and frenchstemmer cannot determine whether the given text is dutch or french.

This would need a modification in the search module itself. It could give the locale along with the text to the hook_search_preprocess in order to let the module decide whether it should apply its filters.

As of now, enabling both dutchstemmer and frenchstemmer at the same time would generate messy index words. As a general rule, you should wipe the index table each time you change from one to another, and each time you update one of this module.

Concerning the 5.x version of frenchstemmer, I've attached the frenchstemmer.txt file with this post. To test it, you should :

  • Install the frenchstemmer files (from the 4.7.x version) like you do for any other module (/modules/frenchstemmer)
  • Copy the frenchstemmer.txt in the same directory and rename it to frenchstemmer.info
  • Enable the frenchstemmer module in Drupal
AttachmentSize
frenchstemmer.txt 208 bytes

#4

beert - May 12, 2007 - 10:52

I've installed French stemmer with the frenchstemmer.info, and did some testing.

  • search with only French stemmer enabled
    First I tested with only French stemmer enabled, and had Dutch stemmer disabled.
    French stemmer seems to be working fine with Drupal 5.1.
  • search with Dutch and French stemmer enabled
    Secondly I enabled both French and Dutch stemmer; after all, if you don't try it out...
    In practice, they both can be enabled simultaneously.
    But as this can create 'messy index words', this might not be such a good idea.
    After some testing -and for the time being- I did not come across any 'weird' or unexpected search results yet.
 
 

Drupal is a registered trademark of Dries Buytaert.