Drupal Search Functionality

Smogger - July 13, 2007 - 16:39

I'm not a developer and this is my first post, so if anyone can point me in the right direction, it's much appreciated.

I'm looking for information on the site search capabilities for Drupal -- more specifically, does Drupal support query builders like spell-checkers to autocorrect search terms for the user and deliver relevant results, phonetic tools that recognize similar spellings (like John McLean and John MacLean), Natural Language Processing ("how do I return an item?") and metadata like a document's meta keywords tag which would include alternate terms and misspellings (for example: cellphone, cellular phone, mobile phone, celular phone, celphone) which may not appear in the document itself but you can see from search logs people are looking for these alternate keywords??

Can a third party search engine like Endeca be integrated with a Drupal ecommerce site?

Thanks,
Linda

One more thing

Smogger - July 13, 2007 - 17:06

Also, where can I find more information on the existing modules for search? I had a hard time finding info on drupal.org.

search add ons

adminfor@inforo... - July 13, 2007 - 17:52

Things that can improve your searching capabilities in Drupal are:

1) Taxonomy terms (by default) and their Synonyms (through this module http://drupal.org/project/synonyms). You have to associate nodes to taxonomy in advance, of course. Later changes to taxonomy synonyms does not affect results without an index rebuilt.

2) Stemmers modules, English, French, etc. This will index simplest words, no plurals, root from verbs , for example, lovers, love, etc as lov. English stemmer is appointed for 4.7 but runs in 5.1 without any problem. Stemming other languages could be avaliable soon, I'm testing my Spanish stemmer for a while before uploading it. Once it runs ok, I will have Italian and Portuguese stemmers very close (they are similar romance languages). Also, I'm trying and "Spa-nglish" stemmer, due to the spread of english words in spanish community.

Also there, you may filter stop words, i.e. common prepositions (or transform to synonyms), like ' a ', ' the ' etc that you will not desire in your queries.

3) Accents module, will remove accents before indexing and searching.

There are also other modules like Swish-e, some works to integrate Lucene / Nutch but I've never tried.

I think could be very useful to add a new category in http://drupal.org/project/Modules grouping "search" related modules and tools.

Hope this helps. If someone have more things to add in the list, please, go ahead.
Gustavo

romance languages stemmer

bonzinip - September 25, 2007 - 06:29

I doubt that an Italian stemmer will be similar to a Spanish one. For one thing, the rules for plurals are completely different, Spanish being relatively similar to English (e.g. mensaje -> mensajes), and Italian to languages with declensions such as German or Russian (messaggio -> messaggi, that is o -> i; сообщение -> сообщения, that is е -> я).

Paolo

You may find stemmer rules

adminfor@inforo... - March 12, 2008 - 11:59

You may find stemmer rules here (spanish, french, italian, portuguese, german, etc):
http://snowball.tartarus.org/

Spanish stemmer module for Drupal 6

gonzalo.koeln - January 5, 2009 - 16:13

I ported this Spanish stemmer algorithm to a Drupal 6 module.

Please check http://drupal.org/project/spanishstemmer.

Other featured languages in Drupal:

http://drupal.org/project/porterstemmer (English, Drupal 5 and 6)
http://drupal.org/project/frenchstemmer (French, Drupal 4.7)
http://drupal.org/project/dutchstemmer (Dutch, Drupal 4 and 5)
http://drupal.org/project/de_stemmer (German, Drupal 6 and 5)

Portuguese stemmer

andrewsuth - March 15, 2009 - 12:47

Did you make any progress with your Portuguese stemmer?

I think it would be a good addition to the list of stemmers for Drupal.

article here

sepeck - July 13, 2007 - 19:05

http://www.lullabot.com/articles/drupals_search_module_and_scoring_factors
And the documentation in the code itself as seen on api.drupal.org
http://api.drupal.org/api/5
http://api.drupal.org/api/5/group/search

Also check out the porter-stemmer module that extends the built in search capabilities and is in use here on drupal.org. Drupal core search was meant to be extended.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

See also Solr

batsonjay - July 13, 2007 - 19:40

I'm not sure how much "programming" support this is going to need, because I've never actually used the module. But the Solr module uses Lucene, and Lucene is as good as Endeca or anything similar to it. Solr provides fields/facets which provides great navigational capabilities, etc. Lucene (the engine under Solr) does a great job with stemming, etc.

 
 

Drupal is a registered trademark of Dries Buytaert.