Download & Extend

Option to display teasers instead of excerpts to get around stemming issues

Project:Search by Page
Version:6.x-1.0
Component:Main Search by Page module
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

I think there should be the option to display a teaser rather the search extract (faceted search has this an option too). When you use a stemmer extract doesn't work. So say I have some text "banana, oranges, lemons" if I search for oranges then extract works fine I get that bit of text shown in the extract and oranges is in bold. But search for orange and it shows just the first line as it doesn't find a match for the extract.

BTW if you try this on google it highlights both orange and oranges. I think perhaps a stemmer works the wrong way. It is faster for speed but a better approach would be make the search key for orange into orange or oranges. That would fix the extract problem but probably increase the search time.

Comments

#1

I had noticed that the stemmers don't work with exerpts, and filed an issue on the Porter Stemmer project:
#437084: Excerpt fails to find stemmed keyword
They are unlikely to fix that issue right now though, because the problem is really that the core Search module has a search_excerpt() function that is inflexible:
#493270: search_excerpt() doesn't work well with stemming

So I agree that until these issues are fixed, allowing Search by Page to display the teaser instead of an extract is a good idea to overcome this problem. It should be pretty easy to do.

#2

Title:Stemming issues» Option to display teasers instead of excerpts to get around stemming issues
Category:bug report» feature request

However, this is not a bug in Search by Page. It is a bug in the stemmers and the core Drupal Search module. So I will make it a feature request of Search by Page to allow for teasers.

#3

Component:Main Search by Page module» Search by Page Nodes module

Although this option to use teasers would only apply to the Node search component of Search by Page... There are no teasers for Users, Pages, etc.

#4

There is an update to the core Search module and to Porter Stemmer, available on issue #493270: search_excerpt() doesn't work well with stemming, that fixes the underlying problem of excerpts not working with stemming. Please test and review there.

#5

If you think this issue is important, please visit #493270: search_excerpt() doesn't work well with stemming and leave a comment. Otherwise, it is possible that no one will think it is important to get into Drupal 7 (much less Drupal 6). The code freeze for Drupal 7 is coming up on September 1st.

#6

Component:Search by Page Nodes module» Main Search by Page module
Status:active» needs review

An update on the status of this issue:

My proposed fix for Drupal core did not get into Drupal 7 -- no one reviewed it (see #493270: search_excerpt() doesn't work well with stemming).

So, I added a fix similar to what I had proposed for Drupal 7 to the development version of Search by Page instead. This fix allows stemming modules (or any other module that preprocesses search terms and search text) to find their own matches when building search excerpts for display on the search results page. Since I am also the maintainer of the Porter Stemmer module for American-English stemming, I implemented the fix there as well. It should not be difficult for other language stemming modules, and other search pre-processing modules, to implement a similar fix.

What this means for now is that if your site is in English, and you download the development versions of both Porter Stemmer (6.x-2.x) and Search by Page (6.x-1.x) [make sure the build date reads September 10 or later], you should be able to have working search excerpts. If you are using a different search preprocessor or stemming module, I would be glad to work with the module maintainers to get their module working with Search by Page as well.

I've tested this and it appears to work well with Porter Stemmer. I'd appreciate it if someone else would give it a try, and reply back here with positive or negative comments. Thanks!

(I realize that the feature request was to have an option to display teasers instead of excerpts, but I thought that fixing the exerpt problem was a better idea. Thoughts?)

#7

Some times teaser could be might be more useful (that is teasers could be a full summary of the article, not just the text around the search word) but otherwise good job on fixing the base problem.

#8

Thanks for your comments -- your point is a good one.

I will add the option to SBP Nodes to display teasers (not appropriate for any other modules; the SBP Paths module already has such an option).

#9

Status:needs review» needs work

The excerpt function fix is now released in Porter Stemmer 6.x-2.2 and Search by Page 6.x-1.4 -- you can use these versions if you want better search exceprts.

I will now add an option to the Search by Page Nodes module to display teasers, in case that is preferred.

#10

Status:needs work» needs review

I just checked in (to the 6.x-1.x-dev branch) a change that allows you to choose teasers instead of excerpts on search results pages, for Search by Page Nodes. Any comments welcome!

This new version should be available from the Downloads page within 24 hours (or available from CVS immediately).

#11

Status:needs review» fixed

This is now released in version 6.x-1.5, which should be ready for download in about 10 minutes or less.

#12

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

nobody click here