I guess I'm allowed to say that ;-)

Just after we started filling our publication list I realized a huge drawback of the search function: by re-using the node search we always get no more than 10 result because this is hard-coded in do_search in search.module. Those 10 results are then ordered by year (by default) and thus the list looks lik we randomly picked several entries from the global publication list.
Especially if you search for an author name you would expect to find all his/her papers and not just 10. biblio search should more act like a "biblio entry contains" filter, not like the normal "give me the 10 best" drupal search.

So I rewrote the search function to build my own query to fetch all results (it's much simpler as we don't need all the scoring stuff when we take all results anyway). But I'm not an sql expert and copied matching parts from do_search, so I guess the build of the SQL statement should be reviewed carefully.

(http://drupal.org/node/376802 is not handlet yet in this patch because I'm not yet running 1.0 and so I can't test this).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

rjerome’s picture

Hmmm, this one only sucks a bit less :-0

It seems that both AND and OR return the same result set. On my test site "dataset AND Quantitative" returns the same as "dataset OR Quantitative"

Frank Steiner’s picture

Could you check your normal drupal search against the AND operator? When I search for "heun AND zimmer" with the normal drupal search, my first hit is the homepage of the user heun which has no reference to the word "zimmer" anywhere in the body, and the second hit is zimmers homepage with no reference to "heun". Same with "heun zimmer" (without quotes) which should be the same according to search_parse_query.

It seems that (at least my) drupal search ignores the AND although search_parse_query says that every AND'ed keyword should match once. But I'm not 100% how exactly this is meant.

I will investigate it, but I would be interested to know what you find when tou search for "term1 AND term2" without pages that contain both.

Frank Steiner’s picture

At the moment I'm not sure I understand anything about the drupal search function... Goto drupal.org (which is running 6.10 now) and type

rjerome OR asdfasdf

into the searchbox. I would expect to find 10 nodes with rjerome. Instead I get an empty list...

rjerome’s picture

Try rjerome OR asdfasdf now and you get exactly one hit (this issue node :-)) Of course we're just assuming it actually works, perhaps a cruise through the issues queues is in order to see if maybe we're not the first to discover this.

rjerome’s picture

You might find this post enlightening ... http://drupal.org/node/186242 Although I've never seen any of these warnings they are talking about.

Frank Steiner’s picture

> Try rjerome OR asdfasdf now and you get exactly one hit (this issue node :-))

Right, but that's not expected, is it? "rjerome AND asdfasfd" should return this single node, not "OR".

About the issue thread, yes, I also realized that my site ignores the AND keyword and just search for the string "and". And if I search "a or b" instead "a OR b" I do get a warning.

The point is that the patch is supposed to be in drupal, but in my D6 installation at least it doesn't do anything for and/AND. I.e., at my site, there is no difference between the normal drupal search and the biblio search, and it shouldn't because I just use the same query without scores.

So how's that at your site? If you enter "dataset AND Quantitative" in the normal search box, not in the biblio one, do you get only pages which contain both or not? Results are only really comparable if there are less than 10 results...

I need to understand if sth. is wrong with my search stuff or with the drupal search in general...

Frank Steiner’s picture

FileSize
3.88 KB

Ok, I got it. The problem was the missing COUNT statement. When you search for "term1 AND term2" drupal does an "OR" sql search but requires a match count of 2, one for each term.

My problem is that we were using the partial_word_search patch for drupal and then you usually get the second match by the partial match, i.e., term1 is matched as "term1" and as "somethinselseTERM1andsoON" becasue the search_index table stores a name "Volker Heun" as "volkerheun" if it is often found in this combination. So for me, the biblio and the normal search returned the same wrong results :-)

Anyway, that was our local problem. The new patch considers the group count and should work, please let me know.

rjerome’s picture

Ahh, I thought it might have something to do with that, but I hadn't quite figured out what that count was doing.

I'll give it a try later today.

Ron.

rjerome’s picture

Status: Needs review » Fixed

Seems to work better,

Committed to CVS.

Ron.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.