Search Problem in d5

FredJones - August 4, 2008 - 14:50

We imported a large set of files via the import_html module and made
them all book nodes. What we actually did was import them into a local
install, and then we ran a PHP script to massage the imported data (to
make some adjustments and corrections, some of which we did on the
body field of the records in node_revisions). Then I uploaded that DB
to the web server, and clicked "Re-index site" and now 100% has been
indexed.

When I run a search for "Hull" however, I get 7 results, which is very
nice, but node 21206 has that word--twice in fact--yet it doesn't
appear in the results list. Node 10037 and 23089 do show up in the
results, so nodes both before and after the 'missing' one are working.

I am using the standard search facility. I didn't adjust the default
settings of 3 for "Minimum word length to index:" and "Simple CJK
handling" is checked.

Any ideas how I can debug this?

Thanks!

For drupal 5.x, trip_search

Hetta - August 4, 2008 - 15:00

For drupal 5.x, trip_search works admirably, without the need to reindex things all the time.

Looks great! Pity I never

FredJones - August 4, 2008 - 15:45

Looks great! Pity I never knew about this long ago. Thank you very much.

Even trip_search doesn't

FredJones - August 5, 2008 - 11:21

Even trip_search doesn't find all search results, as far as I can see. For example, if I run a search using Drupal's regular search module for a certain word like "Hull," I get 2 results. With trip_search, I get 19, but if I execute this SQL:

select * from node_revisions where body like '%Hull%';

I get 34 results! And I confirmed that all 34 are unique nodes as well.

I suppose if no better solution exists, I will just build a module around that SQL. It seems to be the only 100% solution.

Unless I am missing something here.

Thanks.

Is it possible that some of

Hetta - August 5, 2008 - 15:01

Is it possible that some of your nodes are unpublished or access-restricted?

Nope. All are the same,

FredJones - August 6, 2008 - 08:41

Nope. All are the same, Published and "Access restricted for non-premium users" (using premium module). I mean that both nodes found and not found have those same two settings.

Thanks.

 
 

Drupal is a registered trademark of Dries Buytaert.