Closed (fixed)
Project:
Mailing List Archive
Component:
Code
Priority:
Minor
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
15 Oct 2007 at 20:49 UTC
Updated:
6 Dec 2007 at 12:41 UTC
Jump to comment: Most recent file
This patch implements the drupal _search en _update_index hooks.
An additional parameter (mailarchive_search_limit) has been added to the mailarchive configuration page, limiting the number of messages being indexed (or disabling indexing altogether) per cron run.
I didn't test it on large mailing list archives, so keep an eye on the search_index table, it may become huge...
| Comment | File | Size | Author |
|---|---|---|---|
| #8 | mailarchive_disablesearch.patch | 476 bytes | rgareus |
| mailarchive_search.patch | 5.17 KB | BartHanssens |
Comments
Comment #1
jeremy commentedVery cool, thanks! I'll be merging it shortly. However, first I want to give some more thought to see if there's anyway I can better optimize it for very large datasets. Something tells me I'm going to need a bigger database server... ;)
Comment #2
BartHanssens commentedWell, since this patch uses the normal Drupal search, it sure is a nice way to test the number of rows your preferred database can handle :-)
I think I'll take a look at SQLSearch (http://drupal.org/project/trip_search), might be a better approach
Comment #3
jeremy commentedI see that the SQLSearch module offers more functionality (which is cool), but I didn't read anything that implies it offers better performance... am I missing something?
In any case, I'd love to see a patch that supports both Drupal's built in search as well as the more advanced SQLSearch module. And yes, I still plan to merge your search patch, as soon as I find a little more time.
Comment #4
BartHanssens commentedSqlsearch (aka trip search) uses the native mysql fulltext search instead of drupal's cron indexing / storing keys in search_index table. While this might or might not enhance search performance, it'll eliminates cron runs and it'll save you a few million rows in search_index table (at the expense of a fulltext index).
If I recall correctly, fulltext search has recently been added to postgresql core as well, although postgresql isn't fully supported by tripsearch (yet)
Multi-million row tables shouldn't be a problem for any database, but it would indeed be nice to support both sqlsearch and drupal search. I'm willing to implement it, but I'm short on spare time at the moment.
Comment #5
jeremy commentedI integrated the mailarchive module with sphinx for very impressive search performance. You can try it out here. Or, a specific search.
I need to dust off the code before I merge it, it's a little ugly in places. And I still plan to merge in your support for Drupal native search, too.
Comment #6
rgareus commentedHi, I only just found this thread, after seeing it on kerneltrap.org - really cool!!
I had a similar inspiration and hacked away last Sunday using swish-e.org for indexing the mailarchive. the Drupal module is available at http://mir.dnsalias.com/oss/swishmail/ - I'm not yet sure whether to continue this endeavour, swishmail might have a few use cases, but sphinxsearch is great!
Comment #7
jeremy commentedI finally merged this patch, with a few minor changes. Thanks Bart!!
Comment #8
rgareus commentedHuge email archives are beyond drupal's built-in search capabilities. Even If one disables indexing,
the "Mail archive message" Search button is still being displayed on the /search page. - The attached one-liner fixes this.
Comment #9
BartHanssens commentednice work, the module keeps getting better :-)
Comment #10
jeremy commentedIn the future, please open a new issue when fixing new bugs , and be sure to actually re-open the issue if updating an existing issue or it may get ignored. (You left the status set to "fixed", in a larger issue queue I wouldn't have noticed)
Patch applied, thanks!
Comment #11
(not verified) commentedAutomatically closed -- issue fixed for two weeks with no activity.