The performance of content searches in Drupal is generally poor. A content search query will generate two temporary tables and do a node_load on each search result. This makes search results a logical target for caching.
Yet we can't really cache all search results since there is an infinite number of different search queries. Furthermore, we can't cache search results for authenticated users since we then have no way of guaranteeing node_access restrictions.
The solution reached in this patch is to cache a limited number of popular search results for anonymous users.
The first thing this patch does (besides creating a cache_search table) is to use the location column of the watchdog table to determine which search queries have been popular in recent history. While this isn't a perfect predictor of future searching patterns, it should do pretty well. When cron runs, an array of the 100 most popular search queries is saved using variable_set.
Then, when an anonymous user searches, if the query is in the 100 most popular array, the search results are cached. From that point on, the cached results can be served.
The cache is invalidated on node_delete and on cron runs. The cache is not invalidated on node_update or node_insert because the search index is not updated in those cases.
Benchmarking search for anonymous users with only core modules enabled, 1000 nodes and 5000 comments, shows that the average search requests per second falls between 1.69 and 1.91 (for anonymous users). This sucks pretty badly compared to serving other things to anonymous users. After applying the patch this went to 7.65-7.66 requests per second for anonymous users. Searches by authenticated users should not be impacted in any significant way as the increased overhead is tiny.
| Comment | File | Size | Author |
|---|---|---|---|
| search_cache_D6.patch | 5.63 KB | robertdouglass |
Comments
Comment #1
robertdouglass commentedOn Drupal.org, around half of all search queries are from anonymous users. The most frequently searched terms include the following (these numbers include both authenticated and anonymous):
The implication then is that from the top 10 search terms alone, Drupal.org would be able to serve around 1,400 cached search results. Many more for the top 100.
Comment #2
bdragon commentedAutopatch Results:
patching file modules/search/search.module
Hunk #1 succeeded at 302 (offset 28 lines).
Hunk #3 succeeded at 942 (offset 28 lines).
Hunk #4 succeeded at 934 with fuzz 2.
patching file modules/search/search.install
Hunk #1 FAILED at 32.
Hunk #2 FAILED at 68.
2 out of 2 hunks FAILED -- saving rejects to file modules/search/search.install.rej
patching file modules/system/system.install
Hunk #1 FAILED at 3724.
1 out of 1 hunk FAILED -- saving rejects to file modules/system/system.install.rej
Installation stuff is out of date...
Comment #3
robertdouglass commentedThis patch is flawed.
Comment #4
jhodgdonI'm not sure why a flawed patch made this feature request be marked as "won't fix".
In any case, there is now a request for the same thing at #242187: Cache search results so I will change this to be marked "duplicate".