This is a great module and I use it for a lot of client sites. However, the following scalability issue comes up regularly, and it is time to have it fixed in the module itself rather than patching.
The issues are:
1. In taxonomy_browser_form_validate(), there is a call to taxonomy_browser_count_nodes(), which causes hideous queries to be run like the following one. It ran for 9902 seconds before it had to be killed!
SELECT COUNT(n.nid) FROM node n
INNER JOIN term_node tn0 ON n.nid = tn0.nid
INNER JOIN term_node tn1 ON n.nid = tn1.nid
INNER JOIN term_node tn2 ON n.nid = tn2.nid
INNER JOIN term_node tn3 ON n.nid = tn3.nid
INNER JOIN term_node tn4 ON n.nid = tn4.nid
INNER JOIN term_node tn5 ON n.nid = tn5.nid
INNER JOIN term_node tn6 ON n.nid = tn6.nid
INNER JOIN term_node tn7 ON n.nid = tn7.nid
INNER JOIN term_node tn8 ON n.nid = tn8.nid
INNER JOIN term_node tn9 ON n.nid = tn9.nid
INNER JOIN term_node tn10 ON n.nid = tn10.nid
INNER JOIN term_node tn11 ON n.nid = tn11.nid
INNER JOIN term_node tn12 ON n.nid = tn12.nid
INNER JOIN term_node tn13 ON n.nid = tn13.nid
INNER JOIN term_node tn14 ON n.nid = tn14.nid
INNER JOIN term_node tn15 ON n.nid = tn15.nid
INNER JOIN term_node tn16 ON n.nid = tn16.nid
INNER JOIN term_node tn17 ON n.nid = tn17.nid
INNER JOIN term_node tn18 ON n.nid = tn18.nid
INNER JOIN term_node tn19 ON n.nid = tn19.nid
INNER JOIN term_node tn20 ON n.nid = tn20.nid
INNER JOIN term_node tn21 ON n.nid = tn21.nid
INNER JOIN term_node tn22 ON n.nid = tn22.nid
INNER JOIN term_node tn23 ON n.nid = tn23.nid
INNER JOIN term_node tn24 ON n.nid = tn24.nid
INNER JOIN term_node tn25 ON n.nid = tn25.nid
INNER JOIN term_node tn26 ON n.nid = tn26.nid
INNER JOIN term_node tn27 ON n.nid = tn27.nid
INNER JOIN term_node tn28 ON n.nid = tn28.nid
INNER JOIN term_node tn29 ON n.nid = tn29.nid
INNER JOIN term_node tn30 ON n.nid = tn30.nid
INNER JOIN term_node tn31 ON n.nid = tn31.nid
INNER JOIN term_node tn32 ON n.nid = tn32.nid
INNER JOIN term_node tn33 ON n.nid = tn33.nid
INNER JOIN term_node tn34 ON n.nid = tn34.nid
INNER JOIN term_node tn35 ON n.nid = tn35.nid
INNER JOIN term_node tn36 ON n.nid = tn36.nid
INNER JOIN term_node tn37 ON n.nid = tn37.nid
INNER JOIN term_node tn38 ON n.nid = tn38.nid
INNER JOIN term_node tn39 ON n.nid = tn39.nid
INNER JOIN term_node tn40 ON n.nid = tn40.nid
INNER JOIN term_node tn41 ON n.nid = tn41.nid
INNER JOIN term_node tn42 ON n.nid = tn42.nid
INNER JOIN term_node tn43 ON n.nid = tn43.nid
INNER JOIN term_node tn44 ON n.nid = tn44.nid
INNER JOIN term_node tn45 ON n.nid = tn45.nid
INNER JOIN term_node tn46 ON n.nid = tn46.nid
INNER JOIN term_node tn47 ON n.nid = tn47.nid
INNER JOIN term_node tn48 ON n.nid = tn48.nid
INNER JOIN term_node tn49 ON n.nid = tn49.nid
INNER JOIN term_node tn50 ON n.nid = tn50.nid
INNER JOIN term_node tn51 ON n.nid = tn51.nid
INNER JOIN term_node tn52 ON n.nid = tn52.nid
INNER JOIN term_node tn53 ON n.nid = tn53.nid
WHERE tn0.tid IN (1)
AND tn1.tid IN (2)
AND tn2.tid IN (3)
AND tn3.tid IN (4)
AND tn4.tid IN (5)
AND tn5.tid IN (6)
AND tn6.tid IN (7)
AND tn7.tid IN (8)
AND tn8.tid IN (9)
AND tn9.tid IN (10)
AND tn10.tid IN (11)
AND tn11.tid IN (12)
AND tn12.tid IN (13)
AND tn13.tid IN (14)
AND tn14.tid IN (15)
AND tn15.tid IN (16)
AND tn16.tid IN (17)
AND tn17.tid IN (18)
AND tn18.tid IN (19)
AND tn19.tid IN (20)
AND tn20.tid IN (21)
AND tn21.tid IN (22)
AND tn22.tid IN (23)
AND tn23.tid IN (24)
AND tn24.tid IN (25)
AND tn25.tid IN (26)
AND tn26.tid IN (27)
AND tn27.tid IN (28)
AND tn28.tid IN (29)
AND tn29.tid IN (30)
AND tn30.tid IN (31)
AND tn31.tid IN (32)
AND tn32.tid IN (33)
AND tn33.tid IN (34)
AND tn34.tid IN (35)
AND tn35.tid IN (36)
AND tn36.tid IN (37)
AND tn37.tid IN (38)
AND tn38.tid IN (39)
AND tn39.tid IN (40)
AND tn40.tid IN (41)
AND tn41.tid IN (42)
AND tn42.tid IN (43)
AND tn43.tid IN (44)
AND tn44.tid IN (45)
AND tn45.tid IN (46)
AND tn46.tid IN (47)
AND tn47.tid IN (48)
AND tn48.tid IN (49)
AND tn49.tid IN (50)
AND tn50.tid IN (51)
AND tn51.tid IN (52)
AND tn52.tid IN (53)
AND tn53.tid IN (54)
An option to do away with this query should do the trick.
2. When you have terms listed as checkboxes, and then a user selects too many, or all checkboxes (think of 30 or so terms), slow queries result and can bog down the server.
In the past, I used to solve this by some code like this in a separate module.
efine('MISC_TAXONOMY_BROWSER_TERM_LIMIT', 5);
/**
* Add a validator function to the taxonomy browser selection form
*/
function misc_form_taxonomy_browser_form_alter(&$form, &$form_state) {
$form['#validate'] = array_merge(array('misc_taxonomy_browser_validate'), $form['#validate']);
}
/*
* Validate that we don't have more than a certain number of terms
* This makes sure we don't have too many joins in the query, which
* hurts performance.
*/
function misc_taxonomy_browser_validate($form_id, $form_state) {
$count = 0;
foreach($form_state['values']['taxonomy'] as $vocab => $data) {
foreach($data as $k => $v) {
if ($v) {
$count++;
}
}
}
if ($count > MISC_TAXONOMY_BROWSER_TERM_LIMIT) {
form_set_error('taxonomy', t('Too many selections (@count). Please select a fewer number of categories and try again',
array('@count' => $count)));
}
}
It works, but it would be helpful to have an option "Limit the number of search terms to : [ ]" and the user can select a reasonable number. I found that 5 usually works, but it depends on the hosting hardware really.
Both the above new options should have a description that says if you do not select a reasonable value, you can get severe performance problems, ....etc.
| Comment | File | Size | Author |
|---|---|---|---|
| #1 | taxonomy_browser.patch | 1.36 KB | jax |
Comments
Comment #1
jax commentedHad the same problem. Just change the join to join on vid and the query executes much, much faster. Joining on nid is in fact incorrect but produces correct results as long as you're not using versions.