New hooks hook_hierarchical_select_offspring_count() and hook_hierarchical_select_children_count()
| Project: | Hierarchical Select |
| Version: | 6.x-3.x-dev |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | postponed |
| Issue tags: | Performance |
Jump to:
(sorry for my english)
I have write a large post with explanation "why?" and "what?". But accidentally press shortcut for "back" browser button, so now I'll just put code here.
Now i'm writing HS implementation wich work with table (with struct [id, parent_id, title, weight, ...]).
And I don't use tree caching.
hook_hierarchical_select_children_children_count
"_hierarchical_select_hierarchy_add_childinfo" function is a bottleneck of performance for HS implementations wich doesn't use tree caching.
"hierarchical_select.module" - new variant of "_hierarchical_select_hierarchy_add_childinfo":
<?php
function _hierarchical_select_hierarchy_add_childinfo($hierarchy, $config){
if( module_hook($config['module'], 'hierarchical_select_children_children_count') )
{
$parent_item = null;
foreach( $hierarchy->lineage as $depth => $item )
{
$hierarchy->childinfo[$depth] = module_invoke($config['module'], 'hierarchical_select_children_children_count', $parent_item, $config['params']);
$parent_item = $item;
}
// $hierarchy->childinfo[] now has data, but items are not in same order like in $hierarchy->levels[] ... - I don't know is it problem or not...
}
else
foreach ($hierarchy->levels as $depth => $level) {
foreach (array_keys($level) as $item) {
if (!preg_match('/(none|label_\d+|create_new_item)/', $item)) {
$hierarchy->childinfo[$depth][$item] = count(module_invoke($config['module'], 'hierarchical_select_children', $item, $config['params']));
}
}
}
return $hierarchy;
}
?>example of new hook implementation (from my module):
<?php
function datafield_hierarchical_select_children_children_count($parent, $params)
{
$item_list = array();
if( is_null($parent) )
$result = db_query('SELECT p.'.$params['id_field_name'].', COUNT(c.'.$params['id_field_name'].') AS cnt FROM {'.$params['table_name'].'} AS p LEFT JOIN {'.$params['table_name'].'} AS c ON (c.'.$params['parent_id_field_name'].' = p.'.$params['id_field_name'].') WHERE (p.'.$params['parent_id_field_name'].' IS NULL) GROUP BY p.'.$params['id_field_name']);
else
$result = db_query('SELECT p.'.$params['id_field_name'].', COUNT(c.'.$params['id_field_name'].') AS cnt FROM {'.$params['table_name'].'} AS p LEFT JOIN {'.$params['table_name'].'} AS c ON (c.'.$params['parent_id_field_name'].' = p.'.$params['id_field_name'].') WHERE (p.'.$params['parent_id_field_name'].' = %d) GROUP BY p.'.$params['id_field_name'], $parent);
while( $item = db_fetch_array($result) )
$item_list[$item[$params['id_field_name']]] = (int)$item['cnt'];
return $item_list;
}
?>hook_hierarchical_select_children_count
To use
<?php
module_invoke($config['module'], 'hierarchical_select_children_count', $parent, $config['params'])
?><?php
count(module_invoke($config['module'], 'hierarchical_select_children', $parent, $config['params']))
?>example of new hook implementation (from my module):
<?php
function datafield_hierarchical_select_children_count($parent, $params)
{
$result = db_result(db_query('SELECT COUNT(*) FROM {'.$params['table_name'].'} WHERE ('.$params['parent_id_field_name'].' = %d)', $parent));
return $result;
}
?>
#1
Hm, I can definitely see where this is useful. But I'm not sure this is significantly faster than fetching the data for the children itself as well. As you say yourself already, with proper caching in place, it's not slow.
The collecting of child info is an optimization by the way: thanks to that, no callbacks to the server are needed when an item has no children.
Please post either benchmarks comparing the current API with the modified API, or at least post your "I have write a large post with explanation "why?" and "what?"."
Thanks! :)
#2
> The collecting of child info is an optimization by the way: thanks to that, no callbacks to the server are needed when an item has no children.
I know this.
> Please post either benchmarks comparing the current API with the modified API, or at least post your "I have write a large post with explanation "why?" and "what?"."
Hierarchy:
total items: 11997
max depth: 3
"DataField" - my HS implementation not using tree cache
<?php
function datafield_hierarchical_select_children($parent, $params)
{
$item_list = array();
if( $params['weight_field_name'] )
$result = db_query('SELECT '.$params['id_field_name'].', '.$params['title_field_name'].' FROM {'.$params['table_name'].'} WHERE ('.$params['parent_id_field_name'].' = %d) ORDER BY '.$params['weight_field_name'].', '.$params['title_field_name'], $parent);
else
$result = db_query('SELECT '.$params['id_field_name'].', '.$params['title_field_name'].' FROM {'.$params['table_name'].'} WHERE ('.$params['parent_id_field_name'].' = %d) ORDER BY '.$params['title_field_name'], $parent);
while( $item = db_fetch_array($result) )
$item_list[$item[$params['id_field_name']]] = $item[$params['title_field_name']];
return $item_list;
}
?>
Test: just pressing "F5" 20 times on page http://localhost/node/%/edit (node with one field set with value to max depth).
Using Devel's "Collect query info" and "Display query log".
Results
DataField (no tree caching): 324 SQL queries
408.23, 400.23, 373.33, 397.01, 351.79, 616.23 (276.56 - sess_write), 396.77, 359.06, 370.17, 373.91, 412.96, 382.76, 387.08, 374.4, 350.4, 403.8, 1013.39 (402.03 - cache_set, 288.33 - sess_write), 371.04, 360.13, 352.24
TaxonomyField (tree caching): 86 SQL queries
247.82, 226.23, 221.07, 232.55, 781.08 (588.41 - sess_write), 200.47, 214.61, 226.61, 217.02, 222.35, 245.67, 227.82, 268.49, 377.6 (156.66 - sess_write), 220.97, 202.57, 370.6 (163 - sess_write), 233.43, 756.38 (524.65 - cache_clear_all), 244.5
DataField (no tree caching + new hook): 107 SQL queries
216.77, 191.62, 182.26, 186.53, 298.45 (109.72 - sess_write), 168.05, 214.68, 262.84 (76.49 - sess_write), 199.02, 179.42, 245.62 (74.31 - sess_write), 224.99 (31.95 - sess_write), 167.94, 181.84, 195.69, 848.65 (492.68 - cache_set), 161.22, 162.9, 173.45, 199.11
Results (filtered out values very different from avg value)
DataField (no tree caching):
408.23, 400.23, 373.33, 397.01, 351.79, 396.77, 359.06, 370.17, 373.91, 412.96, 382.76, 387.08, 374.4, 350.4, 403.8, 371.04, 360.13, 352.24
TaxonomyField (tree caching):
247.82, 226.23, 221.07, 232.55, 200.47, 214.61, 226.61, 217.02, 222.35, 245.67, 227.82, 268.49, 220.97, 202.57, 233.43, 244.5
DataField (no tree caching + new hook):
216.77, 191.62, 182.26, 186.53, 168.05, 214.68, 199.02, 179.42, 167.94, 181.84, 195.69, 161.22, 162.9, 173.45, 199.11
Total results - avg values
DataField (no tree caching):
379.18
TaxonomyField (tree caching):
228.26
DataField (no tree caching + new hook):
185.37
#3
#4
Thanks!
I agree that this can simplify the implementation/performance optimization a lot. So I'm going to add this. But it's going to become an *optional* hook. Meaning that you can implement it, but if it's not implemented, it'll fall back to the current method of counting children.
But instead of
hook_hierarchical_select_children_children_count(), I'd like to see it namedhook_hierarchical_select_offspring_count().hook_hierarchical_select_children_count()is a fine name already.Care to roll a proper patch? :)
#5
(sorry for my english)
You are welcome.
I really don't know how to make a proper patch. Also I don't like patches...
But, if would you post here a DETAILED manual how to make a patch on local computer (WinXP) and which soft I have to use, I'll do it.
#6
http://drupal.org/patch/create :)
#7
mmm... thanks :)
I read it earlier. It's so confusing and so abstract for me.
Now I have no time to read it all once more time...
Maybe later when I'll have a lot of free time, I'll do it... But this will be not very soon!...
#8
If you're a linux user, then it's really simple.
0) Add these Drupal command-line shortcuts: http://wimleers.com/blog/handy-drupal-command-line-shortcuts
1) Do a CVS checkout of the drupal module.
2) Make your changes
3) Run "ddiff"
#9
#10
#11
This is a performance feature.