While reviewing the recent patch to add a new hook_ranking I noticed that some of the equations that have been used to normalize certain ranking factors are not functioning as expected. Robert Douglass posted a blog entry regarding how scoring factors were not functioning as expected here: http://acquia.com/blog/drupals-search-compared-google-and-yahoo.
Statistics and Comments
In the statistics scoring algorithm (currently residing in node.module) we can see:
Statistics:
'score' => '2.0 - 2.0 / (1.0 + node_counter.totalcount * %f)',
'arguments' => array(variable_get('node_cron_views_scale', 0)),
Comments:
'score' => '2.0 - 2.0 / (1.0 + node_comment_statistics.comment_count * %f)',
'arguments' => array(variable_get('node_cron_comments_scale', 0)),
The %f is the setting that the site admin applies to the score in the search settings page, (1 - 10). As can be shown in the screen shot of the graphed function, the range of this function is (0,2) and not (0,1).
The importance to normalize these score factors to (0,1) is that the other score factors ARE normalized properly and thus these two factors have higher score modifiers than the others when set to the same weight in the search settings. Can anyone else verify this with me?
| Comment | File | Size | Author |
|---|---|---|---|
| Picture 4.png | 25.07 KB | BlakeLucchesi |
Comments
Comment #1
cwgordon7 commentedAgreed. Node promotion and stickiness is automatically normalized on a 0 - 1 scale; comments and statistics should be too. This can probably simply be changed to:
Comment #2
cwgordon7 commentedFalse alarm, the algorithms are fine, since %f is the reciprocal of the maximum number of comments on a single node, so the maximum possible is 1, and minimum possible is 0, so both comment and statistic rankings are normalized properly.
Comment #3
Anonymous (not verified) commentedAutomatically closed -- issue fixed for two weeks with no activity.