Filtering Read Count stats by Agent?
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | other |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Jump to:
I've noticed for some time that my sidebar Popular Content is largely noise, and sifting my referrer log, it's easy to see why: These aren't nodes being read, they are primarily nodes being indexed (or referrer-spammed).
There was some talk (see access log / referrer filter) to allow filtering our own IP's from the logs, which is still useful to exclude not only self-references but also testing from the dev sites, but this current issue is different: I exclude referrer spammers directly in the Apache .htaccess "deny from" rules, but what I want here is the means to exclude spiders and other bots from polluting the Popular Content counts -- if I want hard hit-counts, I can still refer to my webserver logs, so it's no great loss if this also means losing all Stats-log page-counts from spiders and bots, and removing those referrers would greatly improve the meaning of the Today's and especially the Last Viewed sidebar.
One potential problem: this may be a fairly large list of exemptions as most every webcrawler has it's own unique Agent string; the filter would need to be in one or more regex since a list of string exemptions is probably impractical -- since most non-MSIE identify themselves as "Mozilla compatible" (or something like that) it may also be easier to specify a positive matching regex than to even attempt any meaningful exclusion rule.

#1
#2
#3
Feature request go to cvs.