The code below causes the statistics day timestamp to drift by the time it takes to execute the db operation to clear all the daycount fields. A small site will never notice but with 10's of thousands of nodes it becomes noticeable.

  if ((time() - $statistics_timestamp) >= 86400) {
    // Reset day counts.
    db_query('UPDATE {node_counter} SET daycount = 0');
    variable_set('statistics_day_timestamp', time());
  }

It should say something like

  if ((time() - $statistics_timestamp) >= 86400) {
    // Reset day counts.
    db_query('UPDATE {node_counter} SET daycount = 0');
    variable_set('statistics_day_timestamp', $statistics_timestamp+86400);
  }

Comments

This is actually noticeable on small sites as well. It was after I noticed my daycount stats were reseting in the late afternoon that I started looking into it and found this (http://drupal.org/node/521858).

This is also somewhat beyond "drift." Even if the db operation takes only a second or two, with cron running hourly, the first test (if ((time() - $statistics_timestamp) >= 86400)) generally won't pass until the 25th hour because in the 24th hour it will be just a few seconds shy of 86400, so statistics_day_timestamp gains an hour a day.

Here is the code I'm using now - it's needs to be made timezone aware (the 3600*8 is a hack for pacific time) and I've commented out the log delete because I keep all my logs.

/**
* Implementation of hook_cron().
*/
function statistics_cron() {
  $statistics_timestamp = variable_get('statistics_day_timestamp', '');
  $t = time() ;
  $t = $t - ($t % 86400);
  $t = $t + (3600*8);
  if ((time() - $statistics_timestamp) >= 86400) {
    // Reset day counts.
    db_query('UPDATE {node_counter} SET daycount = 0 WHERE daycount != 0');
    variable_set('statistics_day_timestamp', $t  );
  }
  // Clean up expired access logs.
  //db_query('DELETE FROM {accesslog} WHERE timestamp < %d', time() - variable_get('statistics_flush_accesslog_timer', 259200));
}

Thanks for this. I'll give it a whirl. I've been going into the db and resetting this manually about every two weeks.

Two questions:

1 - Then Central time zone would be "$t = $t + (3600*6);" ?
2 - What is "$t = $t - ($t % 86400);" doing? (I'm not clear on the "%" operator in this context)

One thing I've been curious about and have not been able to find documented, when and how does statistics_day_timestamp get initialized the very first time? One might assume that daycounts are meant to start over at midnight, local time, but would it not depend on what time of day an instance of Drupal was fired up for the first time?

The % operator is modulo so $t = $t - ($t % 86400) will give you number of seconds at the last whole multiple of 86400 (IE midnight UTC) then add the fidge for the local timezone to get local midnight.

At the end of it all $t == last midnight in seconds since the unix epoch, then you just need to see if 86400 seconds have elapsed and if so we've hit midnight so do our thing. The other optimization is to set day count only on the records where it's not already zero in sites like mine with 30K+ modes that makes a big difference.

Version:6.13» 7.x-dev
Status:Active» Needs work

This issue should be fixed in 7.x first!

Marked as duplicate #733636: Statistics cron does not reset the day-count properly

Day statistics used only at block http://api.drupal.org/api/function/statistics_block_view/7

And because cron are rinning every 3 hours for D7 by default so we should only change setting to variable to take into account a site's time zone.

Version:7.x-dev» 8.x-dev
Issue tags:+needs backport to D7

Moving to d8

Issue tags:-needs backport to D7

I have tested this code with D6 and seems to work well. Any love for the old-timers?

Fixing tags.

There's a 2 different problems:
1 - statistics reset caused by statistics_cron()
2 - statistics display that does not take into account a site's timezone

Priority:Normal» Major
Issue tags:+Needs tests

This is functionality bogus at storage and display levels

Priority:Major» Normal

Right that makes it a bug, but I don't think it's major in the slightest. Statistics counts are not anywhere in the critical path, and this only affects accuracy not anything else.

Is this actually still an issue now that those instances of time() have been replaced with REQUEST_TIME? This is true even for 7.x.

Yes, because cron running is unpredictable