The introduction of the access log id was made into an update, which drops the acceslog, and then recreates the table just to add a primary key. This is perfectly possible without loosing any data, and is documented in the MySQL docs. Patch attached.

BTW the 'path' introduction before instead of using 'nid' was also buggy in the update, since it does not fill the 'path' values of the items, so all previous log items are invisible. That also needs to be fixed, but before it is fixed, I need input on the intended value of the 'path' field.

In HEAD, statistics.module gets the path alias if available and stores that in the 'path' (line 77 currently). But then node/%d is matched with the path (line 174 currently), when a node access tracker is printed, so those nodes with path aliases are not working with it. It seems to me that the storage should be modified to not store the alias, but let it be computed when needed (eg. when printing links). But I am unsure, why was the alias stored. If this is fixed, we can get back to fixing the update to actually provide the 'path' value, before removing the 'nid', so that old accesslog items are still fully powering the site admin after the update.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dries’s picture

Committed the patch against updates.inc to HEAD. Good catch.

As for the path/nid-problem: you are right when you say that we should store the internal URLs and not the path aliases. That would be a bug.

Accessing the various statistics pages under "administer > logs" is still quite slow so if you see room for improvement, let me know or implement it in your next patch. ;) On drupal.org, the accesslog-table grows big (see bellow) and gets joined against one or more other tables, or queried using pattern matching... I think the table needs and index on accesslog.url, and possibly on other columns as well.

mysql> SELECT COUNT(aid) FROM accesslog;
+------------+
| COUNT(aid) |
+------------+
|     431674 |
+------------+

Lastly, on the mailing it was suggested that we track user agents. Maybe it is worth adding an accesslog.agent-column. Or maybe that should be watchdog.agent? Or both?

Gábor Hojtsy’s picture

Well, first things first. Fix at least the association between the accesslog data and the nodes. The attached patch fixes the path alias issue in stat module and also adds the proper path values when doing the accesslog table update. I also came up with some title update code, but that only works with MySQL 4.0.4 and up, so I commented it out. It still can be used by those advanced users, who would not like to loose their data and have an uptodate MySQL setup.

Dries’s picture

Title: Do not drop the complete accesslog on 4.6 update » Statistics module's performance

Committed to HEAD.

killes@www.drop.org’s picture

Status: Active » Fixed

somebody forgot to mark as fixed...

Anonymous’s picture

Anonymous’s picture

Anonymous’s picture

Anonymous’s picture

Status: Fixed » Closed (fixed)