Download & Extend

Should statistics be collected on every page?

Project:Better Statistics
Version:7.x-1.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

Taking clues from comment #6 in #1776748: Should the browser ask for location on every page?, it would be nice to have the ability to define the range of pages where the access log statistics should be collected or not.

Say you want to avoid collecting statistics for hits to admin pages, or ajax callbacks, or private file downloads. Or include only a section of the site.

Comments

#1

So this would be global for all accesslog hits as opposed to on a field-by-field basis, is that correct?

#2

Status:active» needs review

Yes, that is the idea. Have a look at the patch attached.

AttachmentSizeStatusTest resultOperations
better_statistics-include-exclude-paths-1879726-2.patch4.82 KBIdlePASSED: [[SimpleTest]]: [MySQL] 38 pass(es).View details

#3

Status:needs review» needs work

I'm a little weary to include this type of functionality by default for a couple of reasons:

  • Path functionality is nice, but what if someone wants to run Statistics (or not) based on more advanced conditions? Like... I only care about collecting stats on pages served to robots. Or I only care about collecting stats on authenticated users.
  • Seems to defeat the idea of an "access" log.

I toyed with the idea of integration with Context, but realized we'd run into similar issues we're having now with Drupal's incomplete bootstrap state on cached pages.

All of that being said, I recognize limiting statistics as a legitimate need in some cases. What if we added a hook like hook_statistics_skip(). Where by default, statistics will run on all pages, but if any function implementing hook_statistics_skip() returns true, it doesn't?

#4

Nice idea.

This would also turn to be helpful for the other issue on optionally avoiding logging cached requests overall (point 4, comment 1, #1879512: API - Dealing with statistics data collection on cached page requests).

A hook calls for development though, whereas the patch above is for an admin to decide inclusions/exclusions. Do you fancy Better Statistics to include a basic implementation of such hook for, say, path-based and role-based inclusions/exclusions? And leave more sophisticated ones, like bots, to other modules?

#5

Sorry for the delayed response recently. Yes, I think I'd go for that.

#6

Just a question... will the hook be part of the plugin API? Just wondering, now the footprint of Better Statistics in the .module file of implementing modules is very limited (just hook_statistics_api), and all code can be put in the module.statistics.inc. Can this logic be kept so that this hook will be implemented there as well?

#7

I think we'll need to refactor where the files are included, but that should definitely be the goal.

Is this something you'd be able to take on? Shall I do the work to add the hook and refactoring mentioned previously, and then from there you can port your work from #2 to use the hook?

#8

Shall I do the work to add the hook and refactoring mentioned previously, and then from there you can port your work from #2 to use the hook?

This could work - when you're done I will

  1. port #2 for path-based
  2. implement role-based
  3. implement a general switch for cache-based as per comments #6 and #7 in #1879512: API - Dealing with statistics data collection on cached page requests - for this one, I can place the switch in the 'caching' fieldset of admin/config/development/performance, and reference it in the fieldset 'access log settings' of admin/config/system/statistics as a reminder

#9

Status:needs work» needs review

This should pass tests; if it does, feel free to run with it. Also includes hook documentation to get you started on hook_better_statistics_skip().

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-9.patch5.32 KBIdlePASSED: [[SimpleTest]]: [MySQL] 48 pass(es).View details

#10

By the way, it'd be great if your path-based implementation could take some cues from how Drupal's default block configuration screen does it. E.g. checkbox for inclusive vs. exclusive (but no PHP filter option).

After you're done, I'll add tests for the role, path, and cached page stats skipping functionality.

#11

Here we go, a first attempt.

Note:

  1. I am not familiar with interdiffs, so patch includes both #9 and further
  2. for cache-based, there is a todo left, to use a common function to determine whether the page is being served from cache (see #1879512: API - Dealing with statistics data collection on cached page requests comments #6 and #7)
  3. not sure about your comment in #10 about "take some cues from how Drupal's default block configuration screen does it", can you please comment further/include screenshots

Cheers

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-11.patch12.21 KBIdleFAILED: [[SimpleTest]]: [MySQL] 47 pass(es), 1 fail(s), and 0 exception(s).View details

#12

Status:needs review» needs work

The last submitted patch, better_statistics-conditionally_skip_stats_collection-1879726-11.patch, failed testing.

#13

Status:needs work» needs review

Test fails on cache hit with a ctools error:

Invalid plugin module/type combination requested: module context and type plugins

can you help?

#14

The error I'm getting when running the tests locally is...

Fatal error: Call to undefined function truncate_utf8() in /path/to/better_statistics/better_statistics.statistics.inc on line 145

Looks like there's a call to truncate_utf8 in your hook_better_statistics_skip() implementation before unicode.inc is loaded. If we move the session bootstrap and unicode.inc include before the hook invocation, we shouldn't see any problems. Updated patch attached. Have not reviewed any of the other code, though.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-14.patch11.86 KBIdlePASSED: [[SimpleTest]]: [MySQL] 48 pass(es).View details

#15

Here's a screenshot of the path-based configuration for blocks; we should continue that pattern to minimize confusion.

Block configuration screen path selection

AttachmentSizeStatusTest resultOperations
Screen shot 2013-01-16 at 9.59.02 PM.png49.72 KBIgnored: Check issue status.NoneNone

#16

Status:needs review» needs work

Thanks a lot.

I agree, the whole 'filters' fieldset as implemented in the patch is cluttering the Statistics configuration page too much...

One idea could be to move it to an independent 'Filtering' form, where cache-based, path-based, role-based can be separate tabs in a vertical tabs form - and where implementing modules might add their own tabs. Then, in the Statistics configuration page just put a link to the 'Filtering' form.

Thoughts?

#17

Hmm... Before we open it up to multiple admin pages/forms, I wonder how it would look if we created a separate fieldset (collapsed by default) for filtering? And maybe clarify the current "Better Statistics Settings" fieldset to read "Access Log Data Collection" or something.

Also, I might suggest using "Access Log Restrictions" instead of "filters." "Filter" implies to me that the data is still being collected, it's just not being shown (like in a View), and that if I 'unfilter" it, the data will be there all of a sudden.

#18

OK - simpler. I will update next week.

just to check, are you going to develop the better_statistics_cache_status() function as per comment 7.2 in #1879512: API - Dealing with statistics data collection on cached page requests? We will need it to finalise this one.

#19

Status:needs work» needs review

Here we go.

Note I learnt tests a bit :) so I included some already - although definitely need review as I was not sure where you would like to put them in the code.

Cheers

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-19.patch18.61 KBIdlePASSED: [[SimpleTest]]: [MySQL] 94 pass(es).View details

#20

Status:needs review» needs work

This looks excellent.

Couple of notes:

  • I think I'd still prefer to have the contents of Access Log - Restrictions be in a vertical tab set (like on the blocks configuration page),
  • The "pages to be logged" radio seems superfluous. Like the block configuration page, seems like we should be able to have a set of radios with options (All pages except those listed, Only the listed pages), with a single textfield. That textfield respects the radio selection. By default, we could have it set to "all pages except those listed" and the textfield blank.
  • The "roles to be logged" radio also seems superfluous. Seems like we could get rid of it entirely, and just assume that if no roles are checked, all roles should be logged (might be good to have a description that indicates that too).
  • Some of the description strings could use a little more polish. I don't have specific comments, but I can take a crack at them.

The tests look good to me. I think I'd like to add a couple of admin configuration page tests too (for example: the switching of cache enabled vs. cache disabled).

Getting closer!

#21

Status:needs work» needs review

This still needs work, but just verifying Testbot still likes it.

Most importantly, I changed how the API will work. I think it makes more sense to mirror drupal_page_is_cacheable() with a better_statistics_request_is_loggable() and providing a hook_better_statistics_prelog() to allow modules to react and set it if necessary.

I also tried (but failed) to put a vertical tab inside of a fieldset. I'm definitely not in love with it. I also did not do any of the additional tests/block configuration form copying I noted above.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-21.patch19.14 KBIdlePASSED: [[SimpleTest]]: [MySQL] 94 pass(es).View details

#22

Thanks, I'll leave you lead the dance now :)

Few comments:

  • if you move 'restrictions' to vertical tabs, then I'd suggest to consider doing the same for the 'data collection' to keep consistency.
  • page logging - I was actually copying the concept of the two separate inclusion/exclusion textareas from 'IP Geolocation views and maps' as this seems to me more flexible since you can have two level subsetting - first you define all the inclusions (say A), then out of these, only the exclusions to these (say A-B). This gives you a little more restriction then 'only listed' (again A) or 'all but listed (say total-A).
  • last - when running tests locally, I still get the error message Invalid plugin module/type combination requested: module context and type plugins in the watchdog (the main watchdog, not the watchdog of the test environment). I suspect sth related to low bootstrap phase on cached pages again. However, no errors in the tests themselves.

Cheers

#23

I cannot for the life of me get this to pass locally, even though it's working as expected. Checking to see what Testbot thinks.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-23.patch19.51 KBIdleFAILED: [[SimpleTest]]: [MySQL] 124 pass(es), 1 fail(s), and 0 exception(s).View details

#24

Status:needs review» needs work

The last submitted patch, better_statistics-conditionally_skip_stats_collection-1879726-23.patch, failed testing.

#25

Status:needs work» needs review

For some strange reason, an access to the 'node' page was being logged in patch in #23 while accessing admin/config.

I repeated the truncate and drupalGet twice, and now it works for me.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-25.patch19.68 KBIdlePASSED: [[SimpleTest]]: [MySQL] 127 pass(es).View details

#26

Thanks very much, mondrake! Must be an oddity of the test framework. I added one more test (for the cache enabled/disabled), and in doing so, discovered it was broken.

If everything is working on your end I'll go ahead and commit this and likely put out the next point release.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-26.patch20.88 KBIdleFAILED: [[SimpleTest]]: [MySQL] 135 pass(es), 2 fail(s), and 0 exception(s).View details

#27

Status:needs review» needs work

The last submitted patch, better_statistics-conditionally_skip_stats_collection-1879726-26.patch, failed testing.

#28

Status:needs work» needs review

#26: better_statistics-conditionally_skip_stats_collection-1879726-26.patch queued for re-testing.

#29

Status:needs review» needs work

The last submitted patch, better_statistics-conditionally_skip_stats_collection-1879726-26.patch, failed testing.

#30

It's possible d.o's tsetbot might need a $this->refreshVariables(); after changing those configurations...

#31

Just checking, will you post a new patch?

#32

Yes, shortly.

#33

Status:needs work» needs review

Ah. Was checking for strings that only appear on the page when page cache is enabled.

AttachmentSizeStatusTest resultOperations
better_statistics-conditionally_skip_stats_collection-1879726-33.patch21.13 KBIdlePASSED: [[SimpleTest]]: [MySQL] 137 pass(es).View details

#34

Status:needs review» reviewed & tested by the community

Looks all good here. Thanks!

#35

Status:reviewed & tested by the community» fixed

Great. Committed: http://drupalcode.org/project/better_statistics.git/commit/6460aaf

Thanks as always for your contribution, mondrake!

Per #1892876: META: 1.x Roadmap, I'll be releasing 7.x-1.2 shortly.

#36

Status:fixed» needs work

How unfortunate.

Sorry but I just came across a php notice

Notice: Trying to get property of non-object in drupal_lookup_path() (line 77 of /..../includes/path.inc).

This is happening when serving a cached page, and you have specified page-based restrictions.

There is a drupal_get_path_alias() on line 147 of better_statistics.statistics.inc, which calls a drupal_lookup_path(), which in turns expects the global $language_url to be set - but with low bootstrap phase it looks like this is not defined, hence the notice.

#37

Are the page-based restrictions still working and just barking a notice, or are they not working at all? Perhaps they're only broken in a multilingual site?

#38

Hi,

nothing broken AFAICS. I updated the module on a live site where I applied the following restrictions:
- cache: enabled
- roles: only anon users
- pages: all but an excluded path

Everything gets logged as expected, but any time a page is served from cache, the php notice gets logged in the watchdog.

And yes, the site is multilingual, albeit URL detection is not enabled.

I just gave a try at raising the bootstrap phase to DRUPAL_BOOTSTRAP_LANGUAGE in the hook_exit implementation - which takes away the notice, but unfortunately is very late in the page processing request and raises some 'headers already sent' warnings, which is definitely worse.

Cheers

#39

Status:needs work» fixed

Thanks for the report. For my sanity, I'm closing this and opening up a follow-up with your notes.

#1907684: Notice: Trying to get property of non-object in drupal_lookup_path()

#40

Sure, thanks

#41

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

nobody click here