Currently the module works by requesting a ton of data from Google Analytics, looping that data, and storing what it feels is relevant.

The Google Analytics API module contains a submodule for reports that works the opposite way: collecting data by node and then dealing only with that data.

I think given the ideas of the ga_importer_start_index and the way max_results are calculated it would make more sense for this module to work on a per node basis as well.

Thoughts?

Comments

jpwarren00’s picture

Issue tags: +feature request

I did envision that sort of functionality in the future of the module. Currently it is designed for some specific sites that are behind a CDN and will not take a node "hit" on most pageviews, hence the current batch processing setup. But for smaller sites this would seem like a good option to add in the admin settings.

greggles’s picture

I think it makes sense for any site where the proportion of nodes to other content is "low" though I'm not sure what "low" is in terms of real values.

Here's what I did that led me to the conclusion. Right after the $status = process_ga_data($ga) I added a dsm to see what data is being returned:

  $status = process_ga_data($ga);
  dsm($ga);

I get hundreds of pages like "/403.html?page=/content/acupuncture&from=" and "/403.html?page=/users/paulrobin25&from="

Those have nothing to do with the nodes on my site.

I certainly understand that requesting a single big report from GA will be faster than 1 report per node, but I also hope that by requesting only valid nodes we can save some wasted time looping over things that aren't really nodes.

greggles’s picture

Status: Active » Closed (duplicate)

I did the work for this over in #1006210: Going beyond node_counter.