It seems that for some nodes I have duplicated values. Looking in the 'ga_stats_count' table I did see something like:

nid		url						metric			timeframe	count
3277		/current_alias_of_the_node_3277			pageviews		forever		1715
3277		/node/3277/revisions/view/4857/5085		pageviews		forever		4

Is this an error in the regex extracting the "/node/nid" path? I couldn't correct this behaviour

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

twohlix’s picture

Good question. Do we want revision views to be counted toward the original node or not?
If we do then it is a bug. Otherwise it is not.

I'm not actually sure which is expected here. I'd say revision views are different than simple node views and are the edge case and that this isn't a bug, because viewing a revision is usually an admin operation and revisions aren't usually linked to.

Thinking about this a bit more I'm guessing have two entries with the same nid is going to mess with sorting operations. I think revision views probably shouldn't even be tracked if they're an admin only option. However if you want them tracked they are technically the same node but with different data and should be treated differently.

I'm curious about jec006's input on this.

Balbo’s picture

Maybe those shouldn't probably be tracked as you say. Or /node/3277/* be trated as /node/3277 and so #views summed.

"...revision views are different than simple node views...", true but the point is that in my Views where i put "node" and "#views" I have same node 2 times with different #views (1715 and 4 as the example above) and can't say why (just looking in the DB it happens that one row is for revisions).

btw: great module, this is going to take over statistics core ;-)

jec006’s picture

It seems to me that tracking views of revision urls doesn't really fit into the generalized use case for this module. It seems to me like storing and retrieving this data is really just clouding the real data and storing data we don't need.

I would suggest the best course of action is just removing these urls from the stats before they are stored - during the step where we normalize all the aliases.

Balbo’s picture

News here.
I'm not sure why, I have duplicated entry even without revisions (or other "after slash sub-pages").
I changed the name of the node (and now have 2 aliases both alive). Since the nid is the same I wanted it to be counted as one but in the db 2 different lines are stored. See attached screenshot.

Balbo’s picture

(attached screenshot)

perke’s picture

also getting duplicates when alias is changed (and both are working) ... guess Google analytic doesn't get aliases properly even when Distinct option in view is turned on

view is using sorting per page views in last 24h.

any idea on how to get rid of duplicated results?

ditcheva’s picture

I'm having the same issue.. any ideas about what is causing this? Is it really just aliases?

I have a page with 16 duplicate ga_stats entries in my view, and I definitely don't have 16 aliases to it (just one). What else may be causing this?

This is not happening for the majority of my pages - just one...

ditcheva’s picture

FileSize
136.75 KB

OK, I got it - it really does have to do with aliases.

The issue is that I was displaying daily, weekly *and* monthly views for this node, and since it has two URLs (2 aliases), I can see it's logging two separate counts for daily views, weekly views, monthly views, and then 2 separate counts for unique daily views, unique weekly views, etc.

ga_stats module duplicate view counts for multiple aliases

So, if I remove all the weekly & monthly views from my view and just leave the daily ones, my view is reduced to just two duplicates: one view for each of the two aliases....

Just wanted to confirm that the issue really is up to aliases and there was a reason for my huge number of duplicates...

captcodemonkey’s picture

I had this issue with node/nid paths and the alias paths.

I chose to go ahead and exclude the node/nid path views to avoid the duplicates (you can see what is going on by looking at the ga_stats_count table for any given nid).

Testing was tricky at first until I realized I had to delete the cache entry then run cron to get it to repopulate the data.

In this function: ga_stats_get_data
In the file: ga_stats.module
I slightly modified the last bit to exclude the node/nid paths (because it was showing duplicate entries for each node).

// only log nodes
if ($count->nid) {
if(substr( $count->url, 0, 6 ) !== "/node/") {
$counts[] = $count;
}
}

I would like to see a more long term option to either exclude node/nid paths (and other things like revisions), or to bundle them all together into a summed count.

I don't think getting duplicates is what anyone wants, but I understand deciding proper behavior would be difficult (maybe people don't want node/nid paths excluded, or revisions, etc).

yatendrasingh121’s picture

Issue summary: View changes
FileSize
508 bytes

I had the same issue. there are two entries in ga_stats_count table for same node, metrices and timeframes. In my case duplicate entries are because of url aliases, one entry contain alias and second entry contain node/[node-id] in url column.

Attached patch solve this duplicate entry issue.

yatendrasingh121’s picture

Correcting notices.