Over on #344344: Weekly usage stats not processed if we have a full week of data but we processed less than 1 week ago it was pointed out that the tables shouldn't list weeks if they're not "active" per roject_usage_get_active_weeks(). I wanted for that bit off into it's own issue since it's display related and #344344 is solely related to the batch processing.

Comments

drewish’s picture

Humm... it looks like this is a bit more involved that I'd though. The problem with project_usage_get_active_weeks() is that it only takes {project_usage_week_project} into account but there's actually two separate expiration times for the weekly data: per project and per release. We either need to sync up the the settings and then we can use the same $weeks values for both tables or we need to specify which type of weeks we're interested in to project_usage_get_active_weeks().

dww’s picture

A) I can't think of a good reason you'd want separate lifespans for project vs. release weekly data. I'd be fine merging those unless there's a good argument you can think of for why we'd ever want to save less weekly summary data of one than the other.

B) I can see a good reason to have different display lengths for project vs. release and for different pages. E.g. for the charts, it's nice to have more data points on the project data across API version. However, the table of release data on the project page can't be too wide (8 seems like about the max). OTOH, on the release-specific page (with a chart and a table with only two columns and new rows for each data-point) the more data the better.

C) This makes me think the way project_usage_get_active_weeks() works is a little off. Instead of caching the array at a fixed length, the query to find the distinct timestamps should just find all distinct timestamps we have weekly summary data for and store that in the variable. Even if you're storing 3 years of data, unserializing an array of 156 integers can't be that bad compared to all the other crap that's unserialized in the $conf array. Then, when you call the function, you can pass a $week_count argument. It grabs the full array from the setting, and then returns you that many elements of it. If the $week_count is NULL you get all rows back.

This way, the per-release page can just call it with a NULL count and get all the data. The per-project page can call it twice -- once for the per-release detail table at the top with a $week_count of variable_get('project_usage_show_weeks'), and once with a NULL count for the per-API data.

If it seems crazy to have basically unlimited data and we're worried about (un)serializing an array of potentially 156 ints, I'd be happier if we still merged the lifespan settings into one, and introduced new settings to make 'project_usage_show_weeks' more fine-grained, and then inside project_usage_get_active_weeks() we'd limit the query to the max of all the *_show_* settings.

Thoughts?

drewish’s picture

the rationale for splitting them was the the difference in the sizes of the datasets. the per project data is one row for each project,api,week tuple where the per release data is one row for each release,week tupple. each project will probably only have two or three api versions while it might have tens of releases. so you might be interested in the 5.x 6.x for a lot longer than you'd be interested in the usage for each release. but that said it might be saner to just link them up.

but that aside i think you've outlined a good approach. we should sync up the weekly lifetime values (we can just grab which ever is larger at this point for the update function) and in project_usage_get_active_weeks() query search for distinct timestamps in each of those tables and cache them... i don't even think it's needs to be a parameter.. just unserialize it and call array_slice if you want less.

i'll have some time to work on this over the weekend.

dww’s picture

Version: 5.x-1.x-dev » 6.x-1.x-dev
Status: Active » Needs review
bdragon’s picture

Issue tags: +Needs deployment

Copying over tags and subscribing.

donquixote’s picture

+1, subscribe

bdragon’s picture

Assigned: Unassigned » bdragon

A review would be nice.

  • drumm committed 0616f34 on 7.x-2.x
    Issue #346046: Do not show unprocessed weeks in project usage
    
drumm’s picture

Assigned: bdragon » drumm
Issue summary: View changes
Status: Needs review » Fixed

I went ahead and fixed this since I'm tinkering with #2870226: Replace flot with d3.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

drumm’s picture

Issue tags: -Needs deployment