i ran into some timezone issues and realized that gmt timestamps are best. we might need an update to correct stuff in the db.

Comments

drewish’s picture

StatusFileSize
new4.69 KB

here it is with an update (well two actually, splitting them up avoided timeouts).

dww’s picture

All looks reasonable, but I have no time to test this now. Have you already tried all of this on test data? Thanks.

drewish’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new5.03 KB

I did test it but after your prompting I added a little debugging code and re-ran it on a clean copy of the d.o database to double check. In the process realized that I'd swapped the rounding code for days and weeks. I reversed that and added some code to extend the script time limit because the updates were timing out on my machine.

Once I got that straight, I put another clean copy of the d.o db on and re-ran it. Everything looked good so I'd say it's RTBC.

drewish’s picture

Status: Reviewed & tested by the community » Needs work

viewing the data i think it might need a little more tweaking...

drewish’s picture

Status: Needs work » Needs review
StatusFileSize
new7.13 KB

I started changing the timezone on my machine around to see what would happen and realized that getdate() uses the server's timezone. I've updated the patch to remove the cancel out the timezone.

Also, now the update now only runs UPDATE queries when the timestamp changes which, since the timestamps in the d.o'd database are already GMT, turns it into a very quick update.

I think this is finally good to go, but after the back and forth I've had with myself on this, I'm going to mark it needs review.

dww’s picture

Status: Needs review » Needs work

Upon further consideration...

A) It's too bad the 2 updates duplicate so much code. If they run so fast now, can't we merge them back into a single update that just loops over all 4 tables?

B) Once this is actually RTBC and committed, how do we safely run this update on d.o? Aren't we screwed if any usage data comes in between the time the source is updated and the update runs? Should we temporarily disable usage data collection while we do the upgrade? I haven't really thought about it much, but I'd appreciate you spending a little time to come up with a coherent upgrade plan, so we don't break anything.

Thanks,
-Derek

dww’s picture

p.s. if it's still important to keep those 2 DB updates seperate for some reason, can we at least put the meat of the code into a helper function, invoked by both updates?

drewish’s picture

StatusFileSize
new7.7 KB

a) good call, they can be in the same function now.
b) won't be a problem. as soon as the patch hits new data will be rounded existing data that's not on a GMT will be rounded... based on the last db dump from d.o. there should be no values changed.

p.s) yeah, i actually think i should refactor some of the code to make it into functions for the module.

I'm rolling a patch now so I've got a "clean" copy of the changes. I'm leaving it as needs work because I need to apply some of the visibility patch to test it.

dww’s picture

Maybe this is crazy, but the only difference between the day and week cases in the DB update now is the rounding function to use. Couldn't you further consolidate via something like:

$tables = array(
  'project_usage_raw' => 'day',
  'project_usage_day' => 'day',
  'project_usage_week_project' => 'week',
  'project_usage_week_release' => 'week',
);
foreach ($tables as $table => $type) {
  $query = db_query("SELECT DISTINCT timestamp FROM {$table} ORDER BY timestamp ASC");
  while ($row = db_fetch_object($query)) {
    $old = (int) $row->timestamp;
    // Round the timestamp to the begining of the week.
    $new = $type == 'day' ? (int) project_usage_daily_timestamp($old) : (int) project_usage_weekly_timestamp($old);
    if ($old != $new) {
      $ret[] = update_sql("UPDATE {$table} SET timestamp = $new WHERE timestamp = $old");
    }
  }
}
drewish’s picture

Status: Needs work » Needs review
StatusFileSize
new5.31 KB

i wrote some simpletest tests which i'm attaching. i couldn't figure out how to add a file in a subdirectory (tests) to the patch. to get it to upload i had to add the .txt extension.

patch to follow.

drewish’s picture

StatusFileSize
new7.13 KB

here's the patch... little bit different take on the changes you'd asked for in the update function. thanks for being such a pain in the ass, the code's looking much better ;)

restored the clean d.o database dump and re-ran the update. as i'd hoped, no queries were executed.

the patch also adds the simpletest hook needed to run the tests.

dww’s picture

Status: Needs review » Needs work

Wrong patch at #11? We've regressed to separate DB updates with a bunch of duplicated code.

drewish’s picture

Status: Needs work » Needs review
StatusFileSize
new7.69 KB

yeah wrong patch.

drewish’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new8.12 KB

on IRC dww pointed out a typo. i fixed that and another one in a comment.

dww’s picture

Status: Reviewed & tested by the community » Fixed

Final review, tested on s.d.o, committed, installed on d.o and update_5000 run (without queries). Thanks!

Anonymous’s picture

Status: Fixed » Closed (fixed)