Improve speed by avoiding unnecessary updates in sess_write()

moshe weitzman - December 9, 2005 - 15:17
Project:Drupal
Version:6.x-dev
Component:user system
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed
Description

In sess_write(), we issue 2 UPDATE queries for every page view for a logged in user, and one for anon users. These writes happen even for a cached page! Since these are UPDATE queries, they are relatively expensive. We do this in order to persist any $_SESSION info that was changed and to update the user's last acess time. However, $_SESSION rarely changes and we should be able to notice if it changes and only write when it does. Furthermore, a user's last access time need not be updated with every page view. Once every 5 minutes would be acceptable IMO. So in many cases, both of these writes can be avoided.

Karoly and I worked on this in Amsterdam but didn't finish. I attach our patch here in order to inspire someone to finish it.

AttachmentSizeStatusTest resultOperations
session_optimized.patch2.26 KBIgnoredNoneNone

#1

pwolanin - October 4, 2006 - 01:30

just found this patch playing patch bingo. Seems like it could provide a genuine speed gain.

I'm unclear as to what's stored in SESSION- this DB update different from the one done by the tracker module, etc. to keep track of a user's page views?

Also, is important for any reason to know the true "last access" time? With this patch it seems that this time may be off the true value by ~299 seconds.

#2

moshe weitzman - June 14, 2007 - 14:48
Version:x.y.z» 6.x-dev

the last access time in session is used for who's online block and for throttle and for aggressive cache calculation. i think those features could be reworked in favor of performance as proposed here. core should not guarantee up to the second accuracy for last access IMO.

#3

kbahey - June 15, 2007 - 03:53

Discussed this with Moshe today on IRC.

What I would like to see is leave the present behavior as the default (each access updates the tables and we have up to the second accuracy). However, we add an option under Performance that would allow an admin defined interval to be set (e.g. 1 minute, 5 minutes, 10, 15, 30) where updates would not be done for subsequent accesses.

Moshe said that the impact on throttle (how many people are accessing the site in the last X minutes) would need to be assessed as it will impact throttling (site can be under stress, yet throttle does not kick in due to delayed writes by this patch). Also the aggressive cache code need to be assessed as well (unknown at this point).

As usual, benchmarks needed too.

#4

moshe weitzman - July 19, 2007 - 03:01
Status:needs work» needs review

I looked again and all the issues kbahey listed apply to the write into sessions table. The write into the users table is just cosmetic - no core code relies on the users.access column being up to date.

Here is a patch which defaults to writing into the users table no more frequently than once every 5 minutes for a given user. The only repurcussion that i found is that the user's last access is displated on the admin/user/user page and it won't be 100% accurate but we never claim otherwise.

The 5 minute interval can be changed in settings.php or by a contrib module. I don't see much value in core providing a UI for this. A user wouldn't really care unless they knew a lot - enough to use settings.php to change.

AttachmentSizeStatusTest resultOperations
mw_62.patch1.55 KBIgnoredNoneNone

#5

kbahey - July 19, 2007 - 16:27
Status:needs review» needs work

Minor style issue:
$ago = time()-$access;

Should be (spaces around the minus:
$ago = time() - $access;

Otherwise, looks good and needed.

#6

moshe weitzman - July 19, 2007 - 16:32
Status:needs work» needs review

fixed code style isues.

AttachmentSizeStatusTest resultOperations
mw_63.patch1.55 KBIgnoredNoneNone

#7

kbahey - July 19, 2007 - 16:37
Status:needs review» reviewed & tested by the community

Can someone give feedback on the throttle impact of this?

Setting this as RTBC regardless.

#8

moshe weitzman - July 19, 2007 - 17:04

throttle cares about sessions, not users. zero impact.

#9

Dries - July 20, 2007 - 08:42
Status:reviewed & tested by the community» needs work

I'm not sure we have to make it this complex. If we change the code not to write more often than every 30 seconds, that should be OK. I don't think we need a setting for this. Let's try to compact this code a bit?

#10

kbahey - July 20, 2007 - 13:40

Dries

This is not overly complex, IMHO.

The fact that it is configurable (without a UI) is also good, because it means sites can override it in settings.php if 300 is too low or too high. I would imagine 60 to be a popular value.

A hard coded value means we need to hack core on big sites.

If you want the new function inlined, then maybe that can be done, but the configurability has to be there either way.

#11

moshe weitzman - July 23, 2007 - 01:57
Status:needs work» reviewed & tested by the community

atached is a much more compact version. in fact, we only add one line of code. gone is the extra function, and the variable_get().

dries is proposed to 30 seconds as a minimum interval. 30 seconds is a bit better than no interval (current behavior), but not by much. few authenticated users request pages more frequently than once every 30 seconds? i propose 180 seconds in the attached patch. again, there is very little down side to using a large number here. $user->access is not used except for display to admins.

AttachmentSizeStatusTest resultOperations
mw_64.patch1.02 KBIgnoredNoneNone

#12

kbahey - July 23, 2007 - 03:10

180 is better than the status quo.

Still, I would like to see a variable_get().

Hope Dries sees the light.

#13

Dries - July 23, 2007 - 07:30
Status:reviewed & tested by the community» fixed

I've committed a slightly modified version of this patch.

#14

Anonymous - August 6, 2007 - 07:32
Status:fixed» closed
 
 

Drupal is a registered trademark of Dries Buytaert.