Nice little module. Thanks for the contrib.
When you first enable the module, the functionality is enabled, which means that it will fire the next time cron is fired. In my case, cron fires frequently and, in fact, the deletions were triggered before I even had a chance to configure the settings. Which meant that thousands of session rows were deleted that I didn't want deleted because I wanted the settings different.
Ok, so I was testing this against my dev server, so no big deal. Also, you argue, I could have overridden the variables in my settings.php, but that is pretty obscure.
I suggest your module is configured off by default so admins have a chance to review the settings before the trigger is pulled. (By the way, I use Elysia cron as you suggest, and have for two years now -- great module -- but I also didn't have a chance to configure the job to fire in the wee hours.) Bit of an issue when you have 5 million session table rows. At least it gave me plenty of time to send this post...
Comments
Comment #1
kbahey commentedDo you have a patch we can all look at?
Comment #2
dwightaspinwall commentedSure - I'll be happy to roll one, but there's another issue here. The initial delete is way expensive -- I'm getting about 1000 rows deleted in 20 sec. Do the math for a sessions table with 5 million rows. My thinking is to put a LIMIT 1000 into the query (or this could be configurable too). Thoughts?
Comment #3
kbahey commentedThe LIMIT is a good idea, but I am not sure how much of a performance benefit it would provide.
If you have that dev machine with gazillions of rows, you can benchmark it and confirm.
Also, is it ANSI and hence would work on PostgreSQL as well, or is it MySQL specific?
Comment #4
dwightaspinwall commentedI benchmarked the LIMIT on my dev machine. The first row comes back fast; it's the actual delete operation that seems to be slow. I.e. there are no indexing issues.
The ANSI question is a good one. I'll look into it.
Comment #5
kbahey commentedDon't understand your conclusion from the benchmark.
You are not talking about SELECT with LIMIT, are you?
So, say I have a table with 5 million rows. I do:
DELETE FROM sessions WHERE ...; // No LIMIT clause.
Say it takes 10 seconds.
Then you restore the table and do the same with a LIMIT.
DELETE FROM sessions WHERE ... LIMIT 5000;
How long does that take compared to the first case?
Comment #6
dwightaspinwall commentedYeah - sorry for my terseness. I did the test using a select, and the rows came back immediately, which implies no huge temp table and filesort (which may indicate an indexing problem).
As for your question -- the length of time the delete takes appears to be linear with respect to the limit #.
Comment #7
dwightaspinwall commentedHere's a patch which does two things:
1. It allows the module to be enabled and disabled. It is disabled by default. When first installed, user is directed to settings page.
2. It has a configurable limit on the max number of deletions which can be performed in a single (cron) run. This is set by default to 1000. This should prevent sites from locking up when the module is first enabled and the # of rows in the sessions table is high.
.install file also attached for completeness (as a zip since I apparently can't upload .install extension)
Comment #8
dwightaspinwall commentedDisregard the patch in #7; it has a bug. The install file in #7 is fine. Attached is new patch file.
Comment #9
dwightaspinwall commentedThe other obvious benefit to this patch (#8) is that cursoring and doing individual deletes avoids the long-running table lock.
Incidentally, we've been running this in production for a week with no problems.
Comment #10
dwightaspinwall commentedI've made an improved patch (against dev), Replaces the above patches and does the following:
1. Removes the expire interval concept as this is now unnecessary given that long sessions table locks are avoided I.e. only one {sessions} row is deleted at a time.
2. Doesn't included enable/disable as this in unnecessary for the same reason as #1.
3. Retains the maximum deletions per invocation as this is important to avoid long-running cron jobs.
4. New install file (cleans up variable deletion code).
Marking this as critical because enabling this module as currently written in a production environment with a large session table will lock up the site under a common database configuration (mysql, myisam).
Comment #11
dwightaspinwall commentedComment #12
xurizaemonI believe DELETE ... LIMIT is invalid except in certain DBs (MySQL). This patch may need to check for a supported engine before applying the limit. The question was raised above but I don't see an answer to the question. Doesn't appear valid for Postgres.
Comment #13
dwightaspinwall commentedThe patch doesn't use the LIMIT clause in the DELETE statement. It cursors over candidate rows using SELECT and does individual DELETEs by session id.
Comment #14
dwightaspinwall commentedComment #15
dwightaspinwall commentedI think we need to get this committed. Khalid, if you'd like a co-maintainer let me know; glad to help.
Comment #16
kbahey commentedNeed someone to test this on a large database and report back.
I am concerned that this cursor approach will be too slow and resource intensive.
Comment #17
dwightaspinwall commentedThe maximum number of deletions per cron invocation (SESSION_EXPIRE_MAX_DELETIONS) is user-settable, and by default is 1000, which perhaps should be set by default to 100 for sites on underpowered stacks.
Can you define what you mean by large database? We have 200,000 users on http://www.hci.org. I started running this module (modified with the above patch) against the site when {sessions} had over 5,000,000 rows. Does that qualify?
Comment #18
dwightaspinwall commentedBump. This issue has been running for eight weeks. Time to get this code committed unless there are valid objections.
Comment #19
xurizaemonRTBC. 2.2M entries in session_expire on a moderately powered webserver, no significant delay while cron runs.
Much better than the status quo of having the site hang when session_expire_cron runs, too :)
Backport to D5 attached.
Comment #20
dwightaspinwall commentedThanks grobot. kbahey, can you update on your plans to get this into a release?
I reiterate my offer to co-maintain this module.
Comment #21
xurizaemonD6 patch for Git (I've used format-patch but please credit Dwight, not me). Just a re-roll of Dwight's earlier work.
The only effective changes are (1) retention of Khalid's copyright message and (2) correct attribution of the watchdog stmt as being from this module, not cron module. I don't either merits a separate issue.
Related issues -
* #263130: Big session tables can crash SQL server
* #559814: Sessions table (InnoDB) locked when expired sessions are being deleted
Comment #22
dwightaspinwall commentedThanks grobot. Unfortunately, it doesn't appear that this module is being maintained. Khalid - what's the maintenance status on this module?
Comment #23
rmiddle commentedCommitted to the dev branch on both the 6.x and 7.x branch but wasn't in time for the alpha1 release will be in the alpha2 release.
Thanks
Robert