Using session_api with flag for anonymous users on Pressflow, everything is working beautifully after #785292, except that all anonymous sessions are being cleared on cron. This makes sense given the cron task:
function session_api_cron() {
// Fetch list of outdated sids.
$result = db_query("SELECT sap.sid FROM {session_api} sap LEFT JOIN {sessions} s ON (sap.session_id = s.sid) WHERE s.sid IS NULL");
$outdated_sids = array();
while ($session = db_fetch_object($result)) {
$outdated_sids[] = $session->sid;
}
if (!empty($outdated_sids)) {
module_invoke_all('session_api_cleanup', $outdated_sids);
db_query('DELETE FROM {session_api} WHERE sid IN (' . implode(',', $outdated_sids) . ')');
}
}
Basically, since a "faux session" is started without an entry in the sessions table, all of these entries are cleared on cron. It seems like there could be some expiration logic in the session_api table to combat this clearing. I realize this is an edge case, but it's a pretty big issue for Pressflow users.
| Comment | File | Size | Author |
|---|---|---|---|
| #39 | 1058960-expiration_logic-38.patch.txt | 937 bytes | a.ross |
| #38 | 1058960-expiration_logic-38.patch | 937 bytes | a.ross |
| #36 | 1058960-expiration_logic-36.patch | 937 bytes | a.ross |
| #29 | 1476828-bad-on-clause.patch | 622 bytes | jaydub |
| #22 | cron_expiration_logic_d6-1058960-22.patch | 3.02 KB | jessehs |
Comments
Comment #1
jide commentedEncountered this issue too with d7.
Moreover, unless I'm missing something, session_api and sessions tables are never left joined ? And the condition for the sid to be null is even more obscure to me. Something wrong here ?
Comment #2
jide commentedOkay, here is my attempt at a patch for this. TBH, I'm not sure I understood what the original intent of the author was, so I may be wrong with this, but I can't see how this could be different.
Comment #3
jide commentedStill not used to git patches. Here we go.
Comment #4
jide commentedOuch that
Shouldn't be here, must reroll the patch.
Comment #5
jide commentedCorrected the patch.
Comment #6
rbayliss commentedThat's pretty much exactly what I ended up with for D6/Pressflow as well. Just to make it clear for those coming later,
This is the Drupal 6 Patch. The Drupal 7 patch is in #5.
Comment #7
jide commentedNice, we should sync those two patches so that d6 and d7 versions are similar.
One thing I don't get though :
How would core sessions handling have any impact on this ? Am I missing something ?
Comment #8
jide commentedBTW, any clue on this join ? :
And this condition :
Since session_id is never NULL, I can't see what this JOIN and this WHERE purpose is.
Comment #9
rbayliss commentedThe issue occurs when anonymous users do something that requires Session API, but not a core session. Since both Pressflow D6 and D7 try to avoid starting a session for anonymous users, this happens quite a bit. In my case, it is anonymous flagging that causes a Session API cookie to be set and a Session API record to be written to the database, but no core session to be created (rightly so).
Before this patch, every cron run would clear these Session API sessions out of the database since the
WHERE s.sid IS NULLpart of the query was matched. So to answer your question, core has some control over Session API sessions in that as long as a sessions table entry exists, the matching Session API record will not be deleted. IMO, this is good and expected behavior to let core expire sessions (and with it Session API sessions) wherever possible, which is the reason I'm suggesting setting a 0 created date for any Session API records that match up to the sessions table:I admit this is semantically confusing to have a created column with a bunch of 0 values though, so maybe there's a better way to do this. Just to be clear, I'm suggesting we add expiration logic for the Session-API-only sessions, not every session API record.
Comment #10
rbayliss commentedRight, session_api.session_id is never NULL, but sessions.sid will be NULL if no matching core session exists.
Comment #11
chaps2 commentedI don't get this either. When will the session_id in session_api ever match the sid in the session table? They are created independently and are not the same. As far as I can tell, the left join will never be satisfied so all rows in the session_api table will be deleted with every cron run.
Is the original code in the cron hook just plain wrong or am I missing something too?
Comment #12
rbayliss commentedAh yeah, I see what you mean. Guess I missed that change. Seems like the intent of this module (prior to #785292: Incompatible with Pressflow) was to have the session_api.session_id column be the core session ID if one is set, which is why the left join is there to begin with. This patch just moves back to using the core session ID if one exists so that left join makes sense again. Otherwise, it should be the same as above.
Comment #13
jhedstromMarked #1140064: Cron hook incorrectly deletes all session api records as a duplicate.
Comment #14
jhedstromI'm afraid the approach in 12 may revert the logic added for Pressflow, since the session_id() function is used to "get and/or set the current session id", but I haven't had time to test it.
The patch for 7.x is looking good, but I'd love to add a test here to make certain that sessions aren't cleared out on cron that shouldn't be.
Comment #15
jhedstromchanging status.
Comment #16
ju.ri commentedhas anyone got this to work? I just tried the patch in #12 and got lot of duplicate entries errors :
User warning: Duplicate entry '2cb29cdbc3bafe0e...' for key 'session_id' query: INSERT INTO session_api (session_id, created)
Comment #17
jaydub commentedI've attempted to sync up the patches in #5 and #6. I avoided #12 since as noted in #14 that would seem to create issues with Pressflow which is the environment we are testing session_api under.
I added an index to the timestamp field since the core sessions table has one and in theory since the queries that operate on session_api table with these patches have where clauses on the timestamp field then it should be indexed. Feel free to strip that out.
Comment #18
jhedstromChanging status--I'd love some feedback on how the patches in #17 are working for people before I commit them.
Comment #19
jaydub commentedThere's definitely an issue with a simple check of isset($_SESSION) as a proxy for a drupal {sessions} table entry.
In our case we have Pressflow in place which does not write a row to the sessions table unless a data is explicitly added to the $_SESSION array and is meant to persist beyond the page request. There are other parts of Drupal that can add to the $_SESSION array but that will not persist beyond the page request. An example is a drupal_set_message() message. The $_SESSION array will be populated with messages during the building of the page and then emptied out when the page is rendered.
So in our case we have a flag that uses a confirmation form upon flagging. So for an anonymous user the following happens:
* User clicks flag link
* User is taken to page with flag confirmation form
* Upon confirmation, the flag is set and a session is created in session_api
** In the session_api.module code in session_api_get_sid(), this results in a row in session_api and since $_SESSION is not set at this point, the timestamp used for the session_api record is time().
** A message is set using drupal_set_message() to tell the user that the content was flagged and so $_SESSION is now populated (but after the record to session_api is added
* The form redirects back to the page that the user first flagged from.
** In the session_api.module code the session_api_get_sid() call now is an update. The same test for isset($_SESSION) is now true because the drupal_set_message() string is in the $_SESSION array but there still is not a row in the sessions table (Pressflow here).
** This results in a $row->timestamp value of 0 which is then used in the update to the row in session_api.
Now when the session_api_cron() runs, since this row has a timestamp value of 0 and there still is no row in the sessions table, this session data will be deleted from session_api instead of persisting until (timestamp + session_api_cookie_expire_time).
I'm looking for a reason why $row->timestamp shouldn't just always be set to time() in session_api_get_sid() so please correct me if I'm wrong here...
Comment #20
jerry commentedSubscribing.
Comment #21
jerry commentedChanging status--I'd love some feedback on how the patches in #17 are working for people before I commit them.
FWIW, the 6.x patch didn't help in my non-Pressflow installation. The symptom was that anonymous node flags were no longer visible to Views after a cron run, though the node itself still indicated that it was flagged. The behavior was unchanged post-patch.
I'm looking for a reason why $row->timestamp shouldn't just always be set to time() in session_api_get_sid() so please correct me if I'm wrong here...
I noted in the sessions table that the timestamp for my anonymous flag sessions was consistently being set to 0. Changing the timestamp to unconditionally use time() as above appeared to solve the Views problem described. I haven't had time to look at this closely enough to be confident that this is the right approach.
Comment #22
jessehsI re-rolled the D6 patch in #17, except I removed the logic that would save the timestamp as 0 when the $_SESSION variable is not set, following jerry's comment in #22 above.
I think the problem occurred (at least for me) when a session_api row was updated while the $_SESSION variable was set, as this set the timestamp to 0.
Rolled against 6.x-1.x head. Also note that this adds the $outdated_sids array back to the query in the hook_cron function, which apparently got removed in the 6.x branch.
Comment #23
mattcasey commentedThe patch in #22 worked for me on my site -- running Cron did not clear my flags this time. Using Sessions with Flag 6.x-2.0-beta6
Comment #24
trrroy commentedThe D7 patch in #17 is working for me.
Comment #25
ashokpola commentedThe D7 patch #17 is working fine for me. But still getting Duplicate entry errors in log. Is there any way of get rid off them?
Comment #26
ashokpola commentedI posted a patch to fix duplicate errors, please review it & let me know if any issues are found.
http://drupal.org/node/483176#comment-5699138
Comment #27
jhedstromI committed a slight modification of the patch in #17 to the 7.x branch (the db_add_table and db_add_index calls in 7.x no longer have the $ret parameter). Thanks all!
Comment #28
jhedstromComment #29
jaydub commentedThe patch first submitted in #5 and carried through in subsequent patches (including my own) for d7 added a JOIN on the wrong column in moving from the d6 SQL to the d7 DBTNG SQL:
The join should be on sap.session_id = s.sid on: + $query->leftJoin('sessions', 's', 'sap.sid = s.sid');
I think this is responsible for the issues #1401000: session_api_cron() not valid for PostgreSQL and #1476828: Postgresql and possibly other d7 related issues.
Patch is attached.
Comment #30
olarin commentedseems simple enough - caused no problems for me with latest dev version (although i'm not using pressflow nor postgresql anyway). any reason not to commit?
Comment #31
jm.federico commentedHi Guys
Sorry to move this needs work
But i'm still unsure on why we are doing the Join.
As I can see we are only fetching one field
If so, the LEFT JOIN is useless as it will always return the column in sap (session_api) regardless of what we have in session.
And a second problem is that we assume here that we have sessions in the SQL database, but what if we are using say mongodb for session storage?
I don't think we should use the join, I see no point in it.
If I missed something and the join ias needed, please let me know.
Cheers
Comment #32
jibus commented#29 resolve my problem on PostgreSQL
Comment #33
tic2000 commentedI have to say the the Pressflow patch was not very well thought through.
Before that patch there was a "link" between session_api and session tables. Now that link is totally gone. If that broken link is a good or a bad thing is not for me to decide.
Given that the left join is not wrong because it's on the wrong columns, is wrong because it was left there. sap_session_id and s.sid will NEVER be the same. The first is md5(ip_address() . time() . drupal_get_private_key()) the second is session_id().
That left join has to go because it does absolutely nothing.
Someone was concerned that session_id() not only gets, but also sets a session id and that would brake the pressflow patch. session_id() sets the session id only if an argument is passed to it AND if that is done before session_start(). Without an argument the function is just a getter and if no session id exists it returns "".
Comment #34
ju.ri commentedHi, I have tested the code in #22 for Drupal 6 and it seems to work well.
Will this be committed, aka good to use on a live site? There's a database change, that's why I'm asking.
The unresolved issues seem to be in D7?
Comment #35
goldAs it stands in #29 this fixes my issue with posgres. Reading further up though, that doesn't appear to be the original problem. I arrived here from #1401000. I'm going to steal the patch in #29 and apply it there. The patch at #29 does address the issue at #1401000.
Comment #36
a.ross commented#31 and #33 are correct and I came to that same conclusion. Here's a patch to remove the join, I also took the opportunity to use
db_deleteinstead ofdb_query.Comment #37
a.ross commentedComment #38
a.ross commentedComment #39
a.ross commentedRe-uploading for testbot
Comment #40
a.ross commentedI guess I'll commit this as it's labeled critical. I'd like some test coverage for the cron run though.
Comment #41
a.ross commented