The $mega_query in uc_followup_cron() is very aptly named. On a site with a lot of orders, it can take a LONG time to run. On a site that I'm currently maintaining, I'm using the Elysia Cron to restrict the UC Followup's cron to once a day at 2am, because it takes an average of 15 seconds to run.
Recently, it took 459 seconds to run, which combined with a large flush of traffic caused my database to spin out of control.
I'm creating this issue because I'm interested in exploring ways of improving uc_followup_cron() to make it more efficient.
Comments
Comment #1
DanGarthwaite commentedIt has been a long time since I fought with this, but I know a little more about SQL pattterns.
What if the unix timestamp of the future date is calculated at the time it is entered, and then the query would look like:
SELECT * FROM {table} WHERE next_reminder < CURRENT_TIMESTAMP() AND already_sent != 1;
If you want to repeat sending, recalculate the next current_timestamp, otherwise set already_sent = 1.
I've feelings of anxiety just remembering all the date logic in that mega_query....
Comment #2
m.stentaI found two small things that improve performance a bit:
For one, the module creates a combined index of order_id and followup_id in hook_schema:
... but in the mega query, it does a GROUP BY with them switched (followup_id first, then order_id):
In order for that GROUP BY to work with the MySQL index, it needs to be in the same order. So changing it to the following in the query helps a little:
For more info, see:
http://www.ovaistariq.net/17/mysql-indexes-multi-column-indexes-and-orde...
http://stackoverflow.com/questions/3915574/mysql-optimization-group-by-m...
The second thing that would improve query speed is adding an index to the date column in the {uc_followup_sent} table. The mega query performs an ORDER BY using the date column, which can be improved by indexing it. Adding the following to hook_schema would help (a hook_update_n will also be required):
Neither of these changes improved my queries significantly, but they did help a little. I think it's going to require more of an overhaul than just some improved indexing. I'm doing more digging, but will try to put together a patch that includes these improvements as well. Just thinking out loud for now...
Comment #3
m.stentaCommitted the two above-mentioned changes to the 6.x-1.x branch.
Still organizing my thoughts regarding overhauling the uc_followup_cron() function...
Comment #4
m.stentaamorsent and myself spent a good bit of time yesterday (on the plane to DrupalCon) dissecting the query to better understand what it's actually doing. We found a number of issues with it, which I'm willing to bet are responsible for some of the other bug reports in the issue queue.
I copied the query in its entirety below, restructured it a bit for easier reading (without changing what it's doing), and added lots of comments to explain it step by step. I noted the possible issues where applicable.
Conclusion: this query needs some serious work. I'm actually surprised that none of this has been discovered sooner. I guess the fact that *some* emails were going out hid the fact that almost everything else about it is NOT working correctly.
I'd love to get some second opinions on this, to make sure that the issues described above actually are issues. I've gone over this query many times, to double-check my assumptions, so I'm fairly confident about it. But I may have made a mistake or misunderstood something, so any/all feedback is welcome!
I've started reworking the query from scratch, and also reworking the uc_followup_cron() function. I'll post my ideas as soon as they are fleshed out.
Comment #5
m.stentaI think this issue is the cause of these other three issues:
#620800: Sending way too many emails!
#832416: Repeat send out emails every cron runs.
#724160: Followups not being sent
I'm going to redirect efforts from those issues to this one.
Comment #6
m.stentaPossible issue: if this module is installed on a site that has a lot of past orders, there's the potential for unwanted followups to be sent out for them.
It seems that a previous issue (#620800: Sending way too many emails!) tries to fix this by adding the BETWEEN date logic to the query, but that logic doesn't seem right.
Two possible approaches:
1) Add an "effective" timestamp to followup rules, which determines when they should go into effect. This would allow you to only send out emails for orders AFTER a followup rule is added.
And/or...
2) Add a global setting for uc_followup that allows you to define a sort of "statute of limitation" for followups... meaning only orders that have been modified in the past X days would be included as possible followups.
Comment #7
m.stentaChanging the name of this issue to better reflect it's new intentions... it's more than just a performance issue.
Comment #8
wqmeng commentedHello,
As some other reporter said before, the logic of the send out should be totally rewrite.
I have already give up to use this module for a long time, I have created a custom module and with a cron_hook which works just as I expected .
Comment #9
m.stentawqmeng: Thanks for the confirmation that it needs rewriting. If you have any code that you think would be helpful, please feel free to share it! I am working on a rewrite right now, and I think I have it mostly figured out, but if you have any additional ideas they are certainly welcome. I will be posting my findings soon.
Comment #10
m.stentaAttached is a proposed patch to 6.x-1.x-dev that is a full rewrite of the query and some of uc_followup_cron().
It also adds an 'effective_date' field to followups so that you don't run into the issue where an existing site with lots of past orders wants to send out tons of unnecessary followups as soon as you install this module (see #620800). It defaults all existing followups to become effective when you run update.php (although this may require more thought... it could end up missing some emails that way... ideas?)
Please review!
Note, this only changes the query in uc_followup_cron(), which takes care of automatic followups (the bigger problem). I still need to fix the query for determining which manual emails need to be sent, because right now it's still using the old (incorrect) logic.
Comment #11
m.stentaCouple of things wrong with the patch above, fixed in this new one:
-Forgot to put curly brackets around table names.
-Forgot to replace UNIX_TIMESTAMP() with %d placeholders.
-Moved $i++; out of the email sending "if" structure (see below).
I moved the $i++; because of the following scenario: imagine a site that, for whatever reason, ends up with orders that don't have email addresses. These will eventually build up in the query's results, resulting in needless uc_order_load()s. This is a problem regardless of where the $i++ is, but at least moving it down two lines will stop it from executing uc_order_load() more than the 'uc_followup_send_limit'.
I'm open to ideas on how to avoid that problem too.
Comment #12
m.stentaOk, here is a patch that includes the stuff in #11, but also fixes all the other queries in the module. I abstracted the query out into a helper function so that it can more easily be reused by other functions (and altered slightly depending on what they need).
Please test this patch. I'm starting to test it myself in a live site, so I'll let you know how that goes, but second opinions are welcome!
Comment #13
m.stentaJust to give you an idea of the performance improvements... on the site that I originally reported this ticket for, the max cron execution time has dropped from 459 seconds to 2 seconds! YEA!!!
I'm keeping an eye on the sent emails after every cron run and it appears to be working as expected.
Comment #14
amorsent commentedHurray!
Comment #15
m.stentaI'm committing this to 6.x-1.x-dev and closing this ticket as fixed. It's working well for me, which is 100% better than the current release of the module was, so I see no reason to wait for more reviews. Please submit issues if you find any bugs with it!
This update is also a great reason to make a 6.x-1.2 release (along with some of the other recent bugfixes). I might wait a bit on that to see if any issues are found with this... or I might not. At the very least I want to run a Code Review cleanup of the module to bring it all up to standards.