Persistent caching refreshed on CRON
| Project: | Block Cache |
| Version: | 4.7.x-1.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | needs work |
I discovered this limitation of blockcache, while trying to optimize a quite sizeable instance. Here is our deadly mix:
1) Several blocks that runs long, complex, user-unfriendly queries
2) Several background processes (via cron) that run node_save, expiring the cache
3) Significant traffic that requests these complex blocks as often as every few seconds
Obviously, blockcache is the only hope here. Here is how I tried to configure cache lifetime and what goes wrong:
1) EMPTY: blocks are cached as temporary and flushed almost immediately by the background process (node_save). This means site visitors get no benefit from block cache and are stuck with loooong running block rebuilds
2) POSITIVE: blocks are cached for a set lifetime. If it is too small, the result is much like the case above - since cache is cleared so often. If it is greater refreshes happen without regard to application's logic and content can be stale for long.
3) ZERO: blocks are cached permanently and never flushed. Of course this removes any interference with background processes and streamlines end user response time. But the blocks are not refreshed.
Worse of all, under status quo when long-running blocks are rebuilt while cron is running this can easily deadlock the DB as tables are read and updated at the same time. So this leads to real crashes, not just a performance hit.
Now, here is a proposed solution (to be configured on per-block basis):
1) Store blocks in cache permanently
2) Enable block refresh only on cron
3) Control timing by supplementing "cache lifetime" with "refresh delay" (how long to wait after the last refresh)
This makes cached blocks immune to constant flushes (by background processes), ensures rapid response time to end user (since only cached version is ever served) and allows for a single blockcache refresh after a batch cron job.
What do you think? I am going to try and implement this option but hope this can be merged back into the module.

#1
OK, I took a stab at the solution and it has been working perfectly for me. The fix adds one option that makes block stored in cache persistently (not affected by regular flushes) and refreshed only on CRON and only after the specified lifetime elapses.
I do not know how to make a patch, so attaching a file. This is ready to go but could benefit from further review and comment.
#2
Here's some resources to help you create a patch.
http://drupal.org/diffandpatch
http://drupal.org/node/128209
It's really hard for me to review your changes to the module without a patch of some kind. It sounds like you've got a really good point and I'd love to commit this to the module, but there's not an easy way for me to audit your changes to the module.
Thanks!
#3
dkruglyak are you going to make a patch here...?
#4
My final 4.7.x code is a mess and I have been working on 5.2 upgrade...
It is not really ready to release, but if you want to help and jjeff helps with review and commiting - I could put some effort into doing a release...
#5
Yes, please submit a patch.
#6
Subscribing...
What's the status of this issue? I'm using the development version for a 5.2 site. I have 4 cached blocks displayed in every page. One of them is the weather module which displays the weather. I have set-up cron to run every hour. The problem is that even after a cron run and after the weather module fetches new weather information, instead of displaying the new version, the cached version is displayed. Do you think this behavior is caused by the what is described in this issue?
#7
What is the status of this module? I have what I THINK is a similar issue... I just checked my MySQL Slow Query log and Block Cache features VERY heavily in it. For example, I just noticed this query...
DELETE FROM cache_block WHERE cid LIKE "bc_1::%" OR cid LIKE "bc_2::%" OR cid LIKE "bc_3::%" OR cid LIKE "bc_4::%" OR cid LIKE "bc_5::%" OR cid LIKE "bc_6::%" OR cid LIKE "bc_7::%" OR cid LIKE "bc_8::%" OR cid LIKE "bc_9::%" OR cid LIKE "bc_10::%" OR cid LIKE "bc_11::%" OR cid LIKE "bc_12::%" OR cid LIKE "bc_13::%" OR cid LIKE "bc_16::%" OR cid LIKE "bc_17::%" OR cid LIKE "bc_18::%" OR cid LIKE "bc_19::%" OR cid LIKE "bc_20::%" OR cid LIKE "bc_21::%" OR cid LIKE "bc_22::%" OR cid LIKE "bc_23::%" OR cid LIKE "bc_24::%" OR cid LIKE "bc_25::%" OR cid LIKE "bc_26::%" OR cid LIKE "bc_27::%" OR cid LIKE "bc_28::%" OR cid LIKE "bc_29::%" OR cid LIKE "bc_30::%" OR cid LIKE "bc_34::%" OR cid LIKE "bc_35::%" OR cid LIKE "bc_36::%" OR cid LIKE "bc_37::%" OR cid LIKE "bc_38::%" OR cid LIKE "bc_39::%" OR cid LIKE "bc_40::%" OR cid LIKE "bc_41::%" OR cid LIKE "bc_42::%" OR cid LIKE "bc_43::%" OR cid LIKE "bc_44::%" OR cid LIKE "bc_45::%" OR cid LIKE "bc_46::%" OR cid LIKE "bc_47::%" OR cid LIKE "bc_48::%" OR cid LIKE "bc_49::%" OR cid LIKE "bc_50::%" OR cid LIKE "bc_51::%" OR cid LIKE "bc_57::%" OR cid LIKE "bc_58::%" OR cid LIKE "bc_59::%" OR cid LIKE "bc_60::%" OR cid LIKE "bc_61::%" OR cid LIKE "bc_62::%" OR cid LIKE "bc_63::%" OR cid LIKE "bc_64::%" OR cid LIKE "bc_65::%" OR cid LIKE "bc_66::%" OR cid LIKE "bc_71::%" OR cid LIKE "bc_72::%" OR cid LIKE "bc_75::%" OR cid LIKE "bc_76::%" OR cid LIKE "bc_77::%" OR cid LIKE "bc_78::%" OR cid LIKE "bc_79::%" OR cid LIKE "bc_80::%" OR cid LIKE "bc_81::%" OR cid LIKE "bc_82::%" OR cid LIKE "bc_83::%" OR cid LIKE "bc_84::%" OR cid LIKE "bc_85::%" OR cid LIKE "bc_86::%" OR cid LIKE "bc_87::%" OR cid LIKE "bc_88::%" OR cid LIKE "bc_89::%" OR cid LIKE "bc_90::%" OR cid LIKE "bc_91::%" OR cid LIKE "bc_92::%" OR cid LIKE "bc_93::%" OR cid LIKE "bc_94::%" OR cid LIKE "bc_95::%" OR cid LIKE "bc_96::%" OR cid LIKE "bc_97::%";That's 100 'LIKE' lookups in 1 query... This query took nearly 4 seconds on a dedicated MySQL box with a dual core Opteron 3400 + 2Gb DDR RAM. Imagine the hit on an already overloaded box like the ones Dreamhost provide!
Now I'm not bad with MySQL but I cant actually see any better way of doing this... As I understand the function which does this, I dont think you can simply do "bc_%" because you might not want to clear all the cache entries.
Is this related to the same problem or should this be opened in a new issue?
#8
A patch for the 4.7 version of Block Cache based on dkruglyak's module in #1. This is completely untested but the point of this patch is to see what code is new.
dkruglyak, thanks for your contribution. It would really help if you would tell your text editor not to use tabs and add extra lines all over the place. It makes creating and reviewing patches a real pain.
#9
christefano, thanks for making sense of the patch. At this point my editor is fixed - that was really old code.
I suggest leaving this issue for 4.7.x and creating a new one for 5.x. My code went through many transformations and I am not sure if / how it could be untangled from custom logic and released. I might still do that though...
What is the block caching strategy in 6.x? If I were to put effort into releasing 5.x version, it would make a lot of sense to make it as consistent with future 6.x migration as possible.