boost_cache_kill() is very database intensive
dalin - November 6, 2009 - 04:32
| Project: | Boost |
| Version: | 6.x-1.x-dev |
| Component: | Expiration logic |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Description
On this particular site there are > 100,000 pages cached by Boost. When a node is saved Boost will set the appropriate files as expired to be cleaned up on the next cron run. The problem being that the database is updated with one query per file. This is incredibly inefficient if 80,000 pages need to be expired. This means 80,000 queries, which results in a node save taking > 1 minute to complete.
A better way would be to do
UPDATE {boost_cache} SET expire = %d WHERE hash IN ('%s', '%s', ...)
Such a change is a bit tricky due to the surrounding logic, but I think it's doable.
I'm going to poke around a bit more and then start on a patch.

#1
Can you describe the problem in more detail? (cron flush, node flush, ect...) But I think this would be very doable do to the latest rewrite of boost.
In terms of a single node flush, boost_cache_expire_router() is the key, it already sorta has a batch gathering thing, it reads in multiple file names per lookup. The next step is to create a new function that is very similar to boost_cache_kill() but it would do a bulk kill. I believe you still have to kill 1 file at a time but setting the expiration time in the database in 1 big batch operation would be good. boost_cache_expire_router & boost_cache_kill are the 2 functions to look at. Let me know what your thinking.
One thing to remember is that on a large operation I'm guessing that the disk is taking more time then the database side of the equation. If this is the case for you you can:
* Enable the cron crawler
* Do not flush expired content on cron run, instead re-crawl and overwrite it.
* Expire content in DB, do not flush file.
* Overwrite the cached file if it already exists.
This will not "lock up" the system it will overwrite each file and update the database at a slower pace, allowing for your site to always be serving content from the cache, while avoiding the database issue.
#2
I'm not too concerned if cron takes a while to do its thing. It's the page taking a minute to load when content editors save a node. While the disk time for the file deletion probably contributes, in my case the problem is more on the database side of things. 80,000 queries to a database that lives on another server will take at least 40 seconds (ping time to the DB server is 0.5ms). While one big query might only take 1 second.
I wonder if you can batch the file deletions with the glob stream wrapper.
#3
When editing a node, why does the entire cache get flushed? You might want to look into ajax so you don't have to flush 80k pages at once; store the changing content in an ajax block. Nonetheless doing a batch delete would be the best way to do this, it's not 100% guaranteed though; by doing a batch there is a slight chance that it can get out of sync; db says it's expired when its not, leading to a file never getting flushed. So I will need to create some sort of checker... actually using that instability to do the file deletions might be the safe way to do this.
So in your case node gets edited, 80k pages need to get flushed. Set database to expire=0. Call up a second function that looks for all expired files and makes sure they don't exist. This checker would be run at the end of cron as well. This ensures that if a file was set to be flushed in the database but for some reason it didn't get killed, eventually it will be. So in your case you could delay the massive deletion until cron runs. I would still want the actual node to be flushed, just the related ones (79,999) could get flushed at a later time when your editor is not trying to save a node.
U follow, thoughts?
#4
Here's a patch.
I sort-of follow your train of logic, but I'm not quite intimate enough with Boost to keep up with you. Not sure that I understand the chance of things getting out-of-sync. This patch keeps everything the same, it just avoids 80,000 little update queries and puts them all in one.
#5
Sweet, I'll look it over and probably put it in once I do some checking on this end. Reason for the instability is PHP could time out; when doing large operations, I try to take into account the fact that it might not to finish, thus in the end it will almost always work because I got that covered with logic and code.
#6
Looking over your patch, there is a lot of good! One issue though... you unlink the file and then set the database. This is not ideal. The database should be set first, and then the files cleared.
If a request comes in on a file that you just deleted, that file will be created and the database will be updated. Then when the the database gets expired (in the new bulk operation) the newly created item will have it's database set to expired, but that file will exist now; thus creating a file that will never be expired, because the database says it doesn't exist.
#7
#8
Here's my minor tweak on your patch.
#9
Added some comments and made the if statement in the kill function look more like the comment above.
#10
I would like to commit this soon so I can get this patch in as well; they both conflict, but yours goes in first since you wrote it. #623536: Expiration validation (be smart in short)
#11
committed
eh good for now; that other patch needs to get in.
#12
boost_remove_db() has incorrect SQL syntax
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '\'4d4833b0c6ee1e10c832ccc5d14599cc\')' at line 1 query: DELETE FROM boost_cache WHERE hash IN (\'4d4833b0c6ee1e10c832ccc5d14599cc\') in /boost/boost.module on line 2252.
Odds are I will be replacing this function with the old one.
<?phpfunction boost_remove_db($files) {
$hashes = array();
foreach ($files as $file) {
if (empty($file['hash'])) {
$file['hash'] = md5($file['filename']);
}
$hashes[] = "'" .$file['hash'] ."'";
}
if ($hashes) {
$hashes = implode(', ', $hashes);
db_query("DELETE FROM {boost_cache} WHERE hash IN ($hashes)");
}
}
?>
#13
never mind I already fixed it. I need some sleep.
#14
Automatically closed -- issue fixed for 2 weeks with no activity.