Download & Extend

Boost uses up every available MySQL connection without stopping

Project:Boost
Version:6.x-1.18
Component:Caching logic
Category:support request
Priority:critical
Assigned:Unassigned
Status:active

Issue Summary

Boost is currently using up every single MySQL thread available on the server and every single php process (500). The majority of the requests are variations of this: (I'm guessing on the version since I just handle the hardware and server side stuff, I don't touch Drupal)

| 20208 | snip | localhost | snip | Query | 2 | Locked | UPDATE boost_cache_relationships SET base_dir = 'cache/normal/mgoblog.com', page_callback = 'node', page_type = 'forum', page_id = '29536', child_page_callback = 'node', child_page_type = 'blog', child_page_id = '29529', hash_url = '05312f3588e8cae3de4c859da04cf812', timestamp = '1266282588' WHERE hash = '62facbe345f92292a34624c753f7fb1c'|

| 20212 | snip | localhost | snip | Query | 2 | Locked | UPDATE boost_cache_relationships SET base_dir = 'cache/normal/mgoblog.com', page_callback = 'node', page_type = 'forum', page_id = '29542', child_page_callback = 'node', child_page_type = 'blog', child_page_id = '29529', hash_url = '3fd359911d9d585f304ee4e5cb59f3a9', timestamp = '1266282589' WHERE hash = 'af8c9daea9fea6d006aa233e4eba29d7'|

What is causing it to take every single PHP instance (and in turn every single MySQL thread)? Is there any way to set a max limit on what it can do?

Comments

#1

Whats your crawlers settings at? Also if your site has a very low cache lifetime with lots of hits then boost will be always be recreating all the pages every time cron is run. The way drupal works is each apache thread gets 1 mysql thread, so you should be seeing a lot of apache threads as well.

This tells me your cache lifetime is set to 1 hour.

<!-- Page cached by Boost @ 2010-02-15 20:53:31, expires @ 2010-02-15 21:53:31 -->

What is currently cached by boost? Only the front page? Looking at it I can see why this would be an issue in this case. Your front page contains 630 pages in total, boost is creating a relationship for the front page. To make this problem "go away" Uncheck both of these settings on the boost configuration page
admin/settings/performance/boost
* Clear all cached views pages associated with a node on update/delete
* Clear all cached views pages associated with a node on insert

And then you might want to experiment with turning this on if it's not right now.
* Clear the front page cache when promoted

#2

To answer some of the questions, I believe the entire site, including many of the nodes from user-generated content is cached. There is a ton of user-generated content as well as a ton of anonymous traffic (by last count there's an average of 180,000 requests/hour), so I believe it pays off to cache more than just the home page. And of course that makes sense that this kills the server since it's caching thousands of pages.

I'll see if those settings help. Is it possible to limit how fast boost caches (I know how ironic this statement is)? Can it be specified to only create X amount of concurrent connections? (Oh, and the server is running ngninx + php-fpm, though I don't think that makes much of a difference.) I believe it would be fine if Boost was limited to 100, 200 or so concurrent connections.

#3

try the above settings first, they are slightly DB intensive & thus it should reduce the load by turning off the views settings. In terms of a throttle for boost that's a little nuts IMHO. If all else fails you can turn on retro mode for boost which then doesn't use the database; this is not recommended if you have lots of pages as it has to do a directory scan when flushing the cache (more html files means worse performance, in retro mode).

#4

This seems a duplicate of #658942: Boost hook_nodeapi() silently fails in some update cases (I don't mark as duplicate since I'm not sure).

I have the same problem: many table locks, many slow queries. From my slow-query-log (those take more than 3 seconds to finish):

DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '11621';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '11464';
UPDATE boost_cache_relationships SET base_dir = 'cache/normal/www.thongtincongnghe.com', page_callback = 'taxonomy', page_type = 'Từ khoá', page_id = '6548', child_page_callback = 'node', child_page_type = 'story', child_page_id = '6441', hash_url = '568b4dd930c0ffc49306ece3e896bae0', timestamp = '1266790032' WHERE hash = '54b5ea151c3abf47798c17fca14d3cfc';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '11672';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '14190';
UPDATE boost_cache_relationships SET base_dir = 'cache/normal/www.thongtincongnghe.com', page_callback = 'taxonomy', page_type = 'Từ khoá', page_id = '3082', child_page_callback = 'node', child_page_type = 'story', child_page_id = '14565', hash_url = 'e60813a522a1ba15ac091e8926312def', timestamp = '1266790071' WHERE hash = '63ea22a018b6ddcb4b6fd31db68cc186';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '11629';
SELECT hash, timestamp FROM boost_cache_relationships WHERE hash_url = 'c3236dcfa728b23eab018e559a0b51d6' ORDER BY timestamp DESC;
UPDATE boost_cache_relationships SET base_dir = 'cache/normal/www.thongtincongnghe.com', page_callback = 'node', page_type = 'story', page_id = '7550', child_page_callback = 'node', child_page_type = 'story', child_page_id = '15304', hash_url = '4061fe5da4456a54254c053ca80b3e0d', timestamp = '1266790275' WHERE hash = '50d349b892eae12725fa52092cb6332f';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '12680';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '11730';
SELECT * FROM `boost_cache_relationships`
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '8671';
DELETE FROM boost_cache_relationships WHERE base_dir = 'cache/normal/www.thongtincongnghe.com' AND page_callback = 'node' AND page_type = 'story' AND page_id = '1986';

On the hook_nodeapi $op='update', when I update a node, it often takes more than 30s to finish the update query and Drupal return a page not found on PHP timeout.

It's holiday in my country so I have rather low traffic now, about 10k visits/day, but this error happens fairly *often*. About the db size: 14k nodes, 14k terms, 20k users, node_revisions has 60 MB. About boost:

  • boost_cache: 72k rows, 25 MB
  • boost_cache_relationships: 480k rows, 108 MB

#5

Yeah that makes a lot of since if you have a view that has a lot of nodes it in; when you insert a new node into that view all pages of that view need to be expired. Thus the nodes that where all previously on page one, the last one will be on page 2, to make room for the latest node. If the view has a lot of pages, this can get expensive. I either need a better way of relating what nodes belong on what cached views pages or to somehow optimize this. Optimizing is very unlikely since I do no joins; thus feature this doesn't scale well with a very large view, and I don't see a good way to make it happen. I'm open to ideas, luckily for the front page it's a special case so I do have some ideas to try out, but for other views that are not the front page, I don't see a good way to do this ATM.

Long story short if you can live with your views (not the front page) not being expired at the same time a node is updated then turn the views thing off, Boost will stop using the relationship table for the most part. I have a special case for the front page (wither its a view or Drupal's built in front page), that if the node has Promoted to Front page checked then all cached front pages will be expired at the same time. I know where that node belongs (front page) and thus I don't need the relationship table.

So the solution that I need to code up is preventing the relationship code from running if the view is also the front page. This should help in whelck's case. As for jcisio, looking at your site I don't see a pager at the bottom.

@jcisio
What version of boost are you running? Have you run the latest update.php for boost?

#6

I don't use pager in frontpage as it is build with a lot of Panels, content is shown in categories. But effectively there is a Views with ten thousands of nodes http://www.thongtincongnghe.com/frontpage

And yes, I use the latest stable version of Boost (6.x-1.18), just installed it a few days ago. I see that all fields have their indices.

So, at the moment, what I should do is to turn off all the "clear cache when..." options. IMHO people don't care if a node get updated 15' later since they don't post content. Maybe the comments block (for not logged users) is a problem. Any view can get the comment of a node.

#7

Subscribing

#8

subscribe

#9

subscribe

#10

@arcane, @p4trizio, @locomo
I need details on how you have boost setup, if your using the crawler, cache lifetime, & if you have a view that contains every node (like the front page).

#11

Hello mikeytown2,

If I unchecked the following two:
* Clear all cached views pages associated with a node on update/delete
* Clear all cached views pages associated with a node on insert
And if I turn off the cron cache clear option, is there any different from using the reactro option?

I was using boost, had problem, then turn on the reactro, then disable boost and use the normal Drupal cache. It works great, except that my site is down for eventual peak (not really at peak time, it dies at any moment :( ). So I've returned to boost now :) But as I've said before, sites with hugh views, dozens of thousands of nodes/terms, don't like relationship table.

I think that the extreme setting, not using db at all, can be adjusted to work better: when a comment is submitted, update the node_view page (node/%nid if this file exists, and the corresponding alias). Because it is the most frustrating thing, when a guest tries to submit comment multiple times without seeing it appear. For block like "latest comments", maybe he doesn't care.

#12

Don't bother the comment above. I've just seen that boost uses hook_comment to clear appropriate pages. But still don't understand why some users post the same comments 3 times in a post in 3 minutes.

#13

Starting to work on some ideas for this
#779192: Even smarter views expiration

#14

Try the latest dev, Boost should be better behaved; still not ideal, but better.

#15

for a site with 65,398 nodes and quite a few DB hungry views, I had to wait for a 5-10 minutes untill a a complex CCK heavy node was updated. The worst part is when I was getting the blank page, all the content of the page was gone. I had to restore previous revision manually.
i followed jcisio comment (#6) what I should do is to turn off all the "clear cache when..." options. and I don't loose my content... Thanks jcisio!!!

nobody click here