Hey fellow Drupalers :)
I'm converting my large community site to Drupal (the site's current content is just a forum and has about 20,000 users, 100,000 nodes, and 800,000 comments, running a custom-coded CMS/forum). However I've run into a relatively serious snag regarding performance on Drupal core's forum.module with this number of nodes and comments, and I'm hoping for advice on how to get a Drupal forum this size to a "usable" level of performance.
My current forum was custom coded, and although not modular and totally awesome like Drupal, the forum loads the same amount of data in about 0.02 seconds with no caching enabled (I don't expect performance "that" good of course).
I decided to run the following test, both to help me understand, as well as to see if I can help CentOS.org decide whether to convert their site/forum to Drupal.
Test setup details:
- Drupal 6.4, no contrib modules other than Devel (used Generate content function to create users and forum nodes).
- Users: 20,000
- Forum nodes: sets of 10,000 up to 100,000
- Comments: 8 per node, total of 800,000
- Test system: MacBook Pro (2.33 GHz core2duo, 2GB RAM), running MAMP (APC opcode cache enabled, no performance tweaks or MySQL query_cache).
Test procedure:
-
First added 1000 nodes to the forum for a baseline (8 comments per node), and then increased to 10,000, and continued in increments of 10,000 until I reached 100,000 nodes and 800,000 comments. At each step I saved PDF output of a set of pages, including Devel's query/performance log of: 1) The main forum list, 2) a Topic List that includes a sub-forum, 3) a Topic List with no sub-forum, and 4) a thread.
Test results:
- Google Chart showing a line graph of performance as nodes and comments are added.
- Google Spreadsheet containing the raw page load times.
- PDF files saved from each phase of the test.
Overview of page load time at 100,000 nodes / 800,000 comments:
- Forum list: 12.8 seconds
- Topic list w/sub-forum: 8.2 seconds
- Topic list: 1.2 seconds
- Thread: 0.6 seconds
- (In case anyone is curious, 100,000 nodes and 800,000 comments takes about 1.8GB in the database)
Synopsis / Assessment
The Forum List and the Topic List w/sub-forum, without a caching solution for logged in users, are clearly the tricky ones. The other page load times are ok, though do show a pattern of continued slow-down as more nodes/comments are added (so a forum with several multiples of the amount of nodes/comments would probably begin feeling the pinch on these pages as well).
Almost all of the performance bottleneck is related to the following queries (I'm aware these are hard-coded in, and can't be fixed until D7 at the earliest, preferably with some kind of pre-stored comment counts so it doesn't have to be calculated on the fly every page load):
- forum_get_forums (2 instances on the main Forum list serving different purposes, 1 at up to 6.3 seconds, and then about 0.6 seconds times the number of forums listed on the page... the larger the forum's content, the longer the delay).
- _forum_topics_unread (about 0.2 seconds times the number of forums, totaling 1-2 seconds depending on forum size).
- pager_query (on Topic listings, 2 instances adding up to about 1.2 seconds).
My caching attempts - thus far unsuccessful
Anyhow, I'm sure there must be a way to cache the forum sufficiently for logged in users (or else custom tweak the bad queries somehow), as I've seen several Drupal forums (including drupal.org of course) handling Forum node/comment quantities this high or higher (for instance Gallery2, libcom.org, etc). On Drupal 5 I did try the Advanced Cache patches/module (which are now broken in the current release, as well as not ported to D6)... they did make the forum blazing fast, but only until a new post was made, at which time the next unlucky visitor gets the whoooole delay :)
I also tried enabling the MySQL query_cache by adding the following lines into a my.cnf file:
query_cache_type = 1
query_cache_size = 25M
Using query_cache gave similar great results as Advanced Cache, cutting load time from the tens of thousands of ms down to just hundreds of ms - usually under 1 second, but only until a new post is made, causing the cache to need to be regenerated. It's understandable that the cache needs to be regenerated at this point, but ideally it would be nice to have the server take care of it immediately on the back-end the instant the cache becomes invalid, instead of waiting for the next user to trigger it and make them wait.
If anyone has a large Drupal forum (or knows how drupal.org caches its forum) and would share how you're pulling it off, that would be terrific! :)
Also in the long term: I'm also willing to spend at least in the high hundreds of dollars to hire a developer to ensure (if possible) that the bad queries (at least forum_get_forums) are repaired by Drupal 7's release (for my site's and everyone's benefit)... I believe it would require a number of core patches. If it will cost more than that I'll be happy to lead a fundraiser.
Anyhow, thanks everyone in advance for any advice you can give.
-- David
absolutecross.com
[new guide/lesson in progress: Creating a CCK and Views powered Drupal site - feedback welcome]
Comments
You'll have to find it...
... but somewhere there is a discussion about how taking the previous/next forum links out of the forum query makes things fast.
Also, if you want to work on the advcache module patches I'd help in any way I can (including committing them and bringing out a D6 version of the advcache patches).
Cache stampede patch in memcache queue
This makes me think of chx's cache stampede patch that is in the memcache queue. It would mean you have to run memcache (which is cool and a good idea anyway), but it might help with cache rebuilds. Another strategy would be to simulate that unlucky user and call the URL as a shutdown function any time the cache clears so that the cache gets warmed before a human actually calls it.
- Robert Douglass
-----
my Drupal book
Have you made...
Any progress on this problem? I'm in the same exact situation. Any advice would be greatly appreciated!
Warm regards,
Mike
--
Veneer | Crack-proof Creative
http://www.veneerstudio.com
Civicsolar
http://www.civicsolar.com
Subscribe - Corey
Subscribe
- Corey
High performance
I think this would make a truly interesting thread on Drupal's High performance group, probably the best place for these topics. There is also a list of related resources.
Update
I wrote to Liam McDermott, who runs a large Drupal forum called webmaster-forums.net and he gave me some very excellent tips.
He noted that my test here was not quite accurate since I did not restart Apache/MySQL between importing the data into Drupal and measuring the results. He said (since he states it better than I can): "After you've inserted a huge amount of data you should probably restart Apache and MySQL to get them to clear their caches. When running similar tests I've found Apache and MySQL will eat quite a bit of memory (this is by design I believe, more to do with OS design than anything else)."
He also recommended I run these commands to make MySQL check for errors and make sure its indexes are up to date:
mysqlcheck -cep --all-databases
mysqlcheck -op --all-databases
Since I have many databases I didn't need to run this on, I ran the commands as mysqlcheck -u root -cep mydatabase ... mysqlcheck -u root -op mydatabase ... doing this fixed overhead on the database tables.
Liam also noted that Devel module introduces its own overhead on the test as well, so in my followup testing, I tried measured both with Devel on, and Devel off/fully uninstalled and timed using a stopwatch.
I then loaded up my Drupal site where I had left off at the end of my previous test, ensured Apache/MySQL were freshly restarted and Drupal's caches were initially cleared out. The result was an improvement of about 4 seconds on the main forum listing page. My average load time was now about 8.5 seconds instead of 12.8 seconds. I reloaded the page many times to confirm the average.
In my tests to determine Devel's overhead on the test, my results were that Devel was adding approximately 0.25 seconds to the time (less than I expected it was adding, but good to know).
Since I noticed that the webmaster-forums.net main forum listing is customized, I asked about how this was done. Liam confirmed that it is done with only theming, no custom/overridden queries that I expected.
Liam also provided some suggestions for adjusting /etc/my.cnf ... I will test them and report back the settings if they improve performance. He also recommended using YSlow extension for Firefox to figure out how to improve the front-end performance.
I also tested out Memcache and have not yet achieved any performance benefit in this case from it (Devel query/load stats are approximately the same with and without Memcache... the improvement looks like about 200ms at best). Hopefully I have simply misconfigured something. I used Nate from Lullabot's guide: Setup a Memcached-Enabled MAMP Sandbox Environment. Memcached is installed correctly and shows output for memcached -h. For my settings.php I used the example given in the Lullabot article. I tested with both memcache.db.inc and memcache.inc. When testing with memcache.inc I emptied Drupal's cache tables and confirmed that they remained empty after visiting pages throughout the site (therefore Memcache is definitely handling the caching, though I'm not really sure if it is successfully caching, or if Drupal is just loading the pages with no caching. I'm not sure how to check what the hit/miss ratio on Memcache is). Memcache statistics showing in the page footer includes data for get, set, bins (shows entries for cache and cache_menu), but the hits area is empty.
In case it is helpful in troubleshooting, it's notable that when loading my /forum page, my CPU monitor shows a spike up to about 100% CPU on a mysqld process for the majority of the time the /forum page is being loaded. It seems to me that this could be why Memcache isn't helping as much as I'd hoped, since the slow-down could really be that the CPU has to crunch through such huge tables to calculate the output for the forum_get_forums queries (whether the tables are in memory or not it still has to calculate them). Just an uneducated theory though.
If anyone can provide further advice, it would be greatly appreciated :)
-- David
davidnewkerk.com | absolutecross.com
View my Drupal lessons & guides
Re-testing on high-end hardware
I'm repeating my testing on a demo site running on my production server tonight... will post my results. The server is a SoftLayer hosted SuperMicro 3220-Quad Xeon 2.4GHz (Kentsfield) - 2 x 4MB cache, 4GB RAM, 7200RPM SATAII drive, running CentOS 5 and no control panel software.
-- David
davidnewkerk.com | absolutecross.com
View my Drupal lessons & guides
Updated test on server-class hardware
Here are my new test results running on the production server (hardware specs mentioned in my previous post, also of note is the server runs APC, PHP 5.2.x, Apache 2, MySQL 5), produced on a clean install of Drupal 6.9, with no modifications and only core modules, and Devel + Devel Generate. This is a test "only" of the main /forum page. As before, the test has 20,000 users, and builds up to a level of 100,000 nodes and 800,000 comments (every node has 8 comments). You can see my live demo here: http://dev.davidnewkerk.com/forum (Page caching is not enabled, so you can see about the load time of a logged in user I believe). Though the real site runs on the server, it has almost no measurable effect on the server, so the server's resources are nearly 100% dedicated to this test.
Between each phase of the test, I restarted MySQL and Apache, and ran mysqlcheck -u root -op databasename. The posted result is about the average (I reloaded the /forum page many times and chose a result that represented the average. Also as mentioned in my previous post, Devel module adds overhead to the results. Using a stopwatch it was difficult to measure how much, though it is likely about 0.15 seconds on this server.
I measured the effect of restarting mysql/apache and running the database check/repair... the difference (on this server) afterward was between 50 - 75ms faster.
Fortunately the speed of my real server makes a definite impact on the load times, more than twice as fast as my test system, finishing up at about 3.6 seconds for 100,000 nodes and 800,000 replies (almost 100% of this time is MySQL). Though the real site with more modules will likely increase this, performance on this server is at least "into the ballpark" of where I have a hope of getting it to a good level, given every other performance tweak I can throw at Drupal (and the front end using YSlow) to make up for it. I will probably also resort to significantly pruning out old data, and put up a static archive of it (so ultimately I could reduce the data enough to get as low as 1.5 to 2 seconds). My brief research into the subject of multi-core use for MySQL implied that the quad core is not actually helping in this case (correct me if wrong, though it appears that a single MySQL query gets handled by only one core at this point).
As soon as possible I'd like to test further to see whether the distribution of nodes/comments into a larger number of forum categories has any positive effect (as of course a real-world forum doesn't usually have such a uniform distribution of so many nodes/comments in a small group of forums), or whether it's mostly/completely the huge size of the node/comment tables in any case (and the joins in the query) that causes the performance bottleneck.
I also still need to test MySQL optimizations, install memcache on the production server (and get it to work), and run a test using a copy of the real data of the site.
The number of nodes is a bit odd at the beginning due to a bug I experienced in the current version of Devel (would not generate more than 50 nodes at a time). I switched to an older working version and continued after that.
-- David
davidnewkerk.com | absolutecross.com
View my Drupal lessons & guides
Updated test on sever-class hardware
Hi
I'm just curious -- how many simultaneous user connections are your simulating in your tests?
Thanks
Dave
Subscribing
+1
Multi Lingual Peformance Loading Problems
We are having forum performance loading problems now that we have activated multilingual versions of the web site.
/ forum_get_forums / SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.vid = r.vid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'zh-hans' WHERE (n.language ='zh-hans' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND ( n.status = 1 )GROUP BY r.tid
I think it can be because the query uses Count(n.nid) while the table is in innodb format.
Any suggestions would be appreciated.
Same problem here. The query
Same problem here. The query is very very large.
http://www.DROWL.de || Professionelle Drupal Lösungen aus Ostwestfalen-Lippe (OWL)
http://www.webks.de || webks: websolutions kept simple - Webbasierte Lösungen die einfach überzeugen!
http://www.drupal-theming.com || Individuelle Responsive Themes