Hello,

We had our first customer running a Drupal site make the front page of digg.com yesterday and have some interesting observations. My hope in posting this thread is 2 fold - 1) to provide some information to users of Drupal on how to better optimize their install and 2) provide some feedback to the drupal development team.

First off, we run a clustered environment, where all services are separated onto dedicated resources (so, control panel, dns1, dns2, web, email, mysql, etc) are all located on separate servers.

As soon as our clients site (www.medopedia.com) made the home page of digg.com, our server loads sky rocketed. The link to the digg.com reference:

http://digg.com/health/6_Common_Myths_about_Sleep

Server Specifications:
Web Server - Dual Xeon, 4 GB RAM, SCSI Drives, baseline load average (before digg effect) = 0.50
MySQL Server - Dual Xeon, 8 GB RAM, SCSI Drives, baseline load average (before digg effect) = 0.20

As soon as the site was featured on digg.com, the web server load shot over 40 (effectively rendering service unavailable) and the mysql server load increased slightly.

cache was already enabled on this site. However, adding the following directives (gzip) to .htaccess reduced the load of the web server to below 6

<IfModule mod_gzip.c>
mod_gzip_on       Yes
mod_gzip_dechunk  Yes
mod_gzip_item_include file      \.(html?|txt|css|js|php|pl)$
mod_gzip_item_include handler   ^cgi-script$
mod_gzip_item_include mime      ^text/.*
mod_gzip_item_include mime      ^application/x-javascript.*
mod_gzip_item_exclude mime      ^image/.*
mod_gzip_item_exclude rspheader ^Content-Encoding:.*gzip.*
</IfModule>

After a couple of outages on this particular web server, we were able to stabilize load and the clients site remained up for the rest of the duration of their front page listing. However, I am a little concerned about drupal here, as it does not seem to perform as well as other CMS solutions we regularly use (ie Joomla). We have been watching Drupal for quite some time and are anxious to include it in our supported applications list - its a wonderful solution, supports pgsql and clients love it - however, we need to be able to handle these sorts of loads without issue in a shared environment in order to take on this application. Even with the various modifications we made to the web server as well as .htaccess changes, etc - a load average of 6 on a server dedicated to web service only is still far too high for this amount of traffic. We will regularly see 2-3 times the traffic from a Joomla site and loads will not reach much above 2 - this is without any specific optimization, caching, etc... My concern is that a presence on digg.com crashed our server until we were able to identify the issue and modify the install so that loads could be sustained - and this just makes us, as providers, look bad.

I would like to see Drupal further improve this product with respect to server resource utilization. I believe the queries can be cleaned up and streamlined to reduce relative server load during periods of high traffic burst. Obviously we understand that Data Base driven applications will inherently produce higher loads then static sites for example - however, I see no reason, based on our experience with other solutions, that the relative load affect of drupal under higher then normal loads cannot be further optimized.

Your thoughts, comments and any other suggesstions towards optimizing Drupal in a shared environment are all welcome.

Comments

Michelle’s picture

Might want to have a look at this one for a similar discussion: http://drupal.org/node/137898

Michelle

--------------------------------------
My site: http://shellmultimedia.com

cartika’s picture

Michelle, though I appreciate your response, I have read that thread, and frankly, it is not at all related.

Our entire business model is to host applications. Our primary shared cluster, including VDC (Virtual Dedicated Cluster) instances is pushing 100 servers. We are hosting 1000's of applications across many platforms (lunix, unix, windows, etc). This is not a case of our environment being oversold and requiring us to suspend accounts for resource utilizations. We are not selling 1TB packages for $9. The only time we suspend accounts for CPU violations is typically when traffic volumes bursts are MUCH higher then what you would see from a digg.com listing (in which case, they would typically be moved vs suspended) or from poorly written coding which cannot scale to handle loads.

What I am trying to tell you here is Drupal generated far too much load, compared to similar scripts, for the relative burst it was asked to handle. If Drupal cannot handle this burst in our shared environment, then I highly doubt there is any shared environment which can.

I would prefer if this discussion was focussed on improving drupal performance, specifically under bursting conditions - vs trying to blame an overloaded hosting environment - as this is not the case here (which is why I posted baseline loads)

From a server perspective - increasing number of connections and altering the timeout settings - in combination with activating gzip at the user/domain level - really helped Drupal - however, in my opinion, it was still not sufficient enough - and this needs to be addressed either from a server optimization level (all suggesstions welcome) or from a Drupal core code perspective (again, all comments welcome)

www.cartikahosting.com
business grade, clustered application hosting

Michelle’s picture

Well, if figuring out ways to improve Drupal's performance isn't related then ok...

Michelle

--------------------------------------
My site: http://shellmultimedia.com

pwolanin’s picture

You might draw the attention of some of the most relevant people if you post a link to this or repost to these groups:

http://groups.drupal.org/high-performance

http://groups.drupal.org/coding-standards-and-performance-optimization

---
Work: BioRAFT

VM’s picture

Conisdering your position as a host you have probably already read this handbook page with relation to server tuning for Drupal. In case you haven't see: http://drupal.org/node/2601

Michelle was more than likely, just pointing out the thread, not to use it as a problem solver but as representation of a recent discussion where resources and optimization is being discussed, although evidentally not very actively.

kbahey, had a long post on optimization once a few months back, but for the life of me I can't find it easily. If i track it down in his tracker, Ill be sure to post it back here.

edit: hes's consolidated much of the same information on his consulting site here: http://2bits.com/articles/drupal-performance-tuning-and-optimization-for...

Michelle’s picture

Michelle was more than likely, just pointing out the thread, not to use it as a problem solver but as representation of a recent discussion where resources and optimization is being discussed

Yeah, I thought the discussion on all the various caching methods as well as whether or not Drupal uses too many database queries and what the developers are doing as far as improving performance was related to this one. But he seems to be hung up on the fact that the poster is on shared hosting, so dunno.

Michelle

--------------------------------------
My site: http://shellmultimedia.com

cartika’s picture

Thank you for the responses and resources. We have seen some of these previously, others we have not - and I certainly do appreciate the references.

Michelle - Sorry for coming off harsh, I just really did not want to get into a discussion around oversold hosting/etc as this really isnt relavent here...

Looking through some of those resources, my core concerns still remain. We will obviously implement alot of those for this client once we move them to their own VDC (which is basically a dedicated web/db server plugged into our cluster).

However, I still have concerns around core drupal code - and especially in a shared environment. I fully understand that making a DB application run more efficiently is a science in itself - is Drupal planning on spending more time streamlining their actual code base in future releases? Again, I have very high hopes for this application and believe ALOT of clients will find value in it - however, I feel it is important for both hosting providers and developers to work on reducing relative CPU loads on an install for a given volume of traffic (ie Drupal should have NO issue what so ever accomodating a front page listing on digg in any reasonable shared environment - obviously once you see much larger bursts, then more dedicated environments need to be considered). My primary focus here is not unrealistic - no one (well at least not myself) is expecting drupal to accomodate massive burts in oversold shared hosting environments, this is unrealistic - however, it is certainly realistic to expect the script to operate efficiently enough in a good environment, with "standard" application optimization techniques, to accomodate a burst as small as a digg.com listing.

I think if this is something that is focussed on now (again, both from hosting providers and developers) - drupal can avoid the pitfalls found with scripts like wordpress and really excel as more companies adopt dymanic web solutions and move towards integration of their various systems.

Thanks
Andrew

sepeck’s picture

Drupal developers always spent time working on the performance aspect of the code base. The presentation that Rasmus did with a shot of the callgrind tool run against a drupal site showed that there wasn't 'one' place to work on the inefficiencies, it was a whole series of little steps. Dries has posted performance tests he's done on his blog.

Already Drupal 6 has some improvements in the menu module. Depending on your sites structure tests showed 6-15% performance improvements. There are others that are being worked on as well and in many of them you will see performance data being submitted before the patch is accepted.

Check the issue queue for Drupal 6. Subscribe to the development list. People are working on things but in order to help, you have to spend some time tracking and testing peoples patches and giving feedback. If you have experience, then please put it to use for the next version.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

cartika’s picture

Drupal developers always spent time working on the performance aspect of the code base. The presentation that Rasmus did with a shot of the callgrind tool run against a drupal site showed that there wasn't 'one' place to work on the inefficiencies, it was a whole series of little steps. Dries has posted performance tests he's done on his blog.

Interesting - and overall, this is what one would expect to find (a whole series of little steps) as any really obvious ones would be easily identifiable.

Check the issue queue for Drupal 6. Subscribe to the development list. People are working on things but in order to help, you have to spend some time tracking and testing peoples patches and giving feedback. If you have experience, then please put it to use for the next version.

We have no issue posting/submitting our findings - this entire thing caught us a little off guard as something as simple as a digg effect has never brought one of our servers down. I read some other posts in this thread comparing "speed" of Joomla vs Drupal in various scenarios - and honestly - in the real world - this doesnt matter. We do not care about relative speed as they are both "fast enough" - however, we have often found that speed and relative resultant impact on CPU/Memory usage are not always linked as linearly as people believe. We have Joomla sites that have specific event dates where traffic is significantly more then what was seen here and the resultant impact on server resources was significantly less (and keep in mind, no caching is being used in these Joomla sites). So, although I appreciate the stats comparing performance of Joomla and Drupal in various scenarios - it isnt addressing the problems we have noticed with Drupal that do not exist with Joomla.

we will analyze our logs of this event and will happily contribute them to the drupal team to see if additional bottlenecks can be identified/removed in v6. Again, I know this thread sounds like I am focussing on the negatives of Drupal, in actuality - it is because we see the potential in Drupal that we are even bothering to discuss our concerns with the community and work towards a resolution. If this client was using wordpress, we simply would have told them to either get off wordpress or countinue to throw hardware at it - but, I truely believe that Drupal should be able to handle these sorts of bursts in a proper shared hosting environment without issue - and hearing that this is something being actively worked on is encouraging.

Thanks for all of the comments, resources and feedback so far...

www.cartikahosting.com
business grade, clustered application hosting

Anonymous’s picture

I am the owner of the site in question, (medopedia) and wanted to add some background information.

This is a releatively small site- with around 900 nodes, only 1 user (admin), around 20 taxonomy terms, and the following contributed modules:

    Adsense, Adsense Injector, GSitemap, Pathauto, Nodewords, Quicktags, PoormansCron, Update Status, and Page Title.

Cache was on but not in Aggressive Mode.

The node in question got around 55,000 views in a 24-hour period, with probably 1/4 of that in the first 2 hours of going on Digg's frontpage.

From a site admin point of view, the one thing that could have helped lower the server load would have been to turn on Aggressive Cahe mode; perhaps combined with the gzip enabling would have been enough to survive as well as other CMS's per Andrew's original post.

Does anyone have any experience or input in this regard- i.e., how much difference aggressive mode can make in a traffic spike like this?

VM’s picture

I have not seen benchmarking tests on aggressive versus normal, doesn't mean they aren't around, just that I haven't seen them. Dries' blog has some benchmarking on it, but I think it was D4 vs D5 or Joomla vs D5. In some of the other disucssions about resource usage, the claim is that aggressive caching is the way to go. Did you set up throttling and throttle any of the modules either contrib or core during this spike ???

ex: poormans Cron, although while on the topic of Poormans cron; why not remove it from the equation altogether and add a cronjob on the server set up. This would reduce that one modules usuage which may help some.

I'd be interested in seeing the devel.module output on querries for the front page as well as the page in question. Have you considered installing it ? to help with diagnosis ?

Anonymous’s picture

I'm going to install the devel module and have a look, and try to post the output here as well.

Good call on the cronjob setup, that goes on my to do list, too.

From Dries blog post on benchmarking:

    "When caching is disabled Joomla can serve 19 pages per second, while Drupal can serve 13 pages per second. Hence, Joomla is 44% faster than Drupal.

    However, when caching is enabled Joomla can serve 21 pages per second, while Drupal can serve 67 pages per second. Here, Drupal is 319% faster than Joomla.

    In other words, Joomla's cache system improves performance by 12%, while Drupal's cache system improves performance by 508%."

and:

    "Lastly, when serving gzip-compressed pages Drupal becomes slightly faster compared to having to serve non-compressed pages. Joomla, on the other hand, becomes a little bit slower. The reason is that Drupal's page cache stores its content directly in a compressed state; it has to uncompress the page when the client does not support gzip-compression, but can serve a page directly from the page cache when the client does support gzip-compression."

    This page shows benchmark comparisons for Drupal cache modes.

      "The figure above shows that generating a page in Drupal 5 is 3% slower than in Drupal 4.7. However, when serving a cached page using the normal database cache, Drupal 5 is 73% faster than Drupal 4.7, and 268% faster when the aggressive database cache is used."

    So, it seems to be saying that Aggressive mode is 3X faster than regular cache mode?

cartika’s picture

ex: poormans Cron, although while on the topic of Poormans cron; why not remove it from the equation altogether and add a cronjob on the server set up. This would reduce that one modules usuage which may help some.

This is really solid advise - you can actually just utilize the cron feature from your control panel and establish any crons you require directly at the server level without requiring this module

Did you set up throttling and throttle any of the modules either contrib or core during this spike ???

I highly doubt any of this was done. What we did from a server admin point of view, once we noticed what was happening, we temporarily take the site off line - removed permissions of certain modules which we noticed was resulting in load spikes, tweaked apache settings, added some directives to .htaccess, then put the site back up. We obviously did not have time to research this extensively in the heat of the moment and the objective was simply to get the site back up and keep it up during the digg listing while obviously not impacting our other clients. Overall, it went pretty well, and after we brought the site back up, service proceeded fairly seamlessly.

In retrospect, I wish we had just recreated that 1 page in static format, temporarily turned off mod_rw on this install and placed this static file in the relative location digg was looking for. This would have solved the issue immediately as all traffic from digg would have been serving a static page, and the clients site would have still been live (mind you, without SEO URL's). Hindsite is 20/20 I guess - but, eitherway, I think this is a little dramatic simply to handle a digg listing in a shared environment.

VM’s picture

just to clairify, the throttling mechanism i speak of , is a drupal.module which is distributed with core. When enabled it allows the admin of the website to choose which modules are "throttled" during high loads. Thus that question is more for the site admin then the host : ) my apologies for the confusion.

cartika’s picture

just to clairify, the throttling mechanism i speak of , is a drupal.module which is distributed with core. When enabled it allows the admin of the website to choose which modules are "throttled" during high loads. Thus that question is more for the site admin then the host : ) my apologies for the confusion.

Hello,

No fear, I actually understood what you meant - However, when I said I doubt this was done, was because we didnt actually enter the clients admin area and impact any of these changes - and I am fairly certain the site owner did not do this either - though, it sounds like an interesting feature

Anonymous’s picture

The settings on my throttle admin page were:

    Auto-throttle on anon users: 0
    Auto throttle on authenticated users: 0
    Auto throttle probablity limiter: 10%

Modules having the throttle status set to on were: search, taxonomy, adsense

As the documentation on the throttle settings was a little ambiguous to me, I left them on the installed defaults.

Can someone give a recommendation for how to set them up for anticipated high spike loads please? And also which modules to enable throttling on?

VM’s picture

more information about the throttle module is here : http://drupal.org/handbook/modules/throttle

zoon_unit’s picture

I'm also very concerned about performance for two sites I'm developing. I'm encouraged by feedback I've seen on these two approaches:

http://drupal.org/project/memcache

http://drupal.org/project/blockcache

With your extensive knowledge of server architecture, I'd love to hear your feedback on these two approaches.

misty3’s picture

First of all, a very warm welcome to Cartika. Those of us who have been regular at webhostngtalk at some point of our lives know Cartika is a respected and wellknown name in the circles as honest and efficient host.

I am not a Drupal expert but what I feel is that core Drupal and Drupal+modules perform significantly different. Pathauto module for example can add to system resources eat up.
It also appears from various discussions that Drupal probably performs better ( or more suited ) for vps, semi-ded ( whatever that may mean ) or ded servers ( for example a ded webserver + ded mysql server together for this site only ) . Some people have also noted in this forum that performancewise 4.7.x and 5x do not differ significantly.

To cartika, do you recommend that adding those directives (gzip) to .htaccess in any installation of drupal ( eg 4.7x too ) will help clinets in shared server to some extent .... can you post the full htaccess if possible. ( can any expert test this please both for 4.7x and 5x )
If this is a solution to help the situation to ease even a little bit, I think this post should get sticky and get recommended as this may help a lot of users on shared servers.

Best regards

cartika’s picture

Hello Misty,

Thank you for the warm welcome and the very kind words - much appreciated :)

Yes, I do think gzip would help on all installs - however, this alone will not accomodate something like a digg front page listing. Some of the resources posted here are outstanding as far as optimizing apache and mysql to handle Drupal with high loads. Having said this, I do believe that Drupal should be able to handle greater loads then it currently does before requiring a more dedicated environment. As for providing the entire .htaccess, I believe only the snippet above is what we added - however, the owner of the site (who has posted in this thread) is certainly welcome to post their entire .htaccess file if they feel it has relavence here.

We will be more thoroughly documenting the optimization techniques we utilize for Drupal with this site as it grows into dedicated environments and inevitably gets relisted on digg.com. I will certainly update this thread with our findings, which optimization techniques we utilized and what their relative effects were. Even though we do not yet formally provide application support for drupal, we do host a fair number of installs in our shared environment and its install base in our environment is growing (no surprise really as it is a wonderful solution) - it is probably wise for us to spend some time and evaluate its actual limitations in a shared environment as well as determine the best growth pattern and environment configuration for this application. As of this moment, we will need to let our clients know that Drupal will need a dedicated environment before other similar scripts - however, this isnt necessarily a bad thing, it just represents a slightly different target audience - and if we can create a custom dedicated server/cluster configuration that is optimized for Drupal, we can still create a very strong total cost of ownership case for Drupal, compared to its competitors, for medium to large organizations.

Having said this, your point is well made re "stock drupal" vs "drupal with 3rd party modules". I think this statement is valid with any script and is probably very solid advise. If you expect to use a script in a shared hosting environment, the closer to stock you can get that script the better.

The objective here is to keep people in a shared environment for as long as possible. This keeps startup costs down and allows a client to grow with a solution and also allows clients to more easily begin on the platform as the cost barriers to entry (vs a dedicated environment) are greatly reduced. Obviously the Drupal development team may disagree with this and their objective may simply be to ensure this script runs well and is scalable in a dedicated environment - I certainly could not blame them if this was their approach.

ajwwong’s picture

Just a note of support for Andrew and his team...

I was checking out hosting packages a while ago and while I did not in the end go with Cartika [due to my need for MySQL 4.1 when, at the time, they only had 4.0], I found them to be one of the most professional, intelligent and responsive operations out there. From my limited interactions with them, they were honest and professional and truly customer / service-oriented.

Good luck everyone.

http://drupal.org/node/68777

Albert
Esalen Alumni Group

cartika’s picture

Hello Albert and thank you for your comments. I am not certain when you last dealt with us, but, obviously we have upgraded to mysql 4.1x some time ago and are currently in the process of upgrading to mysql 5.x

I really appreciate your comments and hope that everything has worked out for you with the provider you decided on.

Thanks
Andrew

spooky69’s picture

Hi,

Interesting post. Could someone clarify if exactly what the .htaccess gzip mod posted at the beginning of this thread does and if it is of benefit to all sites (obviously with gzip installed...) in terms of speed of page presentation - I am on a linux VPS and not on standard shared hosting. I have just installed php-eAccellerator and can confirm that it appears to speed things up quite well (to my eye anyway).

-- http://www.inventionmail.com --

cartika’s picture

Interesting post. Could someone clarify if exactly what the .htaccess gzip mod posted at the beginning of this thread does and if it is of benefit to all sites (obviously with gzip installed...) in terms of speed of page presentation - I am on a linux VPS and not on standard shared hosting. I have just installed php-eAccellerator and can confirm that it appears to speed things up quite well (to my eye anyway).

Hello spooky,

eaccelarator certainly helps in any application hosting environment (other standard items would include zend optimizer, etc). gzip will actually somewhat increase the loads on a server, however, when used in combination with cacheing as well as optimizing connection settings, keep alive settings, etc - it can improve performance and overall decrease load.

www.cartikahosting.com
business grade, clustered application hosting

Anonymous’s picture

I installed the devel module on medopedia.com and this is what I found. Not sure if this is good, bad or normal?

For the front page:

    Executed 103 queries in 74.36 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 368.15 ms.>/ul>

    the largest query in terms of time was cache_get, at 20.181 ms

    For nodes, a typical result was:

      Executed 74 queries in 55.01 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 249.23 ms.

    cache_get was similarly longest query here too.

kbahey’s picture

Here is another example of a large Drupal site under the digg effect, with graphs.

Surviving the Digg effect.

We tuned this server and it did hold up well.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

speedhost’s picture

Please send more examples of larges Drupal sites under the digg effect.

Thanks,

Jefferson
SpeedHost - http://www.speedhost.com.br

Summit’s picture

Subscribing, greetings, Martijn