Drupal Performance with MySQL, PHP, and structure!
Hey all,
I have a big argument with my C++ programmer friends, about Drupals workings. Now these guys are not dumb, and they are more experienced, but I am definitely sure I will win this argument. It all started, when we had a couple server problems, with performance, that to me seems to point at their server program causing the problem; however, they argue that its Drupal causing the problem.
First of all, I am the one managing the Drupal site, and I take very good care of optimization of custom modules as well as other modules, and disable any low performance modules.
Right now, my site's performance can be summed up by Devel:
Executed 109 queries in 73.85 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 935.38 ms.
Now, when we first started with same amount of features, we had 598 queries per page, until I started optimization processes by changing my Queries in custom modules to JOINs and also boosting my performance with AdvCache, and enabling more caching features (not aggressive caching tho).
My site is very consuming, although I use a dedicated server with amazing stats, quite expensive, probably more than even Drupal.org. However, we get an usual 10,000 unique users per day. Usually we have 15-70 people online at one time.
What do you guys think?
I also tried following some tips for apache configs, and mysql configs, i made larger buffers and memory for mysql, and I reduced the amount of clients, servers etc, so that we would have LOW CPU consumption so our server program can run easier.
What are your thoughts, and I always wondered, how many queries does your mega-site have (with amount of visitors per day)??? What did you do to optimize your site??? I have to provide many features and blocks, so yes it can be consuming, but none of my queries are over 2 seconds, except for a few cache queries that drupal does.
My friends argue that Drupal is designed inefficiently and use too much DB queries. Is 100 queries per page a lot?

Don't forget eAccelerator or
Don't forget eAccelerator or some other php opcode cache.
Time spent in php drops from the page generation, almost to the point that only time spent in queries remain.
For sample numbers, i had a small site go from 1500-2000 total ms to 700-1000 with eAccelerator.
A large amount of RAM previously held by httpd was freed, from 700-900 to 400-500 MB average.
This depends on usage, but cache expiration no longer slowed the site and the former slow-queries vanished.
In fact, the spent RAM is taken by an embedded gallery setup that generates most of the traffic.
There's still many queries being run by drupal but overall the results are acceptable.
Queries are reduced with both drupal and mysql caching.
This makes it worth to run contributed modules where there's no special needs to cover.
Thus, it is easy to say that piling queries is bad design, but development time is worth money too.
Queries use RAM, which is costly, so the options are caching or building a specially designed module.
Even with optimizations, chances are that reducing RAM usage will instead raise CPU and IO expenditures.
The result is that where costs matter, tuning and calculations must be done to keep resources balanced.
Modules help build a flexible site at low cost, at the expense having to process the code in them.
Since these are open source, they are under public review which help get them improved.
Any slow queries or bad designs should be reported to the issue queues.
Great
Great Idea, but I don't know if I wanna go to those extremes. Dont they have downsides?
The downside is that a httpd
The downside is that a httpd process can crash. See:
http://2bits.com/articles/logwatcher-restart-apache-after-a-segmentation...
I used the script there to deal with restarting apache.
I had seen my site crashes at a peak hour, but without it the server would have crashed anyways.
Of course, this isn't an option in shared hostings but still worth a try.
Also see http://drupal.org/node/2603 and http://drupal.org/node/46048
Well I'm not sure, hopefully
Well I'm not sure, hopefully though , I can get more statistics info from other Drupal users...
How is everyone's MySQL queries etc and how big/popular is their site.
Still wondering
Well, still no one has commented with their statistics or their SQL usage, or the amount of users they have with drupal.
This is why hundreds if not thousands of developers don't switch to using drupal, because they think Drupal is inefficient "piece of crap" as a couple people told me in the past.
=-=
Questions like this should be directed to the drupal performance group at groups.drupal.org. And while there "may" be hundreds or thousands of dev's who don't like Drupal, there are ten's of thousands of Drupal sites working perfectly well on the internet.
Is 100 querries alot ?
For a dynamic site ? I don't think so.
However, the core developers do work on performance, If you have something to add to performance, or your "C++" friends have something to add to performance, by all means, submit a patch.
Hmm... I've yet to see a
Hmm... I've yet to see a site coded in C++.
Well I didn't mean that,
Well I didn't mean that, they coded a program in C++ not the site. They use their program and they only load like 10 queries per user, but thats because the program rarely makes updates/inserts from the site, while a website needs to be re queried each page rather than each user.
My site performs 50 queries
My site performs 50 queries on average with a load time of around 75ms. Using about 10 add-on modules.
Nice, but what is your
Nice, but what is your uniques per day or some other statistic. My site seems to get a lot of load time during the day, maybe because I didn't allow many clients/threads for Apache or something.
Note that statistics and
Note that statistics and queries are two different issues and their effects aren't directly proportional.
In one hand, you need to know is if you have slow queries, because a single one can be the culprit of a slow site.
The rest of the hundreds of queries are usually fast queries and have no significant impact on the load.
You need not use the slow query log in mysql, just take a look at devel.
The devel results can show multiple slow queries, but one slow query can delay the next few queries.
So troubleshooting is needed to find the real bottleneck, and it is often the slowest query.
For example, a pager_query on views is made twice, one to get the values and other to count the results.
If the first is slow then the second is slow as well, as it is cached separately.
It'd show there's two slow queries when there's only one, and the related queries get enqueued too.
Now, for statistics, the load depends on the site. Server statistics can show if there's "peak hours".
Errors when reaching MaxClients in apache is recorded in the error log.
Lots of RAM is allocated per httpd process, but if there's no slow queries it will be released fast.
With 20+ RAM per process and 150 MaxClients RAM usage can build up quickly if pages aren't served properly.
The resources are allocated anyways so statistics are only used to calculate the maximum of resources you want in the server.
In this case you need to find out the "simultaneous visits" to know what the averages and peaks are.
Another setting that helps
Another setting that helps is KeepAliveTimeout in apache.
If all your pages have time generation of 1 seconds (1000ms), and there's no multiple pages or many images viewed simultaneously or sucesively, then you can lower that setting to release the process.
It is usually set at a default of 15 seconds, which is too much. It can be lowered to 3-5 seconds.
Of course, this depends on the sige usage, and can't be modified on shared hosting.
A 15 seconds or higher is often used to send images, which are sent one after another.
If those seconds aren't used then there's a slight waste of resources that can have an effect.
Now, simultaneous users depend on the site usage.
The who is online block has a default of 15 min window per visit.
I set the who is online block to 5 min, so they dissapear after 10 minutes.
If the site is slow, they will pile up, won't dissapear and the number will be misleading.
The google analytics statistics can tell how long have the visitors stayed in the site
(for example, averages of 25 pages and 10 minutes per visit).
Google analytics has a window of 30 minutes, but lowering it spoils the time of spent on the visit.
With a 5 min window and visits of 10 minutes the visitors would double, and this goes for any statistics software.
Webalizer on cpanel also has a 30 minutes window by default, having it lowered will skyrocket the numbers.
Web counters usually count page views, so they don't track visitors.
For load, with 70 simultaneous users in a 15 min window (from who is online block) and 2 pages per minute (from google analytics) there should not be any delays or lag, unless the site is slow.
That is, assuming the server can handle those 70 users in any 30 seconds span.
If there's delay, then those 70 users grow as there will be users and pages waiting to be served.
For example, using an image galleries with large images often enqueues those images, so there's a slight delay that keeps users waiting and processes busy.
The amount of queued processes in apache can be seen with the top command. altough using the iostat software can be run in cron to track it.
If everything is right there will be almost no queue and lots of idle processes waiting for connections.
When there's a shortage of resources the queue grows and if the visitors pile up it will reach the maximum capacity, show blank pages, begin swapping and eventually crash.
My keepalivetimeout is very
My keepalivetimeout is very low, i made sure its incredibly low, it's about 3 seconds I believe, but I know how 15 second can cause problems. I also lowered the amount of MaxClients and other process options in Apache, because I dont want to be overloaded with CPU power or RAM usage.
Here's my Apache Settings, and you already know my statistics so I bet you can make a fair assumption:
Timeout 30
MaxKeepAliveRequests 100
KeepAliveTimeout 3
StartServers 8
MinSpareServers 9
MaxSpareServers 30
ServerLimit 156
MaxClients 156
MaxRequestsPerChild 100
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
Basically I tried to make sure there wasn't too many active clients, and also tried to make sure it wouldn't use too many resources because I was acting worried, but I am not sure. At Peak ours Top d1 can tell me that we have like 10-30% CPU usage on Httpd even with these settings. A couple times I caught MySQL at 99% usage, but it hasn't happened for a while, so I don't think thats related to the website.
I also edited my.conf for MySQL so that it can take larger buffers.
Usually with the 15 minute window I guess the Who's Online block would say around ~40-80 online visitors at our most peak times. And I dont use Google Analytics or other crappy statistics software I use one that actually gets good information like Unique Visitors, First-Time Visitors, Returning Visitors, Where users come from, and what keywords linked back to me etc.
50 queries? ha-ha
My newest site performs @500 queries only in homepage . Using @30 CCK with 20/110 fields and @15 modules.
Martin GERSBACH,
www.gersbach.net
Paris, FRANCE
500 queries is not a lot
500 queries is not a lot depending on your amount of visitors. If you're getting 100 visitors a day, that's nothing, but if you're getting thousands and thousands, then it could be a MAJOR MAJOR problem, and for my site, 500 queries would be unacceptable! I wonder though, what is the amount of visitors you get per day (not page views)!?
wow
500 queries is a LOT. i have a feeling that isn't going to scale so nicely without some changes.