Drupal.org configuration changes

raema - June 19, 2006 - 20:30

To make better use of the Drupal.org servers we've set up PHP to utilize an opcode cache (APC). After a few tweaks, the cache is helping out by almost halving load and CPU usage. Drupal.org traffic has continued to grow over last year, nearly doubling. Additionally with sites such as themes.drupal.org, api.drupal.org, and groups.drupal.org, the two web nodes weren't too far from hitting their limits. APC ought to allow drupal.org and friends to run happily, longer, on the same hardware.

Attached is a graph showing the CPU usage before and after (about since 16:30) on drupal1.

AttachmentSize
drupal1_load_after_apc.png41.99 KB

How did you do that?

James Prather - June 19, 2006 - 22:13

How did you utilize an opcode cache? I run a PHP site that gets alot of traffic, and we are running out of bandwidth, so I am interested to know how you did this, so that we could possibly do the same thing.

Lower CPU usage not bandwith

tostinni - June 19, 2006 - 23:33

Applying this technique won't save you any bandwith. The pages served are the same (maybe you should find what is crunching your bw and maybe optimize this point).

APC cache was used here to served cached content to anonymous users and avoid make severals queries against DB and this allowed to reduce the CPU usage (killes post some graph on the Infrastructure list to show the APC effect)

Byte code caches remove overhead of interpretation

Samat Jain - June 20, 2006 - 00:08

APC and other bytecode caches don't reduce the number of SQL queries made, they remove the some of the overhead of PHP's interpreter.

Rather than having to interpret the script, it "compiles" (I loose the term loosely) to an intermediate/binary code, which is then cached. Each subsequent request for that page executes the cached version instead of having to interpret the original script each time a page is requested.
__
Personal home page | Rhombic Networks: Drupal-friendly web hosting

Right. The graph can be seen

killes@www.drop.org - June 20, 2006 - 09:58

Right.

The graph can be seen

here: http://lists.drupal.org/private/infrastructure/attachments/20060615/3336...

APC was enabled at about 16:30

--
Drupal services
My Drupal services

No show

george@dynapres.nl - June 20, 2006 - 10:54

I can't see the graph: it's a "Private Archives"... :(

I've attached the graph to

killes@www.drop.org - June 20, 2006 - 12:05

I've attached the graph to the forum topic.
--
Drupal services
My Drupal services

eAccelerate

gordon - June 20, 2006 - 00:18

I remember a message earlier what eAccelerator had been installed.

Was there a problem with it and Drupal, or is APC better. Can you give a short run down on why (if you did) change from eAccelerate to APC, or alternatively why did you choose APC over other op caches such as eAccelerate or event Zend Optimatizer.
--
Gordon Heydon
Heydon Consulting

I can remember reading

brashquido - June 20, 2006 - 05:55

I can remember reading somewhere that eAccelerator had some issues with PHP5, but was under the impression it was king of the hill for OS PHP4 opcode accelerators. Seems a lot of large sites have been saddling up with a Memcached + APC combo of late to reduce Web/DB server loads.

----------------
Dominic Ryan
www.it-hq.org

eAccellerator was tried on

killes@www.drop.org - June 20, 2006 - 09:56

eAccellerator was tried on the old hardware and used to crash Apache.
--
Drupal services
My Drupal services

Well, now maybe its just the

jakeg - June 20, 2006 - 06:39

Well, now maybe its just the time of day, but it definitely feels more speedy.

At some point though, its important that the MySQL/Postgres wrapper functions build in the ability to send write queries to the 'master' database, and read queries to slave databases. Otherwise at some point MySQL will no doubt become a bottleneck.

Jake
---
School and university yearbooks and Drupal web services, London

Patch?

Dries - June 20, 2006 - 07:41

There was a patch for that, but can no longer find it ...

PS: the site is more responsive indeed.

mod gzip

bertboerland@ww... - June 20, 2006 - 08:54

judging from the headers, drupal.org doesnt send out gzipped pages does it? i think implementing this will even make drupal.org faster

--
groets
bertb

Correct me if I'm wrong, but ...

shane - June 20, 2006 - 16:31

... Sending out GZIPd pages will only reduce the bandwidth utilized (which isn't necessarily a bad thing). The server would still have to GZIP the page up, which is an expensive CPU operation (compression = math = CPU crunching) - that would increase the CPU usage, which seems to have been a (the?) problem with drupal.org.

GZIPd pages will make things appear to load quicker for those with smaller amounts of bandwidth available to them, and the extra CPU horsepower to unzip the page after they receive it. If you have a fair amount of bandwidth, or your bandwidth isn't the bottleneck, it won't do you much good - it might even slow things down.

For example - a 21 Kbyte page (the Drupal front page is roughly 21 Kbyte - images don't get GZIPd into the stream), it would only take aprox. 1.3 seconds for it to load on a 128 kbit/sec link - which is MINISCULE compared to most broadband. Of course, a 56kbit/sec dialup modem may be able to achieve some compression to get close to that, but still - 2.6 seconds to download the page isn't that much - even for a modem user.

Of course - the longer/bigger the page is, the more on your return for GZIPing the page up for bandwidth challenged users...

Some references:

  • Lehigh University study on Benefits and Drawbacks of HTTP Compression
  • 15 Seconds article Web Site Compression

    In order for the server to respond with a compressed page, the page needs to be compressed. Compression happens in two distinct ways: for static requests, the page can be compressed ahead of time and served to multiple requests. For dynamic pages, like Active Server pages or Cold Fusion pages, the response has to be compressed on every request, since the output is different for every request.

    Since compression takes time, roughly 100 - 1000 milliseconds for every page depending on size and compression quality, compressing static pages ahead of time and severing them to multiple requests saves CPU cycles and makes for faster responses. Compressing dynamic pages is harder on the server, since all requests need compression and none can be done ahead of time.

Also - here is a phpLens article on HOWTO on Optimizing PHP with tips and methodologies - an excellent primer and resource on how PHP impacts your server and what you can start doing to tune your server to work at optimal performance and capacity.

right

bertboerland@ww... - June 20, 2006 - 17:29

thanks for the excellent links and your reaction!

yes, sending zipped pages will reduce the load on the network and increase the load on the server (and the client).

However I was talking about making Drupal.org from and end users point of view faster. And sending zipped pages would make the site -pending there are enough cycles to burn- faster. And since we do have some spare cycles hanging around (more than 50% if I recall correcltly) it does make sense to implement mod_gzip or applications like that that send out zipped pages.

--
groets
bertb

Drupal stores Gzipped pages in cache

Bèr Kessels - June 21, 2006 - 10:02

... so gzipping on requests is not neccesary for cached pages. I doubt Drupal.org will benefit much from gzipping, because due to its nature, it wont use the cache a lot. But for sites wich serve mayority from cache Gzip is a good option.

That is, if i understood this gzipped caching part correct :)
---
Professional | Personal
| Sympal: Development and Hosting

no

bertboerland@ww... - June 21, 2006 - 10:18

no it doesnt and it shouldnt! Compressing pages is something that should be done by the upper layers like the webserver or a reverse proxy.
--
groets
bertb

For those stuck on slow connections...

Samat Jain - June 20, 2006 - 19:05

Vaguely related, but I felt like mentioning it: For those stuck on slow connections (e.g. me, I am on 144k IDSL), the trend to move away from using gzip compression on pages has been making everything load that much slower for me.

If you've a server off-site on some fast connection, there is a nice HTTP proxy software called RabbIT. It's billed as a proxy to speed up web surfing, and it does that: one of it's features includes gzip compression of pages. Using a proxy does increase latency a little, but on average the benefits of gzip compression tend to null the increased latency out.

__
Personal home page | Rhombic Networks: Drupal-friendly web hosting

Output compression and slower page loads

neclimdul - June 20, 2006 - 21:20

This is somewhat missleading. It does have a CPU overhead but it can be useful in speeding up pages delivery. This is hidden in a couple of the linked pages but this says it very clearly:

In real-world test of content compression (using the mod_gzip solution) found that not only did I get a 30% reduction in the amount of bandwidth utilized, but I also got an overall performance benefit: approximately 10% more pages/second..."

-George Schlossnagle in Advandced PHP Programming

As a reference George is one of the leads for APC as well as having other roles in the php community. So gziping should not be discarded as a means for speading up page delivery if you have some CPU overhead to spare. This may not be true for everyone but it should be considered before discounting using compression for your website.

One reasons for this, the outbound connection is open for the extent of the transfer. If php's output is buffered, compressed, then sent it means the connection sending the page should be open for a shorter time.

Just wanted to clear this up. Cheers.

Here is the patch for

killes@www.drop.org - June 20, 2006 - 09:55

Here is the patch for replication:
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/crackerjm/rep...

Hope we don't need it too soon.
--
Drupal services
My Drupal services

This is great. Definitely

Ian Ward - June 23, 2006 - 12:01

This is great. Definitely faster. Now, say you have a handful of sites, some are multisiting, single code base, and others are not. I've read that maybe there is a PHP cache that will not cache duplicate files on a server - meaning, if you've got 30 code bases, the cache could tell which files are duplicates, and just cache and use one of them. Is this really possible? It doesn't seem like it - basically this would mean the PHP cache is telling where to pull scripts from, and it would be like a virtual multisiting...

Does anyone know about this?

cheers,
Ian

I think it hardly matters,

killes@www.drop.org - June 23, 2006 - 13:19

I think it hardly matters, we only use 60MB of diskspace for the cache on the machine with more sites. The other one is ok with only 30MB.
--
Drupal services
My Drupal services

well

moshe weitzman - July 3, 2006 - 12:20

it is true that disk space hardly matters. but it is nice if sites are multi-site because even the infrequently used sites will have fast pages because their scripts are being thrown into the cache by the frequently used sites.

APC Install in 2 minutes, massive immediate benefit!

jakeg - June 24, 2006 - 10:27

Following this article and reading Dries' blog, I've installed APC on my own server and have seen immediate benefits far beyond any expectations I may have had. Here's the easy install for those on Fedora Core (I'm using FC5, may work on older versions as well):

- $ yum install php-devel
- $ pecl install apc (choose no to apxs question as that's not installed)
- add 'extension=apc.so' to PHP.ini
- $ service httpd restart

Check your apache log after a restart and look at a phpinfo() output to make sure its installed.

I benchmarked my server before and after this:

ab -n100 -c5 http://www.allyearbooks.co.uk/yearbooks/demo

BEFORE:
Requests per second: 4.39 [#/sec] (mean)
Time per request: 1139.755 [ms] (mean)

AFTER:
Requests per second: 11.12 [#/sec] (mean)
Time per request: 449.562 [ms] (mean)

... that's a massive improvement, almost tripling the number of page requests I can serve and improving the speed users will see pages. And it took about 2 minutes to implement.

On a side note, how does one protect against the use of a command like ab for DOS attacks?!

Jake
---
School and university yearbooks and Drupal web services, London

typo

bertboerland@ww... - June 24, 2006 - 19:00

use pear install apc

regarding preventing a DOS, well you cant. Base here is: if you can use something, you can abuse something.

You can block src IP addresses, useragents and try all kind of things. But you will always have to react instead of be pro active (some will tell you you can prevent DoS'es but they have propabbly never saw a huge DOS in progress.)

If there is a way to "prevent" a DoS, it is bigger pipes, more cycles.
--
groets
bertb

That wasn't a typo. If I

jakeg - June 24, 2006 - 22:20

That wasn't a typo. If I type 'pear install apc' it says its not available and to try 'pecl install apc' instead, which works. Your OS may vary.

Jake
---
School and university yearbooks and Drupal web services, London

DoS attacks

phacka - June 26, 2006 - 11:23

no

bertboerland@ww... - June 26, 2006 - 12:06

interesting: yes
protection against (d)DoS: no. people will use zombie networks, ICMP flooding and all kinds of services. Your pipes, your servers, your application, your DNS, mail... one of them will fall down in the end.

Akamai might be some defense against this kind of thread, but even they have been taken down in the past.

--
groets
bertb

Thanks for the mini-tutorial

RobRoy - July 10, 2006 - 03:26

Thanks for the mini-tutorial jakeg. I just install APC on my server and noticed instant results. I run a lot of sites (read: way too many) on my dedicated server and this will help with CPU load.

Before APC:
# ab -n100 -c5 http://www.mp3pig.com/index.php
Requests per second: 6.37 [#/sec] (mean)
Time per request: 785.228 [ms] (mean)

After APC:
# ab -n100 -c5 http://www.mp3pig.com/index.php
Requests per second: 35.30 [#/sec] (mean)
Time per request: 141.629 [ms] (mean)

Rob
————————————————————————————
MP3PIG — An Audio Blog Aggregator built on Drupal 4.7

Segfaults?

agentrickard - June 27, 2006 - 17:50

Have you seen any problem with Segfaults after installing APC?

We've been seeing them running PHP 5.x and Apache 1.3.

--
http://ken.blufftontoday.com/
Search first, ask good questions later.

No problems whatsoever, but

killes@www.drop.org - June 27, 2006 - 21:06

No problems whatsoever, but we use php4.

--
Drupal services
My Drupal services

Ah

agentrickard - June 28, 2006 - 14:10

I should say that our aren't insurmountable, just annoying.
--
http://ken.blufftontoday.com/
Search first, ask good questions later.

 
 

Drupal is a registered trademark of Dries Buytaert.