Every 2 or 3 days we are getting a very high server load (5 minutes average up to 70), the server starts to swap, and mysql reports some slow queries. After 15 to 20 minutes monit restarts the apache server because it does not respond anymore and after about 25 minutes everything is normal again. I was not able to find out the reason why this happens. There seem to be only 1 to 2 anonymous users and no logged-in users when this happens.

Drupal 5.2 is running on a dedicated server with 700 Mhz and 250 Mb RAM (Linux Sarge). The main server load is coming from Drupal.

I wrote a script to log the output of „top“ when this happens. Here is the result of top:

top - 14:47:58 up 3 days,  5:43,  0 users,  load average: 9.37, 3.67, 1.53
Tasks:  86 total,   4 running,  82 sleeping,   0 stopped,   0 zombie
Cpu(s):  4.0% us,  3.8% sy,  3.0% ni, 87.1% id,  1.9% wa,  0.0% hi,  0.0% si
Mem:    255380k total,   253268k used,     2112k free,      992k buffers
Swap:  2097136k total,   175396k used,  1921740k free,    12524k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
24399 drupal    17   0 25396 9864  20m R  3.1  3.9   0:00.32 php                
24407 drupal    16   0 25396 9844  20m D  3.1  3.9   0:00.29 php                
24412 drupal    17   0 25436 9764  20m R  2.0  3.8   0:00.30 php                
24414 drupal    17   0 25396 9864  20m R  2.0  3.9   0:00.29 php                
   47 root      15   0     0    0    0 S  1.0  0.0   0:27.87 kswapd0            
24360 drupal    16   0 40020  17m  20m D  1.0  7.2   0:00.97 php                
24364 drupal    16   0 40000  19m  20m D  1.0  7.8   0:00.93 php                
24367 drupal    16   0 40024  18m  20m D  1.0  7.6   0:00.95 php                
24417 root      15   0  2060  944 1852 R  1.0  0.4   0:00.02 top                
    1 root      16   0  1504  400 1352 S  0.0  0.2   0:01.12 init               
    2 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0        
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.20 ksoftirqd/0        
    4 root       5 -10     0    0    0 S  0.0  0.0   0:04.86 events/0           
    5 root      10 -10     0    0    0 S  0.0  0.0   0:00.00 khelper            
    6 root      14 -10     0    0    0 S  0.0  0.0   0:00.00 kacpid             
   35 root       5 -10     0    0    0 S  0.0  0.0   0:07.12 kblockd/0          
   48 root       5 -10     0    0    0 S  0.0  0.0   0:00.00 aio/0              
  184 root      25   0     0    0    0 S  0.0  0.0   0:00.00 kseriod            
  313 root      15   0     0    0    0 S  0.0  0.0   0:18.89 kjournald          
  795 root      15   0     0    0    0 S  0.0  0.0   0:00.00 khubd              
 1690 daemon    16   0  1612  260 1440 S  0.0  0.1   0:00.00 portmap            
 2072 root      16   0  2260  636 2092 S  0.0  0.2   0:05.68 syslogd            
 2075 root      15   0  2384  428 1344 S  0.0  0.2   0:00.63 klogd              
 2083 bind      18   0 30856 2180 4252 S  0.0  0.9   0:00.00 named              
 2132 root      16   0  2628  408 2192 S  0.0  0.2   0:00.00 rpc.dracd          
 2138 root      19   0  2240  388 2084 S  0.0  0.2   0:00.00 inetd              
 2154 root      25   0  2508  404 2380 S  0.0  0.2   0:00.01 mysqld_safe        
 2191 mysql     15   0  122m  14m 8920 S  0.0  6.0   0:02.84 mysqld             
 2192 root      15   0  1488  364 1332 S  0.0  0.1   0:00.00 logger             
 2236 root      16   0  4416 1072 2624 S  0.0  0.4   0:11.35 logdog.pl          
 2243 root      19   0  6556  488 6168 S  0.0  0.2   0:00.00 saslauthd          
 2245 root      19   0  6556  488 6168 S  0.0  0.2   0:00.00 saslauthd          
 2246 root      19   0  6556  488 6168 S  0.0  0.2   0:00.00 saslauthd          
 2247 root      19   0  6556  488 6168 S  0.0  0.2   0:00.00 saslauthd          
 2248 root      19   0  6556  488 6168 S  0.0  0.2   0:00.00 saslauthd          
 2253 root      16   0  3468  520 3092 S  0.0  0.2   0:00.59 sshd               
 2334 root      16   0  7596 1040 6564 S  0.0  0.4   0:00.24 sendmail-mta       
 2349 root      15   0  1764  552 1576 S  0.0  0.2   0:00.53 cron               
 2367 root      16   0  6884 1172 3636 S  0.0  0.5   0:01.81 munin-node         
 2399 root      16   0 11712  588 3156 S  0.0  0.2   0:10.81 monit              
 2477 root      16   0  9672 1160 5300 S  0.0  0.5   0:00.21 miniserv.pl        
27667 root      16   0  7076 1544 5556 S  0.0  0.6   0:00.92 apache             
14953 www-data  15   0  7212 1600 5556 S  0.0  0.6   0:00.85 apache             
14958 www-data  16   0  7344 1580 5556 S  0.0  0.6   0:00.97 apache             
13307 www-data  15   0  7216 1672 5556 S  0.0  0.7   0:00.65 apache             
16247 www-data  16   0  7348 1580 5556 S  0.0  0.6   0:00.27 apache             
18378 www-data  16   0  7216 1600 5556 S  0.0  0.6   0:00.25 apache             
18684 www-data  16   0  7216 1672 5556 S  0.0  0.7   0:00.27 apache             
20386 www-data  16   0  7348 1580 5556 S  0.0  0.6   0:00.21 apache             
22473 www-data  15   0  7216 1600 5556 S  0.0  0.6   0:00.09 apache             
22566 www-data  16   0  7216 1732 5556 S  0.0  0.7   0:00.05 apache             
23574 www-data  17   0  7212 1792 5556 S  0.0  0.7   0:00.02 apache             
24337 drupal    16   0 39996  14m  20m D  0.0  6.0   0:00.94 php                
24338 drupal    16   0 39732  11m  20m D  0.0  4.8   0:00.91 php                
24339 drupal    16   0 41296  10m  20m D  0.0  4.2   0:01.08 php                
24340 drupal    16   0 39996  14m  20m D  0.0  5.8   0:00.92 php                
24341 www-data  15   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24347 drupal    16   0 39996  15m  20m D  0.0  6.4   0:00.94 php                
24348 www-data  17   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24349 www-data  16   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24351 drupal    16   0 39996  17m  20m D  0.0  7.1   0:00.96 php                
24353 www-data  16   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24354 www-data  15   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24355 www-data  15   0  7212 1596 5556 S  0.0  0.6   0:00.00 apache             
24356 www-data  17   0  7212 1816 5556 S  0.0  0.7   0:00.02 apache             
24357 drupal    16   0 41296  11m  20m D  0.0  4.8   0:01.06 php                
24359 drupal    16   0 39996  17m  20m D  0.0  7.0   0:00.94 php                
24361 www-data  17   0  7208 1768 5556 S  0.0  0.7   0:00.05 apache             
24365 www-data  16   0  7212 1884 5556 S  0.0  0.7   0:00.00 apache             
24366 www-data  16   0  7212 1884 5556 S  0.0  0.7   0:00.00 apache             
24370 drupal    16   0 39864  15m  20m D  0.0  6.1   0:00.94 php                
24371 www-data  17   0  7212 1884 5556 S  0.0  0.7   0:00.00 apache             
24372 www-data  16   0  7212 1884 5556 S  0.0  0.7   0:00.00 apache             
24373 www-data  18   0  7076 1524 5556 S  0.0  0.6   0:00.00 apache             
24374 www-data  17   0  7076 1536 5556 S  0.0  0.6   0:00.00 apache             
24380 root      18   0  2052  768 1660 S  0.0  0.3   0:00.00 cron               
24381 root      16   0  2732 1260 2528 S  0.0  0.5   0:00.03 load-average       
24382 root      15   0     0    0    0 S  0.0  0.0   0:00.00 pdflush            
24408 root      15   0     0    0    0 S  0.0  0.0   0:00.00 pdflush            
24416 root      15   0  8204 2972 6580 S  0.0  1.2   0:00.00 sendmail-mta       

During the high load-average, there are up to 20 php processes! I can't find out what causes this to happen. There were no users logged into drupal and the only unusual I could find in the apache access log were some strange calls to nonexisting files, like e.g.

196.0.7.129 - - [31/Aug/2007:14:44:39 +0200] "GET /effects-of-global-warming HTTP/1.1" 200 6150 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"
196.0.7.129 - - [31/Aug/2007:14:44:44 +0200] "GET /Effects%20of%20global%20warming%20%20Time%20for%20change_files/aggregator.css HTTP/1.1" 404 6068 "http://example.com/effects-of-global-warming" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"
196.0.7.129 - - [31/Aug/2007:14:44:48 +0200] "GET /Effects%20of%20global%20warming%20%20Time%20for%20change_files/book.css HTTP/1.1" 404 6068 "http://example.com/effects-of-global-warming" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"
196.0.7.129 - - [31/Aug/2007:14:44:50 +0200] "GET /Effects%20of%20global%20warming%20%20Time%20for%20change_files/forum.css HTTP/1.1" 404 6068 "http://example.com/effects-of-global-warming" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"
122.2.92.219 - - [31/Aug/2007:14:44:51 +0200] "GET /modules/aggregator/aggregator.

196.0.7.129 - - [31/Aug/2007:14:47:49 +0200] "GET /adsa_files/book.css HTTP/1.1" 404 6068 "http://example.com/some-page" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"

What can I do to find out what happens and how can I prevent this?

All hints will be much appreciated.

Juerg

Comments

juerg’s picture

Installation of the module badbehavior and modifications to .htaccess in order to let Apache be the handler for non-existing flat files (404 error) improved the situation.

However about three to five times per week I am still getting a server overload due to 10 to 20 calls for non-existing pages or files within a short time. Usually the same (non-existing) page is called 10 to 20 times within a few seconds. This seems to cause the overloads.

I would appreciate any comments or suggestions how I could avoid the overload situations!

Juerg

SamRose’s picture

I have had bots/open proxy hacker attacks cause this problem.

Google for the pages that are being called to see if anyone else is talking about that pattern. If so, it is likely a bot. Also, traceroute the IP address doing the page calls