We stumbled across this accidentally after moving some Drupal apps off a system (its network traffic stats changed completely). What we are seeing is a factor of almost exactly 10 to 1 for comparative amounts of network traffic between 1) database to web server and 2) web server to rest of world. This suggests that more database data than is really necessary is being SELECTed - either extra rows and/or perhaps unnecessary columns.

I will be spending some more time on this to (hopefully) pinpoint the statements and tables that are the source of most of the traffic, and will update tis when I have them.

CommentFileSizeAuthor
#2 query-log.txt105.61 KBmarkir

Comments

ibeardslee’s picture

Can't work out how to follow or subscribe to this issue, so making this comment instead.

markir’s picture

StatusFileSize
new105.61 KB

Attached file has counts and total query text length for 1 database over 24 hours.

The total database net traffic for this time is:

statements 1890965
bytes recv 4484517926
bytes sent 11582608376

Over 3G of the 4G recv is composed of:

INSERT INTO cache (cid, data, created, expire, headers, serialized) V ...
UPDATE cache SET data = 'a:511:{s:28:"remember_me_settings_display";a ...
UPDATE cache SET data = 'a:607:{s:13:"filter_html_1";i:1;s:18:"node_o ...

The query text length only counts towards traffic *to* the database from the web server, however one can deduce the likely length of SELECT results rows from the corresponding INSERT or UPDATE ones.

e.g:

SELECT data, created, headers, expire, serialized FROM cache WHERE...

occurred 38583 times, and calculating the avg length from the cache INSERT (140232) suggests that this SELECT above will cause 5410571256 (i.e approx 5G) of the 11582608376 total.

So looks like the various cache* table usage is the major cause of the network traffic.

This begs the question as to why the the cache retrieval data is 10 times larger in size than what is being finally served to the client - are we perhaps SELECTing from the cache several times (e.g about 10) and only using the results once per request?

BrockBoland’s picture

I'm seeing similar traffic on a client site. I suspect that the DB would still be handling as much data if it were running on a single server, but you might not notice it.

Since much of this traffic is back and forth to cache tables, I'm tempted to try out the Memcache API module (http://drupal.org/project/memcache) to handle Drupal core caching without going back and forth to the DB, but I'm curious if anyone else has found a better way to address this.

siogwah’s picture

Has anything come of this issue? I've noticed the same.
typical traffic

Status: Active » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.