I recently have mysql high loads on 2 different servers. I have analyzed the problem and i have found that with i18n enabled the mysql has an heavy load caused by some query that are in i18n module. I also checked for the indexes on the tables and are properly set.

the responsible seems to be the query that adds a self left join on the node table: (i18n.module line 467)

the query generated is:

SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.vid = r.vid  LEFT JOIN node i18n ON n.tnid = i18n.tnid AND i18n.language = 'fr' WHERE (n.language ='fr' OR n.language ='' OR n.language IS NULL OR n.language = 'it' AND i18n.nid IS NULL) AND (  n.status = 1 )GROUP BY r.tid;

this query cause high loads on mysql (5.0.81) and 194seconds are needed to extract 722 rows.
(with 2 or 4 simultaneous query you can hit a load average(1) of 10 on a dual xeon...)

without i18n the server does not have high loads, and the query:

SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n INNER JOIN node_comment_statistics l ON n.nid = l.nid INNER JOIN term_node r ON n.vid = r.vid WHERE n.status = 1 GROUP BY r.tid;

use only 0,0084seconds to perform

so I think I have hit a bug.

I am not sure that mysql is not responsible... the problem is the same on mysql 5.0.45 and 5.0.81 and on a busy server could cause the server to become unresponsive (!). But i cannot have at this time the possibility to check on a mysql 5.1..

Comments

inalto’s picture

Status: Active » Fixed

Well, after some day googling around and reading some books I have found the solution.

The main thing I have learned is: "If you have slow queries, maybe you missed an index somewhere".

first of all I have searched what in the mysql caused high loads.
with:

watch mysqladmin pr

i've found a lot of
"Copying to tmp table" and "Waiting for table" with some of the i18n queries on the top...

MyIsam has table-locking db and this has some disadvantages as you can read here: http://dev.mysql.com/doc/refman/5.0/en/table-locking.html

In my server this rapidly degradates the performances and under some circumstances the load averages goes up and up until the system become unresponsive (40-60!).
I have immediately installed the monit daemon to restart mysql on high load averages at first.
Then I identified the query above in the mysql slow log. and then i have done an "explain extended" an then i analyzed by hand the query:

SELECT r.tid, COUNT(n.nid) AS topic_count, SUM(l.comment_count) AS comment_count FROM node n
                INNER JOIN node_comment_statistics l ON n.nid = l.nid
                INNER JOIN term_node r ON n.vid = r.vid 
                LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'fr' WHERE (n.language ='fr' OR n.language ='' OR n.language IS NULL OR n.language = 'it' AND i18n.nid IS NULL) AND (  n.status = 1 )GROUP BY r.tid;

the last left join uses "n.language" in the where clause.
At first this seems not a big problem because the the language variable is made by two characters, small enough to avoid the usage of an index (but this is wrong!).
If I look at the node table the language field is a varchar(12)... so i tried:

ALTER TABLE `mydrupaldatabase`.`node` ADD INDEX `language` USING HASH(`language`);

and voilà the 194 seconds query that locked my server for a week causing high load disappeared, and the query response time falled to 1.4seconds...

now i have my load averages <2 and i feel better :-)

The strange behaviour was that the server load average flapped because I have a monit daemon that restarted mysql when load average goes over 10.

Now everything works fine.

I have to say that problem does not happens if you have few nodes / taxonomy terms relations. In my case the term_node has > 23.000 records and there are >9000 nodes. All worked, without problem (but with high load averages) until approximately 6000 nodes....

With < 3000 nodes you even cannot notice any difference.

I hope this could be useful for people having high mysql loads...

hass’s picture

Title: high database load with i18n enabled » Missing index on node.language causes high database load with i18n enabled
Project: Internationalization » Drupal core
Version: 6.x-1.1 » 7.x-dev
Component: Code » language system
Status: Fixed » Active

This may need to be fixed in core... not sure if i18n module should alter core tables... I believe not. This sounds like a serious performance issue that needs some attention of the core developers.

If i18n is allowed to add indexes to core tables we may add one index named like "i18n_language"

hass’s picture

http://api.drupal.org/api/function/hook_schema_alter/6 may be the way for i18n, but I've never used it myself. Need some confirmation of core if i18n need to change or core is able to change.

inalto’s picture

StatusFileSize
new57.71 KB

I can confirm that after a week of testing with mysql 5.0.81 I dont have any high load peaks or performance issues on the server anymore, so if anyone has performance issues, this index could be a quick solution.
now load average il mostly <2 on a dual xeon.
In the past i had peaks of load average of 40-50 and if the server hitted the swap could even crash.
the peaks in the "before" chart are <10 just because monit restarted mysql if load average reached 10 for 2 cycles.

I think that i18n is relatively new and peraphs is for that reason that the index is not already here.

On sites with few hundred nodes you can even not notice the problem, (at least if you dont turn on log-queries-not-using-indexes in your my.cnf and then look in the mysql.slow.log)

adding an index to the node table in the core is a small patch after all.

catch’s picture

node module should probably add the index, even if it's only used when translation.module/i18n is enabled. I'd be really happy with locale.module hook_schema_alter()ing the index in, since then sites not using it get faster insert performance, but we don't have any hook_schema_alter() in core, and really locale could be altering the column in too, which it doesn't.

plach’s picture

@catch: would unilingual sites be penalized by adding the index even if the inserted value for node.language is always the same ('und')?

catch’s picture

afaik index buffer gets flushed on every insert, so rebuilding the index has to happen for every index on a table. Not sure if identical values affects this or not.

damien tournoud’s picture

If we can add this at the begining or at the end of an existing index, we won't have any performance penalty.

catch’s picture

I don't see an obvious place to do that. The query above would want language, status (in which case it'd never be used for just status because MySQL reads indexes left to right), or status, language - and we don't have an existing index like either of those.

There's separate indexes on tnid and translate though already - would need to look at which queries are using those indexes in core already, but they seem like the most likely options if they could take a condition on language in all cases.

Crell’s picture

The index is updated, not completely wiped and rebuilt, AFAIK. I have no problem with adding another index in core if that makes it easier. Drupal is a read-heavy system, so if it makes saving a new node a few nanoseconds slower to make reading it back it substantially faster that's a good trade-off.

catch’s picture

I'm erring on the side of locale.install (or translation if the indexes are only used for those) implementing hook_schema_alter() and adding both the existing indexes, and the new one there - as long as that's not ridiculously complicated to implement, and if it is then we've got a problem with hook_schema_alter(). While I agree with Crell in general, it's also good not to have redundant or indexes or ones which are never used. Drupal may be read heavy, but language-based queries on the node table are probably less frequent than inserts taken across the install base.

Crell’s picture

I am not sure that hook_schema_alter will even have an effect if that module isn't enabled until post-install. It's black magic that only quasi works.

If we want to only have an index when locale module is enabled, it will need to explicitly add it itself with an addIndex() call. That is, of course, messy because then some D7 sites will have an index there and some won't, which makes future upgrades trickier.

Let's just cut to the chase and put an index on it from the get go if we know it's going to be used.

catch’s picture

We don't know it's going to be used, it looks like it's only currently used if you use the contrib i18n module at the moment, not by queries which core generates. If we have db_index_exists() then that's really not too messy to add.

damien tournoud’s picture

Status: Active » Closed (won't fix)

Let's won't fix this. i18n should add this index itself.

hass’s picture

Project: Drupal core » Internationalization
Version: 7.x-dev » 6.x-1.x-dev
Component: language system » Code
Status: Closed (won't fix) » Active

Moving case back

yang_yi_cn’s picture

subscribe

andrewsuth’s picture

Any more follow ups with good results with this one?

dkg’s picture

It seems a little bit worrisome that this issue has been open for 16 months with a known simple fix that has not yet made it into a packaged version of drupal.

The simple fix is that sites with the i18n module installed need an index on node.language.

is drupal's architecture getting in the way of actually implementing this fix for users? Or is there some other reason for the delay?

catch’s picture

i18n module can add this with db_add_index() (in hook_update_N(), hook_enable()) and hook_schema_alter().

Drupal 7 has http://api.drupal.org/api/drupal/includes--database--database.inc/functi... which makes it easier to avoid errors from adding duplicate indexes.

So all that's really needed here is for someone to write a patch, perhaps you'd like to volunteer dkg?

qazxswedc’s picture

Version: 6.x-1.x-dev » 6.x-1.9
StatusFileSize
new1.29 KB

Thanks for digging into this problem, all. I spent a solid 15 minutes panicking when I couldn't get db load to calm down. Attached is a patch to i18n.install that adds an index to default.node.language in i18n_install, or adds it in an update. I tested the installation against a fresh Drupal 6.20.

lestu’s picture

Component: Code » Blocks

I have the same problem, but unfortunately the index doesn't show any effect at all.
After executing the ALTER TABLE statement to create the index, it should be working immediately, right?

I have roughly 16'000 nodes.
What I have found, this problem is in part caused by nodes having tnid=0. Setting tnid=nid on all the nodes that have tnid=0 solves the speed problem, but of course it is not a good workaround in practice because it messes up the content translation system.

lestu’s picture

This patch fixed my problem: http://drupal.org/node/337089#comment-4482720
So I guess it wasn't exactly the same, but quite similar in its effect.

alanpeart’s picture

Literally saved my life (and perhaps more importantly, my server!). Thank you.