Location module creates queries which perform full table scan of locales_source table:

SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Users can use their own name or handle and can fine tune some personal configuration settings through their individual my account page. Registered users need to authenticate by supplying either a local username and password, or a remote username and password such as DelphiForums ID, or one from a Drupal powered website. A visitor accessing your website is assigned an unique ID, the so-called session ID, which is stored in a cookie. For security\'s sake, the cookie does not contain personal information but acts as a key to retrieve the information stored on your server. ' AND t.locale = 'pl'

Solution:
Add index on 'source' field for 'locales_source' table
(for MySQL)

ALTER TABLE `schema_name`.`locales_source` DROP INDEX `source`,
 ADD INDEX `SOURCE`(`source`(30));
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

janekb’s picture

Assigned: Unassigned » janekb
Priority: Critical » Normal
Status: Active » Needs review
FileSize
1.05 KB

Problems are only when location module is enabled and there are some untranslated strings. In such case no caching is performed and Drupal atempts to look up translation in this table.
Patch hasn't been tested with PostgresSQL (I'm not shure if index can be created on 'text' type fields).

killes@www.drop.org’s picture

Title: Full table scan in location module » Full table scan in locale module
Status: Needs review » Needs work

Thanks, can you provide a patch that includes an update for updates.inc?

Stefan Nagtegaal’s picture

Title: Full table scan in locale module » locale.module tremendous speedup!
Version: x.y.z » 4.7.0-beta6
Assigned: janekb » Stefan Nagtegaal
Status: Needs work » Needs review
FileSize
2.26 KB

Here it is, a new formed patch includes:
- updates database.mysql and database.pgsql;
- updates update.inc!

This does _really_ speed up your site when using the locale.module..

Please try, and see if we can get this little speed improvement into 4.7

Stefan Nagtegaal’s picture

As an example:

I took at the following query:
"SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = '\n <p>Welcome to your new <a href=\"%drupal\">Drupal</a>-powered website. This message will guide you through your first steps with Drupal, and will disappear once you have posted your first piece of content.</p>\n <p>The first thing you will need to do is <a href=\"%register\">create the first account</a>. This account will have full administration rights and will allow you to configure your website. Once logged in, you can visit the <a href=\"%admin\">administration section</a> and <a href=\"%config\">set up your site\'s configuration</a>.</p>\n <p>Drupal comes with various modules, each of which contains a specific piece of functionality. You should visit the <a href=\"%modules\">module list</a> and enable those modules which suit your website\'s needs.</p>\n <p><a href=\"%themes\">Themes</a> handle the presentation of your website. You can use one of the existing themes, modify them or create your own from scratch.</p>\n <p>We suggest you look around the administration section and explore the various options Drupal offers you. For more information, you can refer to the <a href=\"%handbook\">Drupal handbooks online</a>.</p>' AND t.locale = 'nl'"

It takes 20.22 ms to execute before this patch was in. After that, it only took 1.02 ms..

That would mean that the query has been executed 20 times faster as before, which means a tremendous speed up for sites which do use the locale.module..

Steef

Cvbge’s picture

Version: 4.7.0-beta6 » x.y.z
Category: bug » feature
Status: Needs review » Needs work

Postgresql part looks good, but you should use switch statement instead of if and also include mysqli case (same as mysql), thus setting to 'needs work'

Zen’s picture

Would this also be a good place to check/find out if there is any reason why we use a BLOB here (MySQL) instead of TEXT ?

-K

Zen’s picture

Category: feature » task

Caveats:

  • http://dev.mysql.com/doc/refman/4.1/en/create-index.html : You can add an index on a column that can have NULL values only if you are using MySQL 3.23.2 or newer and are using the MyISAM, InnoDB, or BDB storage engine. This is also true for MEMORY tables as of MySQL 4.0.2. You can only add an index on a BLOB or TEXT column if you are using MySQL 3.23.2 or newer and are using the MyISAM or BDB storage engine, or MySQL 4.0.14 or newer and the InnoDB storage engine.
  • http://dev.mysql.com/doc/refman/5.0/en/create-index.html :
     In MySQL 5.0:
    
        * You can add an index on a column that can have NULL values only if you are using the MyISAM, InnoDB, BDB, or MEMORY storage engine.
        * You can add an index on a BLOB or TEXT column only if you are using the MyISAM, BDB, or InnoDB storage engine.

The field length also bears investigating along with the NULL/duplicates issue.

Cheers
-K

Zen’s picture

Title: locale.module tremendous speedup! » locale.module: adding an index on the source field improves speeds
Stefan Nagtegaal’s picture

Would this also be a good place to check/find out if there is any reason why we use a BLOB here (MySQL) instead of TEXT ?
@Zen: IMO that should be a new issue. I don't want to cover more than absolutely nessecary in this issue because I would like to see this committed against HEAD before release of 4.7.

@Zen: What do you ment with your comment about the caveats? I don't know if I understand what you mean.. Does it give you any errors?

@Cvbge: Formatting of updates.inc has been improved as you pointed out.

Both, thanx for the review, new patch is attached.

Stefan Nagtegaal’s picture

Status: Needs work » Needs review

Setting status back to "patch (code needs review)"..

Bèr Kessels’s picture

FileSize
2.95 KB

I did an apache benchmark. And attached the outcome. AFAIK the patch had no real effect, the unpatched version looks faster even.

Disclaimer, i am clueless when it comes to benchmarks ;)

killes@www.drop.org’s picture

Can you do the test again with ab -n 200 -c1? Your test did indeed no tell very much. You did have locale enabled, did you? I think the patch will only affect pages where there are long strings (strlen > 75) which will be retrieved directly from the db.

Zen’s picture

http://drupal.org/node/50669 <-- This patch makes MySQL support engine agnostic, basically stating that Drupal will work on all (common) engines on MySQL v3.23.58+. As stated above, innoDB does not support TEXT/BLOB indexing in versions before 4.0.14, thus making this patch rather dicey.

Just a thought.
-K

Cvbge’s picture

And: have you had cache enabled? (should be disabled). How many translations (different languages) you had imported? (I suppose 1 non-english translation is most common, but testing with 2-3 could be good too).
Also, I wonder if index on the first 30 characters is optimal.

Zen’s picture

The rest of my comment appears to have been eaten up.

The issue in my last statement makes MySQL support in Drupal engine agnostic. So we are essentially supporting all mainstream MySQL engines (and innoDB is one of them) on versions of MySQL in v.3.23.58 onwards. As stated above, innoDB does not support TEXT/BLOB field indexing on MySQL versions older than 4.0.14, thereby making this patch a little on the dicey side.

Cheers
-K

Bèr Kessels’s picture

I will try to rerun that benchmark later.
1) no cache enabled
2) I ran it on ?q=tracker. I am not aware of any page with large strings that are available to anonymous. Ill create a php page for that.
3) I did the c4 on purpose; in my system the slow queries, those wayting while the lock is on affect speed most.
4) modules: all standard modules + locale + statsictics (the imo one-but biggest slow-downer)
5) 3 translations. approx 600 strings 'imported'.
6) only four strnigs were translated. That could be the cause?

Bèr

killes@www.drop.org’s picture

if the locale_target table is empty an index cannot have an impact.

chx’s picture

tavon tested changing blob to text in both source and target tables and it's ok.

killes@www.drop.org’s picture

I've included the change the blob to text change

Steven’s picture

AFAIK we use blob because mysql treats text in a case insensitive fashion.

killes@www.drop.org’s picture

Don't think so:

http://dev.mysql.com/doc/refman/4.1/en/blob.html

BLOB columns have no character set, and sorting and comparison are based on the numeric values of the bytes in column values. TEXT columns have a character set, and values are sorted and compared based on the collation of the character set assigned to the column as of MySQL 4.1. Before 4.1, TEXT sorting and comparison are based on the collation of the server character set.

killes@www.drop.org’s picture

Seems Steven was right and I wasn't.

we have two choices:

1) not apply the patch
2) apply it (minus the blob2text change), and change the system requiremts to state that you can't use old innodb and locales.

killes@www.drop.org’s picture

3) make the locales_source table explicitly MyISAM again.

yched’s picture

I run a french localized site. A lot of admin pages are not translated, and the views/edit page for instance (quite heavy, all untranslated strings) was taking ages to build.

I tested the patch (well I just created the index - no blob2text thing)
This is in no way a regular benchmark - i just rely on query times displayed by devel.module

Before :
40 to 50 ms (!!) for every locale query

After :
all these queries drop to 2 or 3 ms...

local server (Win), MySQL5
one language and 3000 records in locale_target

I guess my personnal opinion is settled : commit it or not, I keep it ! :-)

Naive question : why don't we store in cache the fact that a string has no translation ?
We have to go through all those unsuccessful queries on each page load ?
views/edit page triggers something like 200+ of these queries, that's half a second...

Stefan Nagtegaal’s picture

So killes, what method do you prefer and would let you commit this to HEAD before the release of drupal 4.7?

I'll write the patch which is needed, if time permits...

Steef

Zen’s picture

Status: Needs review » Postponed

This has been postponed to 4.8 - I've created a documentation issue so that this goes into the handbooks for now.. http://drupal.org/node/56943

-K

Dries’s picture

Status: Postponed » Needs work

Let's get this into the next version!

Gerhard Killesreiter’s picture

IIRC the problem was that adding an index on blob colums wasn't supported for all database backends we currently support. IIRC, InnoDB on mysql 3 was not supported.

jadwigo’s picture

+1 for getting some work on this ... the speed improvements are really cool > from ~2500ms to ~550ms page generation time on /node for my homepage for example, more than 4 times faster, so less load on the server, so my hosting provider is happier too

IMO it's really important too fix this at least for incomplete translations, because many translations are incomplete as soon as you use contrib modules in a site. For example translating of admin pages is not a priority for most modules, and even worse translation sources will not be deleted when you remove a module.

Another question: why are translations not cached when a translation is incomplete anyway?
This has significant impact, and only if you really need all settings pages of a site translated does it have a real reason. I would not mind that some untranslated strings are cached if that speeds up my site, as long as the cache get refreshed when I change strings (which would not be a large technical problem to do, and should already work that way)

bwynants’s picture

applying the patch gave an amazing increased of speed for my site (speed was on of the reasons not to upgrade my other sites)

If you guys are serious about 4.7 this patch is a must. Or someone needs to rewrite locale.....

Stefan Nagtegaal’s picture

Dries, now that time is on my side I'm willing to update the patch.
Question remains how you want this to be implemented..

smk-ka’s picture

OT:

@jadwigo,
@yched:
Just out of interest: could it be possible that you suffer from this bug? Are you using Organic Groups or SMTP module together with localization?

Dries’s picture

Patch no longer applies.

yched’s picture

@smk-ka : no OG nor SMTP on my side.
I'm not home right now, so I cant't really check for other possible modules triggering the bug you mention

jadwigo’s picture

@smk-ka
no og.module or smtp.module on my side

Dries’s picture

killes wrote "IIRC, InnoDB on mysql 3 was not supported.". I can't find proof of that ... do you have a pointer or a reference?

Dries’s picture

Just re-roll the patch. The index is added in a seperate SQL statement so in the event that your database doesn't allow an index to be added, the table would still be created.

meba’s picture

These queries are not using indexes either:

SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = '' AND t.locale = 'cs'
SELECT locale, name FROM locales_meta WHERE enabled = 1 ORDER BY isdefault DESC, name ASC;

Could anyone update the patch to include indexes for these queries?

killes@www.drop.org’s picture

Since weve decided o drop mysql 3 support, my objections from #28 are no longer valid.

killes@www.drop.org’s picture

Since weve decided o drop mysql 3 support, my objections from #28 are no longer valid.

drumm’s picture

Status: Needs work » Needs review
Dries’s picture

Probably needs to be re-rolled. Let's get this patch in ...

Heine’s picture

FileSize
2.19 KB

Updated to apply against HEAD.

Dries’s picture

Status: Needs review » Fixed

Committed to CVS HEAD.

Anonymous’s picture

Status: Fixed » Closed (fixed)