For performance drupal only loads short strings in its locale cache and retrieves longer strings from the database when required. I noticed that I had a huge number of duplicate queries for "login or register to post comments". As it turns out, when no translation is found in the database, this result is not cached. So if a longer string occurs multiple times on a page, duplicate DB queries are sent.

Patch attached.

ps: we only load strings shorter than 75 chars. At the time i choose that limit so that this rather often used string would be cached. But it seems the length of the string has changed since. Maybe we should review this limit for D6?

Comments

dries’s picture

Priority: Minor » Normal

It would be great if Gabor could take a look at this one -- it might be necessary for D6 as well.

catch’s picture

Version: 5.x-dev » 6.x-dev
Priority: Normal » Minor
Status: Needs review » Needs work

most things will go to 6.x then get backported. Changing status accordingly. Patch sounds sensible.

catch’s picture

Priority: Minor » Normal
Status: Needs work » Needs review

cross posted and reset to minor by mistake. Leaving at 6.x/needs review.

bart jansens’s picture

Version: 6.x-dev » 5.x-dev

I checked D6 and the code has been rewritten there, already fixing this issue.

Changing version back to 5.x.

gábor hojtsy’s picture

Status: Needs review » Needs work

Yes, there seems to be a solution for this in Drupal 6. I looked at the Drupal 5.x code though, and it did not seem like this fix is enough:

- the $trans object is only filled if a row exists for the string in both the source and the target tables
- if the $trans->translation is empty, this patch caches a TRUE, which already means the string has no translation
- BUT if there is no source and/or target table row (the remaining code of locale()), TRUE is not cached

So this patch only solves one thirds of the problem/cases as far as I see. (Drupal 6 has a solution for all cases as far as I see).

bart jansens’s picture

Indeed, nothing is cached when either source or target is empty. However, unless I'm missing something, this should only happen the very first time a string is used. The remaining code in locale would insert an empty translation in the database and wipe the cache.

So you'll execute the query twice the first time a string is found (the third time it will be cached). As this happens only once, I didn't see a reason to optimize for this unlikely case in D5.

dgtlmoon’s picture

StatusFileSize
new445 bytes

I can confirm this is still happening in 5.x, im getting 2500 (yes nearly two thousand five hundred) SQL queries because I imported a large set of strings but have only translated a few (albeit very important) strings.

Using this method i can see..


     50 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'ad group: @name' AND t.locale = 'en-custom'
     50 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'User List: Users with %role' AND t.locale = 'en-custom'
     80 SELECT r.rid, r.name FROM role r WHERE r.name = '0'
     80 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Menus' AND t.locale = 'en-custom'
     80 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'refine with terms from !voc' AND t.locale = 'en-custom'
    110 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Custom' AND t.locale = 'en-custom'
    120 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Core blocks' AND t.locale = 'en-custom'
    306 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Views' AND t.locale = 'en-custom'
    565 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.source = 'Contributed modules' AND t.locale = 'en-custom'

(left column is number of queries for that string)

A more general grouping reveals

     22 SELECT data, created, headers, expire FROM cache
    100 SELECT r.rid, r.name FROM role r
    113 SELECT dst FROM url_alias
   2425 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid

So the main offenders are url_alias and locales module, attached is a similar patch to cache where no translation is found.

After applying that patch my results look like

     17 SELECT *
     18 SELECT data, created, headers, expire FROM cache
    100 SELECT r.rid, r.name FROM role r
    113 SELECT dst FROM url_alias
    211 SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid
tr’s picture

Status: Needs work » Closed (won't fix)

This has been fixed in 6.x according to #4 and #5, and now that 5.x is officially unsupported we can mark this as won't fix in 5.x.