Problem/Motivation

When importing large translation files (e.g. drupal core translation) with Drupal using a SQLite database the translation ends with a "too many SQL variables" error. This error occurred in #1848490: Import translations automatically during installation but also occurs when manually importing large translation files (on the Language Interface translation import page at admin/config/regional/translate/import). The failure is not language dependent, but may be environment dependent since the maximum number of SQL variables is determined by the SQLite setting SQLITE_LIMIT_VARIABLE_NUMBER).
The error is caused by a select query ran by _locale_refresh_translations() to determine if javascript strings were affected by the import.

Proposed resolution

The attached patch collects the affected locations during the import process, and invalidates the javascript translation cache based on this result. This makes it also future resistant in case CMI translation will land in core and we want to flush CMI translation cache too.

Remaining tasks

t.b.d.

User interface changes

none

API changes

none

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Sutharsan’s picture

Gábor Hojtsy’s picture

Thanks for the great find! I think this slows down .po file import in favor of speeding up the JS cache file regeneration. So for each string imported, it looks up the location in the DB (one query per string?) to avoid regenerating all JS translation files later. Not sure if this is a reasonable trade-off. Would love to read @reyero's feedback.

Also are only JS locations stored? I believe there are other type of locations stored (.po source file or URL path where string was found?!) if so, the locations array might be HUGE at the end of importing a few thousand strings. So maybe we should limit it to a few location types only. Core will need javascript and CMI locations. (CMI-locale integration is not yet in core, it is being proposed in #1648930: Introduce configuration schema and use for translation).

Status: Needs review » Needs work

The last submitted patch, locale-sqllite-translation-import-1884182-1.patch, failed testing.

Sutharsan’s picture

The import is slowed down by approximately 20%. due to this patch. I have only found two locations: javascript and path. po source file is no location (it used to be in D7?).

Gábor Hojtsy’s picture

20% is a pretty big price to pay :/ I'd rather track back to rebuilding all the JS, #1677026: Less aggressive locale js file rebuilds has other ideas to alleviate that.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
4.45 KB

So the bug was introduced in http://drupalcode.org/project/drupal.git/commitdiff/10e58bdd6400cb097fe0... (issue: #1777070: Refactor and clean up source string location handling). Looking at how the per-lid cache refresh is used, I think it makes sense to keep for the manual UI based editing, where only up to 20 strings could be affected, and the sqlite bug would not appear. However on .po file import, the per-lid placeholders are not going to work, so we need to roll back to a state close to where it was before #1777070: Refactor and clean up source string location handling. I did not roll back exactly to the state before because the API function to clear the cache looked useful, so I just made the lids optional on it. If no lids are provided, per-string lookups are not performed but still all languages are affected which were passed.

This should actually speed up the import process (although I'm not sure that would be measurable), since we store less baggage with the import now that we don't need / use the per-string data.

Gábor Hojtsy’s picture

As for automated tests, to cover this, we'd need a .po file with a huge amount of source strings (as per http://www.sqlite.org/limits.html#max_variable_number the default max limit is 999), so a .po file with 1000 strings would suffice (that can in itself be a pretty huge file). Then we'd need to import it and see if it fails or not. It would obviously never fail on testbot because it runs MySQL everywhere. So I'm not sure there is a point in attempting to write tests for this specific bug.

webchick’s picture

Priority: Normal » Major

Escalating to major, since not having working translation imports for one of our supported DBs is probably really a critical bug, but hey, it's only SQLite. ;P

Status: Needs review » Needs work
Issue tags: -D8MI, -language-ui

The last submitted patch, locale-sqlite-translation-import-1884182-6.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
Issue tags: +D8MI, +language-ui
YesCT’s picture

steps to test patch in #6

sudo rm -r sites
git checkout sites
drop db
git checkout 8.x
git pull --rebase
apply patch (drush am 1884182)
manually install (no drush si)
select sqlite for the db
extend: enable interface translation module
go to configuration -> user interface translation
click import tab, follow link to translation server
pick french, scroll down on french page and get the 7.x .po file
get french po file (curl -O http://ftp.drupal.org/files/translations/7.x/drupal/drupal-7.18.fr.po )
back on the import tab, browse to select the french .po file
import
wait

====
I confirm that with the patch (and sqlite) there are no errors anymore.

Also tested with mysql: no errors there either.

code changes look ok to me.

Status: Needs review » Needs work

The last submitted patch, locale-sqlite-translation-import-1884182-6.patch, failed testing.

Gábor Hojtsy’s picture

The test fails seem to be preserved. The LocaleTranslationTest.php is probably related to lack of or too agressive cache invalidation in the translation editing UI code (although theoretically the above patch does not affect that, it retains the individual string based invalidation there). I could not figure out what might be the cause of the negotiation fail. Could use another pair of eyes.

Sutharsan’s picture

Status: Needs work » Needs review
FileSize
4.45 KB

The test failed because a non breaking space slipped into the patch. Mere mortals can't see the difference with a space, but PHP can. For info:
Call to undefined function  () in /Users/erik/www/drupal8/core/modules/locale/locale.module on line 1038

This patch should do better.

Gábor Hojtsy’s picture

Oh, gosh :D I figured out earlier how to obliterate that from my system but then reinstalled. Will do that again to avoid such issues in the future. Sorry for wasting your time.

YesCT’s picture

Status: Needs review » Reviewed & tested by the community

cool! it's green, and I tested this manually earlier.

@webchick wants this in before #1848490: Import translations automatically during installation

After this get's in, do we want to keep it open and re-pursue some of the earlier approach?

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Awesome!! Thanks a lot, folks, especially for the fast turnaround. And I think Sutharsan gets "Sherlock Of The Year" award for tracking down invisible characters. :D

It would be nice to have Jose chiming in here too, but looks like he hasn't been on IRC in a couple of days, and looking back at #1777070-12: Refactor and clean up source string location handling where it seems this string tracking was introduced, it looks like yet another elaborate workaround for #1677026: Less aggressive locale js file rebuilds, so we should probably just fix that bug... seems like it would solve a lot of problems. And in any case, a performance improvement for JavaScript string importing doesn't seem like a good trade-off for breaking SQLite on installation, so I'm comfortable rolling this chunk back.

Therefore, committed and pushed to 8.x. Thanks!!

Gábor Hojtsy’s picture

Woot, thanks!

Jose Reyero’s picture

The workaround looks good to me, as long as it fixes the bug :-)
Still we need to look into better optimization for refreshing translations, I'll follow up on the other issue, #1677026: Less aggressive locale js file rebuilds

About the SQLite issue, really, the issue may happen on may other places, like multiple entity loading, or anywhere that uses a list of ids to load something, and the real issue seems to me not a bug in this part of the code but the absence of limits for these parameters (when the underlying system actually has them). I mean you can expect similar issues on other parts of the code when there's a lot of data in the database.

If we claim to support SQLite but also as a generic safeward - I guess Mysql will have a limit too- we should have some policy like 'do not use more than 999 variables for an sql query, break into multiple steps when that may happen'.

webchick’s picture

We're not talking about 999 variables, though, we're talking about *5,000*. To me, that's a clear indication that we're Doing Something Wrong.™

Jose Reyero’s picture

Umm, looking again at the patch, I think the part we've overlooked is we don't really need to pass the list of strings after a batch import but it may be useful to see whether we've updated any string or not.

So, instead of completely removing this:

-      if ($strings) {
-        // Clear cache and force refresh of JavaScript translations.
-        _locale_refresh_translations($langcodes, $strings);
-      }

We could have used this (still tracking strings, but not passing list of strings)

      if ($strings) {
         // Clear cache and force refresh of JavaScript translations.
         _locale_refresh_translations($langcodes);
      }
Gábor Hojtsy’s picture

@Jose Reyero: well, with l10n_update in core now, the .po files are managed by Drupal and it will not attempt to import files again that it knows did not change. So it is *very* unlikely in Drupal 8 that an import would result in no strings changed.

Automatically closed -- issue fixed for 2 weeks with no activity.