Problem/Motivation
When importing large translation files (e.g. drupal core translation) with Drupal using a SQLite database the translation ends with a "too many SQL variables" error. This error occurred in #1848490: Import translations automatically during installation but also occurs when manually importing large translation files (on the Language Interface translation import page at admin/config/regional/translate/import). The failure is not language dependent, but may be environment dependent since the maximum number of SQL variables is determined by the SQLite setting SQLITE_LIMIT_VARIABLE_NUMBER).
The error is caused by a select query ran by _locale_refresh_translations()
to determine if javascript strings were affected by the import.
Proposed resolution
The attached patch collects the affected locations during the import process, and invalidates the javascript translation cache based on this result. This makes it also future resistant in case CMI translation will land in core and we want to flush CMI translation cache too.
Remaining tasks
t.b.d.
User interface changes
none
API changes
none
Comment | File | Size | Author |
---|---|---|---|
#14 | locale-sqlite-translation-import-1884182-14.patch | 4.45 KB | Sutharsan |
#6 | locale-sqlite-translation-import-1884182-6.patch | 4.45 KB | Gábor Hojtsy |
#1 | locale-sqllite-translation-import-1884182-1.patch | 5.15 KB | Sutharsan |
Comments
Comment #1
Sutharsan CreditAttribution: Sutharsan commentedComment #2
Gábor HojtsyThanks for the great find! I think this slows down .po file import in favor of speeding up the JS cache file regeneration. So for each string imported, it looks up the location in the DB (one query per string?) to avoid regenerating all JS translation files later. Not sure if this is a reasonable trade-off. Would love to read @reyero's feedback.
Also are only JS locations stored? I believe there are other type of locations stored (.po source file or URL path where string was found?!) if so, the locations array might be HUGE at the end of importing a few thousand strings. So maybe we should limit it to a few location types only. Core will need javascript and CMI locations. (CMI-locale integration is not yet in core, it is being proposed in #1648930: Introduce configuration schema and use for translation).
Comment #4
Sutharsan CreditAttribution: Sutharsan commentedThe import is slowed down by approximately 20%. due to this patch. I have only found two locations: javascript and path. po source file is no location (it used to be in D7?).
Comment #5
Gábor Hojtsy20% is a pretty big price to pay :/ I'd rather track back to rebuilding all the JS, #1677026: Less aggressive locale js file rebuilds has other ideas to alleviate that.
Comment #6
Gábor HojtsySo the bug was introduced in http://drupalcode.org/project/drupal.git/commitdiff/10e58bdd6400cb097fe0... (issue: #1777070: Refactor and clean up source string location handling). Looking at how the per-lid cache refresh is used, I think it makes sense to keep for the manual UI based editing, where only up to 20 strings could be affected, and the sqlite bug would not appear. However on .po file import, the per-lid placeholders are not going to work, so we need to roll back to a state close to where it was before #1777070: Refactor and clean up source string location handling. I did not roll back exactly to the state before because the API function to clear the cache looked useful, so I just made the lids optional on it. If no lids are provided, per-string lookups are not performed but still all languages are affected which were passed.
This should actually speed up the import process (although I'm not sure that would be measurable), since we store less baggage with the import now that we don't need / use the per-string data.
Comment #7
Gábor HojtsyAs for automated tests, to cover this, we'd need a .po file with a huge amount of source strings (as per http://www.sqlite.org/limits.html#max_variable_number the default max limit is 999), so a .po file with 1000 strings would suffice (that can in itself be a pretty huge file). Then we'd need to import it and see if it fails or not. It would obviously never fail on testbot because it runs MySQL everywhere. So I'm not sure there is a point in attempting to write tests for this specific bug.
Comment #8
webchickEscalating to major, since not having working translation imports for one of our supported DBs is probably really a critical bug, but hey, it's only SQLite. ;P
Comment #10
Gábor Hojtsy#6: locale-sqlite-translation-import-1884182-6.patch queued for re-testing.
Comment #11
YesCT CreditAttribution: YesCT commentedsteps to test patch in #6
sudo rm -r sites
git checkout sites
drop db
git checkout 8.x
git pull --rebase
apply patch (drush am 1884182)
manually install (no drush si)
select sqlite for the db
extend: enable interface translation module
go to configuration -> user interface translation
click import tab, follow link to translation server
pick french, scroll down on french page and get the 7.x .po file
get french po file (curl -O http://ftp.drupal.org/files/translations/7.x/drupal/drupal-7.18.fr.po )
back on the import tab, browse to select the french .po file
import
wait
====
I confirm that with the patch (and sqlite) there are no errors anymore.
Also tested with mysql: no errors there either.
code changes look ok to me.
Comment #13
Gábor HojtsyThe test fails seem to be preserved. The LocaleTranslationTest.php is probably related to lack of or too agressive cache invalidation in the translation editing UI code (although theoretically the above patch does not affect that, it retains the individual string based invalidation there). I could not figure out what might be the cause of the negotiation fail. Could use another pair of eyes.
Comment #14
Sutharsan CreditAttribution: Sutharsan commentedThe test failed because a non breaking space slipped into the patch. Mere mortals can't see the difference with a space, but PHP can. For info:
Call to undefined function  () in /Users/erik/www/drupal8/core/modules/locale/locale.module on line 1038
This patch should do better.
Comment #15
Gábor HojtsyOh, gosh :D I figured out earlier how to obliterate that from my system but then reinstalled. Will do that again to avoid such issues in the future. Sorry for wasting your time.
Comment #16
YesCT CreditAttribution: YesCT commentedcool! it's green, and I tested this manually earlier.
@webchick wants this in before #1848490: Import translations automatically during installation
After this get's in, do we want to keep it open and re-pursue some of the earlier approach?
Comment #17
webchickAwesome!! Thanks a lot, folks, especially for the fast turnaround. And I think Sutharsan gets "Sherlock Of The Year" award for tracking down invisible characters. :D
It would be nice to have Jose chiming in here too, but looks like he hasn't been on IRC in a couple of days, and looking back at #1777070-12: Refactor and clean up source string location handling where it seems this string tracking was introduced, it looks like yet another elaborate workaround for #1677026: Less aggressive locale js file rebuilds, so we should probably just fix that bug... seems like it would solve a lot of problems. And in any case, a performance improvement for JavaScript string importing doesn't seem like a good trade-off for breaking SQLite on installation, so I'm comfortable rolling this chunk back.
Therefore, committed and pushed to 8.x. Thanks!!
Comment #18
Gábor HojtsyWoot, thanks!
Comment #19
Jose Reyero CreditAttribution: Jose Reyero commentedThe workaround looks good to me, as long as it fixes the bug :-)
Still we need to look into better optimization for refreshing translations, I'll follow up on the other issue, #1677026: Less aggressive locale js file rebuilds
About the SQLite issue, really, the issue may happen on may other places, like multiple entity loading, or anywhere that uses a list of ids to load something, and the real issue seems to me not a bug in this part of the code but the absence of limits for these parameters (when the underlying system actually has them). I mean you can expect similar issues on other parts of the code when there's a lot of data in the database.
If we claim to support SQLite but also as a generic safeward - I guess Mysql will have a limit too- we should have some policy like 'do not use more than 999 variables for an sql query, break into multiple steps when that may happen'.
Comment #20
webchickWe're not talking about 999 variables, though, we're talking about *5,000*. To me, that's a clear indication that we're Doing Something Wrong.™
Comment #21
Jose Reyero CreditAttribution: Jose Reyero commentedUmm, looking again at the patch, I think the part we've overlooked is we don't really need to pass the list of strings after a batch import but it may be useful to see whether we've updated any string or not.
So, instead of completely removing this:
We could have used this (still tracking strings, but not passing list of strings)
Comment #22
Gábor Hojtsy@Jose Reyero: well, with l10n_update in core now, the .po files are managed by Drupal and it will not attempt to import files again that it knows did not change. So it is *very* unlikely in Drupal 8 that an import would result in no strings changed.