After importing a .po file that contains the following text after the header section:
#: /mysite/?q=fr; sites/all/modules/webfm/js/webfm.js
msgid ""
msgstr ""
information contained in the header section of the .po file, like the plurals settings, is erased from the database. i.e. I get this diff between the previous and current database dumps:
-INSERT INTO `mysite_dev_languages` VALUES ('fr','French','Français',0,1,2,'($n!=1)','','fr',0,JAVASCRIPT-HASH-1');
+INSERT INTO `mysite_dev_languages` VALUES ('fr','French','Français',0,1,0,'','','fr',0,'JAVASCRIPT-HASH-2');
After digging in the code a bit I discovered this bit:
includes/locale.inc:1268:
case 'db-store':
// We got header information.
if ($value['msgid'] == '') {
$header = _locale_import_parse_header($value['msgstr']);
which seems to express that any section containing an empty string for msgid will blank the header info. Ordinarily this would be okay, except that apparently in this case some rogue module has tossed a wrench in the works (unintentionally of course) by arranging for there to be a blank msgid that's not a header.
Seems like this routine could benefit from a bit of robustification.
This may be corrected later, I haven't done any testing to see if it's causing problems. I don't expect I will, I'll just make sure that these outliers, ie empty msgid's, are taken care of. Sorry if this "issue" is completely inconsequential.
Comment | File | Size | Author |
---|---|---|---|
#37 | skip_killing_formula-655048-d6-37.patch | 1022 bytes | Albert Volkman |
#33 | skip-killing-formula-33-D7.patch | 3.51 KB | gumanist |
#30 | skip-killing-formula-30-D7.patch | 3.51 KB | gumanist |
#27 | skip-killing-formula-27-D7.patch | 3.41 KB | gumanist |
#16 | skip-killing-plural-formula-16.patch | 3.23 KB | Gábor Hojtsy |
Comments
Comment #1
Gábor HojtsyYes I agree that we could make this code more robust to only take the msgstr value for the first empty msgid as the header. We should also make sure that empty source strings are never exported or handled for that matter in t() when saving data. Because the same source string appearing two times in a .po file makes it invalid for editing in editors which respect the .po format.
Bugs should be fixed in Drupal 7 and then backported.
Comment #2
intuited CreditAttribution: intuited commentedGood point -- I wasn't attacking the root problem. Mostly because I didn't know what it was; I have a (somewhat) better idea now, in part thanks to your guidance.
This patch is not all that thoroughly tested, partly because I'm not running a 7.x install. Initial testing of the same changes made to 6.14 indicates that it is effective.
There are three things that this patch (hopefully) does:
The three levels of checking should help to catch empty strings lurking in the DB or in previously-exported .po* files.
Comment #3
intuited CreditAttribution: intuited commentedComment #4
plachCleaning-up the "language system" issue queue as per http://cyrve.com/criticals.
Comment #5
Dries CreditAttribution: Dries commentedPatch has some code style issues. For example:
} else {
.Comment #6
intuited CreditAttribution: intuited commentedSorry for the delay, I haven't been working on Drupal stuff lately.
Here's a revised patch. The
} else {
is fixed, but I didn't see any other style violations introduced by the patch itself. There's some other stuff, eg the calls todb_update
, that seemed questionable. However that code is not actually introduced by the patch; it is only in the changeset because of an indentation adjustment. It seems like it would be confusing to rewrite them, especially since people might diff the revisions with whitespace ignoring turned on.So this patch is the same as the last but with the
} else {
fixed, and is against current HEAD.Comment #7
intuited CreditAttribution: intuited commentedComment #8
Gábor HojtsyThe locale code changed a bit in the meantime so it does not overwrite the plural formula anymore if you did not use the overwrite mode. It will still kill the plural formula if you have a malformed header or missing plural formula *in that case*.
Closed #993670: Do not kill plural formula if the one in the .po file was broken or missing as duplicate by the way.
Comment #9
Gábor HojtsyA much simpler solution that I've used in my D7 module gettextapi that is under active development is to just not drop the plural formula from the database in any case. There are no languages on earth which have no rules for plurals, some have pretty simple rules, but all of them have rules. So emptying that out even for a malformed Plural-Forms value (which the above patch still did) is an error. We should just leave it alone if the .po file did not contain one or it was malformed. The Plural-Forms parser will already signal user feedback that the formula was malformed, so no need to inform the user again. I think this is as simple as the attached patch.
Comment #10
Gábor HojtsyLikely needs tests and (then) reviews. Reviews welcome anyway.
Comment #11
Gábor HojtsyAdding UI language translation tag.
Comment #12
Gábor HojtsyAll right, the code moved ahead a bit from under this patch, but the same problem applies. Updated patch. We have plenty of tests for plural formulas, especially the new ones added in #1221276: locale_get_plural() only works for a single language within a request and does not work for English. We can extend that one with a "broken .po file" (missing plural formula) once that lands.
Comment #13
Gábor HojtsyPostponed for now until #1221276: locale_get_plural() only works for a single language within a request and does not work for English gets committed and we can expand the tests from there to cover this issue.
Comment #14
Gábor HojtsyFix the title to be more correct as to what happens.
Comment #15
Gábor Hojtsy#1221276: locale_get_plural() only works for a single language within a request and does not work for English landed, so we can extend those tests to verify this functionality.
Comment #16
Gábor HojtsyAll right, attached test extensions based on those. Added a .po file with missing formula and a file with a broken plural formula. The test only patch proves that both French and Croatian plural data will break with the pre-patch code. The patched + test code proves that the missing plural formula or the broken plural formula will not kill the known plural formula.
Comment #18
Gábor HojtsyThe fail was on purpose. So please help review the passing (patch + test).
Comment #19
tstoecklerI'm marking this RTBC.
The code is fine and this is well tested. I did not try this out locally, so please move to needs review if you think that is necessary. I personally don't think it is.
Comment #20
Gábor HojtsyMarking for backport too as it was originally submitted against 6.x.
Comment #21
webchickNormally I would prefer manual confirmation of a patch, yeah, but the tests here seem pretty solid and it's looking like we won't need to do a D7 release this month so we should hopefully have a buffer to find any problems.
This patch didn't apply to 7.x, so committed it just to 8.x. Marking for backport, since gettext.inc doesn't exist there. :)
Comment #22
Gábor HojtsyThe Drupal 7 patch from be in #9 plus the tests from #1221276: locale_get_plural() only works for a single language within a request and does not work for English and #16 merged. So I think we should postpone this in D7 on #1221276: locale_get_plural() only works for a single language within a request and does not work for English like we did for D8 since that sets up the tests we can easily build on for this instead doing tests from scratch. That bug is also in the D7 backport business.
Comment #23
Gábor HojtsyNot going to work on the Drupal 7 backport. Previous patches from above can be used as guidance for that.
Comment #24
Gábor HojtsyNeither I have hordes of people to direct here as part of our current focus. Unfortunately. Doing Drupal 8 patches is hard enough.
Comment #25
gumanist CreditAttribution: gumanist commentedHas a suggession.
Would it be better to move all plural formulas out of the .po files. It could be implemented via one file in locale module with all the plural formulas for all the languages and be accessible from the admin interface on language editing|adding page? data for the file could be gathered from localize.drupal.org.
Formulas is known for all the languages, and it just needed to provide inteface to change it, if someone need it.
Comment #26
Gábor HojtsyThe formulas do need to be in .po files, or they would not be editable in most translation tools. For Drupal 7/6, the .po files from localize.drupal.org are pulled with l10n_update module, which does the plural initialization with the first .po file. There is also l10n_pconfig, that includes a list of plural formulas and preconfigures without .po files. providing an editing UI like that would be a no-go for core though. All in all we do plan to build in the l10n update functionality in D8, but that does not help for cases, when people do import broken .po files manually. Also not for D7, which still needs this bug fixed.
Comment #27
gumanist CreditAttribution: gumanist commentedThank you for explanation.
D7 patch tests are based on other test case which was in
Comment #28
gumanist CreditAttribution: gumanist commentedtest
Comment #29
andypostcomment should be 2-line not longer 80chars per line, could be copied from original patch
Same
Comment #30
gumanist CreditAttribution: gumanist commentedUpdated
Comment #31
andypostComment #33
gumanist CreditAttribution: gumanist commentedre-rolled patch
Comment #34
andypostGreat!
Comment #35
webchickGreat. Thanks folks!
Committed and pushed to 7.x. Moving down to 6.x.
Comment #36
andypostuntagging! @gumanist yay!
Comment #37
Albert Volkman CreditAttribution: Albert Volkman commentedD6 backport.
Comment #38
andypostComment #39
Gábor HojtsyThanks, committed, pushed.
Comment #42
xjm