After importing a .po file that contains the following text after the header section:

#: /mysite/?q=fr; sites/all/modules/webfm/js/webfm.js
msgid ""
msgstr ""

information contained in the header section of the .po file, like the plurals settings, is erased from the database. i.e. I get this diff between the previous and current database dumps:

-INSERT INTO `mysite_dev_languages` VALUES ('fr','French','Français',0,1,2,'($n!=1)','','fr',0,JAVASCRIPT-HASH-1');
+INSERT INTO `mysite_dev_languages` VALUES ('fr','French','Français',0,1,0,'','','fr',0,'JAVASCRIPT-HASH-2');

After digging in the code a bit I discovered this bit:
includes/locale.inc:1268:

case 'db-store':
  // We got header information.
  if ($value['msgid'] == '') {
  $header = _locale_import_parse_header($value['msgstr']);

which seems to express that any section containing an empty string for msgid will blank the header info. Ordinarily this would be okay, except that apparently in this case some rogue module has tossed a wrench in the works (unintentionally of course) by arranging for there to be a blank msgid that's not a header.

Seems like this routine could benefit from a bit of robustification.

This may be corrected later, I haven't done any testing to see if it's causing problems. I don't expect I will, I'll just make sure that these outliers, ie empty msgid's, are taken care of. Sorry if this "issue" is completely inconsequential.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Gábor Hojtsy’s picture

Version: 6.14 » 7.x-dev

Yes I agree that we could make this code more robust to only take the msgstr value for the first empty msgid as the header. We should also make sure that empty source strings are never exported or handled for that matter in t() when saving data. Because the same source string appearing two times in a .po file makes it invalid for editing in editors which respect the .po format.

Bugs should be fixed in Drupal 7 and then backported.

intuited’s picture

Good point -- I wasn't attacking the root problem. Mostly because I didn't know what it was; I have a (somewhat) better idea now, in part thanks to your guidance.

This patch is not all that thoroughly tested, partly because I'm not running a 7.x install. Initial testing of the same changes made to 6.14 indicates that it is effective.

There are three things that this patch (hopefully) does:

  1. Returns early from t() in the case of an empty string. This should be more efficient, since it's not wasting time checking for an empty source string; it also prevents the new source string from being entered into the DB.
  2. prevents empty strings that have made their way into the DB from being written out when exporting.
  3. if importing of a string with empty msgid is attempted after reading the header, it drops the string and watchdogs a warning-level message.

The three levels of checking should help to catch empty strings lurking in the DB or in previously-exported .po* files.

intuited’s picture

Status: Active » Needs review
plach’s picture

Component: language system » locale.module

Cleaning-up the "language system" issue queue as per http://cyrve.com/criticals.

Dries’s picture

Status: Needs review » Needs work

Patch has some code style issues. For example: } else {.

intuited’s picture

Sorry for the delay, I haven't been working on Drupal stuff lately.

Here's a revised patch. The } else { is fixed, but I didn't see any other style violations introduced by the patch itself. There's some other stuff, eg the calls to db_update, that seemed questionable. However that code is not actually introduced by the patch; it is only in the changeset because of an indentation adjustment. It seems like it would be confusing to rewrite them, especially since people might diff the revisions with whitespace ignoring turned on.

So this patch is the same as the last but with the } else { fixed, and is against current HEAD.

intuited’s picture

Status: Needs work » Needs review
Gábor Hojtsy’s picture

Version: 7.x-dev » 8.x-dev

The locale code changed a bit in the meantime so it does not overwrite the plural formula anymore if you did not use the overwrite mode. It will still kill the plural formula if you have a malformed header or missing plural formula *in that case*.

Closed #993670: Do not kill plural formula if the one in the .po file was broken or missing as duplicate by the way.

Gábor Hojtsy’s picture

Assigned: Unassigned » Gábor Hojtsy
Issue tags: +D8MI
FileSize
633 bytes

A much simpler solution that I've used in my D7 module gettextapi that is under active development is to just not drop the plural formula from the database in any case. There are no languages on earth which have no rules for plurals, some have pretty simple rules, but all of them have rules. So emptying that out even for a malformed Plural-Forms value (which the above patch still did) is an error. We should just leave it alone if the .po file did not contain one or it was malformed. The Plural-Forms parser will already signal user feedback that the formula was malformed, so no need to inform the user again. I think this is as simple as the attached patch.

Gábor Hojtsy’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Likely needs tests and (then) reviews. Reviews welcome anyway.

Gábor Hojtsy’s picture

Issue tags: +language-ui

Adding UI language translation tag.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
Issue tags: +sprint
FileSize
1.24 KB

All right, the code moved ahead a bit from under this patch, but the same problem applies. Updated patch. We have plenty of tests for plural formulas, especially the new ones added in #1221276: locale_get_plural() only works for a single language within a request and does not work for English. We can extend that one with a "broken .po file" (missing plural formula) once that lands.

Gábor Hojtsy’s picture

Status: Needs review » Postponed

Postponed for now until #1221276: locale_get_plural() only works for a single language within a request and does not work for English gets committed and we can expand the tests from there to cover this issue.

Gábor Hojtsy’s picture

Title: Header information blanked when importing a poorly-formed .po file » Plural formula information blanked when importing a poorly-formed .po file

Fix the title to be more correct as to what happens.

Gábor Hojtsy’s picture

Status: Postponed » Needs work
Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
1.99 KB
3.23 KB

All right, attached test extensions based on those. Added a .po file with missing formula and a file with a broken plural formula. The test only patch proves that both French and Croatian plural data will break with the pre-patch code. The patched + test code proves that the missing plural formula or the broken plural formula will not kill the known plural formula.

Status: Needs review » Needs work

The last submitted patch, skip-killing-plural-formula-16-test-shoud-fail.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review

The fail was on purpose. So please help review the passing (patch + test).

tstoeckler’s picture

Status: Needs review » Reviewed & tested by the community

I'm marking this RTBC.
The code is fine and this is well tested. I did not try this out locally, so please move to needs review if you think that is necessary. I personally don't think it is.

Gábor Hojtsy’s picture

Issue tags: +needs backport to 6.x

Marking for backport too as it was originally submitted against 6.x.

webchick’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Normally I would prefer manual confirmation of a patch, yeah, but the tests here seem pretty solid and it's looking like we won't need to do a D7 release this month so we should hopefully have a buffer to find any problems.

This patch didn't apply to 7.x, so committed it just to 8.x. Marking for backport, since gettext.inc doesn't exist there. :)

Gábor Hojtsy’s picture

Status: Patch (to be ported) » Postponed

The Drupal 7 patch from be in #9 plus the tests from #1221276: locale_get_plural() only works for a single language within a request and does not work for English and #16 merged. So I think we should postpone this in D7 on #1221276: locale_get_plural() only works for a single language within a request and does not work for English like we did for D8 since that sets up the tests we can easily build on for this instead doing tests from scratch. That bug is also in the D7 backport business.

Gábor Hojtsy’s picture

Assigned: Gábor Hojtsy » Unassigned

Not going to work on the Drupal 7 backport. Previous patches from above can be used as guidance for that.

Gábor Hojtsy’s picture

Issue tags: -sprint

Neither I have hordes of people to direct here as part of our current focus. Unfortunately. Doing Drupal 8 patches is hard enough.

gumanist’s picture

Has a suggession.
Would it be better to move all plural formulas out of the .po files. It could be implemented via one file in locale module with all the plural formulas for all the languages and be accessible from the admin interface on language editing|adding page? data for the file could be gathered from localize.drupal.org.
Formulas is known for all the languages, and it just needed to provide inteface to change it, if someone need it.

Gábor Hojtsy’s picture

The formulas do need to be in .po files, or they would not be editable in most translation tools. For Drupal 7/6, the .po files from localize.drupal.org are pulled with l10n_update module, which does the plural initialization with the first .po file. There is also l10n_pconfig, that includes a list of plural formulas and preconfigures without .po files. providing an editing UI like that would be a no-go for core though. All in all we do plan to build in the l10n update functionality in D8, but that does not help for cases, when people do import broken .po files manually. Also not for D7, which still needs this bug fixed.

gumanist’s picture

Thank you for explanation.

D7 patch tests are based on other test case which was in

gumanist’s picture

Status: Postponed » Needs review

test

andypost’s picture

Status: Needs review » Needs work
+++ b/modules/locale/locale.testundefined
@@ -793,6 +793,23 @@ class LocaleImportFunctionalTest extends DrupalWebTestCase {
+    // Import has broken plurals. This import should not have changed number of plural forms.

comment should be 2-line not longer 80chars per line, could be copied from original patch

+++ b/modules/locale/locale.testundefined
@@ -793,6 +793,23 @@ class LocaleImportFunctionalTest extends DrupalWebTestCase {
+    // Import has no plurals. This import should not have changed number of plural forms.

Same

gumanist’s picture

Issue tags: -Needs tests
FileSize
3.51 KB

Updated

andypost’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, skip-killing-formula-30-D7.patch, failed testing.

gumanist’s picture

Status: Needs work » Needs review
FileSize
3.51 KB

re-rolled patch

andypost’s picture

Status: Needs review » Reviewed & tested by the community

Great!

webchick’s picture

Version: 7.x-dev » 6.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Great. Thanks folks!

Committed and pushed to 7.x. Moving down to 6.x.

andypost’s picture

untagging! @gumanist yay!

Albert Volkman’s picture

Status: Patch (to be ported) » Needs review
FileSize
1022 bytes

D6 backport.

andypost’s picture

Status: Needs review » Reviewed & tested by the community
Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Thanks, committed, pushed.

Status: Fixed » Closed (fixed)
Issue tags: -D8MI, -needs backport to 6.x, -language-ui

Automatically closed -- issue fixed for 2 weeks with no activity.

xjm’s picture