I noticed that new database structure broke search in the admin interface for strings-handling.
How to reproduce:
- Drupal 6-dev fresh install
- Enable Locale module, add a new (empty) locale
- Search for untranslated strings.
- You would expect to see a lot of strings (as shown if searching for all strings), but this search never finds a single string.
The reason is that untranslated strings no more have entries in locales_target with empty translation, there's no entry at all. The SQL query in includes/locale.inc therefore match nothing. I managed to make it find "all languages, all groups, untranslated strings" by replacing "WHERE t.translation = '' " with "WHERE t.lid IS NULL", but if combined with something else (searching for untranslated strings of a given language, which seems quite useful choice) it's broken again, as the language code is also stored in locales_target, which is now missing. I can't see a way out, being not a MySQL guru.
Should I say that removal of empty entries in locales_target broke the translation interface badly, or is there someone with a solution?
I'm unable to supply a patch. Just reporting the bug for now.
| Comment | File | Size | Author |
|---|---|---|---|
| #16 | locale-fixes-4.patch | 12.22 KB | JirkaRybka |
| #14 | locale-fixes-3.patch | 12.56 KB | JirkaRybka |
| #12 | locale-fixes-2.patch | 11.61 KB | JirkaRybka |
| #11 | locale-fixes.patch | 11.5 KB | JirkaRybka |
| #2 | locale-fixes1.patch | 2.2 KB | JirkaRybka |
Comments
Comment #1
meba commentedConfirming and subscribing
Comment #2
JirkaRybka commentedUnfortunately, it's much worse :-(((
The patch, which originated from http://drupal.org/node/150521 broke much more. I'm running MySQL 4.1.19, and have serious problems with the locale.
It seems that the LEFT JOIN on locale tables never works if there's no matching row in locales_target. Code expects the query to return data from locales_source even if nothing is found in locales_target, but (on my MySQL version at least) this returns empty result.
Things broken by this:
1. Search in admin interface, as said above
2. Runtime registration of new strings (!!!) I see that locale() looking for translations receives an empty result on every untranslated string, existing or not, but this is interpreted as source-string non-existence, and therefore DUPLICATE STRINGS ARE INSERTED into locales_source on every occurence, filling the table up very quickly with garbage. This is serious. Also triggers expensive database-stored cache rebuild on EVERY page request.
3. Cache re-building by locale_refresh_cache() only stores translated strings, the condition for caching TRUE is never met (tested). Untranslated strings are only cached in the static array on-the-fly.
4. While translation is found in database (not cached in static array), the currently existing code caches the translation for future use, but forgets to return it via $string variable, so uncached strings are returned in non-translated English!
I'm attaching a quick patch, which should correct #2 and #4 (not fully tested yet), but the other problems (especially #3, which is important performance-wise) seems unsolvable to me, unless adding much more queries, or discovering tricky ones I can't imagine.
I'm inclined to say, that the best way would be to back-out and undo the database-change (the removal of empty translations from locales_target) completely.
I discovered this while working on another performance-patch to the locale system (which seems to work fine on 5.1), but now I should probably wait, as both is in the very same piece of code and I can't integrate into this mess. Soon I'm going to a week's holidays, so probably I'll not have time to finish this (fix) patch myself - please please help!
Oh, and the info how to reproduce: Fresh Drupal 6-dev install, enable Locale module, add a new (empty) locale. Browse a few pages. Go to 'Translate interface' and search for some existing string. See if there are more (duplicate) entries for the same string. (I did search for "Administer" after a very short browsing, and got 14 duplicates already, probably one-per-page). This is the problem #2. #1 reproduction is described above, #3 seems to cause locale entries missing from the {cache} table completely (inserted OK, but flushed VERY frequently), #4 should be seen on long (>75) texts, but I didn't test.
Any help or info about behaviour on other database versions would be useful.
Comment #3
JirkaRybka commentedForgot to update title...
Comment #4
gábor hojtsyWell, the LEFT JOIN should return NULL for those rows where there was no value, that is how it should basically work...
Comment #5
JirkaRybka commentedIf I got your point correctly, then my version of MySQL behaves normally, no strange effect. But then the removal of empty rows from {locales_target} made it impossible to find untranslated strings - the JOIN is necessary to look into target table for existence of translation, but if there's none, then query returns NULL (is this what you meant?) hiding the existence of source string also! Is there a way out? The code really needs to select/recognize untranslated strings, and not only by simple absence of a translation, but even by absence of one particular translation (if more locales are used). The old solution with dummy entries in target table seems more safe to me.
Comment #6
JirkaRybka commentedI slept over it, and I seem to understand finally...
Look at the query used in locale() to find translation for a string, as well as check the source string existence:
db_query("SELECT s.lid, t.translation FROM {locales_source} s LEFT JOIN {locales_target} t ON s.lid = t.lid WHERE s.source = '%s' AND t.language = '%s' AND s.textgroup = 'default'", $string, $langcode)The show-stopper is this part:
WHERE ..... t.language = '%s'Having a WHERE condition on the "target" right-table means that WHERE never matches for a row with no entry in the "target" table. So in the end, a LEFT JOIN with such a WHERE behaves just like INNER JOIN. Doesn't matter which JOIN used, we're completely unable to receive partial results (i.e. from one table if no more exists), unless dropping multi-locale support (the WHERE condition on target table), or returning dummy entries into the table (which I consider a good idea now).I seem able to fix some parts by adding extra query to check for source string existence, but that's awful performance-wise and still works not at all places - particularly in locale_refresh_cache(), which does iterate through all short strings of all locales on any single page request under certain conditions (new string found), there's absolutely necessary a single-query solution to return both translated and untranslated strings for a given locale, which is (due to the above) impossible without dummy entries in target table.
Agreed? Not agreed? Any ideas? Confirming the malfunctions described above? I would like to encourage more comments, as this is really serious in my opinion (given the Drupal 6 release cycle progressing).
Comment #7
meba commentedI am not currently able to review code, but i can confirm, that searching strings doesn't work...
Comment #8
JirkaRybka commentedThanks for comment. But please, don't overlook that more serious problems are now described in #2.
Comment #9
gábor hojtsyWe can move where conditions which filter out rows we need to the JOIN condition, so we get the NULL values properly. Try moving the offending WHERE parts to the JOIN and you will see magic :)
Comment #10
JirkaRybka commentedThanks for tip, I'm rather new to SQL. I'll try as soon as possible, and if this works, then provide a patch.
Comment #11
JirkaRybka commentedFirst: MANY MANY THANKS to Gábor for sending me to the right direction. The tip, however simple might look, is the heart of my new patch.
It took me quite a while, but finally I managed to fix all the above-described bugs, plus some more which I found on the way:
- Search in admin-localization edits for all the conditions
- Runtime string-registration
- (partially from the above originated) faulty locale caching
- Return string from locale()
New-discovered, and also fixed ones:
- Broken languages overview: If there was only one of installed locales enabled, the others disappeared from the list, giving no way to re-enable. Caused by misunderstanding in language_list(), where the count of enabled locales equal to 1 was treated as non-existence of whole localization, and therefore no locales-list returned (only current default language). I used
module_exists('locale')instead, which I believe is correct.- Long standing problem with translated-flags on strings in admin edit-strings interface, if there are more locales and search is performed for one of them: Limiting the query to only one language makes all other locales showing as non-translated "strike through" on all strings, which is not correct. I can't see any way around this, but in my opinion there's no point to show status of other locales if searching explicitly for just one - so I limited the flags only to the requested locale in this case. If searching for all languages, or for English (source strings), then flags are shown correctly for all locales just like before.
- Also changed option-label "All strings in that language" to just "All strings", because now (with no dummy entries in {locales_target} simply all languages have all strings, translated or not. Old title was a bit confusing I think.
- Some minor code clean-up...
I tested this patch over and over, both fresh install and upgrade with real localized-site data, no locale, one locale, two locales, editing strings and searching in all combinations, import/export/template export from/to .po files, observed effects in database... I believe that everything works fine now.
Patch attached!
Comment #12
JirkaRybka commentedAttaching new patch, almost identical to the previous one. Just one line added to cache-rebuild function, to avoid pointless caching of empty locale for English, which is never used (see the t() function, never calls locale() for 'en' language).
Still needs review!
Comment #13
meba commentedPatch did not applied smoothly (but HUNK succeeded), please use correct paths in filenames (drupal-6.x-dev/includes/...). But after patching, everything seems to be OK and bugs are fixed now.
I still think that Gábor needs to review this one...
Comment #14
JirkaRybka commentedThanks for comment. Now attaching a new patch, again almost identical:
- Fixed dir.name and outdated source-file, now applies smoothly (here at least, I'm still a bit unsure about the options of "path" tool)
- Added the very last fix, single line in locale install to add default for "textgroup" column, which we already did in update. Now consistent (install vs. update).
I'm done with my other work, which resulted in discovery of all these bugs, so I don't expect myself to find any more bugs now. Finished from me, unless someone thinks else.
Comment #15
gábor hojtsyMost of the patch look ok now, but I need some clarification:
- good catch with he language_list() bug, I noticed this happening, but did not know why
- in the locale_translate_overview_screen(), why don't you simply do an inner join... that should filter out the empty translations properly
- again, there if I understand correctly, $data->language will be empty if there are no translations yet for that language?
- "All strings in that language" was worded, so it does not give the impression, that you actually search across all *translations* of all strings... "All strings" might suggest that (although I admit that the language selector should stop people from thinking this)
- I think the limit language stuff is quite clever :)
Other parts I did not mention here seem to be quite fine with me. This is an enormous bug fix package, and I'll make sure to get it in before the beta early next week. Thanks a bunch!
Comment #16
JirkaRybka commentedOK, attaching improved patch with requested changes:
- INNER JOIN: Thanks for tip again, I overlooked this. Now changed to INNER, works fine and is more clean solution than before.
- $data->language condition was only just fixing the mess caused by LEFT JOIN, now removed, no more needed.
- "All strings" - OK, I see your point, but now I should explain also mine: The original title "All strings in that language" was extremely misleading to me, honestly, I never quite understood what it meant until examining the code directly. It gave an impression, that we're limiting the search either to all TRANSLATED, all UNTRANSLATED, or all strings OF A SPECIFIC LANGUAGE (which is mysterious, if all languages were selected above), leaving no option to show really ALL strings (i.e. if I want to limit my search by nothing else than the string-pattern, to be sure I didn't miss anything, then I somehow can't find my way through the selectors). Putting the ideas together, I now propose the wording "Both translated and untranslated strings", which matches its real function, and doesn't give false impressions hopefully.
Once this patch is in, I hope you can give a quick attention to http://drupal.org/node/171646, where I propose some performance improvement (patch now blocked by this one here), and I hope to have the thing briefly-considered at least, by someone competent, before the first beta rolls. Thanks in advance.
Comment #17
gábor hojtsyThanks, this now looks much better. Committed.
Comment #18
(not verified) commented