String location is a field barely used atm, only for keeping the list of strings that belong to JavaScript files but for that it's not specially easy either, since we need to do a lot of 'LIKE' searches and regexp strings.

We are fixing that field moving it to a different table which can hold different locations for a string, like now, but properly indexed by location type ('javascript', 'path'....) and location name (file name, path).

1. This will mean an important clean up of all the functions searching strings by location. Location conditions (type, name) which is atm only the JavaScript parser.
2. It will improve performance of the javascript translation regenerating code which currently needs to refresh everything evey time a new translation is added. We'll be able to do more selective update/refresh with this not needing to invalidate every js translation every time.
3. It will open up some more options, like tracking configuration names for strings belonging to configuration, which is needed for this (next step), #1648930: Introduce configuration schema and use for translation
4. On the way, we'll fix some minor issues like strings not being properly tagged with Drupal version when they're used with JavaScript translations.

And overall, we are building a cleaner/healthier Locale Storage API, mainly aiming at translating hardcoded configuration strings (yml files), but also helping other efforts like a more reliable and cleaner string translation during install, upgrades, etc... Related:

#1807272: Mixed locale issues with upgrade scripts, unit tests, 'Table simpletest..locales_source doesn't exist', etc...
#1813762: Introduce unified interfaces, use dependency injection for interface translation

... which eventually will allow us to fix the way the whole locale system is initialized using DIC...

#1806756: How to register storage controllers / 'target' database parameter (Looking for a common standard)

There are a lot of cirtular dependencies between these other issues so this looks like a good first step to clean up what we've got, which is some kind of 'dirty' location field, making it more fitted for what we are using it for atm (js translation) and also useful for the other issues. Other modules like l10n_client will also benefit of properly handled string locations.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Jose Reyero’s picture

Adding first patch

Jose Reyero’s picture

FileSize
978 bytes

Attaching an example po file which creates the locale source properly tagged for configuration. As sad as it may sound we only have one translatable string so far into config system: Site maintenance message.

0. Add spanish language.
1. Import po file
2. Translate string and see how configuration translation changes.

Jose Reyero’s picture

Updated patch fixing some issues with po import. Now config strings are updated properly upon po import.

Note that only non customized config strings are updated, as we check the po source string is the same as the current config string value.

webchick’s picture

Status: Active » Needs review

Marking needs review, since there's a patch.

Gábor Hojtsy’s picture

Very tricky reuse of context and location. I still need to think about whether this is great or just something that works :D As for the location, can we be sure that its unique for all modules core/contrib or at least relatively unique that the same string with the same location will not appear for different things?

As for a code review, the approach of saving and hooking into the locale system seems like done well. I'm not sure the config translation code altogether belongs to the locale module per say (now that we finally moved out everything that was not related to code based text translation). So while locale sounds like the most logical place for storing the community provided translations before it is being used, the actual translation application to config, etc. I don't think should be in locale. So most/all code in locale.config.inc would be in language_config or config_translation module or something along those lines. (We hope to rename locale to interface_translation or interface_language later as well).

We might need a small API addition to locale to add new source strings, so we don't directly mess with the DB in another module, but that should be relatively easy.

Jose Reyero’s picture

@Gábor

Very tricky reuse of context and location. I still need to think about whether this is great or just something that works :D As for the location, can we be sure that its unique for all modules core/contrib or at least relatively unique that the same string with the same location will not appear for different things?

It is just something that works :-D
For the context (='configuration') I think it may make sense, will keep these strings separated from the rest, it's nice to have it shown when you translate a string, etc..
The location hack, yeah, it's ugly, just a temporary thing till we add a proper field (both in the table and the PO). However it should be unique enough and work temporarily for a proof of concept...

We might need a small API addition to locale to add new source strings, so we don't directly mess with the DB in another module, but that should be relatively easy.

Right, I was about to ask you about that too. I think a LocaleSource string class may make sense here so we can pass these strings around easily and then add some API funcions there...

As for a code review,...

I think it should be in locale, besides as you say "being the most logical place...", I find it fits the module's purpose of translating hardcoded strings (only these are not in code by in yaml files which is a very small difference since we're actually moving default strings from code to config files)
It also makes a lot of sense from the workflow and UX point of view, so we don't push upon the users another different way to update string translations, we can build on our current po file workflow, etc..

So it seems we are more or less ok about translating "default configuration strings", aren't we? And then the question remains of whether we should do some kind of 'updating the locale source' when the default config string changes but I think we can postpone that question for later.
However we must be aware that once we have the locale-to-config-names mapping in place that would be a relatively small patch and somehow a logical follow up, mostly in order to not duplicating the translation interface. The current approach in i18n of "you must translate that string here, but that other string there because..." has been always a hard story to tell to end users.

I would like to build on this patch for now:
- Adding a proper config_name field and mapping in both locale source table and po file.
- Using it only for translating default config strings.

And then let the question open of what to do once we update that configuration string. Because if we ever get a proper UI in place for config translation it may also make sense keeping that strings in the db (so they work with the po workflow) but filtering them out in the locale translation UI. And if not it will be easy enough to handle it in contrib.

But in any case, however you look at it, this would mean a big improvement on the current D7 approach for hardcoded-but-configurable strings of "you get them translated on install if you're lucky".

Gábor Hojtsy’s picture

I agree this is a huge improvement over "you get them translated if you are lucky". Kudos for figuring this out :) Thanks for highlighting this is really just an extension of "built in UI translation" where parts of the UI is shipped as part of configuration.

We should indeed be careful to only translate the default configuration and not the config once edited.

Also, I hope we will reach some agreement on language marking for shipped configuration, so if you build a Spanish only site, you can ship Spanish default config with your custom modules without them being intermixed with English stuff as source strings on the translation system. A very basic system to mark a config file with a language inside the file would suffice for that, so we'd skip files marked with a language different from English.

And then the question remains of whether we should do some kind of 'updating the locale source' when the default config string changes but I think we can postpone that question for later.

Agreed on that.

However we must be aware that once we have the locale-to-config-names mapping in place that would be a relatively small patch and somehow a logical follow up, mostly in order to not duplicating the translation interface. The current approach in i18n of "you must translate that string here, but that other string there because..." has been always a hard story to tell to end users.

I don't think the locale system is a good place / UI for config translation en-masse for all runtime modified / created configuration at least for the English source string problem but also many other things that you worked around in i18n. I'd be inclined to postpone this problem to contrib instead of trying to ship with something very sub-par in core just to have some solution.

I would like to build on this patch for now:
- Adding a proper config_name field and mapping in both locale source table and po file.
- Using it only for translating default config strings.

That sounds good. I'm still not sure what should we do with the location field in D8. Localize.drupal.org does not provide any file/line location info anymore in the name or making files smaller and their usefulness almost close to 0, but the location field is still used for 2 things: (a) JS file name for JS strings (b) path for strings found dynamically in Drupal. The first one is obviously a requirement for JS translation file generation, and I'm not 100% sure about the usefullness of (b) although the l10n_client module relies on it a bit... it could also just keep that info in itself (and have a much richer set of data of multiple places a string is used to begin with). So I think it sounds like it would make sense to have per-use-case specific location information properties for JS and config separately. Then instead of needing to pattern match when we look for strings, we can just filter on the proper columns.

And then let the question open of what to do once we update that configuration string. Because if we ever get a proper UI in place for config translation it may also make sense keeping that strings in the db (so they work with the po workflow) but filtering them out in the locale translation UI. And if not it will be easy enough to handle it in contrib.

Indeed, let's do that one later (or I think more likely in contrib).

Jose Reyero’s picture

Gábor Hojtsy’s picture

Priority: Normal » Major

Elevating to major.

Jose Reyero’s picture

Same concept but fully reworked implementation:
- Moved locations to new table, locales_location (type, name, version). The 'name' will map to each of the files contained in old 'location'.
- Locations have a many-to-one relationship with strings, or in other words, a string can have any number of 'locations'. Each file will be a different location.
- Added methods to StringInterface: hasLocation(), getLocation(), addLocation()
- Reworked code using old locations accordingly. It was mostly the javascript part making some use of locations.

Tip: Location is now handled internally by string objects and storage and from outside becomes a kind of 'tagging' system on which we can add/check for locations on a string, also use 'type' and 'name' as parameters for string search but very little more.

Usage examples from locale.module:

      // Handle string locations.
      if (!$source->hasLocation('javascript', $filepath)) {
        $source->addLocation('javascript', $filepath);
        $source->save();
      }
     // Get all strings with a given location type (javascript).
     $strings = locale_storage()->getTranslations(array('language' => $language->langcode, 'type' => 'javascript'));

Locations and configuration:
For strings coming from configuration files, location type will be 'configuration' and location name will be the configuration name. This will allow multiple mappings like getting the strings for a configuration object or knowing which configuration object we need to update after updating a translation.
There's some API in place for localization / configuration operations though not yet used.
- See locale_config() function, intended for translation of configuration strings.
- See locale.config.inc, intended for keeping in sync modules/configuration/translations
See #1648930: Introduce configuration schema and use for translation

Note: If giving the patch a try, instead of reinstalling Drupal, you can uncomment the db update code in locale.install, update 8015, and run that one.

Questions:
- When doing the table update, can we just drop old location data? (it's suppossed to happen on D8 upgrade so D7 locations are very little use here). There's some commented out code that would try to update some locations.
- How to better move on with this: Should we leave here only the 'location' updates and create a third patch that will be the one working with the config metadata?

Next steps:
- Some updates for configuration metadata patch needed (mostly to work with default configuration for which current config objects are not very reusable), coming next.
- Fill in the workflow implementing hooks for module enable/disable, language enable. Improve the po file import reports so they provide a list of updated lids and we can use it for updating the configuration translation. see locale_config_update_translations() ready for it.

Status: Needs review » Needs work

The last submitted patch, locale_config_workflow-1777070-10.patch, failed testing.

Jose Reyero’s picture

Title: Translation workflow for configuration strings » Translation workflow for configuration strings (better locations)
Status: Needs work » Needs review
FileSize
25.65 KB

Updated patch which includes:
- Rebase for latest updates to locale module from #1785086: Introduce a generic API for interface translation strings
- Moved the locale.update.inc API to the related patch in #1648930: Introduce configuration schema and use for translation (coming in next iteration).
- Added better reporting to string import (get all updated lids) so we can do more selective cache updates. Though this is just a performance improvement for updating javascript strings (only when actually updating js strings we flush the caches), this is required for upcoming configuration strings (we cannot just update every config object on each translation update).

So after thinking about it, this patch is focusing now only on improving string locations and making some good use of it for javascript translations.
The part of the of the 'config translation workflow', that will work independently of this, will be worked out in the metadata issue.

The reason of this is having two cleaner independent patches that are meaningful (and work) by themselves. If twe get both of them in, the only remaining functional gap will be 'updating translated configuration after updating translations' (Maybe wait for next iteration of the metadata patch to see what I mean...)

Gábor Hojtsy’s picture

Status: Needs review » Needs work

Not sure about the repurposing of this issue for this task, but it does not directly relate to config system anymore (even though it is a requirement for the config translatability introduction). Also not sure the existing metadata issue should have even more code to review moved from here, it is going relatively slow already :/ Concrete feedback on the issue:

+++ b/core/includes/update.incundefined
@@ -412,6 +412,52 @@ function update_prepare_d8_language() {
+          'locid' => array(
+            'type' => 'serial',
+            'not null' => TRUE,
+            'description' => 'Unique identifier of this location.',
+          ),
+          'lid' => array(
+            'type' => 'int',
+            'not null' => TRUE,
+            'description' => 'Unique identifier of this string.',
+          ),

Ideally the string would have a sid, the location would have a lid. Not sure how much change would that mean, but "locid" type of columns are not very standard.

+++ b/core/includes/update.incundefined
@@ -412,6 +412,52 @@ function update_prepare_d8_language() {
+            'description' => 'Drupal path in case of online discovered translations or file path in case of imported strings.',

I'd lead this with "Type dependent location information. Eg. Drupal path..."

+++ b/core/includes/update.incundefined
@@ -412,6 +412,52 @@ function update_prepare_d8_language() {
+      // Drop old locales_source.location field.
+      db_drop_field('locales_source', 'location');

Where is the data migration happening?

+++ b/core/modules/locale/lib/Drupal/locale/StringDatabaseStorage.phpundefined
@@ -86,6 +86,21 @@ public function findTranslation(array $conditions) {
+   *   Array of locations indexed by locale indentifier, type

identifier, type?!

+++ b/core/modules/locale/lib/Drupal/locale/StringInterface.phpundefined
@@ -158,6 +158,53 @@ public function setValues(array $values, $override = TRUE);
+   * information about the procedence of the string, like the file name it
+   * was found on, the path on which it was discovered, etc...

procedence?!

+++ b/core/modules/locale/lib/Drupal/locale/StringInterface.phpundefined
@@ -158,6 +158,53 @@ public function setValues(array $values, $override = TRUE);
+   * @param bool $load
+   *   (optional) Whether to load the string locations if not loaded yet.
+   *   Defaults to TRUE.

What would getLocation() do if $load = FALSE? Is it not useless in that case?

+++ b/core/modules/locale/lib/Drupal/locale/StringInterface.phpundefined
@@ -158,6 +158,53 @@ public function setValues(array $values, $override = TRUE);
+   * @param string $type
+   *   Location type: 'javascript', 'path', 'configuration'...

Should we have constant for these?

+++ b/core/modules/locale/locale.bulk.incundefined
@@ -532,7 +536,11 @@ function locale_translate_batch_finished($success, $results) {
+      // Eliminage duplicates from string list and language codes.

Eliminage :) - the sentence is not clear even if the typo is fixed. What kind of duplicates are removed? What does "and language codes" mean?

+++ b/core/modules/locale/locale.installundefined
@@ -741,6 +777,46 @@ function locale_update_8013() {
+  //$table = drupal_get_schema('locales_location', TRUE);
+  //db_create_table('locales_location', $table);
+  // Copy only js file names which are the only ones that we are using atm.
+  /*
+  $result = db_select('locales_source', 's')

Needs more work. Also would not have any effect here since the location data is dropped earlier in the update. I think this migration code should be there too.

Gábor Hojtsy’s picture

Title: Translation workflow for configuration strings (better locations) » Refactor and clean up source string location handling

I did not say but I like the concept very much and I was always at a discomfort about the reuse/overload of the location field, which this patch nicely cleans up. Also needed for CMI pre-translation introduction :)

Jose Reyero’s picture

@Gábor Hojtsy,
- Ok with the title changing / repurposing.
- Data migration is commented out in the code, possibly it should go in install.core.inc
- getLocation(FALSE) is used by the StringDatabaseStorage to check whether the string has locations to save without loading them if not. The problem of the storage is that it deals with 'StringInterface' objects so it cannot access the properties directly.
- Constants, it would be fine either way, each type (javascript, etc..) is only used by each subsystem and it should be extendable so as long as we don't add the constants in the base classes, that's ok.

One more possible feature: Since locations have also versions (to map on which file of which version the string was found) we could maybe use it for updating the main string version in locales_source, so we could drop StorageInterface::checkVersion, and handle they internally in the storage. (I.e. Storage could check itself the version of every translation loaded through findTranslation() method )

Jose Reyero’s picture

Status: Needs work » Needs review
FileSize
25.13 KB
33.52 KB

Next version, which includes:
- All code style issues pointed out by @Gábor in #13
- Renames 'locid' to 'lid' (I would postpone renaming existing 'lid' to 'sid' because that's going to be an easy patch but with a lot of lines. Anyway, there's nothing wrong with two tables using the same field name...)
- Adds migration script, and some (minimal) test for it in the Language Upgrade system test. Also it adds missing foreign keys in the table.
- Moves version handling into locale storage, assuming every time we call findTranslation() we need to check latest version, and every time we add a location too. This means checkVersion() is not an interface method anymore (simplify), and there are other minor related changes in the code.
- Fixes a related pre-existent issue which was javascript translation not updating versions of strings.

Attached patch and inter-diff.

Gábor Hojtsy’s picture

I don't think it is rerasonable to expect that all lookups would give a location, most strings a standalone in core and come with no specific location. Maybe that was an incorrect statement about the code, as I don't see how that could work.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

This is shaping up really well. I think most of my comments are relatively easy to fix :)

+++ b/core/includes/update.incundefined
@@ -412,6 +412,99 @@ function update_prepare_d8_language() {
+      // @todo Rename 'lid' field to 'sid'.

Not the lid field in the table schema that is coming, so I think this todo is confusing. I'd drop it or move it to the foreign key where it is apparent.

+++ b/core/includes/update.incundefined
@@ -412,6 +412,99 @@ function update_prepare_d8_language() {
+            'name' => array(
+              'type' => 'varchar',
+              'length' => 255,
+              'not null' => TRUE,
+              'default' => '',
+              'description' => 'Type dependent location information (file name, path, etc).',

I'm not very happy about naming this 'name'. I came back here from the query builder where it said adding a location subquery if a 'type' or 'name' keys existed, which was pretty puzzling.

I understand the location is a combination of type and name, and don't really have a better idea for 'name' except maybe 'value' which is an sql protected name definitely (if not 'name' :).

So it looks understandable after some digging, not ideal but I can live with it :)

+++ b/core/includes/update.incundefined
@@ -412,6 +412,99 @@ function update_prepare_d8_language() {
+        $result = db_select('locales_source', 's')
+          ->fields('s', array('lid', 'location'))
+          ->condition('s.location', NULL, 'IS NOT NULL')
+          ->condition('s.location', '', '<>')
+          ->execute();
+        while ($string = $result->fetchObject()) {

For a lot of strings, this could take considerable time. I don't know how much, maybe not that much, but sounds like to save on this one, we might be better off to just grab the .js locations and process those, which should be minimal. Sites nowadays import translations from localize.drupal.org, so code based information is not available for them (its not in the .po file anymore to limit size). So that is unlikely to be there. Path information is not very useful unless you run l10n_client and happen to stumble on the right page. So looks like sites who don't import from localize.drupal.org this might be a very slow update-start off experience, and I'd rather we not keep the data we never use anyway and focus on the .js data only.

+++ b/core/includes/update.incundefined
@@ -412,6 +412,99 @@ function update_prepare_d8_language() {
+          $locations = preg_split('~\s*;\s*~', $string->location);

Would be good to note why are you doing this here (in a comment). Is this about JS locations providing multiple files? When is the ; separator used? Is this applicable for when the location is .module for example? Are string with this separator saved without whitespace (looks like the regular expression would not necessarily match all whitespace before/after)?

+++ b/core/modules/locale/lib/Drupal/locale/StringStorageInterface.phpundefined
@@ -69,6 +87,9 @@ public function findString(array $conditions);
+   * This function must only be used when actually translating strings as it
+   * will have the effect of updating the string version.
+   *

What should be used otherwise?

+++ b/core/modules/locale/lib/Drupal/locale/Tests/LocaleJavascriptTranslation.phpundefined
@@ -37,12 +37,9 @@ function testFileParsing() {
-    $source_strings = db_select('locales_source', 's')
-      ->fields('s', array('source', 'context'))
-      ->condition('s.location', $filename)
-      ->execute()
-      ->fetchAllKeyed();
-
+    foreach (locale_storage()->getStrings(array('name' => $filename)) as $string) {
+      $source_strings[$string->source] = $string->context;
+    }

Looks like a great update, however should we not provide JS as the location type?

+++ b/core/modules/locale/locale.bulk.incundefined
@@ -548,7 +556,7 @@ function locale_translate_batch_finished($success, $results) {
-    _locale_invalidate_js();
+    _locale_refresh_translations($langcodes, $strings);

Looks like a superb performance improvement, however the name of the function does not have anything to do now with JS but it exclusively works with JS only.

+++ b/core/modules/locale/locale.moduleundefined
@@ -771,8 +771,35 @@ function locale_string_is_safe($string) {
+    // @todo Update translations for configuration strings too.

Those do not exist yet, so I don't think this todo has anything to do here.

+++ b/core/modules/locale/locale.moduleundefined
@@ -771,8 +771,35 @@ function locale_string_is_safe($string) {
+ *
+ * Note this function must be called one final time with a FALSE argument to save
+ * pending updates.
+ *

Erm, what? This sounds puzzling to me.

+++ b/core/modules/locale/locale.pages.incundefined
@@ -403,10 +398,16 @@ function locale_translate_edit_form_validate($form, &$form_state) {
+  // Preload all translations, faster and also not to use findTranslation().

"also not to use..."? should be fixed for grammer + why not? I guess because of the version update, but that should be documented.

Jose Reyero’s picture

Status: Needs work » Needs review
FileSize
8.04 KB
33.61 KB

Wow, that was an in-depth review.

@Gábor Hojtsy,
New patch addressing all the issues in #18, some notes:
- About the update, yes, it was taking some time that I don't think happens now (with only js strings). Anyway if later we find that's still a problem (how many js strings?) we can rename the field in update.inc and move the migration + dropping the renamed field to a regular update in install.inc.
Anyway dropping other locations looks like a possitive clean up to me.
- About location separator, it seems to be '; ' from old code, but I also copied the regexp from locale own code so it should work :-)
- That puzzling comment was a left over from previous patches.
- About _locale_refresh_translations() repurposed it a little bit (does the cache refresh too, which seems silly not to) so I guess it's clearer now. Also I've added some consistency in the comments.
- About #17, what i meant is: every time you add a location to the string, we tag it for current Drupal version. (the StringStorage does it).
- Names for location type, name? I don't have a better idea either. Specially since I'm planning on using that 'name' field for the configuration name too :-) But if the problem is with the conditions passed, we could think of 'location.type', 'location.name' for future locale storage clean ups. Since we've got the thing quite abstracted, search conditions don't need to match field names anymore, which are storage-internal-only.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

Only one minor thing found in the update: :)

+++ b/core/modules/locale/lib/Drupal/locale/StringDatabaseStorage.phpundefined
@@ -430,7 +430,8 @@ protected function dbStringSelect(array $conditions, array $options = array()) {
+    // If we have conditins for location's type or name, then we need the
+    // location table, for which we add a subquery.

conditins => conditions

Jose Reyero’s picture

Status: Needs work » Needs review
FileSize
33.61 KB

Let's try fixing the typo manually :-)

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me now.

Gábor Hojtsy’s picture

Issue summary: View changes

Updated issue summary.

Status: Reviewed & tested by the community » Needs work
Issue tags: -Configuration system, -D8MI, -sprint, -language-config

The last submitted patch, locale_config_workflow-1777070-20.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
Issue tags: +Configuration system, +D8MI, +sprint, +language-config
Jose Reyero’s picture

Status: Needs review » Reviewed & tested by the community

So it was a testbot issue, moving back to RTBC

webchick’s picture

Status: Reviewed & tested by the community » Needs work

I ran out of steam to do a full review on this tonight, but here's a commit-blocker:

+++ b/core/includes/update.incundefined
@@ -412,6 +412,79 @@ function update_prepare_d8_language() {
+        // Drop old locales_source.location field.
+        db_drop_field('locales_source', 'location');

This is a no-no to do in update_prepare_d8* functions. Those should be harmless to the parent site; we can't drop database tables/fields here.

Gábor Hojtsy’s picture

Status: Needs work » Reviewed & tested by the community

@webchick: that is an interesting comment. We are currently (before patch) drop 7 fields in the same function, and we introduced those field drops with multiple patches: http://api.drupal.org/api/drupal/core%21includes%21update.inc/function/u...

Given language is so core to boostrapping Drupal, it is important that the language schema is in the form expected by the runtime system, so not seeing how we could do this clearly later (eg. without bugos data being loaded/cached and us needing to prepare the code of unexpected fields / field name collisions, etc. to possibly appear due to the temporary state in the update).

Jose Reyero’s picture

@webchick,
Would it be ok renaming it (to location_old), and then dropping it in a regular update later?

The problem is if we keep the field with the same name, then the upgrade process crashes later as soon as it attempts to read a translation. (For which we could add workarounds in the code itselt, but it doesn't make sense to me adding workarounds for obsoleted fields ?).

Also the data migration needs to be there because otherwise that data is rebuilt during core upgrade and we lose that information. Though it's old string files/version so that wouldn't be a catastrophe either, just we cannot postpone the data migration because doing it later may overwrite newer data.

Btw, just in case, we've been doing some research about why the upgrade process is so sensitive to changes in the language system, #1807272: Mixed locale issues with upgrade scripts, unit tests, 'Table simpletest..locales_source doesn't exist', etc...
(Which will take a few more "rework" patches to fix, then maybe we can move stuff from this upgrade.inc to regular locale upgrades)

podarok’s picture

#21 looks good for me
+1 RTBC

Jose Reyero’s picture

Status: Reviewed & tested by the community » Needs work

It seems, from a talk with @Gábor Hojtsy and @webchick on IRC, we need to avoid dropping fields in update.inc so this is the plan:
- Rename $string->location to $string->locations (locationS)
- Maybe getLocation() to getLocations() too for it to be consistent.
- We move the code for dropping the old 'location' field to a regular update in locale.install
(only that part, the data migration needs to be in update.inc, reasons explained in #28)

Everything should run fine and pass tests with these changes.

webchick’s picture

Category: feature » task

Yeah, created #1816864: update_prepare_d8_bootstrap() makes irreversible changes to D7 databases, so update UPGRADE.txt accordingly to track this at a larger meta-level.

Thank you, Jose!

Also, from the looks of things, this is more of a refactoring task than a new feature so I am re-categorizing accordingly so this is not blocked by thresholds.

Gábor Hojtsy’s picture

Status: Needs work » Reviewed & tested by the community

@reyero: @webchick's point was that no irreversible schema changes should be done in the update. Not sure where would irreversible starts. If we rename a field? If we create a new table and copy the field data over? Both would need manual DB adjustments to turn back on just to different degrees.

However, consensus on #1816864: update_prepare_d8_bootstrap() makes irreversible changes to D7 databases, so update UPGRADE.txt accordingly so far is that this is not in fact a problem (should be documented rather) and no other concerns were posted here, so moving back to RTBC.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
33.7 KB
2.97 KB

Upon more discussion with @webchick in IRC I implemented these two changes mentioned above by Jose. It should be back to RTBC when passes (given the update related concerns are not anymore valid).

- Rename $string->location to $string->locations (locationS)
- Maybe getLocation() to getLocations() too for it to be consistent.

catch’s picture

+++ b/core/modules/locale/lib/Drupal/locale/StringBase.phpundefined
@@ -152,6 +152,35 @@ public function getValues(array $fields) {
+  public function getLocations($load = TRUE) {
+    if ($load && !isset($this->locations)) {
+      $this->locations = array();
+      foreach ($this->getStorage()->getLocations(array('sid' => $this->getId())) as $location) {
+        $this->locations[$location->type][$location->name] = $location->lid;
+      }
+    }
+    return $this->locations;

When would you ever want to call this method with $load = FALSE while $this->locations isn't set?

+++ b/core/modules/locale/lib/Drupal/locale/StringInterface.phpundefined
@@ -158,6 +158,54 @@ public function setValues(array $values, $override = TRUE);
+   * @param bool $load
+   *   (optional) Whether to load the string locations if not loaded yet.

This also doesn't explain why you'd ever set this to false.

+++ b/core/modules/locale/locale.installundefined
@@ -126,6 +120,54 @@ function locale_schema() {
+      'version' => array(
+        'type' => 'varchar',
+        'length' => 20,
+        'not null' => TRUE,
+        'default' => 'none',
+        'description' => 'Version of Drupal, where the location was found (for locales optimization).',

This is no longer really for optimization, since the locale caching strategy has changed. It's still for (potential) garbage collection of old strings though.

Otherwise looks good to me.

Gábor Hojtsy’s picture

@catch: about $load = FALSE, I had the same question above and it was discussed between #13-15. The answer from Jose was:

getLocation(FALSE) is used by the StringDatabaseStorage to check whether the string has locations to save without loading them if not. The problem of the storage is that it deals with 'StringInterface' objects so it cannot access the properties directly.

In other words the 'locations' in a String stores possibly yet unsaved locations. The updateLocation() in StringStorage calls out to ask for these. If it does not have any unsaved changes, this makes it work quicker. Otherwise it would always load all locations when attempting to update locations even if there are no changes. So it would exercise the storage and then iterate through locations that it just got from storage so none need to be updated anyway. This is what is skipped.

Do you think this optimisation should be removed or have a better suggestion for the argument name / description? :) As seen above, I also stumbled onto it and took some time to realize what it is for :/

Jose Reyero’s picture

Just two notes:
- About the location version, yes that is for garbage collection, but this garbage collection means really locales optimization, since we do load all strings with 'javascript' locations for building the xxx.lang.js file (this behavior is just the current one, may be fixed on future optimization patches if we get this feature in).
- About migrating old js locations from Drupal 7, thinking more about that, we should drop that part, since we'll need to rebuild anyway all js translation files for Drupal 8 it doesn't make sense to keep old string locations here (because we don't want obsoleted strings in the lang.js files, they should be as small as possible).

Gábor Hojtsy’s picture

Status: Needs review » Needs work

That makes sense indeed. Marking for needs work for that then.

catch’s picture

In reply to #35 - what about either 2 methods, or changing the argument to $check_only?

Jose Reyero’s picture

Status: Needs work » Needs review
FileSize
10.41 KB
32.59 KB

Implemented suggestions from #34 to #38, summary:
- Changed parameter to getLocations($check_only), updated descriptions, variable names to be consistent ($location -> $locations), etc...
- Removed the script for upgrading old locations and fixed related test (#36, #37)

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me. Now even invasive updates are pushed to an update function, so resolved @catch's and @webchick's concerns.

Gábor Hojtsy’s picture

catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed. fwiw completely agree with removing the update for locations.

Gábor Hojtsy’s picture

Issue tags: -sprint

Woot, thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

This introduced a bug in sqlite due to the selective JS cache refresh based on string IDs (when there are too many strings affected). A tiny part of this patch is suggested to be rolled back to proceed at #1884182: Import of large number of translations fails on SQLite. That bug blocks the commit of #1848490: Import translations automatically during installation, so it would be good to resolve quickly :)

Gábor Hojtsy’s picture

Issue summary: View changes

Updating description since patch has been repurposed.