Problem/Motivation

Prior to Drupal 7.7 we had an inconsistent behavior when initializing the field 'translatable' property all around core, see #1164852: Inconsistencies in field language handling for details. The solution outlined there has been committed without the related update, since it was not possible to provide test coverage for it. We still need an update ensuring that also D7 sites created before D7.7 get a consistent behavior.

Proposed resolution

Quoting from the parent issue summary:

The tricky part is how to deal with existing sites: since translatability can be changed in any moment, originally we agreed that only sites that never dealt with field languages should get a fix, this means sites with no translation handler enabled and no field value in the database with a language assigned. This could be checked in a field-storage-safe fashion through EntityFieldQuery. However since update functions cannot rely on entity information (#1199946: Disabled modules are broken beyond repair so the "disable" functionality needs to be removed) and they cannot cause hooks to be invoked, there is no way to implement a perfectly working update function.

We decided to fall back to an update function only checking if fields have any language associated, in which case no change to field's translatability should be performed.

Remaining tasks

The update code is ready but we need test coverage for it, which is not possible to provide at the moment, since we have no support for testing minor updates. #1182296: Add tests for 7.0->7.x upgrade path (random test failures) is postponed to #1182290: Add boilerplate upgrade path tests.

CommentFileSizeAuthor
#1 tf-1266430-1.patch2.29 KBplach
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

plach’s picture

FileSize
2.29 KB

Rerolled the patch posted by @sun in #1164852-70: Inconsistencies in field language handling.

plach’s picture

Issue tags: +Needs tests

tagging

plach’s picture

Issue summary: View changes

Minor cleanup

sun’s picture

tdurocher’s picture

Have searched mightily and this is the closest I have found to my issue. I would say that my issue is probably buried somewhere in the parent of this issue except that is supposed to be fixed in 7.8, which I am now running. After all this I see that 7.9 is out, but don't see my issue in the list of those fixed anyway.

What I am seeing is that enabling the locale module and adding a language (Spanish) changes all existing node.language database fields (all nodes) to 'en'. This may be okay in itself, though it is unexpected as the UI says that adding a language does not affect existing content. But the real problem is that the field_data_body.language is NOT set to 'en' but rather left at 'und'. This causes the body field to be blank when trying to edit the node. If I save the node like this, the associated field_data_body is actually deleted. If I set all node.language back to 'und', the problem goes away. Turning off locale then also removes the problem, even though it does not set the node.language back to 'und'. I guess node.language isn't checked anymore.

My feeling is that adding a language should leave existing content as 'und' which is nicely associated in the switcher with "Language Neutral". But in any case, the nodes cannot be made uneditable simply by adding a language.

Also, could someone knowledgeable let me know if setting all node.language to 'und' is a safe workaround? Or is there a better one? Messing with the database is not the kind of workaround I like to use, due to unknown associations, etc. Also keeping in mind that I don't know what language modules I may end up with besides Locale and Content Translate. Thanks.

plach’s picture

What I am seeing is that enabling the locale module and adding a language (Spanish) changes all existing node.language database fields (all nodes) to 'en'.

There is nothing in core that switches node languages upon Locale activation. You should check if you have some contributed module that's doing it.

But the real problem is that the field_data_body.language is NOT set to 'en' but rather left at 'und'. This causes the body field to be blank when trying to edit the node. If I save the node like this, the associated field_data_body is actually deleted.

This is not how core is supposed to work, neither before the bug fix cited in the OP, neither after. The current intended core behavior is: when you enable Locale, nodes might or might not get a language assigned, depending on your configurations and whether you installed the i18n module, which assigns no language instead of the default language to nodes with multilingual support disabled.

Node bodies are now always untranslatable so they are always assigned LANGUAGE_NONE ('und'), no matter which language the node has. Core should be able to handle this scenario and not lose node bodies. They should not be lost, anyway, they should still be present in the DB, although being unaccessible.

Check your field_config table and see if node bodies are set to translatable. You may want to try the unofficial update scripts you can find in http://drupal.org/node/1164852#comment-4761416 to see if they fix your site. Backup it before using them! See also the D7.8 release notes for details.

tdurocher’s picture

Thanks for your response, @plach. I've seen that you are one of the heavy hitters in this area. I appreciate it.

My field_config does indeed show the 'body' field as translatable (==1). In fact, the 2 or 3 nodes I have translated so far did have their field_data_body.language set to 'es' or 'en' during the save.

It is good to hear that core should not be setting node.language upon enabling locale and content translate, or assigning a new language to locale or enabling language on content type. Unfortunately, I saw it on all 3 copies of this site (7.4, 7.8, 7.8). And the saved empty node bodys are not in the table anywhere. They are deleted (I guess an empty data body is considered no data body).

Regarding other modules I have no other language modules enabled. I do have entity_types and entity_tokens, ckeditor, date, and several others but don't think the others would be involved in editing page nodes. But these guys would have no business setting node.language either, right?

I will take a look at your scripts but there is a terrible bug occurring here, though possibly only to me.... I'm thinking I need to write a critical issue on this. Currently, I appear to be doing alright with my workaround of manually setting all node.language back to 'und'. Would you consider this safe to do?. It looks good so far but I can see that I'm only at the beginning of making a fully multi-lingual site.

plach’s picture

Regarding other modules I have no other language modules enabled. I do have entity_types and entity_tokens, ckeditor, date, and several others but don't think the others would be involved in editing page nodes. But these guys would have no business setting node.language either, right?

No idea of what could be causing this behavior. There is no module I'm aware of (but it might exist of course :) that changes node languages automatically.

I will take a look at your scripts but there is a terrible bug occurring here, though possibly only to me.... I'm thinking I need to write a critical issue on this.

Well, if this bug can be reproduced outside your environment and it turns out to be a core bug, it would be difficult to imagine a more critical one. Be sure to investigate it thoroughly and if you decide to submit a bug report against core you should provide enough information to reproduce it on a clean installation.

Currently, I appear to be doing alright with my workaround of manually setting all node.language back to 'und'. Would you consider this safe to do?. It looks good so far but I can see that I'm only at the beginning of making a fully multi-lingual site.

Surely resetting your field_config table to have all fields non translatable, through the scripts cited above, should fix the problem of the disappearing node bodies, since node language would not matter anymore. For the rest it's hard to tell without any idea of what's going on :)

tdurocher’s picture

Well, looking at your scripts, the first one seems to avoid doing anything if any fields have a language set. Since I have fields with a language set (I already translated a couple of node bodies), then this script won't do anything. For the same reasons as mentioned in the script comments and in the 7.8 release notes, I am hesitant to run the "brute force" 2nd script. Those comments do not give a warm fuzzy feeling regarding setting field_translatable always to false.

I guess when I get a chance I'll try to reproduce this on a clean install. If I can do it we'll have a critical issue for somebody to work on.

effulgentsia’s picture

#1 takes the approach of not touching fields that already have language-specific content, which is the safer thing to do in a core update function. For anyone who wants to set their fields to not translatable, even if it already has language-specific content, here's a snippet you can use in a custom update function or any other place that you can execute a PHP snippet within Drupal (e.g., devel module).

// Set every field to not translatable, since that is the default as of
// Drupal 7.7.
foreach (field_read_fields(array('translatable' => 1), array('include_inactive' => TRUE)) as $field) {
  $field['translatable'] = FALSE;
  field_update_field($field);
  // If the locale module is enabled, then it's been negotiating between the
  // entity language and LANGUAGE_NONE. Once the field is no longer
  // translatable, only LANGUAGE_NONE data is used. For entities containing
  // field data in a single language, move that to LANGUAGE_NONE. Leave
  // entities containing field data in multiple languages (or a language and
  // LANGUAGE_NONE) alone. Those need to be dealt with manually.
  if (module_exists('locale')) {
    foreach (array(_field_sql_storage_tablename($field), _field_sql_storage_revision_tablename($field)) as $table) {
      if (db_table_exists($table)) {
        $ambiguous_entity_ids = db_select($table, 'fd')
          ->fields('fd', array('entity_id'))
          ->groupBy('entity_id')
          ->having('COUNT(DISTINCT language) > 1')
          ->execute()
          ->fetchCol();

        $query = db_update($table)
          ->fields(array('language' => LANGUAGE_NONE))
          ->condition('language', LANGUAGE_NONE, '!=');
        if ($ambiguous_entity_ids) {
          $query->condition('entity_id', $ambiguous_entity_ids, 'NOT IN');
          // Make sure we know about fields that need manual inspection.
          watchdog('field', 'Field %field has been changed to not translatable, but has language-specific content in table %table for entities %ids. That content will no longer appear on this site.', array(
              '%field' => $field['field_name'],
              '%table' => $table,
              '%ids' => implode(',', $ambiguous_entity_ids),
            ));
          }
        }
        $query->execute();
      }
    }
  }
}

@tdurocher: this might help you with your already translated body field, but note that it only moves records from language-specific to LANGUAGE_NONE for entities containing that data in a single language. You may have some entities reported in watchdog() after running this that require you to manually inspect their data records and decide what to do.

No one should run the above snippet if you intentionally have some fields set to translatable. This is only applicable to people who had D7 sites prior to 7.7, use the locale module, do not use any field translation module, and want all their old fields changed to non-translatable to be consistent with the new core decision to default all fields that way.

Note that the above also assumes that all field data is stored in a SQL database, so do not run the above if you're using MongoDb or other NoSQL databases.

Apologies if this comment derails this issue. #1 is still the patch to review in terms of anything that can go into core.

tdurocher’s picture

Thanks. My problem did recur when I enabled field translation for my Page content type. If a language was selected, and the client selected English which they did even though they could have kept in language-neutral, on their next edit the body field wooud not show up in the body text area. My solution was to set node.translation, field_body_data and field_body_revision to language-none. I think if all them were set to English it would also be okay. However, I still don't know how to prevent this and I have thus disabled content translation (the client is okay with this for now). So hopefully, I will find some insight in your comment/code to fixing my problem.

Perhaps setting the fields to non-translatable will solve the problem? Are you saying that the fields will still be translatable? Is non-translatable just a required default?

plach’s picture

Status: Postponed » Closed (won't fix)

Given that this is still postponed to #1239370: Module updates of new modules in core are not executed and almost one year has passed with almost none complaining about this, I think it's safe to close it. We might want to reopen it if some serious issue is documented.

Honestly I'd avoid messing with the upgrade path if it's not strictly neccessary.

plach’s picture

Issue summary: View changes

Corrected the proposed solution to match the patch in #1.