Problem/Motivation

I am migrating wimleers.com from D7 to D8. It contains comments from the D5/D6 days. It looks like this:

Note that the language column is empty ('') for the upper half of the rows displayed, but set to the unspecified language code ('und') for the bottom half. The upper half are comments upgraded from D5/D6. The bottom half are comments created in D7.

When migrating comments with entity validation turned on (see #2745797: Add option to content entity destinations for validation and #3095456: [Plan] Discuss strategy for long-term use of entity validation in the migration system), this causes lots of validation errors:

Also there's cases when comments become "orphaned" and sometimes they using language which disabled or deleted, so this cases also needs "fallback" to some undefined language - ref #3044158: Orphaned comments Error: Call to a member function getOwner()

Proposed resolution

Detect this edge case in \Drupal\comment\Plugin\migrate\source\d7\Comment and automatically fix it.

Remaining tasks

None.

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

TBD

Comments

Wim Leers created an issue. See original summary.

wim leers’s picture

Status: Active » Needs review
StatusFileSize
new4.24 KB
new5.53 KB

Fix + test coverage

wim leers’s picture

This was triggering 473 validation errors for me, with #2 applied zero 🤓

wim leers’s picture

Title: The "language" column in D7 might be empty after migrating from D6, this is invalid in D8 » The comment "language" column in D7 might be empty after migrating from D6, this is invalid in D8
Component: comment.module » migration system

Been told that I should put this in migration system :)

quietone’s picture

I have only glanced at this patch and this jumped out at me.

+++ b/core/modules/comment/src/Plugin/migrate/source/d7/Comment.php
@@ -44,6 +45,12 @@ public function prepareRow(Row $row) {
+    // If this comment was upgraded from Drupal 6 or older, it's possible that
+    // no language code was specified.
+    if ($row->getSourceProperty('language') === '') {

Instead of passing an empty string in the pipeline we normally use the default_value process plugin to set the value in the pipeline. For example as is done in the node migrations.

  langcode:
    plugin: default_value
    source: language
    default_value: "und"
wim leers’s picture

I was first going to do that, but … I wasn't 100% clear on how to configure \Drupal\migrate\Plugin\migrate\process\DefaultValue to only treat '' like this, and not NULL, 0, '0' et cetera (see https://www.php.net/manual/en/function.empty.php, which is what DefaultValue uses).

The last submitted patch, 2: 3095922-2-test_only_FAIL.patch, failed testing. View results

quietone’s picture

Assigned: Unassigned » quietone
wim leers’s picture

To be clear: I'm happy to change it to that, but I think the current patch allows for less room for error.

quietone’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Well I reckon that NULL, 0 and '0' would be 'bad' data and it would be perfectly reasonable to attempt to migrate the row with a langcode of 'und'.

This will also need a migration test. I mean update the fixture with a new comment and make assertions in d7/MigrateCommentTest.

quietone’s picture

Assigned: quietone » Unassigned
wim leers’s picture

Good point! I was just reasoning from the perspective "this is an actual observed known-to-occur invalid value resulting from prior Drupal upgrades". But I like your reasoning more :)

I looked for the appropriate place to add test coverage, and thought #2's was correct. But I see now that is just sort of a unit test, I hadn't found \Drupal\Tests\comment\Kernel\Migrate\d7\MigrateCommentTest yet. That looks great! Thanks for the guidance! 🙏

andypost’s picture

I think this fallback should be needed not only in case of bad data but for case when no such language exists (I faced with orphaned comments in language which were deleted years ago)

wim leers’s picture

#13: oh, great, can you update the issue summary? 🙏

andypost’s picture

Issue summary: View changes

updated with first found #3044158: Orphaned comments Error: Call to a member function getOwner() but it was very common case for d5/d6 era

huzooka’s picture

Assigned: Unassigned » huzooka
huzooka’s picture

wim leers’s picture

Issue tags: -Needs tests

Yay, that test-only patch failed! Now let's bring back the fix from #2 (and perhaps tweak some test expectations) to make it pass again.

huzooka’s picture

I had to update Drupal\Tests\migrate_drupal_ui\Functional\d7\Upgrade7Test since we have one more comments.
Also removed the use Drupal\Core\Language\LanguageInterface; line from Drupal\comment\Plugin\migrate\source\d7\Commentfor test-only patch.

Addressed #10, but not with the default_value plugin: I still prefer using LanguageInterface::LANGCODE_NOT_SPECIFIED instead of hard-coding und in the migration.

wim leers’s picture

I had to update Drupal\Tests\migrate_drupal_ui\Functional\d7\Upgrade7Test since we have one more comments.

Incremented the expected comment count: makes sense.

I think all feedback has been addressed. I think this is ready.

This is now just blocked on migration system maintainer review. Thanks, @huzooka! 🙏👍

quietone’s picture

Assigned: Unassigned » quietone

I'll review this during the coming week.

wim leers’s picture

Lovely, thank you @quietone!

quietone’s picture

Assigned: quietone » Unassigned
Status: Needs review » Needs work

@huzooka and @Wim Leers, thanks for work here, this was easy to review.

This look really good and there is failing patch as well. The test and changes to the fixture are doing exactly what needs to be done.

  1. +++ b/core/modules/comment/src/Plugin/migrate/source/d7/Comment.php
    @@ -44,6 +45,12 @@ public function prepareRow(Row $row) {
    +    // If this comment was upgraded from Drupal 6 or older, it's possible that
    

    This is just setting a default if the language is not defined and should be done in the process pipeline, not the source plugin. Doing so in the source plugin adds code that requires testing and maintenance. We can avoid that by using the pipeline.

    And I think a comment in the pipeline would be helpful as well.

    #  Comments migrated from Drupal 6 to Drupal 7 may not have a language.
       langcode:
        plugin: default_value
        source: language
        default_value: "und"
  2. +++ b/core/modules/comment/tests/src/Kernel/Migrate/d7/MigrateCommentTest.php
    @@ -105,6 +105,27 @@ public function testMigration() {
    +    // Comment migrated from Drupal 6 to Drupal 7 may have empty language
    +    // property. This is invalid in Drupal 8. Test that these comments will have
    +    // valid language after the migration.
    

    Then with part of this comment moved to the pipeline this can be shorter. Maybe just
    'Tests a comment migrated from Drupal 6 to Drupal 7 that did not have a language.' or something like that

  3. +++ b/core/modules/comment/tests/src/Kernel/Migrate/d7/MigrateCommentTest.php
    index dc99489563..1231a7a377 100644
    --- a/core/modules/comment/tests/src/Kernel/Plugin/migrate/source/d7/CommentTest.php
    

    Changes not needed if the default value is set in the process pipepline.

quietone’s picture

Updating tags

huzooka’s picture

Status: Needs work » Needs review
StatusFileSize
new7.79 KB
new6.87 KB

Addressed everything from #24.

I removed the previous additions of the comment source plugin test Drupal\Tests\comment\Kernel\Plugin\migrate\source\d7\CommentTest, and set the default 'und' value in the process pipeline.

@quietone, @Wim Leers, @andypost: thank you all!

wim leers’s picture

Status: Needs review » Reviewed & tested by the community
+++ b/core/modules/comment/migrations/d7_comment.yml
@@ -40,7 +40,11 @@ process:
-  langcode: language
+  # Comments migrated from Drupal 6 to Drupal 7 may not have a language.
+  langcode:
+    plugin: default_value
+    source: language
+    default_value: und

+++ b/core/modules/comment/src/Plugin/migrate/source/d7/Comment.php
@@ -45,12 +44,6 @@ public function prepareRow(Row $row) {
-    // If this comment was upgraded from Drupal 6 or older, it's possible that
-    // no language code was specified.
-    if (!$row->getSourceProperty('language')) {
-      $row->setSourceProperty('language', LanguageInterface::LANGCODE_NOT_SPECIFIED);
-    }

D'oh. This makes so much sense 😅 Thanks, @quietone!

I think it's okay for me to RTBC since there's basically nothing left from my first patch iteration 🤓

huzooka’s picture

huzooka’s picture

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

Committed and pushed 642b7653d4 to 9.0.x and 17ef9e3e9a to 8.9.x and fc579acaa2 to 8.8.x. Thanks!

Backported to 8.8.x as a migrate bugfix.

  • alexpott committed 642b765 on 9.0.x
    Issue #3095922 by huzooka, Wim Leers, quietone, andypost: The comment "...

  • alexpott committed 17ef9e3 on 8.9.x
    Issue #3095922 by huzooka, Wim Leers, quietone, andypost: The comment "...

  • alexpott committed fc579ac on 8.8.x
    Issue #3095922 by huzooka, Wim Leers, quietone, andypost: The comment "...
wim leers’s picture

Status: Needs work » Fixed

Lovely, thanks! Since it's already been backported all the way to 8.8, I think this can be marked Fixed? :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.