Problem/Motivation

We would like to add support for deleting items that are no longer present in the incoming data when a migration is run. This was a feature of Feeds in D7 and seems like a common need.

Proposed resolution

From @mikeryan in #15:

I really think this should be a distinct "purge" operation that gathers the currently-available source IDs from the source plugin, and scans the map table for any which aren't in that list to invoke the destination rollback on them. Now that I think about it, maybe it could even be an option on rollback - drush migrate-rollback --missing-from-source my_migration.

Completed tasks

Two tests added and passing tests.

  • MigrateRollbackMissingTest - is this sufficient? it does not test the source all_row property.
  • MigrateRollbackMissingMessageTest

Remaining tasks

  • Decide whether a simpler approach will work. See the discussion in Comments #133, #135, #136, #138.
  • If we stay with the current approach, then address the points raised in Comment #139.
  • Manual testing. See instructions in the next section.
  • Add documentation.
  • Add a change record.

Manual testing

At this point, testing is a little complicated. The current patch (in Comment #136) adds an interface and a trait. In order for selective rollback to work, your migration source plugin needs to implement the interface. An easy way to do that is by using the trait. For examples, see the code snippet below the following steps.

  1. Install Migrate Plus, at least version 8.x-4.2, for #3045346: Add missing source item event.
  2. Apply the current patch from this issue.
  3. Apply the current patch from #2863426: Implement rollback of items no longer in source data.
  4. Create a source plugin that implements the new SyncableSourceInterface interface.
  5. Update your migrations to use your custom source plugin.
  6. Clear caches.
  7. You can now roll back your migration with the new option provided by #2863426: Implement rollback of items no longer in source data: drush mr my_migration --missing-from-source.

For Step 4, you can follow the example in the SyncableEmbeddedDataSource plugin that is added in this issue. That class is used in the new MigrateRollbackMissingTest test class. Here is a class that @marvil07 wrote for a client project that uses the patch from this issue:


namespace Drupal\my_migrate\Plugin\migrate\source;

use Drupal\migrate\Plugin\MigrationInterface;
use Drupal\migrate_plus\Plugin\migrate\source\Url;
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;
use Drupal\migrate_tools\SyncableSourceInterface;

/**
 * A syncable url source using SyncableSourceTrait.
 *
 * @see Drupal\migrate_plus\Plugin\migrate\source\Url
 *
 * @MigrateSource(
 *   id = "syncable_url",
 *   source_module = "my_migrate"
 * )
 */
class SyncableUrl extends Url implements SyncableSourceInterface {

  use SyncableSourceTrait;

  /**
   * {@inheritdoc}
   */
  public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this->setAllRowsFromConfiguration();
  }

}

User interface changes

N/A

API changes

Data model changes

N/A

CommentFileSizeAuthor
#206 2809433-206.patch20.11 KBnortmas
#188 interdiff_184-188.txt6.45 KBheddn
#188 2809433-188.patch10.29 KBheddn
#187 2809433-187.patch8.16 KBheddn
#184 2809433-184.patch8.07 KBandypost
#184 interdiff.txt2.67 KBandypost
#183 interdiff_182-183.txt837 bytesheddn
#183 2809433-183.patch7.46 KBheddn
#182 2809433-182.patch7.57 KBheddn
#182 interdiff_181-182.txt4.73 KBheddn
#181 2809433-181.patch4.85 KBheddn
#177 2809433-176.patch1.03 KBjohn.oltman
#174 2809433-174.patch1010 bytesjohn.oltman
#170 2809433-170.patch19.92 KBStockticker
#170 interdiff_153-170.txt952 bytesStockticker
#153 interdiff-136-153.txt1.05 KBklidifia
#153 2809433-153.patch18.75 KBklidifia
#136 interdiff-130-136.txt5.66 KBmarvil07
#136 2809433-136.patch19.04 KBmarvil07
#130 interdiff-129-130.txt1.15 KBmarvil07
#130 2809433-130.patch18.25 KBmarvil07
#129 interdiff-126-129.txt5.17 KBmarvil07
#129 2809433-129.patch18.28 KBmarvil07
#126 interdiff-125-126.txt7.51 KBquietone
#126 2809433-126.patch17.82 KBquietone
#125 interdiff-123-125.txt4.75 KBmarvil07
#125 2809433-125.patch18.07 KBmarvil07
#123 interdiff-116-123.txt5.87 KBquietone
#123 2809433-123.patch17.79 KBquietone
#122 2809433-122.patch17.87 KBquietone
#116 interdiff-114-116.txt2.93 KBquietone
#116 2809433-116.patch15.45 KBquietone
#114 interdiff.txt479 bytesquietone
#114 2809433-114.patch14.06 KBquietone
#112 2809433-112.patch14.03 KBquietone
#112 interdiff-110-112.txt1.29 KBquietone
#110 interdiff-106-110.txt9.82 KBmarvil07
#110 2809433-110.patch13.92 KBmarvil07
#106 2809433-106.patch10.08 KBquietone
#106 interdiff-105-106.txt278 bytesquietone
#105 interdiff-104-105.txt966 bytesquietone
#105 2809433-105.patch10.08 KBquietone
#104 interdiff-102-104.txt6.3 KBquietone
#104 2809433-104.patch10.02 KBquietone
#102 interdiff.2809433.100-102.txt934 bytesmikelutz
#102 2809433.102.rollback.patch5.66 KBmikelutz
#100 2809433-100.patch5.81 KBquietone
#100 interdiff-99-100.txt7.32 KBquietone
#99 interdiff-95-97.txt930 bytesquietone
#99 2809433-99.patch9.5 KBquietone
#97 interdiff-95-97.txt930 bytesquietone
#97 2809433-97.patch10.5 KBquietone
#96 2809433-95.patch10.21 KBquietone
#95 2809433-94.patch10.13 KBquietone
#93 2809433-93-new-test.txt3.42 KBquietone
#81 interdiff_78-81.txt2.34 KBmErilainen
#81 2809433-migrate-delete-missing-from-source-81.patch22.88 KBmErilainen
#78 2809433-migrate-delete-missing-from-source-8.6.7.patch23.02 KBachraf.noomane
#72 2809433-migrate-delete-missing-from-source-8.6.4.patch41.46 KBwizonesolutions
#69 Migrate-support-for-deleting-items-no-longer-in-the-incoming-data-2809433-69.patch24.26 KBbiguzis
#61 interdiff.2809433.60-61.txt14.78 KBmikelutz
#61 2809433-61.drupal.Migrate-support-for-deleting-items-no-longer-in-the-incoming-data.patch24.6 KBmikelutz
#60 2809433-60.drupal.Migrate-support-for-deleting-items-no-longer-in-the-incoming-data.patch18.93 KBmikelutz
#59 2809433-59.drupal.Migrate-support-for-deleting-items-no-longer-in-the-incoming-data.patch13.38 KBmikelutz
#56 drupal-2809433-55-56-interdiff.txt430 bytesKarlShea
#56 drupal-2809433-56.patch19.03 KBKarlShea
#55 drupal-2809433-55.patch19.15 KBgeek-merlin
#55 drupal-2809433-53-55-interdiff.txt1.48 KBgeek-merlin
#53 drupal-2809433-53.patch18.62 KBgeek-merlin
#53 drupal-2809433-36-53-interdiff.txt1.87 KBgeek-merlin
#47 migrate_unpublished_node.zip5.94 KBkumkum29
#38 2809433-38.patch17.27 KBjofitz
#38 interdiff-36-38.txt1.05 KBjofitz
#36 2809433-36.patch17.11 KBjofitz
#36 interdiff-35-36.txt2.99 KBjofitz
#35 migrate-support-for-deletion-of-missing-items-2809433-35.patch16.3 KBPeacog
#29 migrate-support-for-deletion-of-missing-items-2809433-29.patch16.67 KBmaxdev
#20 interdiff.txt4.11 KBkevin.dutra
#20 migrate-support-for-deletion-of-missing-items-2809433-20.patch15.31 KBkevin.dutra
#18 migrate-support-for-deletion-of-missing-items-2809433-18.patch13.84 KBkevin.dutra
#17 migrateinterdiff.txt737 bytesPavan B S
#17 migrate--support-for-deletion-of-missing-items--2809433--15--8.2.patch8.08 KBPavan B S
#12 migrate--support-for-deletion-of-missing-items--2809433--12--8.2.patch8.1 KBgeerlingguy
#11 migrate--support-for-deletion-of-missing-items--2809433--11--8.2.patch8.25 KBiMiksu
#7 migrate--support-for-deletion-of-missing-items--2809433--7--8.2.patch8.25 KBdamontgomery
#2 migrate--support-for-deletion-of-missing-items--2809433--2--8.1.patch8.23 KBdamontgomery
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pandaeskimo created an issue. See original summary.

damontgomery’s picture

damontgomery’s picture

This is the related patch for migrate_tools to show the number of deleted items, https://www.drupal.org/node/2809437

Status: Needs review » Needs work
geerlingguy’s picture

damontgomery’s picture

This applies to 8.1.10. I'll try to look into what's needed to test against 8.3

damontgomery’s picture

Status: Needs work » Needs review
FileSize
8.25 KB

Tried to make a patch for 8.2.x.

mikeryan’s picture

Status: Needs review » Needs work

Although I helped spitball this approach, I'm still not fond of deleting as a side-effect of import - I feel purging "missing source records" should be a distinct operation.

Given a moment's thought (and looking at the concrete patch), one big problem is that it counts on the import process being complete within a single request. If it's not - if it's broken up through a batch UI, restarted in drush because it's running out of memory, or simply run with a --idlist or --limit - every record not processed in that batch will get deleted. Rather hazardous...

damontgomery’s picture

Thanks for the feedback, Mike.

I agree with those concerns. I think the alternative is to create a new process that will re-read the source data and remove all records no longer in it which is maybe your suggestion. I'm not sure exactly how to go about doing that.

For our purpose, our migration always happens in one go and we may use this patch for now. We can hopefully use this issue if someone (maybe me) can work on the alternative approach.

Thanks again for the direction!

geerlingguy’s picture

@mikeryan - A couple more thoughts, too—it's probably more rare to have migrations that would be configured to delete items that aren't in the source, and I know in the times that I've needed to do it, the source always contained all relevant records, and migrations were always run on cron jobs (thinking of three different situations where this was needed), so even if some records were deleted that shouldn't have been, they would've been recreated on the next run...

But it may be good to make sure there are strong caveats anywhere this ability would be documented!

Another option, too—could we make it so you could unpublish rather than delete? I'm not sure if this would be universal across entities, but for nodes at least, that's a safer option than deletion.

iMiksu’s picture

Despite the three comments above, I've still decided just in case to update the patch of #7 that was having some offset while testing if the patch still applies.

Your branch is up-to-date with 'origin/8.2.x'.
$ curl https://www.drupal.org/files/issues/migrate--support-for-deletion-of-missing-items--2809433--7--8.2.patch | patch -p1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8453  100  8453    0     0  13411      0 --:--:-- --:--:-- --:--:-- 13417
patching file core/modules/migrate/src/MigrateExecutable.php
Hunk #1 succeeded at 296 (offset 5 lines).
patching file core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
Hunk #5 succeeded at 365 (offset 3 lines).
Hunk #6 succeeded at 591 (offset 3 lines).
geerlingguy’s picture

Version: 8.1.x-dev » 8.2.x-dev
Status: Needs work » Needs review
FileSize
8.1 KB

Updated patch for 8.2.x HEAD—there were a few lines failing. I also fixed the comments to use inline (//) style inside functions/methods. No interdiff, it was difficult getting the original patches to apply alongside the 8.2.x updates and generate a reasonable interdiff.

mikeryan’s picture

Assigned: Unassigned » mikeryan

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

mikeryan’s picture

Assigned: mikeryan » Unassigned
Status: Needs review » Needs work

I still don't like the idea of deletion happening as a side-effect of import, and having the source plugin invoking rollbacks on the destination plugin makes me feel all icky inside. I really think this should be a distinct "purge" operation that gathers the currently-available source IDs from the source plugin, and scans the map table for any which aren't in that list to invoke the destination rollback on them. Now that I think about it, maybe it could even be an option on rollback - drush migrate-rollback --missing-from-source my_migration.

I'm thinking also that for scalability, and to avoid problems with batching, the source IDs would be better stored in a temporary database table than in memory.

geerlingguy’s picture

I like that approach... even just having a migrate-purge-missing option or something like that would work for me.

Pavan B S’s picture

+++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
@@ -111,6 +113,43 @@
+   * This will be compared with $this->previouslyMigratedItems to remove items not present

Line exceeding 80 characters.
There is one line which has more than 80 characters. Applying the patch, please review

kevin.dutra’s picture

Here's a first stab at the approach that @mikeryan suggested, where this is a separate type of operation. I'm not sure I'd call this fully baked yet -- it still keeps the source IDs in memory -- but it does at least work for data sets that aren't huge.

No interdiff for this since it's a pretty radical departure from the previous patches. It doesn't look like a corresponding migrate_tools issue was ever opened to expose this to Drush, so I'll do that shortly.

danmuzyka’s picture

Status: Needs review » Needs work

I have a few minor suggestions for this patch.

  1. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -345,6 +345,104 @@ public function rollback() {
    +    $source = \Drupal::service('plugin.manager.migrate.source')->createInstance($source_config['plugin'], $source_config, $this->migration);
    

    Could the migrate source plugin manager be dependency-injected into the constructor for the MigrateExecutable rather than invoked through the static method here?

  2. +++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
    @@ -111,6 +111,13 @@
    +   * Flags whether to return all avaialble source rows.
    

    Typo in the word 'avaialble'; should be 'available'.

At the moment, I don't have a great suggestion for avoiding loading all of the source IDs into memory and iterating through them.

Something else to consider is whether there is a way to refactor the code into some helper methods that rollback() and rollbackMissingItems() can invoke, since right now there is a lot of duplicate code.

Overall this approach makes sense to me.

kevin.dutra’s picture

Status: Needs work » Needs review
FileSize
15.31 KB
4.11 KB

This patch addresses the code review feedback from #19.

  1. I think this would end up being a heavier refactor. Currently, this class is not a service, so we don't have easy access to the service container via that mechanism. These objects are directly instantiated in multiple places, so if we try to go the direct injection route, we'd end up touching code in a number of sort of unrelated spots, so I'm hesitant to add that as part of this issue, but this might be something to consider for a followup.
  2. Oops. Fixed. :)

Per the suggestion, I broke one piece of logic out into a helper method. You're right, there is a little bit of duplication still going on, but I had some difficulty thinking of ways to extract more code and still have it be fairly readable.

danmuzyka’s picture

I think this would end up being a heavier refactor. Currently, this class is not a service, so we don't have easy access to the service container via that mechanism. These objects are directly instantiated in multiple places, so if we try to go the direct injection route, we'd end up touching code in a number of sort of unrelated spots, so I'm hesitant to add that as part of this issue, but this might be something to consider for a followup.

OK, makes sense, not a huge issue.

+++ b/core/modules/migrate/src/MigrateExecutable.php
@@ -345,6 +335,118 @@ public function rollback() {
+  protected function rollbackCurrentRow() {

Could some of the variables ($id_map, $destination_key, and $destination) be passed into this method as parameters, instead of re-initialized here, especially since some of them are already set by the context of the calling method?

Nothing else stands out to me at the moment.

quietone’s picture

Status: Needs review » Needs work
Issue tags: +Needs issue summary update, +Needs tests

Glad, I finally had some time to look at this. Quite interesting. I think the following needs to be addressed.

  1. There is still the code cleanup from #21 to do.
  2. "Loading all of the source IDs into memory and iterating through them". This bothered me and I finally remembered why. It is similar to an i18n issue that got all the source data at once and didn't get committed, partly due to scalability. See #39. Maybe there is another way to do this?
  3. Needs tests
  4. Needs Issue summary update. And should include the approach being used, from #15.
ohthehugemanatee’s picture

heads up: @joelpittet says this has a merge conflict with #2844595: SQLBase breaks GROUP BY queries, which is currently RTBC. So that may need to be resolved, too.

vasi’s picture

I really like this approach!

  1. Using a separate method on MigrateExecutable is good.
  2. I love the 'all_rows' functionality, that could be useful for other purposes too. Maybe it should be a public method on SourcePluginBase, though—it doesn't make much sense for someone to set that in migration .yml files.
  3. As was pointed out above, it might be nice to unpublish (or otherwise hide) instead of delete, under some situations. Maybe we could use a new method on destinations for this. We could do it as a follow-up issue.
  4. Running in_array($source_key, $source_items) in a loop will be very slow if there are a lot of $source_items. But that leads to...
  5. Loading the whole source into memory could be problematic. Could you try using a DB table to collect the source ID hashes, and then doing an outer join to look for differences?
heddn’s picture

Issue tags: +BC break

Taking a quick look at this, we do break BC by adding to the migrate executable interface rollbackMissingItems.

cosmicdreams’s picture

Hey checking in here. I ran across this issue because I attempted to write the same functionality. It looks like there's a function here named rollbackMissingItems that is never used.

Is there some kind of drush function that is missing from this patch that would use it?

kevin.dutra’s picture

@cosmicdreams, the corresponding Drush functions for this API live in the migrate_tools module, which is in the contrib space. See referenced issue #2863426: Implement rollback of items no longer in source data.

cosmicdreams’s picture

Hey @Kevin.dutra Thanks a ton. I'll see if can patch this in and kick it's tires.

maxdev’s picture

Since in 8.3.5 there were changes related to migrate. I made patch compatible with latest version.

maxdev’s picture

Status: Needs work » Needs review

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

heddn’s picture

Status: Needs review » Needs work
Issue tags: -BC break

This should add a new Interface that is DeletableMissingSource and implement it in core's executable. If we don't we'll break drush, drupal console and anyone who has created executables. Then check if we have an implementation of that interface before calling it. That's the only way I can think of doing this without breaking BC in a really bad way.

Also, needs IS summary update and tests.

rauch’s picture

Is there a patch for Drupal 8.4.x?

mstrelan’s picture

There have been a few mentions of unpublishing content in this issue, I wanted to expand that though to users with the ability to block missing users instead of deleting them.

Peacog’s picture

I've been using this patch for a while. It's working well. Here's a re-roll for 8.4.x (it doesn't address the BC issue raised in #32)

jofitz’s picture

Status: Needs work » Needs review
FileSize
2.99 KB
17.11 KB
  • Correct the mistake in the reroll from #29 to #35.
  • Fix coding standards errors.

Once this is confirmed to pass tests we can take a look at @heddn's comments in #32.

jofitz’s picture

Status: Needs review » Needs work
jofitz’s picture

Version: 8.4.x-dev » 8.5.x-dev
Issue summary: View changes
Status: Needs work » Needs review
Issue tags: -Needs issue summary update
FileSize
1.05 KB
17.27 KB
  • Created DeletableMissingSource interface.
  • Updated IS.

Still needs tests.

jofitz’s picture

Status: Needs review » Needs work

Setting back to NW for tests.

ichionid’s picture

Is it safe to remove the entries from the 2 tables create for every sync file when deleting the entity?
We're on 8.3.7 version for drupal and we don't have the patch available and kinda need urgently the deletion.

mullman’s picture

What are the pro's and con's for simply performing a full rollback then a fresh import for feeds that need to be re-iterated daily? Is there a better approach?

Sutharsan’s picture

@mullmann the one con is the change of IDs. In our case we are importing/syncing users from a CSV file. Changing user ID's is not acceptable, all users will be logged out immediately.

MegaChriz’s picture

For Feeds a similar feature exists. In there though you can also choose to unpublish content that is no longer provided by the source (instead of deleting it). Recently, I ported (and improved) the feature to the D8 version: #2939503: Unpublish/Delete nodes not included in feed. Because for the D7 version I sometimes got requests to be able to take some other action on the content than unpublishing or deleting, the provided patch for the D8 version supports choosing an action plugin to apply on the content (currently limited to non-configurable action plugins to make the initial implementation easier).
The feature had been a quite a big thing for Feeds when it got implemented for the D7 version: #1470530: Unpublish/Delete nodes not included in feed.

Maybe such idea (applying an action plugin to content no longer provided by the source) could work for Migrate as well.
The D8 version of the feature also supports to execute the action in batches (either via the batch API or via a queue worker). Not sure if that would make sense for Migrate too.

kumkum29’s picture

Hello,

Actually, the patch #38 isn't included in the migrate module (in core).
Do you think that this patch could be soon commited? Or do I use another method to unpublish contents if items no longer exists in the source? In migrate, Is there a problem to implement this new class?

This feature is very important for migrate and this feature is present in feeds module.

Thanks.

heddn’s picture

This is set to NW still for tests. The thing is that many of the migrate maintainers haven't needed this issue ourselves. So this is highly dependent on others to step up and contribute.

lquessenberry’s picture

Is there perhaps a way to contain this functionality in a separate module that doesn't patch core?

kumkum29’s picture

As sayed by lquessenberry patch core isn't a good idea with the futur updates. If maintainers don't have need of this feature, this feature is very usefull for other users... And making a separate module is a good thing (without patch).

I have attached to my reply a custom module developped by the MTech team (I payed this team to develop a temporary solution). Maybe, it's a starting point to find a solution, and develop a stable module...

P.S. With this custom module we need to patch the core with #38

geek-merlin’s picture

+100 for this patch.

As it came to light in #42, #43, in some use cases the right thing to do is delete, in some use cases we need some kind of unpublish action. How we might implement this (maybe memo to me):

* Add a MigrateDeletableDestinationInterface extending MigrateDestinationInterface with a method delete($destination_key) (we need this for BC, so add "Cleanup before D9" comment)
* Use the delete() method if available, otherwise rollback()

Of course, given that some implementations of delete() (contratrary to rollback()) will not delete, but unpublish, we can bikeshed on how we name things here. The point here is that we want to let the destination plugin know when a source item has gone, and decide on the right actions. Only having the listener would be really bad DX for this.

mglaman’s picture

I have unfortunately not had time to review this. So, I apologize if this is out of sorts, but I wanted to reply based on #48

As it came to light in #42, #43, in some use cases the right thing to do is delete, in some use cases we need some kind of unpublish action.

I'd rather the default be an event which is thrown "This MIGRATION_ID no longer has SOURCE_VALUE"

Because with the following

Add a MigrateDeletableDestinationInterface extending MigrateDestinationInterface

I'd have to make custom destinations for all of the existing sources and use those. I think it would be much easier to react to an event and say "okay, delete" or "great, unpublish"

geek-merlin’s picture

> I'd have to make custom destinations for all of the existing sources and use those.

Yah, that'd be a pita. An event helps in this case.
OTOH, in the long run we want destination plugins to handle that, maybe configurable.

I want in D9 (or today with a contrib decorator or custom destination) to say:

  destination:
    plugin: entity:node
    action_on_delete: unpublish # any action plugin

So implementing both should give us both benefits with little effort.

geek-merlin’s picture

Re-thinking it, i agree that an event is enough for now. We might ourselves listen to it and trigger the default rollback() action, unless some other listener does stopPropagation().

geek-merlin’s picture

As of #32, @heddn:
> This should add a new Interface that is DeletableMissingSource and implement it in core's executable. If we don't we'll break drush, drupal console and anyone who has created executables.

Some code grepping does not show me any alternate implementations or decorators (the parameter usage below is not affected). So why not better add the method to MigrateExecutableInterface and add a change note?

merlin@thinker ~/PhpstormProjects $ grep -rn MigrateExecutableInterface drupal-console
drupal-console/templates/module/src/Plugin/migrate/process/process.php.twig:13:use Drupal\migrate\MigrateExecutableInterface;
drupal-console/templates/module/src/Plugin/migrate/process/process.php.twig:30:  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
merlin@thinker ~/PhpstormProjects $ grep -rn MigrateExecutableInterface drush9
drush9/drush/vendor/chi-teck/drupal-code-generator/templates/d8/plugin/migrate/process.twig:8:use Drupal\migrate\MigrateExecutableInterface;
drush9/drush/vendor/chi-teck/drupal-code-generator/templates/d8/plugin/migrate/process.twig:75:  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {

EDIT: Found a decorated instance in \Drupal\migrate_run\MigrateExecutable. It extends Drupal\migrate\MigrateExecutable (which we extend) so would not break too.

EDIT2: (to myself)
> So why not better add the method to MigrateExecutableInterface and add a change note?
Because changing an interface breaks implementations. Simple as that.

geek-merlin’s picture

Version: 8.5.x-dev » 8.7.x-dev
FileSize
1.87 KB
18.62 KB

Patch flying in that implements #49/#51, a MISSING_SOURCE_ITEM event that falls back to rollback().

Also setting target to 8.7.x, rite? Still needs tests so leaving NW.

geek-merlin’s picture

geek-merlin’s picture

FileSize
1.48 KB
19.15 KB

I tested this patch a bit with the drupal_run drush9 patch, and some stuff worked. What it did not detect is when a source item is missing as of a process skip_row result.

Patch flying in that fixes that for me, by copying some lines of processRow().

KarlShea’s picture

FileSize
19.03 KB
430 bytes

If one of your rows throws a message (I saw this with the skip_on_empty plugin) rollbackMissingItems() will still try to save the message as it accumulates source IDs. That will throw an exception because saveMessage() is looking for $this->sourceIdValues which hasn't been set.

kumkum29’s picture

Hello,

I have some questions to better understand how this patch/evolution works.
If the items is no longer in the source data, the content can be unpublished/deleted.

1. In the map table, the line of the item is deleted or always present?
2. If the item is again in the source data, can we re-publish the content and recreate an entry in the map table? (if the entry was deleted with a previously migration).

Thanks for these precisions.

mikelutz’s picture

mikelutz’s picture

Straight up re-roll of #56 against the latest dev changes.

mikelutz’s picture

Bah, missed a couple hunks. THIS is a re-roll pf #56

mikelutz’s picture

Would an approach like this help solve the memory/scalability concerns?

Have the source plugin check whether the id exists. SourcePluginBase can get the id list from a generator, and loop through it. It would take more time, but under the hood in_array loops through all items in an array and jumps out when it finds the item anyway (admittedly in less cycles per loop, but if you don't want to load all the ids in memory, I don't see an alternative)

I don't necessarily like resetting the iterator like I did here, but this function is called in a safe spot on a source plugin instantiated for this purpose, so it should be fine, even if we added some sort of mechanism to make sure we don't have conflicts with other means of iterating through the source.

The other advantage here is that the individual source plugins can create a more efficient means of getting the ids if they want. Sql can have a query for it (if not the base, then any specific sql plugins), and embeddable and csv(if it makes it in) can do better too.

Thoughts?

heddn’s picture

I too am a little worried by the iterator reset. But if we have tests that show this doesn't cause issues, then we should be fine. And having a base implementation should work for most small data sets. And like you allude, if the data set is huge, someone can always build a custom solution.

  1. +++ b/core/modules/migrate/src/DeletableMissingSource.php
    @@ -0,0 +1,12 @@
    +interface DeletableMissingSource {
    

    I'm not sure I like the name here. I would think this should be tacked onto a source plugin but was surprised to see it added on the executable.

  2. +++ b/core/modules/migrate/src/MigrateExecutable.php
    @@ -300,23 +301,13 @@ public function rollback() {
    +        $this->rollbackCurrentRow();
    
    @@ -346,6 +337,100 @@ public function rollback() {
    +  protected function rollbackCurrentRow() {
    

    This seems out of scope. Can we not do this in here and add to a follow-up?

  3. +++ b/core/modules/migrate/src/Plugin/SyncableSourceInterface.php
    @@ -0,0 +1,22 @@
    +   * Generator for the source idlist.
    ...
    +   * @return bool
    +   *   True if the id exists in the current source.
    

    Generator and return bool? The docs don't line up.

  4. +++ b/core/modules/migrate/src/Plugin/SyncableSourceInterface.php
    @@ -0,0 +1,22 @@
    \ No newline at end of file
    

    Nit

  5. +++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
    @@ -64,7 +65,7 @@
    +abstract class SourcePluginBase extends PluginBase implements MigrateSourceInterface, SyncableSourceInterface, RollbackAwareInterface {
    

    Ah, we do add it here too. I don't like the idea of shared interface here. Let's keep them separate perhaps.

  6. +++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
    @@ -152,6 +153,13 @@
    +  /**
    +   * Flags whether to return all available source rows.
    +   *
    +   * @var bool
    +   */
    +  protected $allRows = FALSE;
    

    We recently added a method to easily filter the idMap. Can we use that instead of this?

heddn’s picture

This blocks feeds and other uses cases. We talked about this at the migrate maintainers meeting with @maxocub @mikelutz @heddn and thought that an FilterIterator might help here.

quietone’s picture

Just a brief read of the code and wanted to mention two things

  1. +++ b/core/modules/migrate/src/Plugin/migrate/source/SourcePluginBase.php
    @@ -599,4 +608,31 @@ public function getSourceModule() {
    +    $ids = $this->getIds();
    +    $iterator = $this->getIterator();
    

    Can $ids be $source_ids and $iterator be $source? Just want to make it easier to read.

  2. +++ b/core/modules/migrate/src/Plugin/migrate/source/SqlBase.php
    @@ -264,83 +264,85 @@ protected function initializeIterator() {
    +      if (!$this->allRows) {
    

    This makes many comment > 80 characters.

badrange’s picture

This might be a strange question, but I'll ask it anyway: Are anybody using any version of this patch in production?

KarlShea’s picture

I'm using it, or at least will be when the site goes live within the month.

kumkum29’s picture

Hello,

as Badrange said, can we use this patch in a production site? This patch is a good step forward for the module.
For the moment, can we consider using this patch because it would soon be included in the module code?

Thanks.

biguzis’s picture

Have migration with ~8000 items and about 800 items should be removed with this, but after some time i got `MySQL server has gone away` even before something is actually rolled back..

biguzis’s picture

btw - i rerolled patch to make able to apply on 8.6.3

mikelutz’s picture

This is new functionality, It won't be eligible for backporting to 8.6. Submitted patches need to be based off 8.7.

geek-merlin’s picture

> #68 "MySQL server has gone away"

That's unrelated. Look into #2829492: Big file backup breaks with "MySQL server has gone away" and do SET SESSION wait_timeout = [enufftime]

wizonesolutions’s picture

This is a patch that works with 8.6.4. Please ignore it. I just need it for my composer.json :)

kumkum29’s picture

Hello !

we are a little lost with all these posts.
Can someone summarize and explain which patch to use for drupal 8.6 version? And what to write in our config file to use this new functions?
With the patch #38 I get problem to patch the core with Drupal 8.6.4.

Thanks for your help and explanations !

wizonesolutions’s picture

@kumkum29: Core changed in 8.6.4. If you need a patch for that, use mine (#72). Don't know the answer to your other question.

kumkum29’s picture

thanks Wizonesolutions ! it's ok with your patch.

I see in several replies, some settings to active the delete behavior in the migration config: e.g. action_on_delete: unpublish...
is it functional? If yes, how to proceed? Can you gieb me more examples on these new plugins/functions?

Thanks.

wizonesolutions’s picture

I can't, unfortunately. I only updated this patch in an existing system, but I haven't dealt much with what it actually does. You'd probably have to read it or get someone to read it for you to get an idea.

praandy’s picture

Hi,
I have used the patch #38, but the new #72 doesn't work so well (now I use 8.6.5).
The problem is the patch in Drupal\migrate\Plugin\migrate\source\SourcePluginBase, the function "sourceIdsExists" load all records of the source every times that the system checks if a record is present or not.

I changed the function so:

public function sourceIdsExists(array $test_ids) {
    $exists = FALSE;
    foreach ($this->idList() as $source_ids) {
      if ($source_ids == $test_ids) {
        $exists = TRUE;
        break;
      }
    }
    return $exists;
  }

and in Drupal\migrate\MigrateExecutable

$source = \Drupal::service('plugin.manager.migrate.source')->createInstance($source_config['plugin'], $source_config, $this->migration);
+   $source_exists = $source->sourceIdsExists();
    // Rollback any rows that are not part of what we get from the source
    // plugin.
    foreach ($id_map as $map_row) {
      $source_key = $id_map->currentSource();
      $destination_key = $id_map->currentDestination();

      // If this one wasn't imported, or if we're still receiving it from the
      // source plugin, then we don't need to do anything.
-     if (!$destination_key || !$source_key || $source->sourceIdsExists($source_key)) {
+    if (!$destination_key || !$source_key || in_array($source_key, $source_exists)) {
        continue;
      }

So, it works.
Could you find a more elegant solution?

achraf.noomane’s picture

The patch was not applying after Drupal security updates SA-CORE-2019-001. re-rolled for Drupal 8.6.7.

heddn’s picture

#77 needs to be addressed. Loading all source records seems like a really bad idea. Can't we have a base solution in SqlBase that does a source query?

mErilainen’s picture

Could the source plugin provide a sourceIds() public function or such which could be used instead?
I tested locally using the patch from #78 with 966 items and it takes always about 47s check if something should be rolled back because for each item all the source ids will be iterated through to see if it still exists in the source.

I tested the idea from #77 by writing a sourceIds() function and calling it once and checking if the $source_key exists in the array: The check now took 3,4s instead when no items needed an actual rollback.

mErilainen’s picture

Status: Needs work » Needs review
FileSize
22.88 KB
2.34 KB

Here is the patch related to my previous comment, replacing sourceIdsExists() with sourceIds() which is called only once. Interdiff included.

Edit: Patch is against 8.6.x.

Status: Needs review » Needs work
samerali’s picture

@mErilainen Thanks for your patch at #81

This patch does not provide drush command or ui option to run this right? for now we need to use ->rollbackMissingItems(); ?

mikelutz’s picture

This patch does not provide drush command or ui option to run this right? for now we need to use ->rollbackMissingItems(); ?/blockquote>

The core Migrate module itself does not have a UI, it is only an api. This patch will be a new addition to that api.

Migrate Drupal and friends, the core modules that makes use of the migrate api to provide a UI for Drupal to Drupal migrations will not use this particular api enhancement.

Currently the contrib module Migrate Tools provides drush commands and a partial UI around the migrate API. If/when this api is released, any drush commands or UI support for it will need to be added to that module.

biguzis’s picture

@samerali Here is patch for Migrate tools with drush command - https://www.drupal.org/project/migrate_tools/issues/2863426

kumkum29’s picture

Hello,

Is there any progress to use a specific plugin inside the config file for delete/unpublish the content?
Today, there is no 'stable' solution to unpublish content with migrate.

We can define a budget/payment for this job, if a maintainer/developer include this feature in migrate.

Thanks.

samerali’s picture

Thank you @biguzis this worked great! saved tons of time.

phenaproxima’s picture

Project: Drupal core » Migrate Tools
Version: 8.7.x-dev » 8.x-4.x-dev
Component: migration system » Drush commands

Discussed with the other Migrate maintainers. We decided that we're not going to put this in core right now because it would add complexity to the already-way-too-messy Migrate API -- complexity which is not needed for any core migrations. Instead, we can implement this in Migrate Tools (which will be faster and easier than core, anyway). So let's do that.

benjifisher’s picture

Now that this issue is attached to the Migrate Tools module, I hope we can get a patch that combines this with #2863426: Implement rollback of items no longer in source data.

In the mean time, notice that the patch from #81 conflicts with #2999478: Refactor MigrateExecutable::rollback(), so the patch will not work with Drupal 8.7.

quietone’s picture

Assigned: Unassigned » quietone

Was working on rerolling this to be in Migrate Tools and realize I don't know much about how migrate tools does things. And too early to know the right question to ask. :-( Will try again tomorrow.

quietone’s picture

Assigned: quietone » Unassigned

Not making progress here.

vidorado’s picture

For those applying patches with Composer, note that patch in #81 applies to drupal/core, not to drupal/migrate_tools

Besides that, the patch works ok! I have tested it with the drush command migrate:rollback --missing-from-source [migration_name]. That new command option is added by the patch in #2863426: Implement rollback of items no longer in source data

quietone’s picture

FileSize
3.42 KB

A possible test.

edysmp’s picture

nice test.

quietone’s picture

Status: Needs work » Needs review
FileSize
10.13 KB

edysmp, thanks,

OK, now combine the test and the patch. Not working yet. failing to get any rows from the map.

quietone’s picture

FileSize
10.21 KB

Try again.

quietone’s picture

Issue tags: -Needs tests
FileSize
10.5 KB
930 bytes

Just some cleanup. The test is failing because in rollbackMissingItems, sourceIds($source); always returns an empty array because the source is invalid. What needs to happen for it to be valid?

heddn’s picture

Status: Needs review » Needs work
  1. +++ b/src/Event/MigrateEvents.php
    @@ -0,0 +1,23 @@
    +final class MigrateEvents {
    

    Let's put the event over in migrate_plus. I think that is a better location for that type of thing and follows the already established pattern.

  2. +++ b/src/MigrateExecutable.php
    @@ -402,4 +493,44 @@ class MigrateExecutable extends MigrateExecutableBase {
    +      yield($this->sourceIdValues);
    +      $source->next();
    

    I don't think this is reachable code, the next part?

  3. We talked about adding an interface (perhaps in plus as well) that lets folks do the rollback in a more optimized method. Then type check and use the optimized if it exists, otherwise use the non-optimized means. I don't see that in this latest patch. If we want to punt that part to a follow-up I'm probably OK, but the patch here is small enough that adding it to this one issue is probably easier.
quietone’s picture

Status: Needs work » Needs review
FileSize
9.5 KB
930 bytes

1. Fixed
2. Yes, that was copy/paste from the original. I left it broken since the source is always invalid and I was trying to fix that first. Still don't know why it is invalid.
3. Agree. I just don't know how to do it.

quietone’s picture

FileSize
7.32 KB
5.81 KB

98.3. Maybe fixed.
Moved $source to a property and cleanup.

What is still needs to be done here is find out why the source plugin created in migratexecutable return no rows. Anyone know?

quietone’s picture

mikelutz’s picture

heddn’s picture

+++ b/src/MigrateExecutable.php
@@ -333,7 +333,7 @@ class MigrateExecutable extends MigrateExecutableBase implements SyncableSourceI
+      $this->message->display($this->t('Migration @id is busy with another operation: @status', ['@id' => $this->migration->id(), '@status' => $this->t($this->migration->getStatusLabel())->render()]), 'error');

$this->t($this->migration->getStatusLabel()) is what is the issue. It is generally discouraged to pass variables or function calls into t()

About the underlying issue of returning data from embedded_data in #99, I think I'd need to debug it. Data rows should be possible.

quietone’s picture

FileSize
10.02 KB
6.3 KB

Seem to have lost the test along the way here, so restoring that.

The problem I see with the test is how to change the data_rows of embedded source. There doesn't seem to be a way to do that. I tried making a second source plugin with the correct data and changed the map and message table names, but then the source is invalid. When it is invalid then all rows are returned. It probably shouldn't do that. Thought maybe test like HighWaterTest which using a sql source plugin, but that ran into problems with hooks in migrate_tools.

quietone’s picture

FileSize
10.08 KB
966 bytes

Trying to fix the t() .

quietone’s picture

FileSize
278 bytes
10.08 KB

remove extra line

Status: Needs review » Needs work

The last submitted patch, 106: 2809433-106.patch, failed testing. View results

mikelutz’s picture

mikelutz’s picture

  1. +++ b/src/MigrateExecutable.php
    @@ -321,6 +328,106 @@ class MigrateExecutable extends MigrateExecutableBase {
    +    $id_map = $this->migration->getIdMap();
    

    Your source plugin isn't returning rows because the rows are already imported, and source plugin base filters out rows that have already been processed by default. You can run $id_map->prepareUpdate() to get the source to return all rows, but that sets the update flag and because we aren't running a migration, it won't be cleared. It feels like something we should fix in core, but to work around it here in contrib, I think you will have to run prepareUpdate, and then at the end resave all the idmap rows to reset their status to imported.

  2. +++ b/tests/src/Kernel/MigrateRollbackMissingTest.php
    @@ -0,0 +1,117 @@
    +    $vocabulary_migration = $this->createMigration('import', $vocabulary_data_rows);
    ...
    +    $vocabulary_migration = $this->createMigration('rollback', $vocabulary_data_rows);
    

    If these two migrations have different ids, then they will have different id maps, and you won't be able to roll back

marvil07’s picture

Status: Needs work » Needs review
FileSize
13.92 KB
9.82 KB

Here a new version with the following changes:

  • Restrict the use of rollbackMissingItems() only to migrations which contain a source plugin implementing the SyncableSourceInterface interface. This follows the approach on #2809433-81, see more reasoning on this below.
  • Added a new SyncableSourceTrait trait, which is the main helper for migration source classes to implement the SyncableSourceInterface interface.
    I have moved both sourceIds() and idList() methods there.
    An idea from #2809433-81 was missing: the "all rows" source configuration option and the related changes over next(); that is the reason I think we cannot work-around sources not implementing the SyncableSourceInterface interface, which the previous patch was suggesting.
    Also, I have added a helper to use during the constructor, and the modified next() method.
    Now that we are not modifying core, we cannot rely on that change on SourcePluginBase class, and instead any source needing to use the new functionality would need a class implementing SyncableSourceInterface interface.
  • Added a new syncable_embedded_data migrate source plugin, as an example on how to use the trait, and also used during the related test.
  • Changed the second migration id used on the test, so it can use the map created by the import, as suggested by @mikelutz.
  • Used the new syncable_embedded_data migrate source on the test.

I would suggest that we should add the new event directly here, because this functionality is only implemented in migrate_tools, so it will be confusing if that event is implemented by other module; as I suggested on the related ticket.

Some pending items on the fixme at the new trait: Some called methods here assume the class using this trait is a SourcePluginBase child, e.g. getIterator(). Check all, and if possible only use methods from interfaces.

marvil07’s picture

I think the test failure is related with migrate_tools not being patched and exposing the new event constant, locally I where I have the latest migrate_tools related patch with that, I got the test running.

Self-reminder: running this test locally: php ./core/scripts/run-tests.sh --sqlite /tmp/d8test --verbose --class "Drupal\Tests\migrate_tools\Kernel\MigrateRollbackMissingTest"

quietone’s picture

FileSize
1.29 KB
14.03 KB

Thanks marvil07!

This just fixes the coding standard errrors.

Status: Needs review » Needs work

The last submitted patch, 112: 2809433-112.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
14.06 KB
479 bytes

Add an @requires for migrate_plus

Status: Needs review » Needs work

The last submitted patch, 114: 2809433-114.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
15.45 KB
2.93 KB

Moving the event back to migrate_tools to get the test to pass. I'm leaning to have the event in migrate_tools.

quietone’s picture

Just noting the todos here.

  1. +++ b/src/Plugin/migrate/source/SyncableSourceTrait.php
    @@ -0,0 +1,111 @@
    + * @fixme Some called methods here assume the class using this trait is a
    + * SourcePluginBase child, e.g. getIterator(). Check all, and if possible only
    + * use methods from interfaces.
    
  2. +++ b/src/Plugin/migrate/source/SyncableSourceTrait.php
    @@ -0,0 +1,111 @@
    +   * @todo check used methods, maybe there is an implicit assumption that this
    +   * is a SourcePluginBase child.
    
heddn’s picture

Issue tags: +Needs reroll

I've committed the work over in #3045346: Add missing source item event, so this will need a re-roll.

heddn’s picture

And BTW, the rational for putting it over in plus is that then things like runner, manifest or anything else can use the feature without requiring tools as well.

marvil07’s picture

Issue tags: -Needs reroll

@quietone, thanks for the iterations here!

@heddn the patch from comment 114 is the most updated using migrate_plus event, changing visibility of patches to follow latest updates.

marvil07’s picture

More patches hidden.

quietone’s picture

FileSize
17.87 KB

Reroll and added a test to test the messages returned on failures, but it isn't working, can't access the message. Can someone sort that out?

Sorry, no interdiff as I am running out the door ... the bus does not wait.

quietone’s picture

FileSize
17.79 KB
5.87 KB

@heddn, thanks for the commit in Migrate Plus, and the explanation.

Fix coding standard.

Status: Needs review » Needs work

The last submitted patch, 123: 2809433-123.patch, failed testing. View results

marvil07’s picture

Status: Needs work » Needs review
FileSize
18.07 KB
4.75 KB

@quietone, thanks for the extra tests.

I have been looking at the methods on MigrateTestBase, and it seems like the startCollectingMessages()/stopCollectingMessages() is intended to be used differently, and that is why messages were not being recognized. See MigrateTestBase::display().
I have changed the test to use that other approach.

@quietone, any reason to use RollbackMissingMessageTest instead of MigrateRollbackMissingMessageTest, to follow the other, MigrateRollbackMissingTest?

quietone’s picture

FileSize
17.82 KB
7.51 KB

@marvil07, thanks. I looked at MigrateDefaultLanguageTest for a way to get the messages.

The reason it is RollbackMissingMessageTest is because I am weary of long file names. It is changed back.

The test is modified to not run the import since that it not needed for the test.

Another wee thing is that the use of quotes in the messages in MigrateExecutable is now different than that of it's core counterpart and different than the way quotes are used in Exception messages, https://www.drupal.org/docs/develop/coding-standards/php-exceptions. I'm pointing that out because I prefer consistency.

quietone’s picture

TODOs:

In SyncableSourceTrait

* @fixme Some called methods here assume the class using this trait is a
 * SourcePluginBase child, e.g. getIterator(). Check all, and if possible only
 * use methods from interfaces.

And the tests do not test the new all_rows configuration property on the source. It is false in MigrateRollbackMissingTest

quietone’s picture

Issue summary: View changes
marvil07’s picture

Component: Drush commands » Code
FileSize
18.28 KB
5.17 KB

@quietone, thanks for the extra changes!

On the implicit dependency on SourcePluginBase

I have been trying to remove the implicit dependency on SourcePluginBase, but I could not find a clean way to do so from here.
SourcePluginBase is an opinionated implementation, and therefore its next() implementation, which is the base for the same method in the new trait proposed in this change, also assumes many internal details.

Splitting SourcePluginBase in different traits/interfaces could help us define which source plugins the new SyncableSourceTrait is compatible with, but changing that core class is definitely out of scope of this ticket.
In any case, I ended up splitting a bit of it, but it is just a start, see #3048413: Split iterable part of SourcePluginBase into a trait.

Instead, I have described on the trait docblock the expected requirement for classes using the trait, except for the mapRowAdded data member, which is probably not used anywhere and will probably be removed at #3017456: Remove SourcePluginBase::$mapRowAdded.
As mentioned there, SyncableSourceTrait is mainly intended for SourcePluginBase children, but if someone really wants to use it, it is still possible with the listed requirements.

Considering this new trait will help to easily expand most source plugins, which are SourcePluginBase children, as the added syncable_embedded_data, I think it is a good trade-off to have it.

On the messages

@quietone, I think you are right, we should follow a bit more the standard for php exceptions, since it is the most related standard about migrate messages.
The main difference is that migrate messages have format, and are translatable strings; but even then, using single quotes move the code one more steps towards standard messages.

I have changed the messages to use single quotes, and the related test accordingly.

marvil07’s picture

FileSize
18.25 KB
1.15 KB

Fixing the coding standard error and changing some docblocks on the message related test to be a bit more accurate.

quietone’s picture

@marvil07, thanks this is coming along nicely.

I would still like a test of the all_rows property and maybe change member to property in the SyncableSourceTrait documentation but that should not hold up a review, particularly of the new trait.

Anyone coming here to review please look at the recently added SyncableSourceTrait, thx.

benjifisher’s picture

Status: Needs review » Needs work

I am just starting to look at this. Forgive me if I start by picking some nits. I also plan to do some testing.

  1. +++ b/src/SyncableSourceInterface.php
    @@ -0,0 +1,20 @@
    +<?php
    +
    +namespace Drupal\migrate_tools;
    +
    +/**
    + * Interface SyncableSourceInterface.
    + *
    + * @package Drupal\migrate\Plugin
    + */
        

    Can we add a description to the doc block? Something like "Migrate source plugins can implement this interface in order to synchronize migrated content with the source. That is, migrated content can be added, updated, or removed to match the source." Maybe also refer to the new trait.

  2. +++ b/src/Plugin/migrate/source/SyncableSourceTrait.php
    @@ -0,0 +1,119 @@
    +<?php
    +
    +namespace Drupal\migrate_tools\Plugin\migrate\source;
    +
    +use Drupal\migrate\Row;
    +
    +/**
    + * Helper to implement \Drupal\migrate_tools\SyncableSourceInterface.
    + *
    + * The main use case of this trait is for source plugins that are children of
    + * \Drupal\migrate\Plugin\migrate\source\SourcePluginBase.
    + *
    + * It can also be used with other source plugins, but they would need to at
    + * least:
    + * - Implement aboveHighwater(), fetchNextRow(), getIterator(),
    + *   getHighWaterProperty(), initializeIterator(), rowChanged(), and
    + *   saveHighWater(), in a similar way than SourcePluginBase;
    + * - Use a currentRow data member with a \Drupal\Migrate\Row value;
    + * - Use a currentSourceIds data member with the primary key of the current row;
    + * - Use a configuration data member to store plugin configuration; and
    + * - Use an idMap data member with a
    + *   \Drupal\migrate\Plugin\MigrateIdMapInterface corresponding to the related
    + *   migration.
    + */
        

    Wow, that is a lot! I think that "use case of" should be "use case for". I think that "in a similar way than" should be "in a similar way to" or simply "similar to". I do not see initializeIterator() outside this comment, so I assume we can remove it from the list. Perhaps document the new configuration key: "Available configuration keys: - all_rows: ..."

  3. +  /**
    +   * Adds a property to source plugin to read all rows.
    +   */
    +  public function readAllRowsConfiguration() {
        

    Since this function is public, maybe it should be added to the interface. Or if it is just used from classes that use this trait (which is what I see in EmbedDataSyncableSource) then we can make it protected.
    I do not like the name nor the description. Maybe setAllRows() and "Set the $allRows property based on configuration."
    It seems odd to set it from the configuration like this instead of having a simple setter, but maybe I will understand that as I continue to read the code.

  4. +  /**
    +   * Handles all_rows configuration during \Iterator::next().
    +   *
    +   * Based on \Drupal\migrate\Plugin\migrate\source\SourcePluginBase::next().
    +   */
    +  public function next() {
      ...
    +      // 1. We're supposed to return all rows.
    +      // 2. This row has not been imported yet.
    +      // 3. Explicitly set to update.
    +      // 4. The row is newer than the current highwater mark.
    +      // 5. If no such property exists then try by checking the hash of the row.
    +      if ($this->allRows || !$row->getIdMap() || $row->needsUpdate() || $this->aboveHighwater($row) || $this->rowChanged($row)) {
        

    For the sake of future reviewers: these lines (adding the first condition) are the only difference between next() in this trait and the version in SourcePluginBase.

  5. +  public function sourceIds() {
    +    $ids = [];
    +    foreach ($this->idList() as $source_ids) {
    +      $ids[] = $source_ids;
    +    }
    +    return $ids;
    +  }
        

    Why not just combine this with the idList() function?

  6. +++ b/src/Plugin/migrate/source/EmbedDataSyncableSource.php
    @@ -0,0 +1,31 @@
    ...
    + * @MigrateSource(
    + *   id = "syncable_embedded_data",
    + *   source_module = "migrate_tools"
    + * )
    + */
    +class EmbedDataSyncableSource extends EmbeddedDataSource implements SyncableSourceInterface {
        

    Can we make the word order in the class name match the word order in the plugin ID? That is, either make the plugin ID embedded_data_syncable or else make the class name SyncableEmbedDataSource.

  7. If we always follow the pattern of creating a new class, like EmbedDataSyncableSource, that extends an existing class and then uses the trait, then we could skip the configuration and the allRows property. That would be simpler. But I guess we are only doing that because we cannot change the core classes. When we create our own source plugins, we can always use the trait and then we need a configuration key to determine the behavior.
  8. I guess this question is for @phenaproxima and @heddn, although others may have opinions: why are we proposing to add this code to Migrate Tools rather than Migrate Plus? If that is where the new event belongs, doesn't the same argument apply to the new trait? I guess I am just asking about the new interface and trait, and maybe these were added after this issue was moved to the Migrate Tools issue queue. I can see why the changes to MigrateExecutable are here. Maybe move those over to #2863426: Implement rollback of items no longer in source data and then move this issue, with the new interface and trait, to the Migrate Plus queue?
  9. +++ b/src/MigrateExecutable.php
    @@ -321,6 +328,105 @@ class MigrateExecutable extends MigrateExecutableBase {
    +      $this->message->display($this->t("Migration '@id' does not support rolling back items missing from the source", ['@id' => $this->migration->id()]), 'error');
        

    Can we break up this long line?

  10. +    /** @var \Drupal\migrate\Plugin\MigrateSourceInterface $source */
    +    $this->source = \Drupal::service('plugin.manager.migrate.source')->createInstance($source_config['plugin'], $source_config, $this->migration);
        

    Do we need the type hint here? Can we break up the long line?

  11. On second thought, keep that type hint but make $source a local variable instead of a class property. (Am I missing something?) Maybe give it a more descriptive name, like $syncable_source.
heddn’s picture

Can we get the super simple solution here addressed first? Let's support syncing without needing any optimizations from the source. Meaning, we scan through the entire source and the idmap and sync things. Let's leave interfaces and improving the performance of things using dedicated methods and interfaces for follow-ups.

charginghawk’s picture

We tested a few scenarios:

  1. The #130 patch applied to migrate_tools
  2. The above plus #19 patch from #2863426: Implement rollback of items no longer in source data
  3. The above plus the #81 core patch above
  4. The #81 core patch plus #19 patch from #2863426: Implement rollback of items no longer in source data

Only the 4th worked for us.

Scenario 1

The drush migrate-rollback --missing-from-source my_migration command ends up with error Unknown option: --missing-from-source.

The #130 patch does not contain the "--missing-from-source" option, so we still need a patch adding that option.

Scenario 2

After adding two patches to migrate_tools without a patch applied to core the drush migrate-rollback --missing-from-source my_migration command ends up with error Migration 'my_migration' does not support rolling back items missing from the source.

Scenario 3

Same results as in Scenario 2

Scenario 4

Only this one scenario is working as expected.

quietone’s picture

Re #133

Can we get the super simple solution here addressed first? Let's support syncing without needing any optimizations from the source

What do you have in mind? Back in #84 there was a suggestion to run prepareUpdate(), and then at the end resave all the idmap rows to reset their status to imported. I am not convinced we should set the row status to imported after this rollbackmissing operation since that would also change failed and skip status to imported.

marvil07’s picture

Status: Needs work » Needs review
FileSize
19.04 KB
5.66 KB

@quietone, we can definitely test the new all rows property.

@benjifisher, re 132:
1. added your description
2. thanks for the suggestions, now added. On initializeIterator(), it is strictly not needed; since is used from within the already mentioned getIterator() from SourcePluginBase, so I removed it from the list.
3. Good point, protected visibility should be OK. Not on the interface because it's a helper, to avoid doing it from each source plugin constructor. I also changed the name of the method to setAllRowsFromConfiguration(), it is a similar to a setter, but adding the last part to make it clear it is not a simple setter, since it does not receive a value to set.
4. I have added a note on the description to suggest it should be kept on sync with core, and added the last commit hash from core I have checked it against.
5. It was about keeping the structure from the initial patches in core, I am fine either way, @heddn let us know what you think.
6. Small things are important, thanks for pointing it out, changed the class name, but to SyncableEmbeddedDataSource to follow a bit more core embedded_data source.
7. Yes, sadly it was decided to not include the change in core, and the trait is an attempt to make it as easy to use as possible implemented from contrib. The best case would naturally be do this change directly on core's SourcePluginBase.
8. I was also thinking on keeping the event in the same place, but based on the outcome of #3045346: Add missing source item event, it seems like the idea is to allow other modules to use the event without needing to depend on migrate_tools.
9. Sure.
10-11. now in two lines. Notice the data property is defined on core's Drupal\migrate\MigrateExecutable, I think it is OK to override it here to provide the extra configuration to the source object. Using a local variable can also work, but I think semantically it would be better to override; no change for now, feedback welcome.

@heddn, re 133,135:
The current reason a syncable source needs to be overridden is to be able to retrieve all rows, which is not possible without overriding next().
The point @quietone makes is relevant here too.
Can you elaborate on what approach will you take to avoid skipping some of them as the current next() on SourcePluginBase does?

@charginghawk, re 134:
Scenario 1 works as expected. It is for core, and only the API, no drush commands.
Scenario 2 works as expected. It is for migrate_tools, and API from the patch here, drush command option for the other ticket. The error is the expected, the source needs to implement SyncableSourceInterface, as it is checked in one of the tests.
Scenario 3 works as expected. Both core and migrate_tools patches are not intended to be used together, and the output is about the expected error from the migrate_tools latest patch.
Scenario 4 works as expected. That is a valid solution, but it was decided to not be accepted by core, at least for now, and therefore the migrate_tools patches started.
What it is probably needed for your use case is to extend the source you are using with the provided trait, as SyncableEmbeddedDataSource is doing for embedded_data plugin.

benjifisher’s picture

@charginghawk:

To follow up on what @marvil07 wrote in #136: see #132.7 and the code it references. With the current patches on this issue and #2863426: Implement rollback of items no longer in source data, you have to follow the example of the test in order to get the rollback functionality. Extend whatever source class you would normally use and use the new trait so that you implement the new interface. Then refer to your custom source plugin in your migrations.

@marvil07:

I owe you a re-review. For now, I just want to say that I am glad I asked, in #132.11,

Am I missing something?

You implicitly answered that question: I was missing that the variable is already defined in the core class.

heddn’s picture

@heddn, re 133,135:
The current reason a syncable source needs to be overridden is to be able to retrieve all rows, which is not possible without overriding next().
The point @quietone makes is relevant here too.
Can you elaborate on what approach will you take to avoid skipping some of them as the current next() on SourcePluginBase does?

Put this retrieving all of the rows into the executable. It can iterate over the entire source, destination,id map, etc. Then only trigger the sync logic when syncing is configured for a specific migration. Yes it will be slow, but then for folks that want to sync less than say 1000 records, it will probably still scale. It is only when we are dealing with many thousands (millions) of records that parsing too much data by iterating over the entire data set would get cumbersome. And it would then work for existing migrations with only adding a "sync" flag on the migration.

benjifisher’s picture

@marvil07,

I only have time for a quick re-review based on the interdiff you attached.

3. The renamed setAllRowsFromConfiguration() is still public.
4. I am not sure that we should include a specific commit hash for core. I might just make it @see \Drupal\migrate\Plugin\migrate\source\SourcePluginBase::next()
10. You convinced me that it is right to keep $this->source as a class variable. Since it already has a type hint when declared as such, I do not think we need to repeat the hint when using it later.

Other than that, I like all the changes you made, including the new names.

I do not know enough to contribute to the discussion started in #133.

joelpittet’s picture

What's the quickest way to test this on an d6 to d8 existing migration? I've applied both patches from #2809433-136: Migrate support for deleting items no longer in the incoming data and #2863426-22: Implement rollback of items no longer in source data

And ran into the

[error] Migration 'upgrade_d6_path_redirect' does not support rolling back items missing from the source

So I figure that I need that SyncableSourceTrait trait apparently but where do I need to add it or extend core's source migrations to add it?

Tim Corkerton’s picture

I'm in the same position as Joel, I've got a custom working D6 -> D8 migration running well but can't get this to run correctly.

I made the same assumption that we need to apply both patches outlined in #140

Can someone confirm the steps required to run successfully?
Many thanks

benjifisher’s picture

@joelpittet, @Tim Corkerton:

Please see my reply to @charginghawk in #137.

joelpittet’s picture

Thanks @benjifisher, I’ll give that a go

toncic’s picture

@joelpittet did you make it works, I have the same problem?
I'm going through test example in patch but can't figure out what I'm missing.

toncic’s picture

I was blind and haven't seen that I'm missing interface 'SyncableSourceInterface'.

I got different error now, Undefined class constant 'MISSING_SOURCE_ITEM' in /app/web/modules/contrib/migrate_tools/src/MigrateExecutable.php on line 384 #0 /app/vendor/drush/drush/includes/drush.inc(232): Drupal\migrate_tools\MigrateExecutable->rollbackMissingItems().

Any idea what I'm missing now?

benjifisher’s picture

@toncic:

Try upgrading Migrate Plus to the latest version (8.x-4.2). See #3045346: Add missing source item event.

benjifisher’s picture

I had a few small suggestions in #139 that should still be addressed if we continue with this approach. But I will leave the issue at NR because it would be nice to get some additional testing and because the discussion started in #133 might take this issue in a different direction.

Iain.Madder’s picture

Wondering if anyone could advise on what i'm doing wrong to get the latest patch to work. It's blocking us upgrading to 8.7 and I'm stumped on the best solution.
Currently using #81 with 8.6, but while testing the upgrade to 8.7 composer fails to apply #81.

Testing #136 fails too due to the relative paths changing in the patch. #81 references a/core/modules/migrate/src/ while #136 references a/src/.

Is this an issue with the patch referencing the wrong location and so just needing recreating with the right relative path? if not, how would I correctly apply the patch?
For reference this is how I currently reference the patch in composer.json:

        "patches": {
          "drupal/core" : {
            "Patch core migrate to support deletion of missing items." : "https://www.drupal.org/files/issues/2019-01-30/2809433-migrate-delete-missing-from-source-81.patch"
            }
          }

Thanks in advance for any help
---------------
Post edit:
Solved by naveenvalecha's advice in #149. I was completely blind to the incorrect module reference.

naveenvalecha’s picture

#148,
Change the project in patches section to drupal/migrate_tools to apply this patch using composer.

malaynayak’s picture

I have applied the patches below, with drupal 8.7.1:

https://www.drupal.org/files/issues/2019-04-22/2809433-136.patch

https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-missing-items-from-source-2863426-20.patch

Migration rollback drush migrate-rollback id --missing-from-source giving the below error:

[error] Migration '@id' does not support rolling back items missing from the source.

Thanks in advance for the help.

benjifisher’s picture

Issue summary: View changes

I updated the issue summary. In addition to some smaller change, I added testing instructions, basically summarizing what I have said in previous comments.

charginghawk’s picture

Just to give an update, today because the old core patch failed to apply against 8.7.1, I switched to using these patches:

https://www.drupal.org/files/issues/2019-04-22/2809433-136.patch
(Changed to apply against migrate_tools, not drupal core like the previous version.)

https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...

After updating migrate_plus and migrate_tools, I had to update my source plugin to implement SyncableSourceInterface (note that you can implement multiple interfaces - my plugin was already implementing ContainerFactoryPluginInterface), and I added "use SyncableSourceTrait;" inside the class (per the example in the issue description).

Working hunky dory so far.

klidifia’s picture

FileSize
18.75 KB
1.05 KB

Tests fine for me:
I applied this patch and also https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...

Use the syncable_url data source plugin that you can create based off the syncable_embedded_data one in this patch.

I can then import as normal: drush migrate-import {name} and get rid of things that are no longer in the source with: drush migrate-rollback {name} --missing-from-source flag.

Was wondering if the SyncableUrlSource plugin that the testing instructions refer to could be included instead (or as well as) the SyncableEmbeddedDataSource.

mErilainen’s picture

How can I implement a "SyncableUrl" source plugin which would extend the Url source plugin from migrate_plus?

I tried to copy the example SyncableEmbeddedDataSource but I get an error
"PHP Fatal error: Trait 'Drupal\my_module\Plugin\migrate\source\SyncableSourceTrait' not found" because there is just
use SyncableSourceTrait;
and that points to the same directory.
I tried also to do:
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;
but then the error is
"PHP Fatal error: Trait 'Drupal\my_module\Plugin\migrate\source\Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait"

Edit:
Apparently I had to include it twice, once at the top of the file and then inside the class. Here is the source plugin code:

<?php

namespace Drupal\my_module\Plugin\migrate\source;

use Drupal\migrate\Plugin\MigrationInterface;
use Drupal\migrate_plus\Plugin\migrate\source\Url;
use Drupal\migrate_tools\SyncableSourceInterface;
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;

/**
 * A syncable Url source using SyncableSourceTrait.
 *
 * @MigrateSource(
 *   id = "syncable_url",
 *   source_module = "my_module"
 * )
 */
class SyncableUrl extends Url implements SyncableSourceInterface {

  use SyncableSourceTrait;

  /**
   * {@inheritdoc}
   */
  public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this->setAllRowsFromConfiguration();
  }

}
benjifisher’s picture

@mErilainen:

When you have finished testing, I hope you will share the results (positive or negative) on this issue.

I think there are only minor differences between your code snippet and the one in the issue description.

Perhaps you are not familiar with two different uses of the keyword use in PHP:

klidifia’s picture

Updated code in #154 is how I have implemented my syncable_url and works well.
One other thing I note: in the UI for migrate you can import / rollback etc, but no way of including this flag, would be a nice inclusion! :)

ronchica’s picture

There seems to me to be a contradiction in the manual steps to use the --missing flag.

Step #3 says to apply a patch to migrate_run, which implies that migrate_run is installed and is being used to run the drush migrate commands. But in order to use migrate_run, migrate_tools has to be uninstalled. This won't work. Or am I missing something?

benjifisher’s picture

Issue summary: View changes

@ronchica:

Sorry, I referenced the wrong related issue. I meant to link to #2863426: Implement rollback of items no longer in source data. I will update the testing instructions in the issue summary.

Thanks for pointing out the mistake.

benjifisher’s picture

@klidifia:

I looked at the interdiff you attached to #153, and it looks as though you addressed two of the points I raised in #139. Thanks for that. There is still an unnecessary type hint on $this->source.

I think that #152 and #153 satisfy the request for manual testing.

heddn’s picture

There's a lot of moving piece here. I haven't had the attention to really focus on and create an example patch for what I said in #138. But having to add an interface to make something syncable seems too heavy handed. It would be much nicer if here in tools there were a flag to drush mim foo_migration to trigger syncing. When this flag is set, it triggers the executable to loop over each source and destination and when it finds an item missing, it triggers MISSING_SOURCE_ITEM event. Then we have an implementation of the event that removes the item and removes the ID Map entry.

Using the event lets folks change the default response of delete by registering their event before this modules and stop propagation. Then they can do whatever they feel like if the default of delete is wrong.

Yes, I know that without an interface we have to do an inefficient loop over all the records. But it doesn't require changes to any existing migration yamls or interfaces. Later, we can tweak things and add efficiencies. For example, if a Sync interface exists on the source, then use its optimized method and bypass the inefficient loop.

agileadam’s picture

For what it's worth, the following patches are playing well together after going from Drupal 8.5 to 8.7.3.
It took me some time to find a working combination.

"drupal/core": {
  "Correct migration lookup to work with multiple values": "https://www.drupal.org/files/issues/2018-12-31/drupal-migrate_lookup_multiple_ids-2890844-43.patch",
},
"drupal/migrate_plus": {
  "Handle XML child nodes as XML in xml-based migrations": "https://www.drupal.org/files/issues/migrate_plus_xml_return_asxml.patch"
},
"drupal/migrate_tools": {
  "Introduce drush switch for rolling back missing migration items": "https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-missing-items-from-source-2863426-20.patch",
  "Migrate support for deleting items no longer in the incoming data": "https://www.drupal.org/files/issues/2019-04-22/2809433-136.patch"
}

I have updated all of my migration YML files to use the "syncable_url" source plugin, which I implemented as shown in #154 above.

/modules/custom/mymodule_migrate/src/Plugin/migrate/source/SyncableUrl.php


namespace Drupal\mymodule_migrate\Plugin\migrate\source;

use Drupal\migrate\Plugin\MigrationInterface;
use Drupal\migrate_plus\Plugin\migrate\source\Url;
use Drupal\migrate_tools\SyncableSourceInterface;
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;

/**
 * A syncable Url source using SyncableSourceTrait.
 *
 * @MigrateSource(
 *   id = "syncable_url",
 *   source_module = "mymodule_migrate"
 * )
 */
class SyncableUrl extends Url implements SyncableSourceInterface {

  use SyncableSourceTrait;

  /**
   * {@inheritdoc}
   */
  public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this->setAllRowsFromConfiguration();
  }

}
damontgomery’s picture

I have tested the patches (using #20 for the drush bit though) as described in the issue description (thanks, Benji!) and they work as described. These many years later, thanks for implementing a more sustainable approach.

We are using

#153 here
#20 for https://www.drupal.org/project/migrate_tools/issues/2863426 since we are using Migrate Tools 8.x-4.1.

We added the implements SyncableSourceInterface to our source class (already had one). We also added the use SyncableSourceTrait; and in the __construct() method, we added $this->setAllRowsFromConfiguration();.

Our class has the following use statements as well if your IDE doesn't add them,

use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;
use Drupal\migrate_tools\SyncableSourceInterface;

Module versions:

 Migration            Migrate (migrate)                                                  Module  Enabled        8.7.3
 Migration            Migrate Plus (migrate_plus)                                        Module  Enabled        8.x-4.2
 Migration            Migrate Tools (migrate_tools)                                      Module  Enabled        8.x-4.1

Thanks again.

benjifisher’s picture

We discussed this issue at the #3067311: [meeting] Migrate Meeting 2019-07-11.

@heddn clarified his suggestion from #138 and #160. The idea is to keep the changes localized in the MigrateExecutable class from the Migrate Tools module. In order to loop through all source items for a migration, without affecting the state of the source plugin, we can clone the object representing that plugin. We might also have to clone the destination plugin.

A few quotes from @heddn in that discussion:

From my review though, it seemed very possible to loop over the entire source and destination, find things that aren't in either and delete them.
...
we could do that in migrate tools executable in a post import step. somewhere... anywhere

Neither I nor @marvil07 have time to follow up on these suggestions in the near future.

kevinc_’s picture

Really appreciate the work and thinking that has gone into this issue development.

We had this implemented and working on a client project but the rollback stopped working after we applied 8.7.3. Previously, we've used the url plugin and the patches in #161.

So, based on the error ('does not support rolling back items missing from the source'), I can see that's because my source plugin is not an instance of SyncableSourceInterface.

I've used the SyncableUrl from #154 for my plugin and updated my migration config to look like this.

source:
  plugin: syncable_url
  data_fetcher_plugin: file
  data_parser_plugin: xml
  urls: private://data/objects.xml

I'm still not able to get this working again even with the patches and the move to the new plugin. I'm also confused why my version was working with the url plugin previously.

Any thoughts gratefully received!

mErilainen’s picture

I have pretty similar setup, and everything works with core 8.7.4 and these patches:
https://www.drupal.org/files/issues/2019-05-31/2809433-153.patch
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...

Config:

source:
    plugin: syncable_url
    track_changes: true
    skip_count: true
    data_fetcher_plugin: file
    data_parser_plugin: jsonpath
    # Stage and prod override urls & authentication in settings.php.
    urls:
      - 'private://userdata/mergedUsers.json'
kevinc_’s picture

Thanks @mErilanien - my bad!

I changed the config in my migration module but had forgotten that was the install directory.

Updated the config in my sync directory, imported the config and it works great.

quietone’s picture

I too don't have time for some months to work on this again.

ronchica’s picture

I am confirming that the patches are working for me:
https://www.drupal.org/files/issues/2019-05-31/2809433-153.patch
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...

I have also created a plugin that works with migrate_source_csv.

Thanks for this effort!

kumkum29’s picture

Hi ronchica, kevinc_, mErilanien,

I need some clarifications to understand how this solution works (#168).
I we use this source plugin (syncable_url), and if we use the Drush command, a RollBack is launched on the items. So, these items are deleted, isn't it ?

How can I proceed If I want unpublish (and not delete) these items?

Thanks.

P.S. After having tested this solution, the items are deleted. During the RollBack, is there a solution to unpublish and not delete items? (alter the delete function) ?

Stockticker’s picture

FileSize
952 bytes
19.92 KB

Here's the small addition to the #153 where the possibility to "Rollback Missing" items has been added to the Drupal UI.

ccarrascal’s picture

Hi,
I can confirm our setup is also working in 8.7.6 with:
#170 from here (https://www.drupal.org/files/issues/2019-08-09/2809433-170.patch)
#29 from #2863426 (https://www.drupal.org/files/issues/2019-07-18/2863426-29.patch)

Thanks a lot for the effort!

joelpittet’s picture

I am trying things out but can't seem to get it to not rollback everything still. Looks promising though.

joelpittet’s picture

This is a bit of a hack but I'm using SqlBase so I had to overwrite the sourceIds

  /**
   * {@inheritdoc}
   */
  public function sourceIds() {
    $query = $this->query();
    $nids = $query->execute()->fetchCol(0);
    $ids = [];
    foreach ($nids as $nid) {
      $ids[] = ['nid' => $nid];
    }
    return $ids;
  }
john.oltman’s picture

FileSize
1010 bytes

This adds the syncable_url plugin as a patch to migrate_tools so you can avoid a custom module if you want.
Failed patch, see below for the fix.

john.oltman’s picture

Status: Needs review » Needs work

The last submitted patch, 174: 2809433-174.patch, failed testing. View results

john.oltman’s picture

Status: Needs work » Needs review
FileSize
1.03 KB

Sorry, fixed the patch, use 176 not 174.

Confirmed working on Drupal 8.7.6 with all 3 of these patches:
#29 from #2863426 (https://www.drupal.org/files/issues/2019-07-18/2863426-29.patch)
#170 from here (https://www.drupal.org/files/issues/2019-08-09/2809433-170.patch)
#176 from here (https://www.drupal.org/files/issues/2019-08-30/2809433-176.patch)

As noted above, patch 176 replaces the need for a custom module and adds the necessary plugin to migrate_tools. If you go the custom module approach for the syncable_url plugin, you do not need patch 176.

Did a rollback missing migration and the items no longer in the source were deleted.

For those adding patches via composer, make sure to use a different issue name in composer.json in the patch definitions for 170 and 176, otherwise only one will be applied.

Running with drush:

drush mim my_migration  # process adds and updates in source
drush mr my_migration --missing-from-source  # process deletes in source

Config:

source:
    plugin: syncable_url
    track_changes: true
    data_fetcher_plugin: http
    data_parser_plugin: xml
    urls: 'private://my-xml-file.xml'
    item_selector: /my/selector
    fields:
      etc etc
ronchica’s picture

@kumkum29 - sorry, just noticed your question on unpublishing.

To unpublish instead of deleting, in a custom module, I extended MigrateExecutableBase and wrote a custom function 'unpublishMissingItems'. That function looks a lot like Migrate Tools rollbackMissingItems, except it unpublishes the missing items instead of deleting them. There is a custom drush command to run it.

kumkum29’s picture

@ronchica,

I have created a custom module with an extended rollbackCurrentRow() function. In this function, before to delete the row, I call a new function (UnpublishEntities) :

 /**
  * Roll back the current row.
  */
 protected function rollbackCurrentRow() {
 ...
        // Unpublish the entities.
        $this->UnpublishEntities($destination_key);
  ...

In this new function I get the missing IDs and unpublish node and medias no present in the source (xml)...

jmaties’s picture

@ronchica,

How can works with migrate_source_csv?

heddn’s picture

FileSize
4.85 KB

This is not completely tested. But it might be what is needed. There's a new '--sync' option added for drush import command. Still needs tests.

No interdiff, as this takes a wildly different approach to the whole thing.

heddn’s picture

Issue tags: +Needs manual testing
FileSize
4.73 KB
7.57 KB

And here's some tests, which demonstrate that this PoC actually can work. Can we get some manual testing too?

heddn’s picture

FileSize
7.46 KB
837 bytes
andypost’s picture

FileSize
2.67 KB
8.07 KB

Here's a patch to isolate dispatching which is different in new Symfony (but core's dispatcher still using old arguments)

  1. +++ b/src/EventSubscriber/MigrationImportSync.php
    @@ -0,0 +1,75 @@
    +      $id_map->prepareUpdate();
    +      $source = $migration->getSourcePlugin();
    +      $source->rewind();
    +      $source_id_values = [];
    +      while ($source->valid()) {
    +        $source_id_values[] = $source->current()->getSourceIdValues();
    +        $source->next();
    

    if there's lots of source records they may not fit into PHP memory limit
    idmap will left in the prepare state after clean-up done?

  2. +++ b/src/EventSubscriber/MigrationImportSync.php
    @@ -0,0 +1,75 @@
    +        if (!in_array($map_source_id, $source_id_values, TRUE)) {
    +          $destination_ids = $id_map->currentDestination();
    

    maybe use kind of hash for source ids?
    also destination could be null, then probably no reason to dispatch events

edysmp’s picture

I did a manual test with #183patch applied.

Migrating 4 items from a json file to a taxonomy.
1. `drush mim guitars` - Shows 4 items created. Terms are created.
2. (Delete one item from the json file)
3. `drush mim guitars --sync` - shows 0 created, 0 updated... The term is deleted.
4. Checking the map table:
The `source_row_status` column has been set to 1 for all rows. Is it the expected?
the deleted item is still shown in the map table. Seems that the idmap isn’t getting updated.
5. `drush mim guitars` -- shows 3 updated. Only the deleted item remains to 1 in the map table.
6. `drush mr guitars` - shows Rolled back 4 items - map table is empty.

heddn’s picture

Status: Needs review » Needs work

Sounds like a little work is needed. Quite likely we need to move the rollbacks of non-existent source items in a pre import event. And we should add an assert to the test that the id map only has 2 entries in it after the rollback. I have to move on to some other things today, but maybe someone else can pick up from here?

heddn’s picture

FileSize
8.16 KB

First a re-roll.

heddn’s picture

Status: Needs work » Needs review
FileSize
10.29 KB
6.45 KB

Here's some fixes and additional test coverage.

heddn’s picture

Yeah! Green again. Now for another round of manual testing from folks.

chriscalip’s picture

Is this incoming feature accommodate an option to unpublish instead of delete? I have datasources that would periodically not include deactivated records but will likely add them in the subsequent runs.

edysmp’s picture

Looking good with the latest patch. Things are as expected now.

I am still curious in #184 comment about possible PHP memory limit issue by creating an array with a bunch of items.

heddn’s picture

re #190, that sounds like a new issue. This does what is suggested in the title and IS and deletes the content. If you want to keep the items and just unpublish, you could review what is going on in this issue and easily build your own subscribers to do such things.

re #184: memory limits, let's handle such memory limits in follow-ups. PHP can handle very large PHP arrays. But if someone is trying to sync more then some finite number of items, the solution put forth here might work. Or it might not. I just depends on so many variables. If it becomes an issue, it almost seems like a custom solution for that specific site might be needed. This by necessity has to be a very generic solution because we don't know the nature of the sources. If we did, then we could probably do some fancy SQL or CSV filtering, etc.

So I say, wait for bug reports on memory limits, figure out what those generally are... then update docs for --sync to mention what those limits are.

  • heddn committed 0adfcf3 on 8.x-4.x
    Issue #2809433 by quietone, marvil07, heddn, mikelutz, geek.merlin aka...
heddn’s picture

Status: Needs review » Fixed

Thank you to everyone who's stuck with this issue over its long and varied life.

TrevorBradley’s picture

This isn't set as closed (fixed), so I thought I'd pop here with a question. I'll start a new ticket if this belongs elsewhere. I'm using 4.x-dev#0adfcf3

How exactly is this supposed to work? From the latest patch, it looks like there's a new "sync" option, but drush rejects this from the command line:

drush mim my_import --sync
Unknown option: --sync.  See `drush help migrate-import` for available options. To suppress this error, add the option --strict=0.

It looks like migrate_tools.drush.inc' 's migrate_tools_drush_command is missing a 'sync' option in migrate-import. If I add it there, the error about sync not being an option goes away, but the code about sync doesn't appear to run.

I poked around at the code a bit, but can't seem to work out what's missing. I have no idea how @edysmp's commands in #183 worked with the current code base.

I'm using migrate_source_csv - is it possible there's something else I need to set up in my migrate configuation?

(continuing to investigate...)

heddn’s picture

This is only added for drush 9 and drush 10. Support for drush 8 (which uses drush.inc files) is best effort. Drush 8 hasn't been supported for quite some time, and this module only backports things that are super easy. This wouldn't be hard, but it isn't in the super easy camp either.

Bite the bullet and upgrade to Drush 9+.

TrevorBradley’s picture

Awesome, I knew I was missing something. I'll get working on that drush upgrade immediately.

Thanks @heddn!

TrevorBradley’s picture

OK, back for one more thing. This is 80% chance of another user config error, but I'm seeing something else that might be a real bug.

I'm now running Drush 10.0.2, and PHP 7.2.14

When I remove lines from my import file, then try to run a sync through drush, it seems to work (removing removed lines), except it throws an error:

drush mim my_import --sync
 [notice] Rolled back 38 items - done with 'my_import'
 [error]  Migration failed with source plugin exception: Cannot rewind a generator that was already run 

As best I can tell, $source->rewind() really is being called twice. The first time from MigrationImportSync::sync() at line 57, and then again in MigrateExecutable::import() at line 188.

Any ideas what might be wrong here?

EDIT: Investigating further - getSourcePlugin returns an instance of Drupal\migrate_source_csv\Plugin\migrate\source\CSV - I'm betting the fault lies there. (It's the only thing I'm really doing differently at this point)
EDIT2: Not sure about this... all the $source->rewind() and $source->next() are entirely within migrate_tools.

heddn’s picture

Interesting delema. Sounds like there the use of a generator (yield) somewhere in the source. Is there any other way to process the source (maybe via an object clone) 2 times?

Let's open a new issue for handling generators more cleanly.

TrevorBradley’s picture

charginghawk’s picture

If anybody switches from the old method (using SyncableSourceTrait) to the new method, please add a comment describing the upgrade path. Thanks!

Tim Corkerton’s picture

Is there any documentation on how to make this work? I just read through all the comments and am unsure how to implement this feature. I have just upgraded to Drupal 8.8. I am using Drush 9.7.1 and have the following contrib versions:
Migrate Plus 8.x-4.2
Migrate Tools 8.x-4.5
Migrate Upgrade 8.x-3.1

Is #201 suggesting that it simply a case of running

drush mim my_migration --sync

when I do this I get

Rolled back 296 items - done with 'my_migration'
[notice] Processed 0 items (0 created, 0 updated, 0 failed, 0 ignored) - done with 'my_migration'

which results in all my content being rollback despite there being no changes in the source

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

sannminwin’s picture

I wasn't able to apply patch #188 on migrate_tools 4.4.0
it failed with error.

Could not apply patch! Skipping. The error was: Cannot apply patch https://www.drupal.org/files/issues/2019-11-07/2809433-188_2.patch

Drupal version : 8.6.16
PHP binary : /usr/bin/php7.1
Drush version : 9.5.2

xurizaemon’s picture

@sannminwin seems this was fixed and is available in 8.x-4.5? If you're able to update to that release it, that'd be my recommendation.

nortmas’s picture

FileSize
20.11 KB

Unfortunately, in some cases, the only Drush command is not enough. This addition to patch #170 fixes the issue when the source is not available the status is getting stuck at "migration rollback" and all further migration are not possible. So this patch adds the fix to just reset the status to "Idle" in case the source is not available.

nadavoid’s picture

@nortmas Could you create a new issue to address the scenario you're describing?

harsh.behl’s picture

I can confirm, after applying https://www.drupal.org/project/migrate_tools/issues/2809433#comment-1322... patches it worked for 8.9.13 but 2nd patch not applied gracefully. So, did some modifications and manual work. Also Do not forget to create source plugin.

NicholasS’s picture

@tim-corkerton So from what I can tell adding the `--sync` flag. Should delete items that were previously migrated but no longer in the source feed and that works. But you are correct this seems to be a "simple" delete everything, then insert everything found in the feed. Resulting in new Node ID's every time you run a migration. Nothing is actually updated.

I had to apply this patch https://www.drupal.org/project/migrate_tools/issues/3104268 and now my migrations with a --sync properly update existing content.

I even tried change the ID from integer to string but without the patch from 3104268 #3104268 --sync just deletes everything then inserts new records from the source.

ids:
    ID:
      type: string

[UPDATE: SOLVED]
You need to use the patch from #3104268 and be on `"drupal/migrate_tools": "^5.0"` see https://www.drupal.org/project/migrate_tools/issues/3110335#comment-1410...

ddavisboxleitner’s picture

In 9.1 the order of the arguments to dispatch() changed, and so the previous patches stop working in those higher version of Drupal. In testing, just re-ordering the arguments in all calls to dispatch() works to restore this feature to functioning.

The signature of Symfony\Component\EventDispatcher\EventDispatcherInterface::dispatch() has changed.

https://www.drupal.org/node/3154407