Problem/Motivation
We would like to add support for deleting items that are no longer present in the incoming data when a migration is run. This was a feature of Feeds in D7 and seems like a common need.
Proposed resolution
From @mikeryan in #15:
I really think this should be a distinct "purge" operation that gathers the currently-available source IDs from the source plugin, and scans the map table for any which aren't in that list to invoke the destination rollback on them. Now that I think about it, maybe it could even be an option on rollback -
drush migrate-rollback --missing-from-source my_migration
.
Completed tasks
Two tests added and passing tests.
- MigrateRollbackMissingTest - is this sufficient? it does not test the source all_row property.
- MigrateRollbackMissingMessageTest
Remaining tasks
- Decide whether a simpler approach will work. See the discussion in Comments #133, #135, #136, #138.
- If we stay with the current approach, then address the points raised in Comment #139.
- Manual testing. See instructions in the next section.
- Add documentation.
- Add a change record.
Manual testing
At this point, testing is a little complicated. The current patch (in Comment #136) adds an interface and a trait. In order for selective rollback to work, your migration source plugin needs to implement the interface. An easy way to do that is by using the trait. For examples, see the code snippet below the following steps.
- Install Migrate Plus, at least version 8.x-4.2, for #3045346: Add missing source item event.
- Apply the current patch from this issue.
- Apply the current patch from #2863426: Implement rollback of items no longer in source data.
- Create a source plugin that implements the new SyncableSourceInterface interface.
- Update your migrations to use your custom source plugin.
- Clear caches.
- You can now roll back your migration with the new option provided by #2863426: Implement rollback of items no longer in source data:
drush mr my_migration --missing-from-source
.
For Step 4, you can follow the example in the SyncableEmbeddedDataSource plugin that is added in this issue. That class is used in the new MigrateRollbackMissingTest test class. Here is a class that @marvil07 wrote for a client project that uses the patch from this issue:
namespace Drupal\my_migrate\Plugin\migrate\source;
use Drupal\migrate\Plugin\MigrationInterface;
use Drupal\migrate_plus\Plugin\migrate\source\Url;
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;
use Drupal\migrate_tools\SyncableSourceInterface;
/**
* A syncable url source using SyncableSourceTrait.
*
* @see Drupal\migrate_plus\Plugin\migrate\source\Url
*
* @MigrateSource(
* id = "syncable_url",
* source_module = "my_migrate"
* )
*/
class SyncableUrl extends Url implements SyncableSourceInterface {
use SyncableSourceTrait;
/**
* {@inheritdoc}
*/
public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
$this->setAllRowsFromConfiguration();
}
}
User interface changes
N/A
API changes
Data model changes
N/A
Comment | File | Size | Author |
---|---|---|---|
#206 | 2809433-206.patch | 20.11 KB | nortmas |
#188 | interdiff_184-188.txt | 6.45 KB | heddn |
#188 | 2809433-188.patch | 10.29 KB | heddn |
Comments
Comment #2
damontgomery CreditAttribution: damontgomery at Acquia commentedHere is my patch.
Comment #3
damontgomery CreditAttribution: damontgomery at Acquia commentedThis is the related patch for migrate_tools to show the number of deleted items, https://www.drupal.org/node/2809437
Comment #5
geerlingguy CreditAttribution: geerlingguy commentedAdded related issue #1416672: Optionally delete destination when source data is removed (with tests) for the D7 Migrate module, and #2753365: Support deletion of migrate destination (which might be best marked a dupe of this issue).
Comment #6
damontgomery CreditAttribution: damontgomery at Acquia commentedThis applies to 8.1.10. I'll try to look into what's needed to test against 8.3
Comment #7
damontgomery CreditAttribution: damontgomery at Acquia commentedTried to make a patch for 8.2.x.
Comment #8
mikeryanAlthough I helped spitball this approach, I'm still not fond of deleting as a side-effect of import - I feel purging "missing source records" should be a distinct operation.
Given a moment's thought (and looking at the concrete patch), one big problem is that it counts on the import process being complete within a single request. If it's not - if it's broken up through a batch UI, restarted in drush because it's running out of memory, or simply run with a --idlist or --limit - every record not processed in that batch will get deleted. Rather hazardous...
Comment #9
damontgomery CreditAttribution: damontgomery at Acquia commentedThanks for the feedback, Mike.
I agree with those concerns. I think the alternative is to create a new process that will re-read the source data and remove all records no longer in it which is maybe your suggestion. I'm not sure exactly how to go about doing that.
For our purpose, our migration always happens in one go and we may use this patch for now. We can hopefully use this issue if someone (maybe me) can work on the alternative approach.
Thanks again for the direction!
Comment #10
geerlingguy CreditAttribution: geerlingguy commented@mikeryan - A couple more thoughts, too—it's probably more rare to have migrations that would be configured to delete items that aren't in the source, and I know in the times that I've needed to do it, the source always contained all relevant records, and migrations were always run on cron jobs (thinking of three different situations where this was needed), so even if some records were deleted that shouldn't have been, they would've been recreated on the next run...
But it may be good to make sure there are strong caveats anywhere this ability would be documented!
Another option, too—could we make it so you could unpublish rather than delete? I'm not sure if this would be universal across entities, but for nodes at least, that's a safer option than deletion.
Comment #11
iMiksuDespite the three comments above, I've still decided just in case to update the patch of #7 that was having some offset while testing if the patch still applies.
Comment #12
geerlingguy CreditAttribution: geerlingguy at Acquia commentedUpdated patch for 8.2.x HEAD—there were a few lines failing. I also fixed the comments to use inline (
//
) style inside functions/methods. No interdiff, it was difficult getting the original patches to apply alongside the 8.2.x updates and generate a reasonable interdiff.Comment #13
mikeryanComment #15
mikeryanI still don't like the idea of deletion happening as a side-effect of import, and having the source plugin invoking rollbacks on the destination plugin makes me feel all icky inside. I really think this should be a distinct "purge" operation that gathers the currently-available source IDs from the source plugin, and scans the map table for any which aren't in that list to invoke the destination rollback on them. Now that I think about it, maybe it could even be an option on rollback -
drush migrate-rollback --missing-from-source my_migration
.I'm thinking also that for scalability, and to avoid problems with batching, the source IDs would be better stored in a temporary database table than in memory.
Comment #16
geerlingguy CreditAttribution: geerlingguy at Acquia commentedI like that approach... even just having a migrate-purge-missing option or something like that would work for me.
Comment #17
Pavan B S CreditAttribution: Pavan B S at Valuebound commentedLine exceeding 80 characters.
There is one line which has more than 80 characters. Applying the patch, please review
Comment #18
kevin.dutra CreditAttribution: kevin.dutra at Workday, Inc. commentedHere's a first stab at the approach that @mikeryan suggested, where this is a separate type of operation. I'm not sure I'd call this fully baked yet -- it still keeps the source IDs in memory -- but it does at least work for data sets that aren't huge.
No interdiff for this since it's a pretty radical departure from the previous patches. It doesn't look like a corresponding
migrate_tools
issue was ever opened to expose this to Drush, so I'll do that shortly.Comment #19
danmuzyka CreditAttribution: danmuzyka at Phase2 for Workday, Inc. commentedI have a few minor suggestions for this patch.
Could the migrate source plugin manager be dependency-injected into the constructor for the
MigrateExecutable
rather than invoked through the static method here?Typo in the word 'avaialble'; should be 'available'.
At the moment, I don't have a great suggestion for avoiding loading all of the source IDs into memory and iterating through them.
Something else to consider is whether there is a way to refactor the code into some helper methods that
rollback()
androllbackMissingItems()
can invoke, since right now there is a lot of duplicate code.Overall this approach makes sense to me.
Comment #20
kevin.dutra CreditAttribution: kevin.dutra at Workday, Inc. commentedThis patch addresses the code review feedback from #19.
Per the suggestion, I broke one piece of logic out into a helper method. You're right, there is a little bit of duplication still going on, but I had some difficulty thinking of ways to extract more code and still have it be fairly readable.
Comment #21
danmuzyka CreditAttribution: danmuzyka at Phase2 for Workday, Inc. commentedOK, makes sense, not a huge issue.
Could some of the variables (
$id_map
,$destination_key
, and$destination
) be passed into this method as parameters, instead of re-initialized here, especially since some of them are already set by the context of the calling method?Nothing else stands out to me at the moment.
Comment #22
quietone CreditAttribution: quietone as a volunteer commentedGlad, I finally had some time to look at this. Quite interesting. I think the following needs to be addressed.
Comment #23
ohthehugemanatee CreditAttribution: ohthehugemanatee as a volunteer commentedheads up: @joelpittet says this has a merge conflict with #2844595: SQLBase breaks GROUP BY queries, which is currently RTBC. So that may need to be resolved, too.
Comment #24
vasi CreditAttribution: vasi at Evolving Web commentedI really like this approach!
in_array($source_key, $source_items)
in a loop will be very slow if there are a lot of $source_items. But that leads to...Comment #25
heddnTaking a quick look at this, we do break BC by adding to the migrate executable interface
rollbackMissingItems
.Comment #26
cosmicdreams CreditAttribution: cosmicdreams commentedHey checking in here. I ran across this issue because I attempted to write the same functionality. It looks like there's a function here named rollbackMissingItems that is never used.
Is there some kind of drush function that is missing from this patch that would use it?
Comment #27
kevin.dutra CreditAttribution: kevin.dutra at Workday, Inc. commented@cosmicdreams, the corresponding Drush functions for this API live in the
migrate_tools
module, which is in the contrib space. See referenced issue #2863426: Implement rollback of items no longer in source data.Comment #28
cosmicdreams CreditAttribution: cosmicdreams commentedHey @Kevin.dutra Thanks a ton. I'll see if can patch this in and kick it's tires.
Comment #29
maxdev CreditAttribution: maxdev at Youwe commentedSince in 8.3.5 there were changes related to migrate. I made patch compatible with latest version.
Comment #30
maxdev CreditAttribution: maxdev at Youwe commentedComment #32
heddnThis should add a new Interface that is DeletableMissingSource and implement it in core's executable. If we don't we'll break drush, drupal console and anyone who has created executables. Then check if we have an implementation of that interface before calling it. That's the only way I can think of doing this without breaking BC in a really bad way.
Also, needs IS summary update and tests.
Comment #33
rauch CreditAttribution: rauch commentedIs there a patch for Drupal 8.4.x?
Comment #34
mstrelan CreditAttribution: mstrelan commentedThere have been a few mentions of unpublishing content in this issue, I wanted to expand that though to users with the ability to block missing users instead of deleting them.
Comment #35
Peacog CreditAttribution: Peacog as a volunteer commentedI've been using this patch for a while. It's working well. Here's a re-roll for 8.4.x (it doesn't address the BC issue raised in #32)
Comment #36
jofitz CreditAttribution: jofitz at ComputerMinds commentedOnce this is confirmed to pass tests we can take a look at @heddn's comments in #32.
Comment #37
jofitz CreditAttribution: jofitz at ComputerMinds commentedComment #38
jofitz CreditAttribution: jofitz at ComputerMinds commentedStill needs tests.
Comment #39
jofitz CreditAttribution: jofitz at ComputerMinds commentedSetting back to NW for tests.
Comment #40
ichionid CreditAttribution: ichionid commentedIs it safe to remove the entries from the 2 tables create for every sync file when deleting the entity?
We're on 8.3.7 version for drupal and we don't have the patch available and kinda need urgently the deletion.
Comment #41
mullman CreditAttribution: mullman commentedWhat are the pro's and con's for simply performing a full rollback then a fresh import for feeds that need to be re-iterated daily? Is there a better approach?
Comment #42
Sutharsan CreditAttribution: Sutharsan at LimoenGroen commented@mullmann the one con is the change of IDs. In our case we are importing/syncing users from a CSV file. Changing user ID's is not acceptable, all users will be logged out immediately.
Comment #43
MegaChriz CreditAttribution: MegaChriz as a volunteer commentedFor Feeds a similar feature exists. In there though you can also choose to unpublish content that is no longer provided by the source (instead of deleting it). Recently, I ported (and improved) the feature to the D8 version: #2939503: Unpublish/Delete nodes not included in feed. Because for the D7 version I sometimes got requests to be able to take some other action on the content than unpublishing or deleting, the provided patch for the D8 version supports choosing an action plugin to apply on the content (currently limited to non-configurable action plugins to make the initial implementation easier).
The feature had been a quite a big thing for Feeds when it got implemented for the D7 version: #1470530: Unpublish/Delete nodes not included in feed.
Maybe such idea (applying an action plugin to content no longer provided by the source) could work for Migrate as well.
The D8 version of the feature also supports to execute the action in batches (either via the batch API or via a queue worker). Not sure if that would make sense for Migrate too.
Comment #44
kumkum29 CreditAttribution: kumkum29 commentedHello,
Actually, the patch #38 isn't included in the migrate module (in core).
Do you think that this patch could be soon commited? Or do I use another method to unpublish contents if items no longer exists in the source? In migrate, Is there a problem to implement this new class?
This feature is very important for migrate and this feature is present in feeds module.
Thanks.
Comment #45
heddnThis is set to NW still for tests. The thing is that many of the migrate maintainers haven't needed this issue ourselves. So this is highly dependent on others to step up and contribute.
Comment #46
lquessenberry CreditAttribution: lquessenberry commentedIs there perhaps a way to contain this functionality in a separate module that doesn't patch core?
Comment #47
kumkum29 CreditAttribution: kumkum29 commentedAs sayed by lquessenberry patch core isn't a good idea with the futur updates. If maintainers don't have need of this feature, this feature is very usefull for other users... And making a separate module is a good thing (without patch).
I have attached to my reply a custom module developped by the MTech team (I payed this team to develop a temporary solution). Maybe, it's a starting point to find a solution, and develop a stable module...
P.S. With this custom module we need to patch the core with #38
Comment #48
geek-merlin+100 for this patch.
As it came to light in #42, #43, in some use cases the right thing to do is delete, in some use cases we need some kind of unpublish action. How we might implement this (maybe memo to me):
* Add a MigrateDeletableDestinationInterface extending MigrateDestinationInterface with a method delete($destination_key) (we need this for BC, so add "Cleanup before D9" comment)
* Use the delete() method if available, otherwise rollback()
Of course, given that some implementations of delete() (contratrary to rollback()) will not delete, but unpublish, we can bikeshed on how we name things here. The point here is that we want to let the destination plugin know when a source item has gone, and decide on the right actions. Only having the listener would be really bad DX for this.
Comment #49
mglamanI have unfortunately not had time to review this. So, I apologize if this is out of sorts, but I wanted to reply based on #48
I'd rather the default be an event which is thrown "This MIGRATION_ID no longer has SOURCE_VALUE"
Because with the following
I'd have to make custom destinations for all of the existing sources and use those. I think it would be much easier to react to an event and say "okay, delete" or "great, unpublish"
Comment #50
geek-merlin> I'd have to make custom destinations for all of the existing sources and use those.
Yah, that'd be a pita. An event helps in this case.
OTOH, in the long run we want destination plugins to handle that, maybe configurable.
I want in D9 (or today with a contrib decorator or custom destination) to say:
So implementing both should give us both benefits with little effort.
Comment #51
geek-merlinRe-thinking it, i agree that an event is enough for now. We might ourselves listen to it and trigger the default rollback() action, unless some other listener does stopPropagation().
Comment #52
geek-merlinAs of #32, @heddn:
> This should add a new Interface that is DeletableMissingSource and implement it in core's executable. If we don't we'll break drush, drupal console and anyone who has created executables.
Some code grepping does not show me any alternate implementations or decorators (the parameter usage below is not affected). So why not better add the method to MigrateExecutableInterface and add a change note?
EDIT: Found a decorated instance in \Drupal\migrate_run\MigrateExecutable. It extends Drupal\migrate\MigrateExecutable (which we extend) so would not break too.
EDIT2: (to myself)
> So why not better add the method to MigrateExecutableInterface and add a change note?
Because changing an interface breaks implementations. Simple as that.
Comment #53
geek-merlinPatch flying in that implements #49/#51, a MISSING_SOURCE_ITEM event that falls back to rollback().
Also setting target to 8.7.x, rite? Still needs tests so leaving NW.
Comment #54
geek-merlinComment #55
geek-merlinI tested this patch a bit with the drupal_run drush9 patch, and some stuff worked. What it did not detect is when a source item is missing as of a process skip_row result.
Patch flying in that fixes that for me, by copying some lines of processRow().
Comment #56
KarlSheaIf one of your rows throws a message (I saw this with the skip_on_empty plugin)
rollbackMissingItems()
will still try to save the message as it accumulates source IDs. That will throw an exception becausesaveMessage()
is looking for$this->sourceIdValues
which hasn't been set.Comment #57
kumkum29 CreditAttribution: kumkum29 commentedHello,
I have some questions to better understand how this patch/evolution works.
If the items is no longer in the source data, the content can be unpublished/deleted.
1. In the map table, the line of the item is deleted or always present?
2. If the item is again in the source data, can we re-publish the content and recreate an entry in the map table? (if the entry was deleted with a previously migration).
Thanks for these precisions.
Comment #58
mikelutzComment #59
mikelutzStraight up re-roll of #56 against the latest dev changes.
Comment #60
mikelutzBah, missed a couple hunks. THIS is a re-roll pf #56
Comment #61
mikelutzWould an approach like this help solve the memory/scalability concerns?
Have the source plugin check whether the id exists. SourcePluginBase can get the id list from a generator, and loop through it. It would take more time, but under the hood in_array loops through all items in an array and jumps out when it finds the item anyway (admittedly in less cycles per loop, but if you don't want to load all the ids in memory, I don't see an alternative)
I don't necessarily like resetting the iterator like I did here, but this function is called in a safe spot on a source plugin instantiated for this purpose, so it should be fine, even if we added some sort of mechanism to make sure we don't have conflicts with other means of iterating through the source.
The other advantage here is that the individual source plugins can create a more efficient means of getting the ids if they want. Sql can have a query for it (if not the base, then any specific sql plugins), and embeddable and csv(if it makes it in) can do better too.
Thoughts?
Comment #62
heddnI too am a little worried by the iterator reset. But if we have tests that show this doesn't cause issues, then we should be fine. And having a base implementation should work for most small data sets. And like you allude, if the data set is huge, someone can always build a custom solution.
I'm not sure I like the name here. I would think this should be tacked onto a source plugin but was surprised to see it added on the executable.
This seems out of scope. Can we not do this in here and add to a follow-up?
Generator and return bool? The docs don't line up.
Nit
Ah, we do add it here too. I don't like the idea of shared interface here. Let's keep them separate perhaps.
We recently added a method to easily filter the idMap. Can we use that instead of this?
Comment #63
heddnThis blocks feeds and other uses cases. We talked about this at the migrate maintainers meeting with @maxocub @mikelutz @heddn and thought that an FilterIterator might help here.
Comment #64
quietone CreditAttribution: quietone as a volunteer commentedJust a brief read of the code and wanted to mention two things
Can $ids be $source_ids and $iterator be $source? Just want to make it easier to read.
This makes many comment > 80 characters.
Comment #65
badrange CreditAttribution: badrange commentedThis might be a strange question, but I'll ask it anyway: Are anybody using any version of this patch in production?
Comment #66
KarlSheaI'm using it, or at least will be when the site goes live within the month.
Comment #67
kumkum29 CreditAttribution: kumkum29 commentedHello,
as Badrange said, can we use this patch in a production site? This patch is a good step forward for the module.
For the moment, can we consider using this patch because it would soon be included in the module code?
Thanks.
Comment #68
biguzis CreditAttribution: biguzis at Wunder commentedHave migration with ~8000 items and about 800 items should be removed with this, but after some time i got `MySQL server has gone away` even before something is actually rolled back..
Comment #69
biguzis CreditAttribution: biguzis at Wunder commentedbtw - i rerolled patch to make able to apply on 8.6.3
Comment #70
mikelutzThis is new functionality, It won't be eligible for backporting to 8.6. Submitted patches need to be based off 8.7.
Comment #71
geek-merlin> #68 "MySQL server has gone away"
That's unrelated. Look into #2829492: Big file backup breaks with "MySQL server has gone away" and do
SET SESSION wait_timeout = [enufftime]
Comment #72
wizonesolutionsThis is a patch that works with 8.6.4. Please ignore it. I just need it for my
composer.json
:)Comment #73
kumkum29 CreditAttribution: kumkum29 commentedHello !
we are a little lost with all these posts.
Can someone summarize and explain which patch to use for drupal 8.6 version? And what to write in our config file to use this new functions?
With the patch #38 I get problem to patch the core with Drupal 8.6.4.
Thanks for your help and explanations !
Comment #74
wizonesolutions@kumkum29: Core changed in 8.6.4. If you need a patch for that, use mine (#72). Don't know the answer to your other question.
Comment #75
kumkum29 CreditAttribution: kumkum29 commentedthanks Wizonesolutions ! it's ok with your patch.
I see in several replies, some settings to active the delete behavior in the migration config: e.g. action_on_delete: unpublish...
is it functional? If yes, how to proceed? Can you gieb me more examples on these new plugins/functions?
Thanks.
Comment #76
wizonesolutionsI can't, unfortunately. I only updated this patch in an existing system, but I haven't dealt much with what it actually does. You'd probably have to read it or get someone to read it for you to get an idea.
Comment #77
praandy CreditAttribution: praandy commentedHi,
I have used the patch #38, but the new #72 doesn't work so well (now I use 8.6.5).
The problem is the patch in Drupal\migrate\Plugin\migrate\source\SourcePluginBase, the function "sourceIdsExists" load all records of the source every times that the system checks if a record is present or not.
I changed the function so:
and in Drupal\migrate\MigrateExecutable
So, it works.
Could you find a more elegant solution?
Comment #78
achraf.noomane CreditAttribution: achraf.noomane at Interpersonal Frequency commentedThe patch was not applying after Drupal security updates SA-CORE-2019-001. re-rolled for Drupal 8.6.7.
Comment #79
heddn#77 needs to be addressed. Loading all source records seems like a really bad idea. Can't we have a base solution in SqlBase that does a source query?
Comment #80
mErilainen CreditAttribution: mErilainen at Wunder commentedCould the source plugin provide a sourceIds() public function or such which could be used instead?
I tested locally using the patch from #78 with 966 items and it takes always about 47s check if something should be rolled back because for each item all the source ids will be iterated through to see if it still exists in the source.
I tested the idea from #77 by writing a sourceIds() function and calling it once and checking if the $source_key exists in the array: The check now took 3,4s instead when no items needed an actual rollback.
Comment #81
mErilainen CreditAttribution: mErilainen at Wunder commentedHere is the patch related to my previous comment, replacing sourceIdsExists() with sourceIds() which is called only once. Interdiff included.
Edit: Patch is against 8.6.x.
Comment #83
samerali CreditAttribution: samerali at Therefore commented@mErilainen Thanks for your patch at #81
This patch does not provide drush command or ui option to run this right? for now we need to use
->rollbackMissingItems();
?Comment #84
mikelutzComment #85
biguzis CreditAttribution: biguzis at Wunder commented@samerali Here is patch for Migrate tools with drush command - https://www.drupal.org/project/migrate_tools/issues/2863426
Comment #86
kumkum29 CreditAttribution: kumkum29 commentedHello,
Is there any progress to use a specific plugin inside the config file for delete/unpublish the content?
Today, there is no 'stable' solution to unpublish content with migrate.
We can define a budget/payment for this job, if a maintainer/developer include this feature in migrate.
Thanks.
Comment #87
samerali CreditAttribution: samerali at Therefore commentedThank you @biguzis this worked great! saved tons of time.
Comment #88
phenaproximaDiscussed with the other Migrate maintainers. We decided that we're not going to put this in core right now because it would add complexity to the already-way-too-messy Migrate API -- complexity which is not needed for any core migrations. Instead, we can implement this in Migrate Tools (which will be faster and easier than core, anyway). So let's do that.
Comment #89
benjifisherNow that this issue is attached to the Migrate Tools module, I hope we can get a patch that combines this with #2863426: Implement rollback of items no longer in source data.
In the mean time, notice that the patch from #81 conflicts with #2999478: Refactor MigrateExecutable::rollback(), so the patch will not work with Drupal 8.7.
Comment #90
quietone CreditAttribution: quietone as a volunteer commentedWas working on rerolling this to be in Migrate Tools and realize I don't know much about how migrate tools does things. And too early to know the right question to ask. :-( Will try again tomorrow.
Comment #91
quietone CreditAttribution: quietone as a volunteer commentedNot making progress here.
Comment #92
vidorado CreditAttribution: vidorado at Biko2 commentedFor those applying patches with Composer, note that patch in #81 applies to drupal/core, not to drupal/migrate_tools
Besides that, the patch works ok! I have tested it with the drush command
migrate:rollback --missing-from-source [migration_name]
. That new command option is added by the patch in #2863426: Implement rollback of items no longer in source dataComment #93
quietone CreditAttribution: quietone as a volunteer commentedA possible test.
Comment #94
edysmpnice test.
Comment #95
quietone CreditAttribution: quietone as a volunteer commentededysmp, thanks,
OK, now combine the test and the patch. Not working yet. failing to get any rows from the map.
Comment #96
quietone CreditAttribution: quietone as a volunteer commentedTry again.
Comment #97
quietone CreditAttribution: quietone as a volunteer commentedJust some cleanup. The test is failing because in rollbackMissingItems, sourceIds($source); always returns an empty array because the source is invalid. What needs to happen for it to be valid?
Comment #98
heddnLet's put the event over in migrate_plus. I think that is a better location for that type of thing and follows the already established pattern.
I don't think this is reachable code, the next part?
Comment #99
quietone CreditAttribution: quietone as a volunteer commented1. Fixed
2. Yes, that was copy/paste from the original. I left it broken since the source is always invalid and I was trying to fix that first. Still don't know why it is invalid.
3. Agree. I just don't know how to do it.
Comment #100
quietone CreditAttribution: quietone as a volunteer commented98.3. Maybe fixed.
Moved $source to a property and cleanup.
What is still needs to be done here is find out why the source plugin created in migratexecutable return no rows. Anyone know?
Comment #101
quietone CreditAttribution: quietone as a volunteer commentedAdd related issue #3045346: Add missing source item event
Comment #102
mikelutzComment #103
heddn$this->t($this->migration->getStatusLabel())
is what is the issue. It is generally discouraged to pass variables or function calls intot()
About the underlying issue of returning data from embedded_data in #99, I think I'd need to debug it. Data rows should be possible.
Comment #104
quietone CreditAttribution: quietone as a volunteer commentedSeem to have lost the test along the way here, so restoring that.
The problem I see with the test is how to change the data_rows of embedded source. There doesn't seem to be a way to do that. I tried making a second source plugin with the correct data and changed the map and message table names, but then the source is invalid. When it is invalid then all rows are returned. It probably shouldn't do that. Thought maybe test like HighWaterTest which using a sql source plugin, but that ran into problems with hooks in migrate_tools.
Comment #105
quietone CreditAttribution: quietone at Acro Commerce commentedTrying to fix the t() .
Comment #106
quietone CreditAttribution: quietone at Acro Commerce commentedremove extra line
Comment #108
mikelutzComment #109
mikelutzYour source plugin isn't returning rows because the rows are already imported, and source plugin base filters out rows that have already been processed by default. You can run $id_map->prepareUpdate() to get the source to return all rows, but that sets the update flag and because we aren't running a migration, it won't be cleared. It feels like something we should fix in core, but to work around it here in contrib, I think you will have to run prepareUpdate, and then at the end resave all the idmap rows to reset their status to imported.
If these two migrations have different ids, then they will have different id maps, and you won't be able to roll back
Comment #110
marvil07 CreditAttribution: marvil07 at Isovera commentedHere a new version with the following changes:
rollbackMissingItems()
only to migrations which contain a source plugin implementing theSyncableSourceInterface
interface. This follows the approach on #2809433-81, see more reasoning on this below.SyncableSourceTrait
trait, which is the main helper for migration source classes to implement theSyncableSourceInterface
interface.I have moved both
sourceIds()
andidList()
methods there.An idea from #2809433-81 was missing: the "all rows" source configuration option and the related changes over
next()
; that is the reason I think we cannot work-around sources not implementing theSyncableSourceInterface
interface, which the previous patch was suggesting.Also, I have added a helper to use during the constructor, and the modified
next()
method.Now that we are not modifying core, we cannot rely on that change on
SourcePluginBase
class, and instead any source needing to use the new functionality would need a class implementingSyncableSourceInterface
interface.syncable_embedded_data
migrate source plugin, as an example on how to use the trait, and also used during the related test.syncable_embedded_data
migrate source on the test.I would suggest that we should add the new event directly here, because this functionality is only implemented in migrate_tools, so it will be confusing if that event is implemented by other module; as I suggested on the related ticket.
Some pending items on the fixme at the new trait: Some called methods here assume the class using this trait is a SourcePluginBase child, e.g. getIterator(). Check all, and if possible only use methods from interfaces.
Comment #111
marvil07 CreditAttribution: marvil07 at Isovera commentedI think the test failure is related with migrate_tools not being patched and exposing the new event constant, locally I where I have the latest migrate_tools related patch with that, I got the test running.
Self-reminder: running this test locally:
php ./core/scripts/run-tests.sh --sqlite /tmp/d8test --verbose --class "Drupal\Tests\migrate_tools\Kernel\MigrateRollbackMissingTest"
Comment #112
quietone CreditAttribution: quietone at Acro Commerce commentedThanks marvil07!
This just fixes the coding standard errrors.
Comment #114
quietone CreditAttribution: quietone at Acro Commerce commentedAdd an @requires for migrate_plus
Comment #116
quietone CreditAttribution: quietone at Acro Commerce commentedMoving the event back to migrate_tools to get the test to pass. I'm leaning to have the event in migrate_tools.
Comment #117
quietone CreditAttribution: quietone at Acro Commerce commentedJust noting the todos here.
Comment #118
heddnI've committed the work over in #3045346: Add missing source item event, so this will need a re-roll.
Comment #119
heddnAnd BTW, the rational for putting it over in plus is that then things like runner, manifest or anything else can use the feature without requiring tools as well.
Comment #120
marvil07 CreditAttribution: marvil07 at Isovera commented@quietone, thanks for the iterations here!
@heddn the patch from comment 114 is the most updated using migrate_plus event, changing visibility of patches to follow latest updates.
Comment #121
marvil07 CreditAttribution: marvil07 at Isovera commentedMore patches hidden.
Comment #122
quietone CreditAttribution: quietone at Acro Commerce commentedReroll and added a test to test the messages returned on failures, but it isn't working, can't access the message. Can someone sort that out?
Sorry, no interdiff as I am running out the door ... the bus does not wait.
Comment #123
quietone CreditAttribution: quietone at Acro Commerce commented@heddn, thanks for the commit in Migrate Plus, and the explanation.
Fix coding standard.
Comment #125
marvil07 CreditAttribution: marvil07 at Isovera commented@quietone, thanks for the extra tests.
I have been looking at the methods on
MigrateTestBase
, and it seems like thestartCollectingMessages()
/stopCollectingMessages()
is intended to be used differently, and that is why messages were not being recognized. SeeMigrateTestBase::display()
.I have changed the test to use that other approach.
@quietone, any reason to use
RollbackMissingMessageTest
instead ofMigrateRollbackMissingMessageTest
, to follow the other,MigrateRollbackMissingTest
?Comment #126
quietone CreditAttribution: quietone at Acro Commerce commented@marvil07, thanks. I looked at MigrateDefaultLanguageTest for a way to get the messages.
The reason it is RollbackMissingMessageTest is because I am weary of long file names. It is changed back.
The test is modified to not run the import since that it not needed for the test.
Another wee thing is that the use of quotes in the messages in MigrateExecutable is now different than that of it's core counterpart and different than the way quotes are used in Exception messages, https://www.drupal.org/docs/develop/coding-standards/php-exceptions. I'm pointing that out because I prefer consistency.
Comment #127
quietone CreditAttribution: quietone at Acro Commerce commentedTODOs:
In SyncableSourceTrait
And the tests do not test the new all_rows configuration property on the source. It is false in MigrateRollbackMissingTest
Comment #128
quietone CreditAttribution: quietone at Acro Commerce commentedComment #129
marvil07 CreditAttribution: marvil07 at Isovera commented@quietone, thanks for the extra changes!
On the implicit dependency on SourcePluginBase
I have been trying to remove the implicit dependency on
SourcePluginBase
, but I could not find a clean way to do so from here.SourcePluginBase
is an opinionated implementation, and therefore itsnext()
implementation, which is the base for the same method in the new trait proposed in this change, also assumes many internal details.Splitting
SourcePluginBase
in different traits/interfaces could help us define which source plugins the newSyncableSourceTrait
is compatible with, but changing that core class is definitely out of scope of this ticket.In any case, I ended up splitting a bit of it, but it is just a start, see #3048413: Split iterable part of SourcePluginBase into a trait.
Instead, I have described on the trait docblock the expected requirement for classes using the trait, except for the
mapRowAdded
data member, which is probably not used anywhere and will probably be removed at #3017456: Remove SourcePluginBase::$mapRowAdded.As mentioned there,
SyncableSourceTrait
is mainly intended forSourcePluginBase
children, but if someone really wants to use it, it is still possible with the listed requirements.Considering this new trait will help to easily expand most source plugins, which are
SourcePluginBase
children, as the addedsyncable_embedded_data
, I think it is a good trade-off to have it.On the messages
@quietone, I think you are right, we should follow a bit more the standard for php exceptions, since it is the most related standard about migrate messages.
The main difference is that migrate messages have format, and are translatable strings; but even then, using single quotes move the code one more steps towards standard messages.
I have changed the messages to use single quotes, and the related test accordingly.
Comment #130
marvil07 CreditAttribution: marvil07 at Isovera commentedFixing the coding standard error and changing some docblocks on the message related test to be a bit more accurate.
Comment #131
quietone CreditAttribution: quietone at Acro Commerce commented@marvil07, thanks this is coming along nicely.
I would still like a test of the all_rows property and maybe change member to property in the SyncableSourceTrait documentation but that should not hold up a review, particularly of the new trait.
Anyone coming here to review please look at the recently added SyncableSourceTrait, thx.
Comment #132
benjifisherI am just starting to look at this. Forgive me if I start by picking some nits. I also plan to do some testing.
Can we add a description to the doc block? Something like "Migrate source plugins can implement this interface in order to synchronize migrated content with the source. That is, migrated content can be added, updated, or removed to match the source." Maybe also refer to the new trait.
Wow, that is a lot! I think that "use case of" should be "use case for". I think that "in a similar way than" should be "in a similar way to" or simply "similar to". I do not see
initializeIterator()
outside this comment, so I assume we can remove it from the list. Perhaps document the new configuration key: "Available configuration keys: -all_rows
: ..."Since this function is public, maybe it should be added to the interface. Or if it is just used from classes that
use
this trait (which is what I see in EmbedDataSyncableSource) then we can make it protected.I do not like the name nor the description. Maybe
setAllRows()
and "Set the $allRows property based on configuration."It seems odd to set it from the configuration like this instead of having a simple setter, but maybe I will understand that as I continue to read the code.
For the sake of future reviewers: these lines (adding the first condition) are the only difference between
next()
in this trait and the version in SourcePluginBase.Why not just combine this with the
idList()
function?Can we make the word order in the class name match the word order in the plugin ID? That is, either make the plugin ID
embedded_data_syncable
or else make the class name SyncableEmbedDataSource.use
s the trait, then we could skip the configuration and theallRows
property. That would be simpler. But I guess we are only doing that because we cannot change the core classes. When we create our own source plugins, we can alwaysuse
the trait and then we need a configuration key to determine the behavior.Can we break up this long line?
Do we need the type hint here? Can we break up the long line?
$source
a local variable instead of a class property. (Am I missing something?) Maybe give it a more descriptive name, like$syncable_source
.Comment #133
heddnCan we get the super simple solution here addressed first? Let's support syncing without needing any optimizations from the source. Meaning, we scan through the entire source and the idmap and sync things. Let's leave interfaces and improving the performance of things using dedicated methods and interfaces for follow-ups.
Comment #134
charginghawk CreditAttribution: charginghawk as a volunteer commentedWe tested a few scenarios:
Only the 4th worked for us.
Scenario 1
The
drush migrate-rollback --missing-from-source my_migration
command ends up with errorUnknown option: --missing-from-source.
The #130 patch does not contain the "--missing-from-source" option, so we still need a patch adding that option.
Scenario 2
After adding two patches to migrate_tools without a patch applied to core the
drush migrate-rollback --missing-from-source my_migration
command ends up with errorMigration 'my_migration' does not support rolling back items missing from the source
.Scenario 3
Same results as in Scenario 2
Scenario 4
Only this one scenario is working as expected.
Comment #135
quietone CreditAttribution: quietone at Acro Commerce commentedRe #133
What do you have in mind? Back in #84 there was a suggestion to run prepareUpdate(), and then at the end resave all the idmap rows to reset their status to imported. I am not convinced we should set the row status to imported after this rollbackmissing operation since that would also change failed and skip status to imported.
Comment #136
marvil07 CreditAttribution: marvil07 at Isovera commented@quietone, we can definitely test the new all rows property.
@benjifisher, re 132:
1. added your description
2. thanks for the suggestions, now added. On
initializeIterator()
, it is strictly not needed; since is used from within the already mentionedgetIterator()
from SourcePluginBase, so I removed it from the list.3. Good point,
protected
visibility should be OK. Not on the interface because it's a helper, to avoid doing it from each source plugin constructor. I also changed the name of the method tosetAllRowsFromConfiguration()
, it is a similar to a setter, but adding the last part to make it clear it is not a simple setter, since it does not receive a value to set.4. I have added a note on the description to suggest it should be kept on sync with core, and added the last commit hash from core I have checked it against.
5. It was about keeping the structure from the initial patches in core, I am fine either way, @heddn let us know what you think.
6. Small things are important, thanks for pointing it out, changed the class name, but to
SyncableEmbeddedDataSource
to follow a bit more core embedded_data source.7. Yes, sadly it was decided to not include the change in core, and the trait is an attempt to make it as easy to use as possible implemented from contrib. The best case would naturally be do this change directly on core's
SourcePluginBase
.8. I was also thinking on keeping the event in the same place, but based on the outcome of #3045346: Add missing source item event, it seems like the idea is to allow other modules to use the event without needing to depend on
migrate_tools
.9. Sure.
10-11. now in two lines. Notice the data property is defined on core's
Drupal\migrate\MigrateExecutable
, I think it is OK to override it here to provide the extra configuration to the source object. Using a local variable can also work, but I think semantically it would be better to override; no change for now, feedback welcome.@heddn, re 133,135:
The current reason a syncable source needs to be overridden is to be able to retrieve all rows, which is not possible without overriding
next()
.The point @quietone makes is relevant here too.
Can you elaborate on what approach will you take to avoid skipping some of them as the current
next()
on SourcePluginBase does?@charginghawk, re 134:
Scenario 1 works as expected. It is for core, and only the API, no drush commands.
Scenario 2 works as expected. It is for migrate_tools, and API from the patch here, drush command option for the other ticket. The error is the expected, the source needs to implement
SyncableSourceInterface
, as it is checked in one of the tests.Scenario 3 works as expected. Both core and migrate_tools patches are not intended to be used together, and the output is about the expected error from the migrate_tools latest patch.
Scenario 4 works as expected. That is a valid solution, but it was decided to not be accepted by core, at least for now, and therefore the migrate_tools patches started.
What it is probably needed for your use case is to extend the source you are using with the provided trait, as SyncableEmbeddedDataSource is doing for
embedded_data
plugin.Comment #137
benjifisher@charginghawk:
To follow up on what @marvil07 wrote in #136: see #132.7 and the code it references. With the current patches on this issue and #2863426: Implement rollback of items no longer in source data, you have to follow the example of the test in order to get the rollback functionality. Extend whatever source class you would normally use and
use
the new trait so that you implement the new interface. Then refer to your custom source plugin in your migrations.@marvil07:
I owe you a re-review. For now, I just want to say that I am glad I asked, in #132.11,
You implicitly answered that question: I was missing that the variable is already defined in the core class.
Comment #138
heddnPut this retrieving all of the rows into the executable. It can iterate over the entire source, destination,id map, etc. Then only trigger the sync logic when syncing is configured for a specific migration. Yes it will be slow, but then for folks that want to sync less than say 1000 records, it will probably still scale. It is only when we are dealing with many thousands (millions) of records that parsing too much data by iterating over the entire data set would get cumbersome. And it would then work for existing migrations with only adding a "sync" flag on the migration.
Comment #139
benjifisher@marvil07,
I only have time for a quick re-review based on the interdiff you attached.
3. The renamed
setAllRowsFromConfiguration()
is stillpublic
.4. I am not sure that we should include a specific commit hash for core. I might just make it
@see \Drupal\migrate\Plugin\migrate\source\SourcePluginBase::next()
10. You convinced me that it is right to keep
$this->source
as a class variable. Since it already has a type hint when declared as such, I do not think we need to repeat the hint when using it later.Other than that, I like all the changes you made, including the new names.
I do not know enough to contribute to the discussion started in #133.
Comment #140
joelpittetWhat's the quickest way to test this on an d6 to d8 existing migration? I've applied both patches from #2809433-136: Migrate support for deleting items no longer in the incoming data and #2863426-22: Implement rollback of items no longer in source data
And ran into the
So I figure that I need that
SyncableSourceTrait
trait apparently but where do I need to add it or extend core's source migrations to add it?Comment #141
Tim Corkerton CreditAttribution: Tim Corkerton as a volunteer commentedI'm in the same position as Joel, I've got a custom working D6 -> D8 migration running well but can't get this to run correctly.
I made the same assumption that we need to apply both patches outlined in #140
Can someone confirm the steps required to run successfully?
Many thanks
Comment #142
benjifisher@joelpittet, @Tim Corkerton:
Please see my reply to @charginghawk in #137.
Comment #143
joelpittetThanks @benjifisher, I’ll give that a go
Comment #144
toncic CreditAttribution: toncic commented@joelpittet did you make it works, I have the same problem?
I'm going through test example in patch but can't figure out what I'm missing.
Comment #145
toncic CreditAttribution: toncic commentedI was blind and haven't seen that I'm missing interface 'SyncableSourceInterface'.
I got different error now,
Undefined class constant 'MISSING_SOURCE_ITEM' in /app/web/modules/contrib/migrate_tools/src/MigrateExecutable.php on line 384 #0 /app/vendor/drush/drush/includes/drush.inc(232): Drupal\migrate_tools\MigrateExecutable->rollbackMissingItems()
.Any idea what I'm missing now?
Comment #146
benjifisher@toncic:
Try upgrading Migrate Plus to the latest version (8.x-4.2). See #3045346: Add missing source item event.
Comment #147
benjifisherI had a few small suggestions in #139 that should still be addressed if we continue with this approach. But I will leave the issue at NR because it would be nice to get some additional testing and because the discussion started in #133 might take this issue in a different direction.
Comment #148
Iain.Madder CreditAttribution: Iain.Madder commentedWondering if anyone could advise on what i'm doing wrong to get the latest patch to work. It's blocking us upgrading to 8.7 and I'm stumped on the best solution.
Currently using #81 with 8.6, but while testing the upgrade to 8.7 composer fails to apply #81.
Testing #136 fails too due to the relative paths changing in the patch. #81 references a/core/modules/migrate/src/ while #136 references a/src/.
Is this an issue with the patch referencing the wrong location and so just needing recreating with the right relative path? if not, how would I correctly apply the patch?
For reference this is how I currently reference the patch in composer.json:
Thanks in advance for any help
---------------
Post edit:
Solved by naveenvalecha's advice in #149. I was completely blind to the incorrect module reference.
Comment #149
naveenvalecha#148,
Change the project in patches section to drupal/migrate_tools to apply this patch using composer.
Comment #150
malaynayak CreditAttribution: malaynayak as a volunteer and at TA Digital commentedI have applied the patches below, with drupal 8.7.1:
https://www.drupal.org/files/issues/2019-04-22/2809433-136.patch
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-missing-items-from-source-2863426-20.patch
Migration rollback
drush migrate-rollback id --missing-from-source
giving the below error:[error] Migration '@id' does not support rolling back items missing from the source
.Thanks in advance for the help.
Comment #151
benjifisherI updated the issue summary. In addition to some smaller change, I added testing instructions, basically summarizing what I have said in previous comments.
Comment #152
charginghawk CreditAttribution: charginghawk as a volunteer commentedJust to give an update, today because the old core patch failed to apply against 8.7.1, I switched to using these patches:
https://www.drupal.org/files/issues/2019-04-22/2809433-136.patch
(Changed to apply against migrate_tools, not drupal core like the previous version.)
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...
After updating migrate_plus and migrate_tools, I had to update my source plugin to implement SyncableSourceInterface (note that you can implement multiple interfaces - my plugin was already implementing ContainerFactoryPluginInterface), and I added "use SyncableSourceTrait;" inside the class (per the example in the issue description).
Working hunky dory so far.
Comment #153
klidifia CreditAttribution: klidifia commentedTests fine for me:
I applied this patch and also https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...
Use the syncable_url data source plugin that you can create based off the syncable_embedded_data one in this patch.
I can then import as normal: drush migrate-import {name} and get rid of things that are no longer in the source with: drush migrate-rollback {name} --missing-from-source flag.
Was wondering if the SyncableUrlSource plugin that the testing instructions refer to could be included instead (or as well as) the SyncableEmbeddedDataSource.
Comment #154
mErilainen CreditAttribution: mErilainen at Wunder commentedHow can I implement a "SyncableUrl" source plugin which would extend the Url source plugin from migrate_plus?
I tried to copy the example SyncableEmbeddedDataSource but I get an error
"PHP Fatal error: Trait 'Drupal\my_module\Plugin\migrate\source\SyncableSourceTrait' not found" because there is just
use SyncableSourceTrait;
and that points to the same directory.
I tried also to do:
use Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait;
but then the error is
"PHP Fatal error: Trait 'Drupal\my_module\Plugin\migrate\source\Drupal\migrate_tools\Plugin\migrate\source\SyncableSourceTrait"
Edit:
Apparently I had to include it twice, once at the top of the file and then inside the class. Here is the source plugin code:
Comment #155
benjifisher@mErilainen:
When you have finished testing, I hope you will share the results (positive or negative) on this issue.
I think there are only minor differences between your code snippet and the one in the issue description.
Perhaps you are not familiar with two different uses of the keyword
use
in PHP:Comment #156
klidifia CreditAttribution: klidifia commentedUpdated code in #154 is how I have implemented my syncable_url and works well.
One other thing I note: in the UI for migrate you can import / rollback etc, but no way of including this flag, would be a nice inclusion! :)
Comment #157
ronchica CreditAttribution: ronchica commentedThere seems to me to be a contradiction in the manual steps to use the --missing flag.
Step #3 says to apply a patch to migrate_run, which implies that migrate_run is installed and is being used to run the drush migrate commands. But in order to use migrate_run, migrate_tools has to be uninstalled. This won't work. Or am I missing something?
Comment #158
benjifisher@ronchica:
Sorry, I referenced the wrong related issue. I meant to link to #2863426: Implement rollback of items no longer in source data. I will update the testing instructions in the issue summary.
Thanks for pointing out the mistake.
Comment #159
benjifisher@klidifia:
I looked at the interdiff you attached to #153, and it looks as though you addressed two of the points I raised in #139. Thanks for that. There is still an unnecessary type hint on
$this->source
.I think that #152 and #153 satisfy the request for manual testing.
Comment #160
heddnThere's a lot of moving piece here. I haven't had the attention to really focus on and create an example patch for what I said in #138. But having to add an interface to make something syncable seems too heavy handed. It would be much nicer if here in tools there were a flag to
drush mim foo_migration
to trigger syncing. When this flag is set, it triggers the executable to loop over each source and destination and when it finds an item missing, it triggers MISSING_SOURCE_ITEM event. Then we have an implementation of the event that removes the item and removes the ID Map entry.Using the event lets folks change the default response of delete by registering their event before this modules and stop propagation. Then they can do whatever they feel like if the default of delete is wrong.
Yes, I know that without an interface we have to do an inefficient loop over all the records. But it doesn't require changes to any existing migration yamls or interfaces. Later, we can tweak things and add efficiencies. For example, if a Sync interface exists on the source, then use its optimized method and bypass the inefficient loop.
Comment #161
agileadamFor what it's worth, the following patches are playing well together after going from Drupal 8.5 to 8.7.3.
It took me some time to find a working combination.
I have updated all of my migration YML files to use the "syncable_url" source plugin, which I implemented as shown in #154 above.
/modules/custom/mymodule_migrate/src/Plugin/migrate/source/SyncableUrl.php
Comment #162
damontgomery CreditAttribution: damontgomery commentedI have tested the patches (using #20 for the drush bit though) as described in the issue description (thanks, Benji!) and they work as described. These many years later, thanks for implementing a more sustainable approach.
We are using
#153 here
#20 for https://www.drupal.org/project/migrate_tools/issues/2863426 since we are using Migrate Tools 8.x-4.1.
We added the
implements SyncableSourceInterface
to our source class (already had one). We also added theuse SyncableSourceTrait;
and in the__construct()
method, we added$this->setAllRowsFromConfiguration();
.Our class has the following use statements as well if your IDE doesn't add them,
Module versions:
Thanks again.
Comment #163
benjifisherWe discussed this issue at the #3067311: [meeting] Migrate Meeting 2019-07-11.
@heddn clarified his suggestion from #138 and #160. The idea is to keep the changes localized in the
MigrateExecutable
class from the Migrate Tools module. In order to loop through all source items for a migration, without affecting the state of the source plugin, we can clone the object representing that plugin. We might also have to clone the destination plugin.A few quotes from @heddn in that discussion:
Neither I nor @marvil07 have time to follow up on these suggestions in the near future.
Comment #164
kevinc_ CreditAttribution: kevinc_ commentedReally appreciate the work and thinking that has gone into this issue development.
We had this implemented and working on a client project but the rollback stopped working after we applied 8.7.3. Previously, we've used the url plugin and the patches in #161.
So, based on the error ('does not support rolling back items missing from the source'), I can see that's because my source plugin is not an instance of SyncableSourceInterface.
I've used the SyncableUrl from #154 for my plugin and updated my migration config to look like this.
I'm still not able to get this working again even with the patches and the move to the new plugin. I'm also confused why my version was working with the url plugin previously.
Any thoughts gratefully received!
Comment #165
mErilainen CreditAttribution: mErilainen at Wunder commentedI have pretty similar setup, and everything works with core 8.7.4 and these patches:
https://www.drupal.org/files/issues/2019-05-31/2809433-153.patch
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...
Config:
Comment #166
kevinc_ CreditAttribution: kevinc_ commentedThanks @mErilanien - my bad!
I changed the config in my migration module but had forgotten that was the install directory.
Updated the config in my sync directory, imported the config and it works great.
Comment #167
quietone CreditAttribution: quietone at Acro Commerce commentedI too don't have time for some months to work on this again.
Comment #168
ronchica CreditAttribution: ronchica commentedI am confirming that the patches are working for me:
https://www.drupal.org/files/issues/2019-05-31/2809433-153.patch
https://www.drupal.org/files/issues/2019-01-22/migrate-tools-rollback-mi...
I have also created a plugin that works with migrate_source_csv.
Thanks for this effort!
Comment #169
kumkum29 CreditAttribution: kumkum29 commentedHi ronchica, kevinc_, mErilanien,
I need some clarifications to understand how this solution works (#168).
I we use this source plugin (syncable_url), and if we use the Drush command, a RollBack is launched on the items. So, these items are deleted, isn't it ?
How can I proceed If I want unpublish (and not delete) these items?
Thanks.
P.S. After having tested this solution, the items are deleted. During the RollBack, is there a solution to unpublish and not delete items? (alter the delete function) ?
Comment #170
Stockticker CreditAttribution: Stockticker as a volunteer and at Internetdevels, Drupal Ukraine Community commentedHere's the small addition to the #153 where the possibility to "Rollback Missing" items has been added to the Drupal UI.
Comment #171
ccarrascal CreditAttribution: ccarrascal at Johnson & Johnson commentedHi,
I can confirm our setup is also working in 8.7.6 with:
#170 from here (https://www.drupal.org/files/issues/2019-08-09/2809433-170.patch)
#29 from #2863426 (https://www.drupal.org/files/issues/2019-07-18/2863426-29.patch)
Thanks a lot for the effort!
Comment #172
joelpittetI am trying things out but can't seem to get it to not rollback everything still. Looks promising though.
Comment #173
joelpittetThis is a bit of a hack but I'm using SqlBase so I had to overwrite the sourceIds
Comment #174
john.oltman CreditAttribution: john.oltman commentedThis adds the syncable_url plugin as a patch to migrate_tools so you can avoid a custom module if you want.Failed patch, see below for the fix.
Comment #175
john.oltman CreditAttribution: john.oltman commentedComment #177
john.oltman CreditAttribution: john.oltman commentedSorry, fixed the patch, use 176 not 174.
Confirmed working on Drupal 8.7.6 with all 3 of these patches:
#29 from #2863426 (https://www.drupal.org/files/issues/2019-07-18/2863426-29.patch)
#170 from here (https://www.drupal.org/files/issues/2019-08-09/2809433-170.patch)
#176 from here (https://www.drupal.org/files/issues/2019-08-30/2809433-176.patch)
As noted above, patch 176 replaces the need for a custom module and adds the necessary plugin to migrate_tools. If you go the custom module approach for the syncable_url plugin, you do not need patch 176.
Did a rollback missing migration and the items no longer in the source were deleted.
For those adding patches via composer, make sure to use a different issue name in composer.json in the patch definitions for 170 and 176, otherwise only one will be applied.
Running with drush:
Config:
Comment #178
ronchica CreditAttribution: ronchica commented@kumkum29 - sorry, just noticed your question on unpublishing.
To unpublish instead of deleting, in a custom module, I extended MigrateExecutableBase and wrote a custom function 'unpublishMissingItems'. That function looks a lot like Migrate Tools rollbackMissingItems, except it unpublishes the missing items instead of deleting them. There is a custom drush command to run it.
Comment #179
kumkum29 CreditAttribution: kumkum29 commented@ronchica,
I have created a custom module with an extended rollbackCurrentRow() function. In this function, before to delete the row, I call a new function (UnpublishEntities) :
In this new function I get the missing IDs and unpublish node and medias no present in the source (xml)...
Comment #180
jmaties CreditAttribution: jmaties at Geekia commented@ronchica,
How can works with migrate_source_csv?
Comment #181
heddnThis is not completely tested. But it might be what is needed. There's a new '--sync' option added for drush import command. Still needs tests.
No interdiff, as this takes a wildly different approach to the whole thing.
Comment #182
heddnAnd here's some tests, which demonstrate that this PoC actually can work. Can we get some manual testing too?
Comment #183
heddnComment #184
andypostHere's a patch to isolate dispatching which is different in new Symfony (but core's dispatcher still using old arguments)
if there's lots of source records they may not fit into PHP memory limit
idmap will left in the prepare state after clean-up done?
maybe use kind of hash for source ids?
also destination could be null, then probably no reason to dispatch events
Comment #185
edysmpI did a manual test with #183patch applied.
Migrating 4 items from a json file to a taxonomy.
1. `drush mim guitars` - Shows 4 items created. Terms are created.
2. (Delete one item from the json file)
3. `drush mim guitars --sync` - shows 0 created, 0 updated... The term is deleted.
4. Checking the map table:
The `source_row_status` column has been set to 1 for all rows. Is it the expected?
the deleted item is still shown in the map table. Seems that the idmap isn’t getting updated.
5. `drush mim guitars` -- shows 3 updated. Only the deleted item remains to 1 in the map table.
6. `drush mr guitars` - shows Rolled back 4 items - map table is empty.
Comment #186
heddnSounds like a little work is needed. Quite likely we need to move the rollbacks of non-existent source items in a pre import event. And we should add an assert to the test that the id map only has 2 entries in it after the rollback. I have to move on to some other things today, but maybe someone else can pick up from here?
Comment #187
heddnFirst a re-roll.
Comment #188
heddnHere's some fixes and additional test coverage.
Comment #189
heddnYeah! Green again. Now for another round of manual testing from folks.
Comment #190
chriscalip CreditAttribution: chriscalip commentedIs this incoming feature accommodate an option to unpublish instead of delete? I have datasources that would periodically not include deactivated records but will likely add them in the subsequent runs.
Comment #191
edysmpLooking good with the latest patch. Things are as expected now.
I am still curious in #184 comment about possible PHP memory limit issue by creating an array with a bunch of items.
Comment #192
heddnre #190, that sounds like a new issue. This does what is suggested in the title and IS and deletes the content. If you want to keep the items and just unpublish, you could review what is going on in this issue and easily build your own subscribers to do such things.
re #184: memory limits, let's handle such memory limits in follow-ups. PHP can handle very large PHP arrays. But if someone is trying to sync more then some finite number of items, the solution put forth here might work. Or it might not. I just depends on so many variables. If it becomes an issue, it almost seems like a custom solution for that specific site might be needed. This by necessity has to be a very generic solution because we don't know the nature of the sources. If we did, then we could probably do some fancy SQL or CSV filtering, etc.
So I say, wait for bug reports on memory limits, figure out what those generally are... then update docs for
--sync
to mention what those limits are.Comment #194
heddnThank you to everyone who's stuck with this issue over its long and varied life.
Comment #195
TrevorBradley CreditAttribution: TrevorBradley commentedThis isn't set as closed (fixed), so I thought I'd pop here with a question. I'll start a new ticket if this belongs elsewhere. I'm using 4.x-dev#0adfcf3
How exactly is this supposed to work? From the latest patch, it looks like there's a new "sync" option, but drush rejects this from the command line:
It looks like migrate_tools.drush.inc' 's migrate_tools_drush_command is missing a 'sync' option in migrate-import. If I add it there, the error about sync not being an option goes away, but the code about sync doesn't appear to run.
I poked around at the code a bit, but can't seem to work out what's missing. I have no idea how @edysmp's commands in #183 worked with the current code base.
I'm using migrate_source_csv - is it possible there's something else I need to set up in my migrate configuation?
(continuing to investigate...)
Comment #196
heddnThis is only added for drush 9 and drush 10. Support for drush 8 (which uses drush.inc files) is best effort. Drush 8 hasn't been supported for quite some time, and this module only backports things that are super easy. This wouldn't be hard, but it isn't in the super easy camp either.
Bite the bullet and upgrade to Drush 9+.
Comment #197
TrevorBradley CreditAttribution: TrevorBradley commentedAwesome, I knew I was missing something. I'll get working on that drush upgrade immediately.
Thanks @heddn!
Comment #198
TrevorBradley CreditAttribution: TrevorBradley commentedOK, back for one more thing. This is 80% chance of another user config error, but I'm seeing something else that might be a real bug.
I'm now running Drush 10.0.2, and PHP 7.2.14
When I remove lines from my import file, then try to run a sync through drush, it seems to work (removing removed lines), except it throws an error:
As best I can tell, $source->rewind() really is being called twice. The first time from MigrationImportSync::sync() at line 57, and then again in MigrateExecutable::import() at line 188.
Any ideas what might be wrong here?
EDIT: Investigating further - getSourcePlugin returns an instance of Drupal\migrate_source_csv\Plugin\migrate\source\CSV - I'm betting the fault lies there. (It's the only thing I'm really doing differently at this point)
EDIT2: Not sure about this... all the $source->rewind() and $source->next() are entirely within migrate_tools.
Comment #199
heddnInteresting delema. Sounds like there the use of a generator (yield) somewhere in the source. Is there any other way to process the source (maybe via an object clone) 2 times?
Let's open a new issue for handling generators more cleanly.
Comment #200
TrevorBradley CreditAttribution: TrevorBradley commentedOK, created #3096252: Cannot use Source Plugin that uses Generators with import sync.
Comment #201
charginghawk CreditAttribution: charginghawk as a volunteer commentedIf anybody switches from the old method (using SyncableSourceTrait) to the new method, please add a comment describing the upgrade path. Thanks!
Comment #202
Tim Corkerton CreditAttribution: Tim Corkerton as a volunteer commentedIs there any documentation on how to make this work? I just read through all the comments and am unsure how to implement this feature. I have just upgraded to Drupal 8.8. I am using Drush 9.7.1 and have the following contrib versions:
Migrate Plus 8.x-4.2
Migrate Tools 8.x-4.5
Migrate Upgrade 8.x-3.1
Is #201 suggesting that it simply a case of running
drush mim my_migration --sync
when I do this I get
Rolled back 296 items - done with 'my_migration'
[notice] Processed 0 items (0 created, 0 updated, 0 failed, 0 ignored) - done with 'my_migration'
which results in all my content being rollback despite there being no changes in the source
Comment #204
sannminwin CreditAttribution: sannminwin commentedI wasn't able to apply patch #188 on migrate_tools 4.4.0
it failed with error.
Could not apply patch! Skipping. The error was: Cannot apply patch https://www.drupal.org/files/issues/2019-11-07/2809433-188_2.patch
Drupal version : 8.6.16
PHP binary : /usr/bin/php7.1
Drush version : 9.5.2
Comment #205
xurizaemon@sannminwin seems this was fixed and is available in 8.x-4.5? If you're able to update to that release it, that'd be my recommendation.
Comment #206
nortmas CreditAttribution: nortmas commentedUnfortunately, in some cases, the only Drush command is not enough. This addition to patch #170 fixes the issue when the source is not available the status is getting stuck at "migration rollback" and all further migration are not possible. So this patch adds the fix to just reset the status to "Idle" in case the source is not available.
Comment #207
nadavoid CreditAttribution: nadavoid commented@nortmas Could you create a new issue to address the scenario you're describing?
Comment #208
harsh.behl CreditAttribution: harsh.behl commentedI can confirm, after applying https://www.drupal.org/project/migrate_tools/issues/2809433#comment-1322... patches it worked for 8.9.13 but 2nd patch not applied gracefully. So, did some modifications and manual work. Also Do not forget to create source plugin.
Comment #209
NicholasS@tim-corkerton So from what I can tell adding the `--sync` flag. Should delete items that were previously migrated but no longer in the source feed and that works. But you are correct this seems to be a "simple" delete everything, then insert everything found in the feed. Resulting in new Node ID's every time you run a migration. Nothing is actually updated.
I had to apply this patch https://www.drupal.org/project/migrate_tools/issues/3104268 and now my migrations with a --sync properly update existing content.
I even tried change the ID from integer to string but without the patch from 3104268 #3104268 --sync just deletes everything then inserts new records from the source.
[UPDATE: SOLVED]
You need to use the patch from #3104268 and be on `"drupal/migrate_tools": "^5.0"` see https://www.drupal.org/project/migrate_tools/issues/3110335#comment-1410...
Comment #210
ddavisboxleitner CreditAttribution: ddavisboxleitner at Palantir.net commentedIn 9.1 the order of the arguments to dispatch() changed, and so the previous patches stop working in those higher version of Drupal. In testing, just re-ordering the arguments in all calls to dispatch() works to restore this feature to functioning.
https://www.drupal.org/node/3154407