Suggested commit message:
Issue #2213451 by andypost, bdone, benjy, penyaskito, chx, claudiu.cristea, damiankloip, gregboggs, InternetDevels, jessehs, jhedstrom, marvil07, mikeryan, pcambra, Xano, YesCT: Update Migrate API in core
Problem/Motivation
This patch brings everything in the migrate module up-to-date and incudes any changes and dependent files from migrate_drupal. Our plan is to have this API committed and then create separate issues for Drupal 6/7 to Drupal 8 migrations which are in the migrate_drupal module.
The patch was generated from the IMP sandbox with the following commands:
git diff upstream/8.x core/modules/migrate > 2208061_16.patch
git diff --diff-filter=M upstream/8.x core/modules/migrate_drupal >> filename.patch
git diff upstream/8.x -- `cat d6_files` >> filename.patch
d6_files.txt is attached, these are new/deleted files and they are necessary to get the changed files in migrate_drupal to pass the tests.
Remaining tasks
- Get this in.
- Submit a patch including all the D6 to D8 migrations. We are finished with this one (sneak peek) and that gives us confidence in the changes submitted here.
- Submit a patch including all the D7 to D8 migrations, just begun work. The data storage changed little compared Drupal 6+CCK and the completed Drupal 6 work includes CCK (and Profile) which gives us confidence that we will be able to tackle Field API easily.
- In parallel, create an UI in migrate_drupal. It is ~80% complete. Expected to be completed by the time this gets in. Another UI is expected in contrib.
User interface changes
This patch does not provide any UI changes. The main UI will be provided by migrate_drupal. It is being worked on.
API changes
The following API additions/changes are included:
- A dependency system supporting both soft and hard dependencies.
- A mechanism for terminating a process pipeline early
- A mechanism for skipping a row during process
- New entity + entity revision destination plugins and derivatives
- Every other destination plugin necessary (I can think of UrlAlias off head)
- A proxying password service for migrating MD5'd user passwords. This is borderline migrate_drupal material but we thought it was generic enough to be mainlined.
- The source and destination IDs are now in the plugins and not the migrations.
- New annotations.
- Countless cleanups, minor fixes, doxygen fixes etc
Comment | File | Size | Author |
---|---|---|---|
#27 | 2213451_27.patch | 395.25 KB | benjy |
#9 | 2213451_9.patch | 395.28 KB | benjy |
Comments
Comment #1
dawehnerFirst bit: Note: Creating such big patches is a pain for everyone using dreditor. It is nearly impossible to use, sorry.
Is there a special reason to not put that into the Plugin base annotation? Each plugin type potentially has derivatives
I wonder whether we can improve the variable name to figure out this difference.
Let's use at least \Drupal::entityManager now. There is simply no reason to not do that now but require another followup.
Do we really want to catch all exceptions? This could be problematic
If we update the docs we should do it properly. No idea what $highwater should be, and what the returned int could be.
I do like that this is moved away from the migration class but we maybe should remove the documentation that you can replace it now.
Wait: Time exists in multiple dimensions? ;)
Interesting usage!!
This sould really break in case someone decided to write their own totally different password service. Additional I don't know why this is part of migrate not migrate_drupal.
We could switch those two lines to have the docs in a bit better context place.
Can't really imagine that there are usecases for themes, but yeah you probably never know.
Nihilistic documentation
What is the reason we do that here? This could/should be explained.
Given that this is the storm default text i wonder whether this was intended.
These kind of documentation is really helpful. Thank you!
Comment #2
chx CreditAttribution: chx commented> Is there a special reason to not put that into the Plugin base annotation? Each plugin type potentially has derivatives
we wanted to establish the requirements check in the annotation and the class
Drupal\Component\Annotation\Plugin
does not have $derivative on it.> What is the reason we do that here? This could/should be explained.
The original intent was... to avoid re-instantiating the same class over and over again, plugins are usually stored in migrate entity properties. However, because of the iterator plugin and stubbing the getProcessPlugin takes a $process array argument. Either we could hash that array and store the resulting process plugins in the migrate entity or we can just do that we did here and have the process plugin manager keep them. Probably storing on the migrate entity by hash is better because then we can avoid re-normalizing as well.
Comment #3
chx CreditAttribution: chx commented> This sould really break in case someone decided to write their own totally different password service. Additional I don't know why this is part of migrate not migrate_drupal.
Because it's not uncommon for legacy systems to have MD5 passwords (for eg WordPress uses phpass since March 2008 (version 2.5) but it still supports plain MD5 passwords). The service is not doing anything just proxies through unless
enableMd5Prefixing
is called beforeuser_save
so there's no harm in just sitting there. Finally, breaking in case of a different password service: if it was defined before us, we save the service definition and call it so that case is covered. If someone overrides the service after migrate then we can't migrate passwords so the user migration giving up is also correct.Comment #4
dawehnerAny reason why many methods in this class aren't implemented?
Can't we just use $this->getDerivativeId() instead? This should result in the same bits.
Please ensure to have absolute namespace path.
Some one-liners in there would be cool to explain what all this code is doing.
Lovely exception!
As written above this is kinda problematic. Isn't there another way to do it, like some temporary override during the migration process.
I really wonder whether we maybe should do something better than just truncating, the problem is that this might lead to non-unique table names. Do you have an opinion about that? How did you dealt with that
in Drupal6/7?
Nitpick alarm: two spaces
WOW!
We usually don't use camelCase for local variables.
Should we maybe use strict mode here?
There are so many TODOs in there, we should at least document in the issue summary why we do that.
It would be great to @endcode as well. It is great to document this here!
There are a couple of places where we use serialize and a couple of places where we don't. Is there a reason behind that?
This seems to be a copy and paste error from some earlier patches. This just appeared in the diff due to the introduction of the setup method.
Comment #5
dawehnerI prefer the measurement-based performance approach but sure, you can just do whatever you like here.
Comment #6
benjy CreditAttribution: benjy commentedEdit: Thanks for the review dawehner.
#1
#4
Comment #7
benjy CreditAttribution: benjy commentedThis patch removes
MigrateProcessPluginManager
as mentioned in #6 to address the point raised in #1.13Comment #8
benjy CreditAttribution: benjy commentedFixed some comments and formatting errors picked up by PHPStorms "Inspect Code" feature.
Comment #9
benjy CreditAttribution: benjy commentedA few changes here, mainly:
system.site
CMI file and check the value ofslogan
but testing whether the slogan ends up in the theme where it should be would require essentially doubling the entire test suite which I believe noone wants. This is mitigated for entities by using the Entity API but raw CMI objects are not covered by any API and they are problematic. b) more, whether the source database structure and even more importantly our little dump contents match a "typical" site in the wild is anyone's guess. In this particular case, we ran afoul of b).Config
destination when we want to store NULL instead of the default. Yes, it's in the class doxygen.custom_block: false
describing a soft dependency) but thed6_block_plugin_id
process plugin still unconditionally requiredcustom_block
. Opsie.Comment #10
chx CreditAttribution: chx commentedComment #11
chx CreditAttribution: chx commentedComment #12
andypostSuppose this needs change notice because
1) API changed: plugin-types, new interfaces, format of yml-files
2) handbook pages are outdated
Comment #13
dawehnerThe changes in the interdiffs seems pretty fine. Yeah we could iterate until the end here but this looks like a good step forward now.
I do agree with andypost but last time I was proven to be wrong on the change record side of things.
Comment #14
webchickEscalating to major, since it's a Migrate unblocker, but due to how many changes are crammed into this patch, it would take me at least 4 hours to do final review, which means it won't happen for at least a week. :\ Hoping one of the other maintainers can get to it before then, or else that the patch is split up functional lines to make each chunk more like a 15-30 min review process, which I could pick away at over the next few days when I'm not traveling.
Comment #15
chx CreditAttribution: chx commentedI need policy clarified. Are we writing 8-8 change notices or not? berdir told me we don't now andypost + dawehner says we do.
Comment #16
dawehnerMy main goal is to be pragmatic. Noone will write migrations for contrib modules already, so why should we. On the other hand you could always argue that it is wrong to special case everything.
Comment #17
xjmIn general, as dawehner says, we're pragmatic.
Comment #18
chx CreditAttribution: chx commentedI have updated https://drupal.org/node/2141805 . So we need another...??
Comment #19
xjmNo, IMO that's plenty, so long as the handbook docs get updated. Thanks chx.
It's really unfortunate that this is such a huge patch. I can't imagine it was reviewed completely in 13 comments given its size and impact. @dawehner's review effort is awesome, but shouldn't something of this scale be reviewed by more than one developer?
Comment #20
chx CreditAttribution: chx commentedThe patch is definitely teamwork and is already reviewed by either benjy or me (and each other). I seriously doubt there's a single line of code that was seen by only one developer. I definitely did re-review, for example, all the tests very recently when I reworked them to become MigrateDrupal6Test, penyaskito yesterday was working a lot with the entity destination -- you get the picture.
Comment #21
dawehnerOne central problems of such mega-patches is that you never know whether changes are in scope or not, which makes it hard to figure out
the important parts of the patch. It would be just great if you could avoid such patches in the future ( I know you have something even bigger planned)
Comment #22
chx CreditAttribution: chx commentedI dearly badly hope we never need to do an API patch of this size again -- the API really should be done-ish as it migrates D6 just fine.
As for the next patch, we can split out the plugins first and then it's going to be YAML-Test-dump times 82 or so in one 600kb patch without any concepts or out-of-scope, just the boring parts. Plus MigrateDrupal6Test which is a real fun metatest. I am so proud of that one.
Comment #23
dawehnerThis is good to hear, thank you!
Comment #24
chx CreditAttribution: chx commentedNote that the commit message only includes that worked on these parts. The Drupal 6 migration was also worked on by axoplasm, kgoel, dclavain, jeckman, alvar0jurtad0, Drupali, fastangel gloob, jhedstrom, joshtaylor and mpgeek. People were already noticing the lack of attribution -- I would never ever do that.
Comment #25
xjmAlright, thanks! Assigning to Dries for review.
Comment #27
benjy CreditAttribution: benjy commentedCore removed the need for empty .module files and migrate_drupal.module used to be empty before this patch so the patch conflicted.
Comment #28
alexpottOne of the aims of beta is to have the API as complete as possible - therefore it makes sense for this to be a beta target.
Comment #29
benjy CreditAttribution: benjy commentedTo add to @alexpott's comments in #28, I feel that getting this in and then the D6 to D8 migrations before beta will give us invaluable feedback from developers who will be much more likely to start testing the migrate API.
Also, back to RTBC.
Comment #30
int CreditAttribution: int commentedWhy assigning to Dries for review? He is review at this moment? He asked to? Only he can do it?
This issue are RTBC for 6 days...
I do not want to be misunderstood, but I think assigning to dries, made everyone stop working/reviewing on this.
Comment #31
benjy CreditAttribution: benjy commented@int if you have a review or comments to add please put them in this issue.
The issue was set to RTBC and has been assigned to Dries for final review/committing but that doesn't mean you can't still chip in if you have something to add.
Comment #32
Dries CreditAttribution: Dries commentedGreat work! Glad to see progress on the migrate path. I just committed it to 8.x.
Comment #33
xjm@int, yes, the issue was assigned to Dries for review following discussion with him about the issue in a meeting. It was already RTBC, so there was no work to interrupt. :)