Hello all, it’s time for the weekly migration initiative meeting. The meeting will take place in slack in various threads

This meeting:
➤ Is for core migrate maintainers and developers and anybody else in the community with an interest in migrations
➤ Usually happens every Thursday and alternates between 1400 and 2100 UTC.
➤ Is done over chat.
➤ Happens in threads, which you can follow to be notified of new replies even if you don’t comment in the thread. You may also join the meeting later and participate asynchronously!
➤ Has a public agenda anyone can add to here.
➤*Transcript will be exported and posted* to the agenda issue. For anonymous comments, start with a :bust_in_silhouette: emoji. To take a comment or thread off the record, start with a :no_entry_sign: emoji.

0️⃣ Who is here today? Anything interesting to share over the new year?

heddn heddn here!
Nick Wilde (he/him) - DC 2020 DevOps Session Team Nick, he/him, contrib, frequent migrate user.
webchick Here, though distracted. I went to the gym once in 2020 so feel like my fitness goals are now met. :wink:
wimleers (he/him) :wave:, from :european_castle: :flag-be:. I ran my fastest 5K ever on the last day of 2019, looking to improve it in 2020 :smile: (edited)
gabesullice :wave: from Colorado :flag-us:

1️⃣ What do we need to talk about, other than the NR issues?

gabesullice 1I'm curious about the purpose the migrate_upgrade module.@heddn said:All things related to upgrading from a previous version of drupal, without regard to if that means drush or web ui or whatever should really land in migrate upgrade. So while drush is the only mechanism in migrate upgrade, that doesn't mean someone could add a web ui.but the project page specifically says:This module provides drush support for upgrading from Drupal 6 or 7 to Drupal 8. The plan is for this drush support to be added to drush itself at some point, at which point this module will become obsolete.source of heddn's quote: #3097336: Improve the DX for migrating content entities: view modes, fields, formatters, widgets etc should be migrated per entity type + bundle#comment-13390769 (edited)
gabesullice I suppose that the project page is simply outdated
gabesullice 2I've heard a few anecdotal comments that out-of-memory issues are a frequent problem with migrations and this is confirmed by some foggy memories of past migrations I've written. I'd like to dig into that. I'm wondering if y'all know of any important issues related to memory problems that I could put on my radar. (edited)
mikelutz migrate_upgrade serves to convert core migration plugins into migrate_plus migration config entities.
mikelutz Throwing all the outdated documentation away, that really what it’s there for.

2️⃣ If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up

mikelutz #3061571: If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up
wimleers (he/him) WDYT about my proposal? :slightly_smiling_face:
mikelutz I think the concept is a fine way to go, at least as an interim step. Still wish we could do something with php_code.
mikelutz But the fact that we can’t shouldn’t stop this from going forward.
wimleers (he/him) The best thing I can think of with php_code is to replace the and with PHP-CODE-REMOVED-STORED-IN-KEY-VALUE-AT-%KEY . With the value for %KEY% of course being everything in those PHP tags. That’d allow us to safely remove that filter, while not losing the data. (edited)
mikelutz I wouldn’t mind stripping it out by default if the contrib php_code module isn’t installed. I see no real reason to try to save it in kv store though. At that point you need a developer to make it useful anyway, so you might as well have the developer do a custom migration if you want the data for something.
mikelutz Remember, not migrating something isn’t data loss. You can’t lose the data if it’s still in the legacy database.
mikelutz Could also do a core filter that strips the php code out without executing it. We would have to deal with the contrib php_code filter with the same name though. Maybe have a core remove_php filter that does that. Then, we map php_code to that IF the php_code module isn’t installed. Then the data is there, but we don’t execute the php. Not sure how the sec team would feel about that though.
wimleers (he/him) At that point you need a developer to make it useful anyway, so you might as well have the developer do a custom migration if you want the data for something.My proposal allows a non-developer to proceed now and enables a developer in the future to bring back this functionality.
wimleers (he/him) Could also do a core filter that strips the php code out without executing it. We would have to deal with the contrib php_code filter with the same name though.No, we could make core do a hook_migration_plugins_alter() that checks if the contrib module is installed. If it’s installed, it returns early. If it’s not installed, it would map D7's `php_code` filter to a D8 php_code_no_op filter.
wimleers (he/him) That would enable sites that want to keep using PHP filter to keep using it, and gets everyone else on Sanity Island.
mikelutz Right, basically what I meant.
wimleers (he/him) Oh ok :slightly_smiling_face:
wimleers (he/him) Great!
mikelutz I just meant we couldn’t call it php_code, we would have to call it something else and manage the use case that the contrib module was installed.
wimleers (he/him) But let’s leave php_code out of the picture for a moment
wimleers (he/him) That’s one very clear scenario.
mikelutz Yeah, that’s totally a separate issue from this one.
wimleers (he/him) The “I used a D7 contrib filter” scenario is much more widespread.
mikelutz The only question here is really what other filters need to be on the list.
wimleers (he/him) I will volunteer to go over ALL D7 modules providing filters.
mikelutz @heddn any objection to the approach here?
wimleers (he/him) I have two main questions:would you RTBC this approach?would you accept having a list of contrib module things in core? Because that’s what this needs.
mikelutz I think I would prefer if the list could live somewhere in migrate_drupal as opposed to filter.
mikelutz That seems like a more appropriate place to keep contrib related data lists in general.
wimleers (he/him) WFM. So you’re saying that migrate_drupal should override the FilterID plugin implementation, to add contrib filter types to D7 filters. Right?
heddn @mikelutz @wimleers (he/him) I'm down with the approach :slightly_smiling_face:
wimleers (he/him) :partyfrog:
heddn I really don't like alter hooks though with migrations.
heddn any way we can do this more natively, would be nice.
wimleers (he/him) Which alter hooks? There’s no alter hooks in my proposal?
mikelutz I don’t know exactly. I see why it makes sense in filter, and why putting it in migrate_drupal is tricky, maybe it’s not that important to move it out of filter, It just feels better to me to have static data relating to d7 contrib modules live in migrate_drupal.
heddn @wimleers (he/him) hook_migration_plugins_alter is what I'm referring. if it isn't currently there, fine. But in the scrollback, it was mentioned :slightly_smiling_face:
wimleers (he/him) @heddn Oh, but that was only when talking about php_code, which is not what #3061571: If no Drupal 8 equivalent filter is found during migration, drop transformation-only filters, so that body fields show up is trying to solve :slightly_smiling_face:
wimleers (he/him) @mikelutz Aight, I’m looking forward to pushing this further then :smile:

3️⃣ Cannot update some file properties during migrate update

mikelutz #2822545: Cannot update some file properties during migrate update

5️⃣ MigrateExecutableMemoryExceededTest has mismatched argument type mock expectations (and fails in PHPUnit8)

mikelutz #3102903: MigrateExecutableMemoryExceededTest has mismatched argument type mock expectations (and fails in PHPUnit8)

6️⃣ Create a way to declare a core migration yml file as deprecated

mikelutz #3039240: Create a way to declare a plugin as deprecated
wimleers (he/him) I pinged Tim Plunkett a few times. Don’t want to bother him again :stuck_out_tongue: Other strategies to move this forward?
heddn @wimleers (he/him) did you see #23 and #25?
wimleers (he/him) I did not!
wimleers (he/him) Sorry!
wimleers (he/him) Commented :slightly_smiling_face:
webchick Tim is also going to be specifically focused on D9 stuff work-wise starting Monday, so that should hopefully help.
wimleers (he/him) Right!

7️⃣ system_maintenance migrations uses incorrect maintenance mode variable in Drupal 7 migrations

mikelutz #3096676: system_maintenance migrations uses incorrect maintenance mode variable in Drupal 7 migrations

8️⃣ Migrate D6 and D7 node revision translations to D8

mikelutz #2746541: Migrate D6 and D7 node revision translations to D8
mikelutz #3076447: Migrate D7 entity translation revision translations

9️⃣ getHighestId() should be able to use any integer destination id

mikelutz #3091004: getHighestId() should be able to use any integer destination id

🔟 Migration memory management thoughts and issues. (edited) 

gabesullice "thoughts and issues" is a perfect heading :slightly_smiling_face:I'm sure that y'all have lots of thoughts about them and could give a some really valuable high-level pointers
mikelutz #3006750: Move memory management from MigrateExecutable to an event subscriber
mikelutz #2688297: File migration slows down and eats more and more memory, eventually stops
mikelutz Generally, given the amount of data it can process, the migration system does a really good job managing memory. The file issue is one anecdotal issue I’ve seen pop up that we haven’t been able to reliably reproduce.
gabesullice yeah, reproducibility has got to be the biggest issue whenever memory issues arise
mikelutz There are stories of people that hit a wall with 6-7 figure numbers of rows, but in my experience the memory usage stays constant no matter how many rows are being processed. I’ve always figured the culprit was likely poorly written custom plugins and hooks.
gabesullice I bet you're right about the source of the issues. Do you have a concrete example of a common mistake made in that custom code?I'm thinking that maybe an API change, documentation, or some magic code (or a combination of all 3) might make it easier to detect and fix those problems. (edited)
gabesullice For example, maybe we get total memory usage before a plugin is invoked, then get it after, and do some kind of analysis to identify which plugin is most responsible for memory usage increases
heddn @gabesullice it has never been reproduceable enough to even start reacting. and no one with enough know how has seen this problem (probably those in the know aren't use web UI and using drush instead)
mikelutz Yeah, I don’t have a concrete example, but something like a runaway static cache for some data set could potentially do it.
mikelutz But I can’t say I’ve ever seen someone do that specifically.
chriscalip Also part of the mix is that if the targetBundle has many attributes and the complexity of the fieldType of those attributes. eg. (my node bundle, xxx, has many fields.. not just text fields but address,geofields,etc..)
heddn we did a two d6 and one d7 => one consolidated d8 site migration with 800+ fields and loads of data. all via drush and there were no memory issues. :shrug: (edited)
chriscalip how many items in play? less than 10k? what types of fields? text fields?
gabesullice no one with enough know how has seen this problem (probably those in the know aren't use web UI and using drush instead)I'm trying to find ways to make it easier for developers with zero prior migrate experience to write their migrations.
heddn exactly, really hard to pin down isn't it :slightly_smiling_face:
gabesullice I feel like this is probably a sort of chicken and egg thing. If you're a dev who wrote code that leaks memory, you might not be a dev that's able to isolate the problem and create a reproducible scenario (edited)
heddn how often does memory management become an issue? It seems rare.
chriscalip above 10k of items
gabesullice I've heard a few anecdotal comments that out-of-memory issues are a frequent problem with migrations and this is confirmed by some foggy memories of past migrations I've written.
heddn @chriscalip if you have a db dump that we could play with that seems to systematically reproduce the issue, let's see if we can get an NDA in place and share that thing :slightly_smiling_face:
gabesullice Maybe a Twitter poll could help us take a pulse on the problem (if one exists at all)? (edited)
chriscalip @heddn i'll follow-up with you on that; my bet is given a big enough csv source with a diverse mix of fields on the target bundle I can reproduce troubles without actual db. just generated csv source and a target bundle config.
heddn @chriscalip but that isn't the same thing. CSVs are notoriously bad at memory management from a source. Drupal to Drupal is SQL and much better at memory
chriscalip @heddn i see for the sake of limiting scope, i bet a drupal source would also show a similar experience.
gabesullice Proposed tweet:If you've run any migrations into Drupal 8 with the Migrate module, have you encountered an error like "Fatal error: Allowed memory size of NNNNN bytes exhausted"?YesNoN/A; show the numbers(edited)
heddn @gabesullice can we word via web ui?
heddn @gabesullice and please paste the link in here and I'll retweet for us
chriscalip haha exhaustion even happens on cli
chriscalip well remote cli.. i believe pantheon has 250mb limit on most of their plans. (edited)
gabesullice Yeah, I'm not sure that the web vs drush distinction is important.
gabesullice Maybe I'm missing something.
heddn @chriscalip but CLI uses a different executable. there's enough moving parts I want to be sure we aren't finding a bug in contrib here
chriscalip ahh limiting scope, thats fair.
gabesullice hmm
heddn @gabesullice it is important. there's core's executable, then migrate_tools has a batch and a non-batch version.
heddn And the wording should ask for folks to reach out with anecdotal stories so we can see if there is a trend
gabesullice Good point about the stories.
gabesullice the reason I'm hesitant to limit scope and you're is because I think we're trying to answer a different question.
gabesullice I'm trying to test whether "are memory issues are a hard problem in the migrate ecosystem?"
gabesullice And I think you're approaching it as "does the migrate module have a memory problem"
heddn k
gabesullice If the answer to my question is "yes" then my feeling is that we could do things to help people in the ecosystem find their mistakes more easilty (edited)
gabesullice Updated tweet:If you've run migrations into Drupal 8 with the migrate module, have you encountered an error like "Fatal error: Allowed memory size of NNNNN bytes exhausted"?If yes, please reply with anecdotes :slightly_smiling_face: i.e. what happened? what did you change? were you using drush or the UI? etc.YesNoN/A; show the numbers
gabesullice @heddn how's that?
heddn @gabesullice nice
gabesullice :thumbsup:
gabesullice https://twitter.com/GabeSullice/status/1212860036428357632?s=20
kevinquillen Using drush only, I would get it on Pantheon a lot with 30,000 records, especially if they dealt with files and paragraphs. I had to use jq to break the source (a json file) into a dozen files to process them within pantheons memory limits.
kevinquillen Ones that dealt with fields that are neither file nor paragraphs? Totally fine
heddn @kevinquillen I'd expect that w/ a large JSON source. I doubt there is much that can automagically be done to shrink that source down is there?
gabesullice https://github.com/salsify/jsonstreamingparser
gabesullice quick google, just out of curiosity
gabesullice https://github.com/salsify/jsonstreamingparser/blob/master/src/Listener/...
kevinquillen Hm would have to give that a go, came up with the file chunking in a time crunch
gabesullice yeah, if jq worked, by all means go for it!I think I saw a coupled JSON related OOM issues mention on the twitter poll above. If that's a frequent pain point, maybe migrate_plus could have a source that handles large JSON files by streaming them.
gabesullice just thinking out loud
gabesullice @kevinquillen, was your JSON basically a huge array of objects?
gabesullice I imagine that's usually the case
kevinquillen Yeah it was about 30,000 items in two files for two content types
gabesullice so like:[ {itemId: 1}, {itemId: 2}, ... {itemId: 29123}, ...]
gabesullice ?
kevinquillen Yep, maybe 12-15 fields each

Participants:

heddn, Nick Wilde (he/him) - DC 2020 DevOps Session Team, webchick, wimleers (he/him), gabesullice, mikelutz, chriscalip, kevinquillen

Comments

mikelutz created an issue. See original summary.

mikelutz’s picture

Issue summary: View changes

mikelutz credited heddn.

mikelutz’s picture

Issue summary: View changes

mikelutz’s picture

mikelutz credited webchick.

mikelutz’s picture

mikelutz’s picture

mikelutz’s picture

mikelutz’s picture

mikelutz’s picture

Status: Active » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.