Problem/Motivation

The content element describes how and the order in which fields are displayed. This array is sorted by key and a weight is stored. This is a problem because moving fields in field UI will change weights and if the export of two such changes are merged then fields otherwise unchanged will conflict.

Proposed resolution

Do not store the weights. Rather, sort before saving, remove them and add them back on runtime. This makes for a very simple update, too.

This reverts some of #2388867: Notifying user of config changes when config has never been synched makes no sense / e6cd1fa5b7 however this approach also results in a stable config file so the problems fixed there won't reoccur.

Remaining tasks

Write tests, review, commit, party :)

User interface changes

none

API changes

none

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webflo’s picture

Status: Active » Needs review
FileSize
643 bytes
dawehner’s picture

I wonder whether we can have some test coverage for that.

In general I totally agree with the proposed solution. The common change is to change the priority not the machine name of a thing.

webflo’s picture

Issue tags: +Needs tests

Sure thats not a hard thing to do.

swentel’s picture

Yeah makes sense - I'm not sure if this has implications on Field UI - make sure we check this.

swentel’s picture

Also, I'm not sure it's useful to sort the hidden keys

dawehner’s picture

Also, I'm not sure it's useful to sort the hidden keys

Well they could also suffer from the reordering problem as well.

yched’s picture

Looking at the actual content of the yaml file, seeing components ordered by actual display order is much more readable.

Even when it comes to diffs, I'd say - if you move a component, then you see it removed somewhere (-) and added somewhere else (+), looks clear and reasonable ?

In fact, since Field UI's drag-n-drop renumbers weights anyway, the actual weights themselves are not relevant, and *that* leads to irrelevant diff hunks (moving one element around produces diff hunks for 'weight' changing in all the components that are after either its initial or final location).

If we stored the components in the right order but without a numeric weight, then the diffs would make more sense IMO.

Gábor Hojtsy’s picture

@yched: but what happens when you reorder AND change things? How can you tell that from the diff without looking very closely?

Gábor Hojtsy’s picture

yched’s picture

@Gabor:

what happens when you reorder AND change things? How can you tell that from the diff without looking very closely?

Like for code, yes.
Also, it's mentally closer to what you would see if you had a visual diff of a rendered node before/after the EntityDisplay change. Some parts have moved, some have changed, some did both. The parts that did neither do not appear in the diff.

I still think I'd prefer that, be for diffs or when looking at the content of one file, but well, i won't die if others feels strongly against it :-)

yched’s picture

Well, looks like @alexpott made "do not order by weight in exported YAML files" the official recommendation in http://chapterthree.com/blog/principles-configuration-management-part-one.

Still not really convinced personally, but I'll shut up now :-)

Berdir’s picture

I need this in #2248795: Entity displays fetch the bundle field definitions for deleted bundles, added it there as I also need to add it in a second place.

katzilla’s picture

FileSize
1.74 KB

Added the test for sorting

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 13: issue_2350537_comment_13.patch, failed testing.

alexpott’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
1.74 KB

No commit credit please - reuploading patch to preserve test result because it is unrelated and odd.

Berdir’s picture

Note that we have a slightly different patch in https://www.drupal.org/node/2248795

alexpott’s picture

@Berdir do you have a preference for doing the sort in the toArray() or preSave()?

yched’s picture

preSave() and toArray() are slightly different. preSave() changes the runtime object before it gets serialized to storage. toArray() is where the serialization happens. If we want to do stuff purely on the storage format, I'd say toArray() is the correct place.

Plus, doing the ksort in preSave() caused weird fails in #2248795: Entity displays fetch the bundle field definitions for deleted bundles that forced us to to the sort also in init(), where it doesn't belong. toArray() worked fine over there. See #37 / #38 / #43 over there.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

As per #18 and #20 then.

yched’s picture

Berdir’s picture

Sounds good to me. On mobile, so keeping it short.

alexpott’s picture

The thing is the tests break if you move the sort to toArray() because the when you save only what is written out is sorted so you have to reload the configuration entity from disk to get the new order.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Ghost of Drupal Past’s picture

Assigned: Unassigned » Ghost of Drupal Past
Ghost of Drupal Past’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
2.96 KB

Here's a patch, ground up approach, I will attempt reusing the test tomorrow.

I noted in the IS I am aware I revert the relevant changes from #2388867: Notifying user of config changes when config has never been synched makes no sense -- but that's not a problem, this patch achieves the same goal but without storing weights.

daffie’s picture

Straight reroll.

Status: Needs review » Needs work

The last submitted patch, 34: 2350537-34.patch, failed testing. View results

Ghost of Drupal Past’s picture

Issue summary: View changes
alexpott’s picture

So at the moment all the weights are changing but at least everything stays in the same place because everything is sorted by the key and that's the fieldname. We did that to fix hard to solve conflicts and incorrect merges when a field is intentionally moved. I'm not convinced that changing this to not store weight and not sort by display is a good idea. I think a better solution is to make the javascript weight tool only change weights when it has too. This would benefit all config that have weights that can be set by the javascript table row drag thingy.

Notes on the patch - if we do remove weight then we need to remove it from the schema too.

Ghost of Drupal Past’s picture

We are at an impasse. It seems no matter what we do, it confuses git greatly. Even if JS keeps the weights, if you have a field with weight 4 in one branch and a new field in the other branch has weight 4 , git knows nothing about our schema and will anchor on the weight: 4 line and make the mother of all messes.

But what about blasting the file apart ?

If the display settings of each field were a separate file, well, life is easy. Even if there is a weight conflict arising from reordering on both branches, there's no mess of different field display configs melting into each other, it's just one line which is trivial to resolve, easily scriptable, even. One could use the ours/theirs merge strategies per field as adequate.

Even better, we could do this only on export/import and keep the active storage the same.

  1. Add a new annotation key to config entities much like config_export say config_shard and have it contain a list of config key paths. This would not only handle entity displays but filter format filters, paragraphs could shard settings.handler_settings.target_bundles_drag_drop, field_group could shard third_party_settings.field_group etc.
  2. Shard file names are like core.entity_view_display.node.book.tile.yml.shard.content.body.
  3. Exporting. We could change the config.storage.sync service to return a sharding capable storage. Much like the new config.storage.export it could be "smarter" storage, one that talks to the entity type manager and collects a list of config entity prefixes and shard lists and then do the sharding in FileStorage::write. The latter just needs to match the name with a prefix, iterate the paths to shard, then iterate the results of a NestedArray::getValue and write. It sounds more complex than it is. :)
  4. Importing. Simply have the file storage look at filename.shard.suffix files read them and if there is nothing in the original then set them there.
alexpott’s picture

@Charlie ChX Negyesi that feels super super complex. Is this really worth it? Field changes are not that common and re-ordering the fields is even less common. And when they are conflicts between prod and dev you have a mess to sort out but it is a mess you've got into by allowing production and dev change at the same time. When this type of thing occurs on projects I've worked on I recommend taking the production changes back to dev and then re-building the dev changes on top.

Ghost of Drupal Past’s picture

Version: 8.8.x-dev » 8.9.x-dev
FileSize
6.72 KB

Now that I realized (and edited in) that reads do not require any communication to the entity manager, while certainly there's some complexity, it's not that bad. Look at this. Also? I can just take this to contrib if core doesn't want to deal with it. As always, I am probably working on more complex problems than most. Such is life. We just had a team of ten people doing config changes on production and various branches for weeks. I was at my wits end and this week I broke config merge so craftily it took us several hours of intense debugging to figure out what happened.

This patch is proof-of-concept level.

alexpott’s picture

Well on some projects the sharding strategy is going to result in hundreds of files which comes with a cost too. I still think that pursing a table sort that doesn't always change all the weights would help minimise this.

alexpott’s picture

@Charlie ChX Negyesi. also - yeah different projects with different dev / prod workflows have different pain points. And I do recognise the pain and am up for trying to find a core solution if we can do that without breaking the 1 config object to 1 config file relationship. Breaking that feels very disruptive.

Ghost of Drupal Past’s picture

Assigned: Ghost of Drupal Past » Unassigned

I will continue in contrib. I have a very good idea now -- shard into directories like configdir/.core.entity_view_display.node.test.teaser (collections starting with dots do not work anyways so that's not a problem) and have files like content.body.yml in there. But any sharding will break potx so it's not tenable for core. And fixing JavaScript is not something I can do.

Ghost of Drupal Past’s picture

https://www.drupal.org/project/pax has been released, I am unfollowing this issue.

andypost’s picture

Version: 8.9.x-dev » 9.2.x-dev
nikitagupta’s picture

Status: Needs work » Needs review
FileSize
6.75 KB

Rerolled patch #40.

andypost’s picture

Status: Needs review » Needs work

Patch is not working

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

andypost’s picture

karishmaamin’s picture

Re-rolled patch for 9.3.x

andypost’s picture

alexpott’s picture

Thinking about this issue some more - maybe there's another way. We could change how the javascript that changes the weights for a drag and drop table works and we could prioritise not changing existing weights. That way we'd benefit all configuration that using a drag and drop table.

andypost’s picture

donquixote’s picture

Coming from #3251174: Don't use weights to store fields order in Field UI.
I don't see test cases in the recent patches, so not sure how this is meant to work.

Anyway, I think I have a proposal that will solve the problems without creating additional files.

Problems with weights..

@alexpott:

So at the moment all the weights are changing but at least everything stays in the same place because everything is sorted by the key and that's the fieldname. We did that to fix hard to solve conflicts and incorrect merges when a field is intentionally moved. I'm not convinced that changing this to not store weight and not sort by display is a good idea.

This is good, but..

I think a better solution is to make the javascript weight tool only change weights when it has too. This would benefit all config that have weights that can be set by the javascript table row drag thingy.

Even if we store a "canonical" weight always, this will still cause considerable noise when inserting or removing.

From the linked issue:

Traditionally, Drupal uses weights to define the order of items (e.g. fields) in a list (e.g. entity display settings or form settings).
The benefit is that in config items can remain in alphabetic order, making it easier to keep track of e.g. display settings for a specific field in the git history of a project.

However, using weights in this way has two problems that often lead to git merge conflicts:

  • The same order can be represented as different combination of weights. E.g. the order ['b', 'c', 'a'] can be represented as '[a: 2, b: 0, c: 1] or as [a: 99, b: 15, c: -9].
  • Changing the position of a single item can cause weight changes in an unlimited number of other items. This can cause plenty of merge conflicts.

Even if we could solve the first problem by introducing a "canonical" weight, the second problem would still persist. E.g. when inserting a new item, all subsequent item weights need to be incremented.

Proposal: List of keys

Instead of storing weights, store the order as a list of machine names.
The actual field config entries remain in alphabetic order.
This works both for view displays and form displays.

content:
  body: {..}
  field_event_date: {..}
  title: {..}
order:
  - title
  - field_event_date
  - body

What about field groups?

Fields within field groups don't need to be in this list, because field groups have their own order.
However, field groups now also need to be in the order list.
We might need to prevent name clashes if field group and field have the same machine name (should not happen, but still..).
I noticed field_group is still in contrib, I did not check if core has any group support. Seems this would be a clear BC break, if a contrib module needs to change its implementation.

third_party_settings:
  field_group:
    group_event_info:
      children:
        - field_event_date
        - field_event_location
content:
  body: {..}
  field_event_date: {..}
  field_event_location: {..}
  title: {..}
order:
  - title
  - field_group:group_event_info
  - body
donquixote’s picture

Btw

Well, looks like @alexpott made "do not order by weight in exported YAML files" the official recommendation in http://chapterthree.com/blog/principles-configuration-management-part-one.

Url has changed:
https://www.chapterthree.com/blog/principles-of-configuration-management...
(please chapterthree, enable redirect module! or was the link in #11 always wrong?)

I no longer see weights mentioned there. Instead I now see this:

Sort configuration keys in a predictable order that does not change often

Luckily, with the solution I am proposing, we can have our cake and eat it too :)

alexpott’s picture

Even if we store a "canonical" weight always, this will still cause considerable noise when inserting or removing.

This is not necessarily true. The biggest annoyance here is when you make a small change and you end up with lots of change. In my opinion we could design an algorithm that minimised the amount of weight change by the JS that meant that inserting a single field rarely means adjusting any other weight. Same with moving a field. I do think it would be a big win if the change we make here was not specific to field configuration but specific to tabledrag as doing this will improve more things.

For example the default behaviour could be always to have a gap of ten when you add a field. And when you move the JS it would maintain this gap and it should be aware of which weights are already saved and therefore not adjusted. And if it has to choose between maintaining the gap and changing a weight that is saved it should always adjust the gap.

Berdir’s picture

That is interesting, one problem is that the default weight form element goes from -50 to 50 I think, so we'd need to adjust that to considerably higher numbers, which also means a lot more markup.

bircher’s picture

RE #54: This is a neat idea but unfortunately only solves some problems while also creating other ones.
In particular the new order sequence is inherently unsortable itself because the information it contains is the order.
So if you diff and merge it is not easy to get back the original. (I know git can do that by saving the lines above and below, so in non trivial changes you will just get a merge conflict, but in for example Config Split 2.x we just work with arrays and the array merge just adds it to the end. For that a way to sort the result is important in order to avoid unnecessary diffs.)

#56 is an interesting idea. instead of bigger numbers we could also switch from int to float and make the algorithm prefer whole numbers.

alexpott’s picture

@bircher - oohh using floats... that is an awesome idea.

Berdir’s picture

float would require config schema changes though and I guess we would need to replace the select with a number field? might be ok, but again, sounds very much non-trivial to change the generic tabledrag API like this.

Pasqualle’s picture

In Drupal 7 it was possible to use float values as weight. I was using it a lot.
In D8+ it works also, just need to change the schema, but the config export is quite ugly with lots of !!float

Float sort also discussed in #2757207: Introduce "stable sort by weight" functions to sort arrays (and objects)

donquixote’s picture

@alexpott #56

Even if we store a "canonical" weight always, this will still cause considerable noise when inserting or removing.

This is not necessarily true. The biggest annoyance here is when you make a small change and you end up with lots of change. In my opinion we could design an algorithm that minimised the amount of weight change by the JS that meant that inserting a single field rarely means adjusting any other weight. [...]

We can do this. But then it is not "canonical" anymore.
To me, "canonical" means that the same order always results in the same file.

In #54 I mention two "problems":

  1. The same order can be represented as different combination of weights. E.g. the order ['b', 'c', 'a'] can be represented as '[a: 2, b: 0, c: 1] or as [a: 99, b: 15, c: -9].
  2. Changing the position of a single item can cause weight changes in an unlimited number of other items. This can cause plenty of merge conflicts.

The algorithm you (and others below) propose would solve the second problem (up to a limit, I guess), but not the first problem. Over time it can lead to a creeping infection (spicy language chosen to add flavor), depending how it is implemented.

E.g. if we use floats, "infected" config could look like this:
- 0
- 0.25
- 0.375
- 0.4375
- 0.46875
- 0.5
- 0.7109375
- 1
- 2

This can lead to ever more digits, and the non-canonicalness means that two developers that reorder fields in the same way, or almost the same way, will see a bigger diff than they should. And who wants to resolve a conflict with such fractal weights?
Is this not something we should try to avoid?

The solution I propose, with the extra 'order' entry solves both problems.
Or that's what I think - discussion below.

-------

@bircher (#58)

RE #54: This is a neat idea but unfortunately only solves some problems while also creating other ones.
In particular the new order sequence is inherently unsortable itself because the information it contains is the order.
So if you diff and merge it is not easy to get back the original. (I know git can do that by saving the lines above and below, so in non trivial changes you will just get a merge conflict, but in for example Config Split 2.x we just work with arrays and the array merge just adds it to the end. For that a way to sort the result is important in order to avoid unnecessary diffs.)

I don't quite get it.
What is "get back the original"?
And what happens in config split that cannot be implemented in a different way?

The solution I propose has a small version of the same problem as the very original idea of sorting all of the field config entries.
But on a much smaller scale, and much less confusing to git.
The problem with the original proposal is that all the field config is being reordered, whereas in my proposal only the machine name is being reordered. This way, git won't rip apart the config entries and build Frankensteins out of them. Or at least much less so.

Btw the same conflicts that would occur with the proposed solution already do occur with field group, where the sorted list of entries within a field group is part of config (somewhat redundant, because we already store weights for those fields).

-----

This said: A solution with non-canonical weights and creeping infection would be less of a BC break than the new "order" key.

bircher’s picture

Ok I get that switching to floats would be a schema change and would probably be a lot more work than it is worth.

But if we were to just add a factor of 10 and keep things integers there would still be plenty of gaps for a smart algorithm to find numbers that would change less frequently. This way there wouldn't be anything creeping and in most cases one would still look at integers in a reasonable range. Of course then when there is no space, the algorithm has to decide to reassign weights of elements you didn't touch, but that is still better than the current situation. (I could get nerd-sniped finding such an algorithm, that sounds very interesting)
But yes this would result in different weights for BA => AB depending on whether you sort B after A or A before B.

And a bit a longer explanation of why I think #54 is not something we should do.
The new key is inherently unsortable because the order of the elements is the information kept in the list. As such the information is stored in the array list keys. That is just not a very practical thing to deal with. It is equivalent to what you get if you array_flip it now you have the names as keys and the weight as values, except with the constraint that the weights must go from 0 to len()-1. Of course for a human to review the yaml not seeing the weights and reordering the fieldnames the way they should be when resolving a merge conflict is easier. But for a algorithm it is not so easy either way.
A practical example with Config Split (But this affects all the modules using the config transformation API which do more than adding or removing whole config files.) Config Split 2.x has a feature to "split modules as if they had been uninstalled" (Something core currently doesn't do for the excluded modules #3230825) It does so by taking the active config and asking core what would change if a module would be uninstalled and then taking that second config and save that to the sync storage and in order to do the reverse on import it saves a "config patch" in the split storage. The patch is created essentially with the recursive array_diff utility, and it is merged back with the array_merge utility. It is a bit more complicated than that because it doesn't work great with things that are array_is_list (from php 8.1) But essentially when you do array_diff and array_merge you lose the information about the position in the array of the thing you removed and then added back.
If there is a way to sort the array then you get the order back and there is no diff when importing the config. If there is no way to sort it then we would have to find a way to record that position the element is in based on the other elements, but they could change independently too and in git this would be a merge conflict, but in a config transformation you can not stop and ask a user to solve a merge conflict, so we would have to just chose one way to solve it and hope that it was what it should be. We have yet to see how often that is the case (we currently cheat with the sorting so it is not ideal).
I am considering looking into how difficult that would be since config sequences are only rarely sorted even in core unfortunately. But being able to just sort the config would definitely be much easier. So I am trying to make all config sortable and thus solving the problem for everybody that tries to do something similar as config split. That is why I try to avoid inherently unsortable config sequences.

donquixote’s picture

@bircher (#63)

And a bit a longer explanation of why I think #54 is not something we should do.

Wow, somehow I missed this recent reply. Sorry for that.
Btw, more paragraph breaks would be nice :)

Problems when splitting a list

I think it boils down to the problem of merging two pieces of such a list, while restoring the original order of the full list.
This would indeed be difficult..

The new key is inherently unsortable

From the rest of the post, I understand you want to "merge" two separate parts of config, and then sort everything into a canonical order.
This way you don't have to think about the position of where to insert the entries from the other part of config.

Unfortunately I listened to too many people who say "sort" when they mean "manually order", which is the opposite of what you and me mean by "sort".
I think what we mean is we treat the data not as a list, but as a collection/set, meaning there is no explicit order. To guarantee a deterministic storage process, we can either store it in a "canonical" order (sorted by machine name or some other criterion), or in separate files.

Conceptually, anything we manually order with tabledrag sort is a list (of distinct items, typically), not a collection/set. The non-canonical weights are merely a technical artifact. The storage model proposed in #54 attempts to reflect this 1:1.

A fragment of a list, which has to be combined with other fragments later to restore the original order, is no longer a "pure" list, but a "list with gaps". For the original use case, these gaps should be considered redundant information, but they are no longer redundant if we plan to recombine it with other pieces.

Our current storage model for these cases is a sorted map + weights. The sorted map is how we store a collection/set, and with the integer weights this makes a list + gaps. With float weights it would be like custom-placed points on a number scale.
But is sorted list + weights the best way to store a list + gaps? See below..

A practical example with Config Split (But this affects all the modules using the config transformation API which do more than adding or removing whole config files.) Config Split 2.x has a feature to "split modules as if they had been uninstalled"

First, I want to be clear that we would not use the storage method from #54 for the module list itself (core.extension.yml). Module weight is not comparable to the usual tabledrag weight lists. Not something to argue about, just a point for completeness.

in order to do the reverse on import it saves a "config patch" in the split storage. The patch is created essentially with the recursive array_diff utility, and it is merged back with the array_merge utility.

This already sounds limited, e.g. the patch can only add things, never remove. But probably this is ok 99% of the time, because most modules don't remove stuff when they are installed.

It is a bit more complicated than that because it doesn't work great with things that are array_is_list (from php 8.1)

array_merge() behaves in undesirable ways for any array with any integer keys, not just for pure sequential arrays: https://3v4l.org/Q9NqI
If we want to treat all keys as associative, we can use "+" operator or array_replace().

But essentially when you do array_diff and array_merge you lose the information about the position in the array of the thing you removed and then added back.

True.

If there is a way to sort the array then you get the order back and there is no diff when importing the config.

True.
All of this because we are really merging two sets, rather than two lists, if the weights are treated as properties of each item.

Weights getting out of sync?

I see one limitation: If somebody operates the tabledrag while the split-out modules are uninstalled, then merging the split-out entries back in will insert these entries in unexpected positions, because the weights are no longer in sync.

I assume the idea is that site builders should always work with all modules installed.

How relevant is this in practice?

Before we look into alternatives, I wonder:
Are there real-world scenarios where a config split would remove items from a list with custom (tabledrag) order?
E.g. would the uninstalling of modules in a config split lead to field instances being removed? Image effects? Text filters? Language selection methods?
It seems technically the answer is yes.
In practice, my wild guess is that for most sites where this happens, only a small number of lists would be split up in this way - but still more than zero, so it is a problem to be dealt with.

V2 of #54: List + gaps

As mentioned above, currently we are using a sorted list + weights to store a list + gaps.
The gaps are needed if we later want to combine this with another fragment to restore a complete list with the original order.

An alternative way to store such a list + gaps would be to store spaces or blank lines between machine names.
E.g. like so.

Original:

order:
- one
- two
- three
- four
- five

Split, main part:

order:
- one
- " "
- three
- four
- " "

Split, patch:

order:
- " "
- two
- "  "  # Long gaps are combined as a single entry for readability.
- five

Note that this is equivalent to the following weights-based storage with sorted map:
(for this example, weights start at 1, not 0).

Split, main part:

four: {weight: 4}
one: {weight: 1}
three: {weight: 3}

Split, patch:

five: {weight: 5}
two: {weight: 2}

We can modify this model so that the main reduced list won't have any gaps, and only the "patches" have gaps.

Split, main part:

order:
- one
- three
- four

Split, patch:

order:
- " "
- two
- "  "  # 2x gap.
- five

To merge this, we can start with the patch version, and fill up the gaps with entries from the main version.

There are some questions what to do if we have multiple patches, and how to deal with duplicates, but all of this can be solved.

How would the recursive merge algorithm determine which nested array to treat in this special list-with-gap merge, instead of the usual array_merge() / "+" / array_replace()?
-> This can be determined either by a config schema, or by a heuristic.
(I remember doing this in custom code in the past, with a schema-like structure to determine the array type)

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.