Problem/Motivation

Based on a rather frustrating IRC discussion, we apparently aren't all on the same page about the use of UUIDs in config entities. (See #1969698-9: ConfigEntity::save() should disallow saving ID/UUID conflicts (Field UUID changes can badly corrupt field data).)

Proposed resolution

Discuss. Details to follow.

Remaining tasks

Identify the usecases for UUIDs, the problems with using them, and alternatives.

#2077435: Use a yml file to create the forum vocabulary
#2106243: Use yml files to create the forum module's comment and taxonomy term fields

Files: 
CommentFileSizeAuthor
#58 1969800_58.patch34.21 KBchx
PASSED: [[SimpleTest]]: [MySQL] 59,223 pass(es).
[ View ]
#58 interdiff.txt957 byteschx
#52 1969800_51.patch32.88 KBchx
PASSED: [[SimpleTest]]: [MySQL] 59,230 pass(es).
[ View ]
#48 1969800-48.patch30.84 KBtayzlor
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1969800-48.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#46 modified-configs.txt2.88 KBherom
#44 1969800-uuid-44.patch24.38 KBmtift
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1969800-uuid-44.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#44 interdiff.txt2.29 KBmtift
#42 1969800-uuid-42.patch22.1 KBmtift
PASSED: [[SimpleTest]]: [MySQL] 58,313 pass(es).
[ View ]
#42 interdiff.txt5.99 KBmtift
#32 1969800-uuid.patch16.51 KBtayzlor
PASSED: [[SimpleTest]]: [MySQL] 57,957 pass(es).
[ View ]
#32 interdiff.txt1.61 KBtayzlor
#25 1969800-25.patch14.9 KBswentel
PASSED: [[SimpleTest]]: [MySQL] 55,478 pass(es).
[ View ]
#25 interdiff.txt1.38 KBswentel
#23 1969800-23.patch16.29 KBswentel
FAILED: [[SimpleTest]]: [MySQL] 55,488 pass(es), 0 fail(s), and 1 exception(s).
[ View ]
#23 interdiff.txt3.33 KBswentel
#21 uuid-1969800-21.patch12.95 KBxjm
FAILED: [[SimpleTest]]: [MySQL] 55,367 pass(es), 3 fail(s), and 0 exception(s).
[ View ]

Comments

One usecase for UUIDs (of many) is helping a site owner understand when a piece of configuration is something they already have configured that they want to update, versus something that is a "different" entity that they may or may not want to replace their existing data with.

@D34dMan posted these scenarios for our consideration:

April 14th 2013 by D34dman { xjm, webchick, Crell, beejebus, merlinofchaos, et all }

There are two main resources with same name Resource-A and Resource-B

Case 1:
Resource-A: Module 'pants' written by author 'Tailor' has UUID 'clothes-000'
Resource-B: Module 'pants' written by author 'Sportsman' has UUID 'Actions-000'

Resource-A is already installed and now when trying to install Resource-B

Action Needed
1. CORE should not delete Resource-A.
2. Do something else than "aeeiiiiiiiiii....." Give chance to
Deffer/Overwrite/Update!

Case 2:
Resource-A: Module 'pants' written by author 'Tailor' has UUID 'clothes-000'
Resource-B: Module 'pants' written by author 'Fashion Designer' has UUID 'apparels-000'

Resource-A is already installed and now when trying to install Resource-B

Action Needed
1. CORE should delete Resource-A.
2. Install Resource-B

Case 3:
Resource-A: Module 'pants' written by author 'Tailor' has UUID 'clothes-000'
Resource-B: Module 'pants' written by author 'Tailor' has UUID 'clothes-000'

Resource-A is already installed and now when trying to install Resource-B

Action Needed
1. Update Resource-A to Resource-B

Case 4:
Resource-A: Module 'pants' written by author 'Tailor' has UUID 'clothes-000'
Resource-B: Module 'pants' written by author 'Tailor's Junior' has UUID 'clothes-111'

Resource-A is already installed and now when trying to install Resource-B

Action Needed
1. Update Resource-A to Resource-B

My perspective, and what led me to "discovering" #1969698: ConfigEntity::save() should disallow saving ID/UUID conflicts (Field UUID changes can badly corrupt field data), is that part of the purpose of CMI is to allow us to create truly robust installation profiles and distributions. IMO, out of the box, an installation profile should initially be exactly the same everywhere I install it.

Additionally, I think that it makes a ton of sense for there to be a canonical UUID that identifies "the view.view.frontpage that ships with the node module in 8.0" or "the my_foo field instance that mymodule provides by default" or "the Magic Pony content type that Super Awesome Distro provides". This was also @heyrocker's suggestion to resolve #1969698: ConfigEntity::save() should disallow saving ID/UUID conflicts (Field UUID changes can badly corrupt field data) when I described the problem to him last week.

short version: we need UUIDs in default config.

longer version.

during install, i think it makes sense to say "the config system will not override an existing config thingie file". during module install, we will only create new config thingies, we will not overwrite existing ones.

if a module ships thingie.foo.yml, and thingie.foo.yml exists in the active config, we can do one of:

- block the module install, with an appropriate message like 'can't install, because config thingie.foo.yml already exists'
- proceed with the install, but don't change thingie.foo.yml in the active store, and show a message like 'i didn't install thingie.foo.yml, because it already exists'

either way, from the config system's point of view, module install is about creating new thingies only. (and obviously, we don't support delete/rename etc during module install.)

contrast this to import.

during import, updates/deletes/renames of existing config thingies must be supported. at the end of an import, we *must* have thingie.foo.yml updated to the version from the staging dir, end of story.

for the simple update case, unique machine names are all we need. we don't need UUIDs to just update from staging.

so, why do we need UUIDs? because we are assuming that for some sub-systems, there is a big difference between an update and a 'delete-create', and a big difference between a 'rename' and a 'delete-create'. so we need to build a system that allows such sub-systems to know the difference, and do the right thing.

but wait, there's more! :-)

we want to handle module enable/disable events during import. during a module install, we copy stuff from a module's config dir into the active config dir. we absolutely want the UUIDs for that default config to match across environments, right? so, if we don't ship UUIDs in the default config, how do we achieve that?

so yeah, UUIDs in default config. let's do that.

OK, so if we ship with UUIDs, then does it matter what format they are in? The UUID generator in core already supports 2 implementations and there can be more. The UUID that gets written to contrib files could very well depend on whatever UUID implementation the maintainer is using.

i'm not sure i understand the concern in #4.

are UUIDs created by one implementation not compatible with others?

I think the issue is that they might not look the same, thats all. Pretty minor.

Restoring tags.

Note that we recently fixed #1964254: Configuration schemas missing langcode and uuid at places, so any removal of uuids should take care of config schemas. Also any additions should check they are also in the schema for the given entity.

UUIDs are crucial to the config import process, they are the only way to tell between "update" and "delete + create new with same name".

Field API might be the only ConfigEntity in core today for which the difference is crucial, mostly because other current systems do not really care between update & delete/create since they are basically just "definitions of some runtime behavior" while fields have associated persistent data to maintain. But core could see other similar cases, and contrib most definitely will.

The current shape of the import process is :
- config system interprets the set of imported config into a series of create, update, delete
- it then calls each configEntity storage controller: "here, this item is new, this item is to be updated, this item is to be deleted, deal with it"
So it's not the task of each subsystem to sort out whether an update is really a delete/create, this has to be handled upstream, and UUIDs are the only way.

re "So it's not the task of each subsystem to sort out whether an update is really a delete/create, this has to be handled upstream, and UUIDs are the only way." from #9 - yes, i agree completely.

the config system should figure out delete/create vs update.

fwiw custom_block module uses custom block uuids in its block plugin ids, instead of using the serial id.
That way the config entities it creates reference the record in {custom_block} table by uuid instead of serial id.
Note also #1757452: Support config entities in entity reference fields

WRT to #12, I think that's actually different because those are UUIDs referencing content entities stored in configuration entities, which is already something we do in a number of other places (like shortcut) without handling it properly, and fail to do in other places (like taxonomy). This issue though is about UUIDs for the configuration entities themselves, and when and whether we provide them in default config.

Just adding some things to this issue:

#1892634: Notice and error when trying to import block: another example why UUID's are so important and why I still think the patch #1969698: ConfigEntity::save() should disallow saving ID/UUID conflicts (Field UUID changes can badly corrupt field data) is not our problem, and not only related to Field API. Note, the issue is relatively old so I'm not sure if it's still the case. It's just that it's so easy to ship something with the same machine name, but different settings and 'break' your site, where 'break' can be a simple string change to complete breakdown.

Another related issue: http://drupal.org/node/1609418#comment-6459520 - pointing directly to the comment. I've been mulling about a 'human friendly solution + uuid in config file names' this evening and came to the same idea as in that comment. IMHO it's not as scary and still readably, and almost a 100% guarantee we can give for D8 to make configuration (whether it's create, update or delete) work perfect :)

- edit - so the comment also points to a direct call in PHP code, I think we can skip that, somehow ..

so yeah, I'm with yched and beejeebus, let's discuss this more in Portland :)

For sake of full disclosure, here are the concerns @webchick outlined in the other issue:

Summary of IRC discussion...

For those who are not as familiar with CMI and don't "grok" the problem from the issue summary / what's in the video...

Basically, what's happening is a module or profile that defines a ConfigEntity defines will create default config such as:

profiles/foo/config/user.role.administrator.yml:
id: administrator
label: Administrator
weight: 2
langcode: en

When this gets written out during installation / module enable / whatever, it'll end up in the site's active directory as:

sites/default/files/config_XXX/active/user.role.administrator.yml:

id: administrator
uuid: b8474e77-6317-4e8b-9922-344eea1358d1 # Some auto-generated UUID.
label: Administrator
weight: 2
langcode: en

The UUID is unique (obviously), and so if the dev / prod sites are both two separate installations, they will end up with two different UUIDs. I was not aware of this auto-magic UUID generation thing before, so I guess that went in in some patch I forgot about. :P~ I might even have committed it. ;)

The "Synchronize configuration" page on prod therefore gets confused, and thinks every single one of the ConfigEntity files between dev and prod changed, and tries to "helpfullly" replace all of them when you sync. I'm not sure exactly which one of them caused the fatal error/exception that blew up the site in the video, but basically this is Un-Good, because if this happens with a field, it will cause data loss. There was also a concern expressed about what happens if two modules each define a View or something called "myevents"... xjm was saying that ideally, it would inform you you had two and let you decide which one to delete.

So the proposal is to add UUIDs to default config so they can match between environments.

As a module author, for anything less trivial than say a View or something similar with 50,000 entries in it, I would be inclined to just make these default YAML files by hand like I do in other applications where I've used YAML. I mean, we picked YAML as the format in large part because it's a nice human-editable format. Having to make up a UUID and add it to the file is a pretty non-standard thing and annoying thing to have to do. (Apparently, if I called $entity->save() somewhere in my module, this would auto-generate one for me, which is great, but I wouldn't have known that trick had it not been told to me.)

And my worry is that distro authors (who are generally not as sophisticated as module authors) aren't going to understand this concept and are merely going to copy/paste files from core/standard to their own distros, change all of the other values in that file (id, name, etc.) but leave UUID alone because they don't know what it is and it looks scary, and then end up with real problems because the UUIDs match for totally different things, and possibly 3-4 of them. :P

So the question I posed on IRC is whether or not this could just be handled with just machine names, which are already unique, are easy to define by module authors, and will already match between dev and prod, and what the consequences would be if we did that. The end.

Thanks for linking #1609418: Configuration objects are not unique; somehow I wasn't following that one.

Just wanted to poke in here, since I was just doing a site demo and ran into this issue. I am basically 100% in agreement with the above observations that default config must contain a UUID. Webchick's point about machine names is only valid if we remove the ability to edit machine names after creation (that situation is the whole reason we need UUIDs in the first place.) I would actually be OK with this solution to be honest, but it would require a lot more discussion obv.

My situation was similar to the one xjm ran into in the original issue. I installed Drupal, made some changes, then took a fresh install and copied the entire source active config into the dest staging config directory. When I went to synch I had a hundred changes that were nothing but UUIDs, and the import failed on the field issue.

Whatever the solution, if people can't wholesale copy their entire config directory from one site to another without encountering this situation then I kind of think the entire thing is pointless.

We had a long meeting around this at DrupalCon Portland consisting of myself, alexpott, xjm, swentel, yched, effulgentsia, beejeebus and several others. Long story short, everyone came to the agreement that default config needs to have UUIDs. The reasons and implications have been expanded on in great detail above.

So what we need to come out of this is a patch to all default config that adds a UUID, and I believe we also need some code in the upgrade path that detects if the config being upgraded is actually an implementation of default (based on machine name) and adds the existing UUID into it if so. Non-default config with the same machine name as default config get the default config's UUID, I think this is a rare edge case anyways.

Assigned:Unassigned» xjm

Thanks @heyrocker!

And it just so happens I have the beginnings of such a patch from #1969698: ConfigEntity::save() should disallow saving ID/UUID conflicts (Field UUID changes can badly corrupt field data). :)

Title:[META] Discuss how and whether the configuration system, config entities, or their implementations should or shouldn't use UUIDsAdd UUIDs to default configuration
Status:Active» Needs review
StatusFileSize
new12.95 KB
FAILED: [[SimpleTest]]: [MySQL] 55,367 pass(es), 3 fail(s), and 0 exception(s).
[ View ]

Here's some.

Status:Needs review» Needs work

The last submitted patch, uuid-1969800-21.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new3.33 KB
new16.29 KB
FAILED: [[SimpleTest]]: [MySQL] 55,488 pass(es), 0 fail(s), and 1 exception(s).
[ View ]

Do we also add them to the default config in test modules ?

Added some more and removed them from the field test files just to be consistent with the rest in case we don't add them to test files.

Also, those failures pass locally for me.

Status:Needs review» Needs work

The last submitted patch, 1969800-23.patch, failed testing.

StatusFileSize
new1.38 KB
new14.9 KB
PASSED: [[SimpleTest]]: [MySQL] 55,478 pass(es).
[ View ]

Oh sweet irony

Status:Needs work» Needs review

Status:Needs review» Reviewed & tested by the community

I think this issue is ready. Or not?

Status:Reviewed & tested by the community» Needs review

Hoops upgrade path code is missing, see @heyrocker

In Portland we discussed that if any piece of config on the D7 site has a machine name that matches the machine name of a piece of default config in D8, then it should be considered that config and have the D8 default config's UUID added to it rather than having a new one generated. Yes this leaves an edge case where someone could have possibly created a new field with the same machine name as a D8 default field, but I'm honestly not overly concerned.

Assigned:xjm» Unassigned
Status:Needs review» Needs work
Issue tags:+Needs upgrade path, +Needs upgrade path tests

StatusFileSize
new1.61 KB
new16.51 KB
PASSED: [[SimpleTest]]: [MySQL] 57,957 pass(es).
[ View ]

Updated patch which adds in some more uuids that are required.

Status:Needs work» Needs review

Status:Needs review» Needs work

Didnt notice the tags for upgrade path + tests, setting back to needs work.

Status:Needs work» Needs review
Issue tags:-Needs upgrade path, -Needs upgrade path tests

@alexpott and I just discussed this, and we decided that it actually doesn't make sense to add an upgrade path to make the UUIDs the same for D7 data. In that situation, the canonical source for those fields is not the default config directory that ships with Drupal 8.

So, let's add these as they are. I think the next step is then to fail validation on installation if the UUID is not already present in the config entity, to ensure that we don't inadvertently add more default entities without UUIDs in the future.

Status:Needs review» Needs work

See https://drupal.org/node/1969698#comment-7640335

We need default config files for article and page body as well.

suppose same should be done for forum module. body field must be added only for content types created via ui. probably that needs tests

Issue tags:+sprint

Tagging

I think the files that need UUIDs contain the text "id:" (including tests and install profiles with config), end in ".yml", and do not currently contain "uuid". Therefore, I will try and find the files that need updating using:

grep -rl "id: " core/ | grep .yml | xargs grep -L uuid

#40 might help, but this won't give you all of them
- they don't necessarily contain 'id:' - using 'id' as 'the name if the id property" is only a convention, I don't think all of core respects it
- they might already contain 'uuid' (field.instance.*.yml files have a 'field_uuid' property). You probably want to look for '^uuid'

StatusFileSize
new5.99 KB
new22.1 KB
PASSED: [[SimpleTest]]: [MySQL] 58,313 pass(es).
[ View ]

This script is not perfect, and the results required a fair amount of cleanup, but it helped me find more files in need of UUIDs and essentially explains the reasons for the updated patch:

#!/bin/bash
# Start with a clean Drupal installation
sudo rm -rf ~/Sites/d8/sites/default/files/*
drush sql-drop d8 --database=d8 --yes
drush -y si --account-pass=admin --db-url="mysql://root:root@localhost/d8"
# devel is required to use drush dd below
drush en devel -y
# Figure out the configuration directory
CONF_DIR=$(ls -d sites/default/files/config_*);
# This grep craziness:
#  1. gets all modules that contain config_prefix
#  2. pulls the text between the quotes
#  3. removes the quotes
#  4. removes everything after the dot
#  5. removes duplicates
MODULES=$(grep -r "config_prefix = "" ./core | grep -o "".*"" | tr -d '"' | sed 's/\.[a-z_]*//g' | uniq)
# Enable all of the modules that contain configurables
drush en -y $MODULES
for m in $MODULES
do
  # "config_query_test" & "display" are not modules that can be enabled
  if [ $m != "config_query_test" ] && [ $m != "display" ]; then
    DIR=$(drush dd $m)
    if [ -d "$DIR/config/" ]; then
      FILES=$(ls $DIR/config/)
      for f in $FILES
      do
        # Check the the file is not a schema file, and that is exists in both
        # the active directory and the default configuration directory
        if [ -f $CONF_DIR/active/$f ] && [ -f core/modules/$m/config/$f ] && [ $f != "schema" ]; then
          sudo cp $CONF_DIR/active/$f core/modules/$m/config/$f
        fi
      done
    fi
  fi
done

Status:Needs work» Needs review

Status:Needs review» Needs work
Issue tags:+Novice
StatusFileSize
new2.29 KB
new24.38 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1969800-uuid-44.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

I found more that need default UUIDs using this method:

  1. Install two instances of Drupal 8 (origin and destination) and enable all modules
  2. Apply the patch to both sites (git apply --index 1969800-uuid-44.patch)
  3. Export the origin configuration at admin/config/development/export to config.tar.gz
  4. In the destination site, navigate to admin/config/development/import and upload config.tar.gz
  5. You should see something like "58 changed" even though you didn't change anything.

This will provide you with a list of files that need UUIDs. You can get a UUID to add to the attached patch by clicking "View Differences."

Also, these (and others, probably) do not provide default config and will need special treatment:

action.action.user_add_role_action.administrator
action.action.user_remove_role_action.administrator

Assigned:Unassigned» herom

I'm doing this.

Assigned:herom» Unassigned
StatusFileSize
new2.88 KB

unassigning myself.
couldn't figure out how to deal with the ones without default config.
but for anyone who can, here's a list of configs that were modified, after following #44's method.

the first two words is the config prefix (e.g. breakpoint.breakpoint). the rest is the config id.

The patch #44 is failing (on HEAD revision of 8.x) due some refactoring on issue #2082499.

Actually other files seems to be renamed as well, like menu config files for example.

StatusFileSize
new30.84 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1969800-48.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Here's a re-rolled patch that should apply that also includes a couple more UUIDs that were missing. We still need to think about the ones without default config.

Not trying to derail this issue, but this issue is related #2100043: Dynamically generated default YML use randomized UUIDs.. (Cross posted on recommendation Beejeebus).

Priority:Major» Critical
Status:Needs work» Needs review

Status:Needs review» Needs work

The last submitted patch, 1969800-48.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new32.88 KB
PASSED: [[SimpleTest]]: [MySQL] 59,230 pass(es).
[ View ]

Added the and fixed a couple casts while in there.

Status:Needs review» Reviewed & tested by the community

Perfect!!

Issue tags:+8.x-alpha4

An alpha4 without working CMI would be a very sad alpha4.

Status:Reviewed & tested by the community» Needs work

The last submitted patch, 1969800_51.patch, failed testing.

That's a bogus fail. Re-testing.

Status:Needs work» Needs review

#52: 1969800_51.patch queued for re-testing.

Status:Needs review» Reviewed & tested by the community
StatusFileSize
new957 bytes
new34.21 KB
PASSED: [[SimpleTest]]: [MySQL] 59,223 pass(es).
[ View ]

Moar UUIDs needed. Also, no reason this is not RTBC.

I agree. RTBC.

Issue summary:View changes

Updated issue summary.

would love to see this go in, it's holding up #2106171: Write tests for simple configuration deployment scenario.

Status:Reviewed & tested by the community» Fixed

Sorry, meant to do this before babysitting duty but didn't quite make it. :)

Committed and pushed to 8.x. Thanks!

There was also a small follow-up commit to fix a "rue" to "true".

Title:Add UUIDs to default configurationChange notice: Add UUIDs to default configuration
Priority:Critical» Major
Status:Fixed» Active
Issue tags:+Needs change record

Oh, and this'll need a change notice.

Do we have a way to force module developers to put in a UUID into their default entities?

We could make config importer refuse to run. We should also add $uuid to ConfigEntityBase

#64 sounds like it should be a separate issue.

This needs to be rolled back. #2110615: Do not ship with default UUIDs

Issue summary:View changes

Updated issue summary.

Title:Change notice: Add UUIDs to default configurationAdd UUIDs to default configuration
Issue summary:View changes
Status:Active» Fixed
Issue tags:-Needs change record, -sprint

In the interest of being able to close this down, I added https://drupal.org/node/2153709 as a change notice. Note the title: "Add default UUIDs to shipped configuration entities (for now)". Continue debating in #2110615: Do not ship with default UUIDs and #2133325: Create a site UUID on install and only allow config sync between sites with the same UUID.

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.