Coming from #934050: Change format into string:

Relatively complex problem space; the below list of problems are depending on each other, so it does not make sense to split them into separate issues. The resolution needs to take all of them into account.

Problem

  • A field schema is set in stone after initial creation and cannot be updated.

Goal

  • Field type modules have to be able to programmatically update their field schema.

Required functionality
For field schema:

  • Add new column (e.g. new setting)
  • Change column settings (e.g. different size)
  • Remove column (e.g. removed setting)
  • Rename column (e.g. a field name was too long, had a typo or needs rename for other reasons)

For field(s):

Details

  • As with any other module, the field schema of field type modules will change over time. Currently, Field API tries to prevent any schema updates on the assumption that the schema update has been triggered via Field UI. While limiting the possibilities of the user interface makes sense to some extent, there must be a way for field type modules to perform intended, safe, and wanted field schema updates, just like any other module has to be able to update its schema. Field type modules are no exception to that rule.

    AFAICS, Field UI already implements the necessary logic to prevent users from updating a field schema through the UI. The current patch does not change the runtime API of field_update_field(), so it still disallows field schema changes. Instead, an update helper function is introduced, which basically is the same as field_update_field(), but works within the update.php environment and does not disallow schema changes.

EDIT: Splitted the other issues with field_update_field() into #937554: Field storage config entities 'indexes' key

Issue fork drupal-937442

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

LaurentAjdnik’s picture

Tagging

alex_b’s picture

chx’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » feature
Priority: Critical » Normal
Status: Needs review » Needs work

There is a whole lot of problems when you try to change the schema of a field and it's not something you can solve in a generic way. While adding / dropping indexes is possible (it does not touch data), changing and / or columns is not something that is possible most of the time. The lack of bulk field update API bites us here but that can't be helped. The two are inseparable. For example, if you want to change cardinality from 1 to 2 or from 2 to 3, will your widget and formatter be able to handle that? When you add a new column, but then how are you going to ensure that it contains the correct value? For PostgreSQL we might need to support USING clauses of ALTER TABLE.

You will throw another tantrum here that storage engines are bad, unusable, etc -- sorry, I can't help that either. The field storage engine concept is extremely strong -- it's not just that I am a rabid MongoDB fan -- when we designed the storage engine layer, MongoDB did not even exist. The motivation was an unsure hope in what will be called NoSQL databases (note that the field storage engine layer predates the wildspread use of this word) because SQL just plain sucks at storing this kind of data and the definite need for remote fields.

There is only so much that can be done in one release. We got extremely far this release but -- some work is left for contrib. Already entity module helps with singular updates, and batching them won't be hard, once that's done writing a module that copies the old field values into a new field and then drops the old will be possible.

If there will be a need to actually do a schema change we can write an update helper that changes the SQL tables directly, that's what the current update does. That does not mean we need to change the field_sql_storage.module.

sun’s picture

Version: 8.x-dev » 7.x-dev
Category: feature » bug
Priority: Normal » Critical

@chx, If you do not want to maintain your stored data, then that is your problem.

chx’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » feature
Priority: Critical » Normal

http://drupal.org/node/361683#comment-1234964

Everyone agrees we do *not* want field migration on the main data-storage

Users scream about stuff. So it goes, we're pretty used to it. People who don't understand the complexities of schema changes aren't qualified to have their screaming heard.

You had almost two years to write this code. It did not happen. Drupal 8.

Damien Tournoud’s picture

Priority: Normal » Major

Not supporting schema changes was one of the initial design decisions of the Field API. Not just to make the lives of everyone difficult, but because it is a *very complex* matter.

The initial Field API patch has been first marked as RTBC more then one year and a half ago. Field update is not something we can or should handle so late in the Drupal 7 cycle.

This problem can be investigated and figured out in contrib. Two possible directions:

  • Supporting updates of field storage outside of the current Field API (and maybe only for the SQL Storage)
  • Migrating data from one version of the field to the next via import/export
alex_b’s picture

#3 / #5 / #6:

Does this mean that it would be acceptable for the Drupal 7 cycle that a schema update helper like _update_7000_field_update_field() in the patch at hand does not invoke 'hook_field_update_forbid'? (It does right now.)

Or in a more general way: what is the workaround for the limitation that field schema's can't be changed? The intention of this thread is not to find a general solution for schema changes, but to allow module maintainers to do a controlled schema change on the hook_update_N() level.

sun’s picture

Version: 8.x-dev » 7.x-dev
Category: feature » bug
Priority: Major » Critical

What Alex said. (Field type) modules need a way to update fields during module updates. In any way. Not doing so violates Drupal's #1 principle: Stored user data is retained and updated.

webchick’s picture

Here's what I don't understand.

I used to maintain a module called OG Block Visibility. In it, there's a table that effectively foreign keys off to block.delta in core. Let's say at some point we decide it's a good idea to make block.delta a varchar(128) field. Here's how we would do that:

1. In core, write a block_update_XXXX() that does a one-line db_change_field() call.
2. File a bug with OG Block Visibility saying "Hey, core changed its schema to varchar(128). You should too, or your users are going to get their data truncated."
3. Write a og_block_visibility_update_XXXX() with a one-line db_change_field() call.
4. There is no step 4.

What we don't do is make it Drupal core's responsibility to be both omniscient and psychic, and derive meaning from database schemas to infer that such-and-such a column is actually a block.delta, and make that change accordingly on behalf of the module author. Which is good, since core is not both omniscient and psychic, and so will probably get it wrong.

I fail to see how this situation is any different.

IMO, core should be handling its own format columns, which are filter.format, filter_format.format, users.signature_format, and any fields that are type text_with_summary, text_long, whatever (which can be determined from rows in the field_config table). That's where core's responsibility begins and ends: core's modules' own data.

Will this method work with PBS module or Fancy Format module? Probably not! Because those are contributed modules, they need to come up with its own mechanism to handle core's schema change. Just as OG Block Visibility does.

Now, a generic API for performing field updates sounds like a wonderful feature. For Drupal 8. In the meantime, KarenS and joachim are both trying to solve this problem in contrib, and have been for the past year or more. We can't hold 7.0's release up on this.

What am I missing?

chx’s picture

webchick posted in the wrong issue based on an IRC discussion. Please disregard. We will get back later.

LaurentAjdnik’s picture

I think that, for now, if modules want to update the fields they created, they should do it their very own way. At least, they have the complete knowledge of what's in them, therefore on how to make them evolve.

It would be a pity to block the release of Drupal 7 until we manage to find a generic way do it.

I do not feel like it's a violation of the "saving data principle" since none is lost. And it would be the choice of the module coders to change something in their code that would require changes in fields. Not Drupal's fault :c)

Note: I will skip my turn in the "change-the-priority-everytime-a-comment-is-posted" game ;c)

chx’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » feature
Priority: Critical » Normal

Enough of this. It wastes everyone's time and we do not go ahead. It's not feasible to fix this in Drupal 7.

In Drupal 7, neither field type modules nor fields can change their schemas. The root cause of this is not storage (which might or might be fixable) but widgets and formatters that expect certain data structures and might or might not break when they get something different. This is simply not fixable within this release cycle. Only move back if you have something to say about this. The demand to change anyways is just that: a demand that is not supported. Not everything can be done in one release.

What webchick wanted to post into this issue is "Core handling generic field schema changes sounds like a 18-month long problem to solve. I don't see us solving it by 7.0. KarenS and joachim have been working on it for months and afaik neither has achieved success." but she needed to leave and posted in hurry to the wrong issue. Edit: oh I see she actually posted that too but mixed it up with the two approaches that were discussed for the filter table update.

sun’s picture

Version: 8.x-dev » 7.x-dev
Category: feature » bug
Priority: Normal » Critical
Status: Needs work » Needs review

I have troubles to follow the logic.

1) No one suggested to do a "perfect" solution. Nothing is perfect. And nothing will ever be perfect.

2) While the problem space may not be new and may have been discussed during initial Field API discussions ~2 years ago, no one despite the developers being involved was actually aware of that. Don't tell me I wouldn't have worked on core. That's plain ridiculous.

3) I perfectly understand that more complex and more troublesome updates are a possible scenario, too. However, the use-case at hand does not involve any configurable field logic. Text module *knows* that 'format' is an integer. If that conversion from integer to string would be a problem, then all the other module updates throughout core would fail equally. This simple use-case is 100% solvable with the tools at hand (resp. in the patch). No one ever suggested a solution for doing more complex field updates.

4) By performing manual queries within the Text module update, any other field storage engines will not notice that a field update happened at all. We can surely do that, but it's slightly beyond me that exactly those people that are actively using other storage engines are objecting a patch that makes other storage engines aware of a field update (attempt).

To summarize:

5) All that is being suggested is a solution for *simple* field schema updates. I'm perfectly aware of the fact that there can be more complex scenarios. Scope creep.

6) The actual update at hand is safe, because it does not involve configurable field schema settings. Text fields are not going to be the only fields that will have to do such simple updates.

7) The proposed changes allow other field storage engines than SQL to react to the field update, if they need to, or simply skip the update.

8) We can easily add some comments to explain that the functionality is unsuitable and must not be used for more complex field updates. That's an entirely different can of worms anyway.

Conclusion:

What is the point of disallowing easy cases, just because there are more troublesome cases? The proposed changes are merely a small step forward. And nothing else.

Damien Tournoud’s picture

Category: bug » feature
Priority: Critical » Major

Drupal 7 or Drupal 8, this is a feature request. It doesn't prevent anything from working, so it cannot possibly be critical.

We have no declarative API for database schema updates. We don't have it because it's a difficult problem where the start and end points are not sufficient because the path taken can influence the end result.

This patch introduce a half-baked API where:

- you can add / remove / change columns
- you can add / remove / change indexes
- you cannot rename columns
- you cannot migrate data from one column to the other

We need to be aware of the limitations but, all things considered, this is not that bad.

chx was worried about the data (ie. schema) / formatter / widget dependencies. My guess is that this is something the modules that provide field types, formatters or widgets will have to deal with. I can expect that most schema modifications required by the fields in contrib will be backward compatible (for example: adding a column).

sun’s picture

Yes, that's exactly what I mean. Also, slight correction: HEAD already supports indexes, but only indexes. This patch adds columns, because we happen to need it.

gordon’s picture

subscribe

an.droid’s picture

+1 to sun.
subscribe.

Crell’s picture

Without commenting on the particular implementation at the moment, if you have a field that, after the initial release, you decide needs to have one property out of a serialized options column split out to its own column, what do you do?

This exact scenario happened multiple times to CCK field modules in Drupal 6. If there's no way to do that in Drupal 7, then that's a clear regression. I agree it's a complicated problem, but it's still a regression if field modules have no upgrade path available to them.

adrian’s picture

The address field for d6 also had a couple of schema changes to be able to more accurately record addresses.

Damien Tournoud’s picture

If you have a field that, after the initial release, you decide needs to have one property out of a serialized options column split out to its own column, what do you do?

@Crell, in that case, you create the new column first, and the migrate the data over in a batched update.

alex_b’s picture

#20 - so how does that work in concrete terms? That basically means you can't keep your field type name when you change your schema, correct? So field type 'date' would need to turn into 'date2'? Am I missing something here?

LaurentAjdnik’s picture

@alex_b:

Crell's example is a bit different. He intends to extract some serialized info from an existing column into a new one.

In that case, Damien's solution works perfectly: create the new column, and then run an update on the two columns.

The restriction is about changing the type of an existing column (from text to number, or from date to text, for instance) or the cardinalities.

This involves many possibilities and complex behaviors, especially since it has to run on any storage type and take care of widgets/formatters, that won't be solved in Drupal 7's Field API.

Damien Tournoud’s picture

@alex_b: if we decide to go ahead with this patch, you can keep your field type name, but you can only do "simple" schema modifications. Every thing else (like migrating data from one column to the other) will need to be done by an expensive batched migration:

$data = $query
  ->fieldCondition($field)
  ->execute();

foreach ($data as $entity_type => $entities) {
  $entities = entity_load($entity_type, array_keys($entities));
  foreach ($entities as $entity_id => $entity) {
    // Update the field.
    [...]
    field_attach_update($entity_type, $entity);
  }
}

As a (probably better) alternative, the field can also "overlay" the previous data structure: on load, convert the old structure to the new one and leverage the natural lifecycle of the entities to progressively migrate the data.

yched’s picture

wow, subs for now.

alex_b’s picture

So after talking to some people (among them chx and sun) and thinking more on this, I really don't know what's the way forward here. IMO, swappable storage engines for fields with significantly limited possibilities for managing schema changes is broken. That said, I don't know the solution.

(I'd like to emphasize that this issue is about modules managing changes to their field schemas - schema changes induced by site building tasks is a different topic.)

Our options for schema maintenance seem to be these:

a) No changes allowed: Allow schema changes to field instances without data. Once there is data in field instances, assume that only index changes are necessary (this is the status quo).
b) Migration: Create a new field of a new type, migrate data from field instances of the old type to field instances of the new type. This seems to be the somewhat agreed upon solution for upcoming Drupal major releases. I have serious doubts around how practical this is (create a new field type every time the field's schema changes?). At the very least it will require a much more robust batch API and hence is not fit for a recommended course of action for Drupal 7 module maintainers.
c) Comparison based delegation: A change API based on hook_field_storage_update_field($field, $prior_field) that detects changes between the field specs passed in and changes the current schema. This is the path we started down in D7 and this patch expands on it. Many have commented that this is not a realistic approach for all kinds of problems with reliably detecting changes between schema specs and adjusting them. This is also the reason why we have hook_update_N() in Drupal. That said, Damien hinted in #14 that within limitations, this could be a workable solution. I'd like to learn more about this because frankly, while knowing that this approach isn't a general solution (trust me, I've been burned) - I have a hard time fathoming the actual limitations of this approach.
d) Granular Change API: Come up with an abstraction layer where every storage engine would have to implement the equivalents of db_change_field(), db_remove_index(), db_add_index() etc. This is really just and idea at this point and I don't know how feasible this is.
e) Schema version based delegation: A hook_field_update_[storage_engine]_N() style approach where core provides update hooks for the sql storage engine and storage engine maintainers need to provide equivalents. hook_update_N() may be even sufficient for this approach. This way our community winds up writing [storage engines] x [field modules] x [update hooks] update hooks. Not ideal. But maybe a compromise that carries us through 7?

Note that it is entirely possible to just change a field's schema with hook_update_N() not heeding the conceptual limitations of the storage layer abstraction. Nothing stops contrib modules from changing their schemas the way they used to in CCK times. I have a strong hunch that this is what we will see in D7 contrib.

For Drupal 7 I think at the very least we need clear documentation on this limitation. Module maintainers need to understand what the implications of changing a field's schema are.

I know this issue is branching off a pretty heated debate around making the filter format field a string #934050: Change format into string - I really don't care about filter formats here. This is larger than making filter formats exportable. By all the debate that managing schema changes had during the past couple of years, I think it's worth giving this problem another very hard thought before the Drupal 7 beta cycle comes to an end.

alex_b’s picture

Improved on #2. I summarize:

- This is a #25 c) style approach.
- Introduces a _update_7000_field_update_field($prior_field, $field) update helper that invokes hook_field_storage_update_field()
- Expands field_sql_storage_field_storage_update_field() to allow for adding, changing and dropping fields using db_add_field(), db_drop_field() and db_change_field().

Changes to #2:

- Keep field_sql_storage_field_update_forbid() in place.
- Don't invoke hook_field_update_forbid() from _update_7000_field_update_field($prior_field, $field).
- Adjust tests accordingly.

Patch requires fixes from #937554: Field storage config entities 'indexes' key. I am attaching one patch with and without #937554 applied.

I am submitting this for review, I am not convinced that this is the recommendable course of action for Drupal 7.

Status: Needs review » Needs work

The last submitted patch, 937442-26_maintain_field_schema.patch, failed testing.

chx’s picture

- Introduces a _update_7000_field_update_field($prior_field, $field) update helper that invokes hook_field_storage_update_field()

i stopped reading here. never call hooks in updates. I thought I was clear on IRC and hopefully in issues like this: we never upgrade field schemas. Write a bulk update API in contrib, due to the lack of entity CRUD you can't really in core, copy to a new field (quite possibly massaging the data as needed), drop the old field.

As for the filter change: gather the core SQL tables that have filter columns and update their schema and disregard everything else. End of both stories.

Damien Tournoud’s picture

b) Migration: Create a new field of a new type, migrate data from field instances of the old type to field instances of the new type. This seems to be the somewhat agreed upon solution for upcoming Drupal major releases. I have serious doubts around how practical this is (create a new field type every time the field's schema changes?). At the very least it will require a much more robust batch API and hence is not fit for a recommended course of action for Drupal 7 module maintainers.

Ok, so for Drupal 7, the only possible course of action is this. Your two concerns here are partially bogus. First, field's schema should not change that often, so it doesn't really matter if the operation is costly or not. Second, the whole update process relies on the Batch API *anyway*.

We need to make sure that it is and remains possible to migrate data from one field to the other, meaning:

  • Create a new field field_name_new
  • Migrate data in batch from field_name to field_name_new
  • Delete field_name
  • Rename field_name_new to field_name

As far as I know, the current API provides everything except a way to rename a field.

Crell’s picture

chx: To make sure I understand #28, you're saying that for the format int/string conversion we should ignore abstraction and rely on the fact that we know that will be a local DB table in practice, and for any existing D7 sites that have that field in a non-SQL storage make it their problem?

I am not objecting to that approach, I just want to make sure I'm understanding it correctly.

alex_b’s picture

Component: field system » documentation

#28

never call hooks in updates

Right, this is just proof of concept for #25 c). Look at the code that winds up getting invoked (field_sql_storage_field_storage_update_field()) this pattern could be done in an update safe way. Doesn't change anything about the larger problems of this approach.

As for the filter change: gather the core SQL tables that have filter columns and update their schema and disregard everything else.

We're on the same page here.

#29

the whole update process relies on the Batch API *anyway*

Sure, there's just a speed difference between db level operations and entity API level load/save operations.

#28 / #29

So looks like there's no way around field migration. Where's the best place to document this?

yched’s picture

Component: documentation » field system

"As far as I know, the current API provides everything except a way to rename a field."

Would the mongo backend be able to rename a nested property in all its node records ?

yched’s picture

Component: field system » documentation

did not mean to change component

sun.core’s picture

Status: Needs work » Needs review
amitaibu’s picture

ouch, I've been bitten by this as-well, when wanting to drop some columns form OG tables.

Alan D.’s picture

Component: documentation » field system
Issue tags: +Needs documentation

Subscribing, this is going to be a pain for multicomponent fields like the name field.

I need three new columns added to achive all of the new features requested, so if now 1 by 1, I'll end up with 4 field types and the need to write update scripts to convert one field type to the next field type for every step. Then to clean up the mess latter, the old field types would need to be removed, risking users being caught up in the inactive field issues.... This is way to messy and time consuming, so I will have to stop features that require schema changes until a work around is found.

During the course of Drupal 7 development, I became aware of the policy not to do any changes that may lose user data, but I would not have imagined that adding a new column was going to be an issue. I had to read through the code last night to see that the core system only does indexes.

While this is very late in the development cycle, I would love to see the DB maintainers to have another look at this. I have to agree with sun about most developers would not have known about the way core has implemented ccks' handling of the schema changes.

Cheers
Alan

Alan D.’s picture

Pinging some life into this. Two feature requests in a week, each requiring a change (new column) to the schema....

We need to make sure that it is and remains possible to migrate data from one field to the other, meaning:

Create a new field field_name_new
Migrate data in batch from field_name to field_name_new
Delete field_name
Rename field_name_new to field_name

As far as I know, the current API provides everything except a way to rename a field.

So any pointers for contrib maintainers on a safe way to do this?

Create a new field field_tmp
Migrate data in batch from field_name to field_tmp
Delete field_name
Create a new field field_name
Migrate data in batch from field_tmp to field_name
Delete field_tmp

Cheers

n-tuple’s picture

Issue tags: -Needs documentation

Hi

Just want to throw my 10 cents in.

In many cases changes in application business logic imply changes in field schema definition.

In my case i have developed a n-tuple family of modules (see my profile) that heavily relies on cck's ability to adopt field schema api at runtime.

catch’s picture

Version: 7.x-dev » 8.x-dev
barraponto’s picture

This blocks OG #1277842: Data type of gid column in og_field_schema is float instead of int, we need some way to deal with this in contrib.

thedavidmeister’s picture

Status: Needs review » Needs work

There's no patch here that needs review or applies to the current 8.x HEAD.

jhedstrom’s picture

Adding a link to a related issue/patch that is often used in D7 for modules such as geocluster.

Is this still an issue in 8.x, or can field schemas be altered? If not, is it too late for 8.0?

Alan D.’s picture

I think both are dead threads with core maintainers voting these ideas down. And also would probably be 9.x feature requests now :(

chx’s picture

Status: Needs work » Closed (won't fix)

> A field schema is set in stone after initial creation and cannot be updated.

Correct.

> Field type modules have to be able to programmatically update their field schema.

Nope, just migrate over. We will get to D8 sources ... eventually. Help is welcome.

jhedstrom’s picture

Status: Closed (won't fix) » Active

This comes up very frequently with the (contrib) Date module. It seems a bit harsh to assume that site builders will perform a migration of an old date field to a new one if, for example, they want to collect timezone, make the field repeating, or add/remove an end date from the field. None of those are currently possible once a field has data without a non-trivial update hook that circumvents the logic core has in place to prevent this.

Dave Reid’s picture

This would also make the Multifield module possible in D8. Right now there's even a bug that the field cannot maintain it's field indexes with field_update_field() either: #2003746: field_read_fields() omits indexes from hook_field_schema

Alan D.’s picture

A field schema is set in stone after initial creation and cannot be updated.

Note that it is possible in 7.x, but you have to check the storage engine, do the updates etc manually. I done this with partial date alpha-3 to alpha-4 upgrade, details can be found here: #1345514: Upgrade path Alpha3 to Alpha4. Since non-sql would not be upgraded and it was only an alpha, the code was never committed.

@jhedstrom
Yep, it sucks. The best way I found was to define everything in the field schema irrespective if it is used or not. :(

Lots of unused columns, but I guess that is only really a byte of wasted storage per record; it just looks sloppy when you look under the hood.

plach’s picture

Issue tags: +entity storage
jhedstrom’s picture

Version: 8.0.x-dev » 8.1.x-dev

Feature request => 8.1.x.

(This seems like more of a task than a feature, but moving for now.)

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

jibran’s picture

I think @chx will disagree with #44 now after writing dynamic_entity_reference_update_8001 in #2555027: Support non-numeric entity ID's.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Anybody’s picture

I just ran into the same situation where we had to add a new column to our schema (field type extending Drupal\Core\Field\FieldItemBase and defining its schema there).

I couldn't and still can't believe that there is no automatism or even helper function to update the schema accordingly for field columns. Having a look at modules like

  • field_collection
  • dynamic_entity_reference (#51)
  • address
  • viewsreference
  • ...

or other modules which provide more or less complex field type columns I saw that alle module maintainers wrote > 150 lines of code for each update hook which is nearly the same.

It includes:
Defining the new / updated / deleted fields
Identifying the entities which use the field type
Checking if the entity type(s) is/are revisionable
adding / updating / removing columns in the database schema

From my point of view there are several ways this could be handled cleanly with more or less automatism / automatic comparison and detection of schema updates in the long term.

But for now what is really needed first are at least helper functions to use in HOOK_update and a field schema change API documentation to stop writing the same code again and again in every module, which should not be the modules task at all. This is being reflected by the major priority of this task already.

Perhaps module maintainers of mdoules like the modules listed above could help with their experience? How can we proceed and can we get a plan / general agreement on this?
A first best practice implementation would also help as first step for field type module maintainers.

Thank you all very much.

jibran’s picture

Anybody’s picture

Title: Field type modules cannot maintain their field schema » Field type modules cannot maintain their field schema (field type schema change C(R)UD is needed)
Anybody’s picture

I created a gist with two helper functions for module maintainers as a first "solution"...

I needed them myself to add new columns to a field type and change a column configuration: https://gist.github.com/JPustkuchen/ce53d40303a51ca5f17ce7f48c363b9b Please feel free to fork / add

And of course we still need a better solution in core.

robbin.zhao’s picture

I posted my worked code, and hope it works for others.


/**
 * Add a new column for fieldType
 * @param string $field_type
 * @param sring $new_property
 */
function field_definition_add_helper($field_type, $new_property) {
  
  $manager   = \Drupal::entityDefinitionUpdateManager();
  $field_map = \Drupal::service('entity_field.manager')->getFieldMapByFieldType($field_type);
  
  foreach ($field_map as $entity_type_id => $fields) {
    
    foreach (array_keys($fields) as $field_name) {
      $field_storage_definition = $manager->getFieldStorageDefinition($field_name, $entity_type_id);
      $storage = \Drupal::entityTypeManager()->getStorage($entity_type_id);
      
      if ($storage instanceof \Drupal\Core\Entity\Sql\SqlContentEntityStorage) {
        $table_mapping = $storage->getTableMapping([
          $field_name => $field_storage_definition,
        ]);
        $table_names = $table_mapping->getDedicatedTableNames();
        $columns = $table_mapping->getColumnNames($field_name);
        
        foreach ($table_names as $table_name) {
          $field_schema = $field_storage_definition->getSchema();
          $schema = \Drupal::database()->schema();
          $field_exists = $schema->fieldExists($table_name, $columns[$new_property]);
          $table_exists = $schema->tableExists($table_name);
          
          if (!$field_exists && $table_exists) {
            $schema->addField($table_name, $columns[$new_property], $field_schema['columns'][$new_property]);
          }
        }
      }
      $manager->updateFieldStorageDefinition($field_storage_definition);
    }
  }
  
}

/**
 * Remove a column of field_type
 * @param string $field_type FieldTypeId in your definition
 * @param string $property column name
 */
function field_definition_delete_helper($field_type, $property) {
  $field_map = \Drupal::service('entity_field.manager')->getFieldMapByFieldType($field_type);
  foreach ($field_map as $entity_type_id => $fields) {
    foreach (array_keys($fields) as $field_name) {
      $storage = \Drupal::entityTypeManager()->getStorage($entity_type_id);
      _field_property_definition_delete($entity_type_id, $field_name, $property);
    }
  }
  
}

/**
 * Inner function, called by field_definition_delete_helper
 * @param string $entity_type_id
 * @param string $field_name
 * @param string $property
 */
function _field_property_definition_delete($entity_type_id, $field_name, $property) {
  $entity_type_manager  = \Drupal::entityTypeManager();
  $entity_update_manager = \Drupal::entityDefinitionUpdateManager();
  $entity_storage_schema_sql    = \Drupal::keyValue('entity.storage_schema.sql');
  $entity_definitions_installed = \Drupal::keyValue('entity.definitions.installed');
  
  $entity_type = $entity_type_manager->getDefinition($entity_type_id);
  //$field_storage_definitions = $entity_field_manager->getFieldStorageDefinitions($entity_type_id);
  $field_storage_definition = $entity_update_manager->getFieldStorageDefinition($field_name, $entity_type_id);
  $entity_storage = \Drupal::entityTypeManager()->getStorage($entity_type_id);
  /** @var Drupal\Core\Entity\Sql\DefaultTableMapping $table_mapping */
  $table_mapping = $entity_storage->getTableMapping([
    $field_name => $field_storage_definition,
  ]);
  
  // Load the installed field schema so that it can be updated.
  $schema_key = "$entity_type_id.field_schema_data.$field_name";
  $field_schema_data = $entity_storage_schema_sql->get($schema_key);
  
  //get table name and revision table name, getFieldTableName NOT WORK, so use getDedicatedDataTableName
  $table = $table_mapping->getDedicatedDataTableName($field_storage_definition);
  //try/catch
  $revision_table = NULL;
  if ($entity_type->isRevisionable() && $field_storage_definition->isRevisionable()) {
    if ($table_mapping->requiresDedicatedTableStorage($field_storage_definition)) {
      $revision_table = $table_mapping->getDedicatedRevisionTableName($field_storage_definition);
    }
    elseif ($table_mapping->allowsSharedTableStorage($field_storage_definition)) {
      $revision_table = $entity_type->getRevisionDataTable() ?: $entity_type->getRevisionTable();
    }
  }
  
  // Save changes to the installed field schema.
  if ($field_schema_data) {
    $_column = $table_mapping->getFieldColumnName($field_storage_definition, $property);
    //Update schema definition in database
    unset($field_schema_data[$table]['fields'][$_column]);
    if ($revision_table) {
      unset($field_schema_data[$revision_table]['fields'][$_column]);
    }
    $entity_storage_schema_sql->set($schema_key, $field_schema_data);
    //Try to drop field data
    \Drupal::database()->schema()->dropField($table, $_column);
  }
}

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Ghost of Drupal Past’s picture

#51 is correct. I have a better idea. https://gitlab.com/chx_/druidfire not a full solution but it's useful.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

rudolfbyker’s picture

#40 works like a charm for adding and removing columns from existing fields!

I did some code cleanup and made it into a utility class:

<?php

namespace Drupal\my_utilities;

use Drupal;
use Drupal\Core\Entity\Sql\SqlContentEntityStorage;

/**
 * Utilities for updating field type definitions.
 *
 * Based on https://www.drupal.org/project/drupal/issues/937442#comment-12586376
 */
class FieldTypeUpdateUtil {

  /**
   * Add a new column for fieldType.
   *
   * @param string $field_type
   *   The ID of the field type definition.
   * @param string $property
   *   The name of the property whose column to add.
   *
   * @throws \Drupal\Component\Plugin\Exception\InvalidPluginDefinitionException
   * @throws \Drupal\Component\Plugin\Exception\PluginNotFoundException
   * @throws \Drupal\Core\Database\SchemaObjectDoesNotExistException
   * @throws \Drupal\Core\Database\SchemaObjectExistsException
   */
  public static function addProperty($field_type, $property) {

    $manager = Drupal::entityDefinitionUpdateManager();
    $field_map = Drupal::service('entity_field.manager')
      ->getFieldMapByFieldType($field_type);

    foreach ($field_map as $entity_type_id => $fields) {

      foreach (array_keys($fields) as $field_name) {
        $field_storage_definition = $manager->getFieldStorageDefinition($field_name, $entity_type_id);
        $storage = Drupal::entityTypeManager()->getStorage($entity_type_id);

        if ($storage instanceof SqlContentEntityStorage) {
          $table_mapping = $storage->getTableMapping([
            $field_name => $field_storage_definition,
          ]);
          $table_names = $table_mapping->getDedicatedTableNames();
          $columns = $table_mapping->getColumnNames($field_name);

          foreach ($table_names as $table_name) {
            $field_schema = $field_storage_definition->getSchema();
            $schema = Drupal::database()->schema();
            $field_exists = $schema->fieldExists($table_name, $columns[$property]);
            $table_exists = $schema->tableExists($table_name);

            if (!$field_exists && $table_exists) {
              $schema->addField($table_name, $columns[$property], $field_schema['columns'][$property]);
            }
          }
        }
        $manager->updateFieldStorageDefinition($field_storage_definition);
      }
    }

  }

  /**
   * Remove a property and column from field_type.
   *
   * @param string $field_type
   *   The ID of the field type definition.
   * @param string $property
   *   The name of the property whose column to remove.
   *
   * @throws \Drupal\Component\Plugin\Exception\InvalidPluginDefinitionException
   * @throws \Drupal\Component\Plugin\Exception\PluginNotFoundException
   * @throws \Drupal\Core\Entity\Sql\SqlContentEntityStorageException
   */
  public static function removeProperty($field_type, $property) {
    $field_map = Drupal::service('entity_field.manager')
      ->getFieldMapByFieldType($field_type);
    foreach ($field_map as $entity_type_id => $fields) {
      foreach (array_keys($fields) as $field_name) {
        self::removePropertyFromEntityType($entity_type_id, $field_name, $property);
      }
    }

  }

  /**
   * Inner function, called by removeProperty.
   *
   * @param string $entity_type_id
   *   The ID of the entity type.
   * @param string $field_name
   *   The ID of the field type definition.
   * @param string $property
   *   The name of the property whose column to remove.
   *
   * @throws \Drupal\Component\Plugin\Exception\PluginNotFoundException
   * @throws \Drupal\Component\Plugin\Exception\InvalidPluginDefinitionException
   * @throws \Drupal\Core\Entity\Sql\SqlContentEntityStorageException
   */
  private static function removePropertyFromEntityType($entity_type_id, $field_name, $property) {
    $entity_type_manager = Drupal::entityTypeManager();
    $entity_update_manager = Drupal::entityDefinitionUpdateManager();
    $entity_storage_schema_sql = Drupal::keyValue('entity.storage_schema.sql');

    $entity_type = $entity_type_manager->getDefinition($entity_type_id);
    $field_storage_definition = $entity_update_manager->getFieldStorageDefinition($field_name, $entity_type_id);
    $entity_storage = Drupal::entityTypeManager()->getStorage($entity_type_id);
    /** @var \Drupal\Core\Entity\Sql\DefaultTableMapping $table_mapping */
    $table_mapping = $entity_storage->getTableMapping([
      $field_name => $field_storage_definition,
    ]);

    // Load the installed field schema so that it can be updated.
    $schema_key = "$entity_type_id.field_schema_data.$field_name";
    $field_schema_data = $entity_storage_schema_sql->get($schema_key);

    // Get table name and revision table name, getFieldTableName NOT WORK,
    // so use getDedicatedDataTableName.
    $table = $table_mapping->getDedicatedDataTableName($field_storage_definition);
    // try/catch.
    $revision_table = NULL;
    if ($entity_type->isRevisionable() && $field_storage_definition->isRevisionable()) {
      if ($table_mapping->requiresDedicatedTableStorage($field_storage_definition)) {
        $revision_table = $table_mapping->getDedicatedRevisionTableName($field_storage_definition);
      }
      elseif ($table_mapping->allowsSharedTableStorage($field_storage_definition)) {
        $revision_table = $entity_type->getRevisionDataTable() ?: $entity_type->getRevisionTable();
      }
    }

    // Save changes to the installed field schema.
    if ($field_schema_data) {
      $_column = $table_mapping->getFieldColumnName($field_storage_definition, $property);
      // Update schema definition in database.
      unset($field_schema_data[$table]['fields'][$_column]);
      if ($revision_table) {
        unset($field_schema_data[$revision_table]['fields'][$_column]);
      }
      $entity_storage_schema_sql->set($schema_key, $field_schema_data);
      // Try to drop field data.
      Drupal::database()->schema()->dropField($table, $_column);
    }
  }

}

Should we make this into a contrib module? Maybe including some of the utilities from https://gitlab.com/chx_/druidfire ?

Anybody’s picture

Assigned: sun » Unassigned

@sun has been inactive here for a long period of time. Removing assignment. Feel free to reassign, but this problem is huge for module developers.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Anybody’s picture

@Core Maintainers here, is there already any (future) plan to add a helper function to Core / Field System to update field properties like in #66?

It's not cool to add that boilerplate code to several modules to have a compact and inutitive way to update field schema definitions. Shouldn't this be a core job?

I think we need a general decision to proceed.

tim.plunkett’s picture

There hasn't been a patch since the D7 days. Can someone turn #66 into a patch (or MR!) and get this issue moving again?
Anyone can do this, there are no core maintainers blocking this effort.

Anybody’s picture

@tim.plunkett, thank you for your superfast and very helpful reply! :) You're totally right, with your response this got the expert feedback needed to proceed. I just wanted to ensure in #70 that this is going into the right direction and there isn't already any parallelly evolving or even already existing functionality for this in core.

I'll turn #66 into a first MR and hope to put it in the right place, at least to discuss. I was just frightened I'm missing the required core knowlegde here, sorry.

tim.plunkett’s picture

I definitely wouldn't call myself an expert in the field system! But I'm happy that my comment gave you some confidence :)

catch’s picture

This could use an issue summary update.

One of the bigger changes to the field API was the addition of more granular field definition update functions, see https://www.drupal.org/node/2554097

There's also some code examples here:
https://www.drupal.org/docs/drupal-apis/update-api/updating-entities-and...

Gives an idea where things are at the moment, but don't think anything changes scope here.

Turning #66 into a patch or MR is a good first step, and after that adding some test coverage.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

efpapado’s picture

dww made their first commit to this issue’s fork.

dww’s picture

Status: Active » Needs work
Issue tags: +Needs tests

MR 1910 is #66 from @rudolfbyker. I put it at core/lib/Drupal/Core/Field/FieldTypeUpdateUtil.php for now. Tagging for tests. I haven't reviewed this at all, just copy/paste #66 exactly. I probably should have at least run core commit check on it. ;)

Anybody’s picture

@dpi: You once closed #2843108: How to change type of fields with existing data as duplicate of this issue. I now once again ran into a case where I had to "change" the type of an existing field (in this case fontawesome_iconpicker to micon) in an update hook, which was a really crazy task even though the schema of both fields is exactly the same.

Technically, it would have been enough to simply change the "type" value in the field storage and the type in the entity_form_display.

I totally understand, why core doesn't simply allow a change of an existing field "type", even if it's that simple in some cases from a birds-eye view.

Cannot change the field type for an existing field storage.

So the current solution is:
https://www.drupal.org/docs/drupal-apis/update-api/updating-entities-and...

Which works, but putting this bunch of code into update hooks of multiple module which need to perform such a conversion, for example because fields are deprecated, field types are merged, ... is scary.

TLDR:

Beside the important field schema change helpers from this post, we should decide if #2843108 was really a duplicate, and so we should also add a functionality for changing a field type with existing data to be able to do such a conversion smarter in update hooks or create a separate feature request for that. I'd vote for separation here, as there's a lot to discuss in detail, like limiting to same schema, allowing to write migrations, define formatters, add callbacks etc. But also think a helper for that is very important for DX and prevention of code duplication.

Until that's solved, I suggested putting something like that in contrib: https://www.drupal.org/project/change_field_type/issues/3271790
But for update hooks of other modules, such a further contrib solution wouldn't be a good final solution.

Where should we put that feature request for such a core helper? Thanks!

Anybody’s picture

Issue summary: View changes

Updated summary to reflect the required functionalities.

Might this be a topic for the https://www.drupal.org/community-initiatives/bug-smash-initiative or is it too general / abstract?

Anybody’s picture

While @dpi's response to #80 is still open, I added three functions (without implementation yet), which were missing, but are required in cases where compatible updates have to be made to the field schema preserving the field data:

/**
   * Update a field schema to match the latest definition,
   * while preserving its data.
   *
   * Updates the whole property schema to match the latest definition
   * from the fields schema().
   * Note: The property schema change has to be compatible
   * with the existing data.
   *
   * @param string $field_type
   *   The ID of the field type definition.
   */
  public static function updateToCurrentSchema($field_type) {
    // @todo: Implement:
    // 1. Get all entity types using this $field_type.
    // 2. Optionally: Ensure the existing data is generally compatible with the new schema.
    // 3. Copy the existing data (as the tables have to be recreated).
    // 4. Update the field storage definition to current for all field instances
    // 5. Restore (insert) the existing data
  }
/**
   * Updates a property schema definition preserving its data.
   *
   * Use this, if you need to make a change to an existing
   * fields property schema, preserving its values.
   * Note: The property schema change has to be compatible
   * with the existing data and supported by database engines,
   * e.g. change "description", ""not null", "unsigned" or "default".
   * This can't do magic like changing incompatible types!
   *
   * @param string $field_type
   *   The ID of the field type definition.
   * @param string $property
   *   The name of the property whose property schema to update.
   * @param array $new_property_schema
   *   The new schema definition of the single property.
   */
  public static function updatePropertySchema($field_type, $property, array $new_property_schema) {
    // @todo: Implement:
    // 1. Get all entity types using this $field_type.
    // 2. Optionally: Ensure the existing data is generally compatible with the new schema.
    // 3. Copy the existing data (as the tables have to be recreated).
    // 4. Update the field storage definition to current for all field instances
    // 5. Restore (insert) the existing data
  }

not yet sure if we need both or which one is smarter. I think the first one is what you want in most cases to be up-to-date with the defined field schema again, but may be to implicit.
The second one would need to copy over the schema definition for the changed property once again and for example can't add / remove indexes. This should be discussed.

The third one comes from the discussion in #80 so I won't repeat what I wrote there:

/**
   * Changes the field type (name) to $new_field_type.
   *
   * Use this (only in the rare case) if $field_type is obsolete but
   * the existing field and its data should be kept as $new_field_type.
   * Note: The property schema of $field_type and $new_field_type have to
   * be the same!
   *
   * @param string $field_type
   *   The ID of the field type definition to rename.
   * @param string $field_type
   *   The new ID of the field type definition.
   */
  public static function changeFieldTypeName($field_type, $new_field_type) {
    // @todo: Implement:
    // See https://www.drupal.org/docs/drupal-apis/update-api/updating-entities-and-fields-in-drupal-8#s-example-updating-a-field-from-an-obsolete-type-to-a-new-type
    // And safe us from implementing the boilerplate code more than once :).
    // Also see https://www.drupal.org/project/drupal/issues/2843108
    // and https://www.drupal.org/project/drupal/issues/937442#comment-14458722
  }

The main reason for adding these functions without implementation to the MR is to ensure they're not forgotten, as they're also highly relevant here, but the "how" definitely has to be discussed!

Examples for use-cases are:

  • Make columns (NOT) NULL'ABLE
  • Add Index
  • ...

yogeshmpawar made their first commit to this issue’s fork.

GaëlG’s picture

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

amaria’s picture

I am still having a problem with this. After updating my field schema with a new column, and use code from #60/#66 in an update hook, it doesn't work.

When I get the columns from the table mapping...
$columns = $table_mapping->getColumnNames($field_name);
It does not contain the new property yet so it fails at this...
$field_exists = $schema->fieldExists($table_name, $columns[$property]);

Am I missing something before I run this code?

Anybody’s picture

Once again ran into this, now in #3363573: SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'field_access_role_id' cannot be null. I'll ask @Grevil if he can help to push things forward here #74 + #79.

Could we get some core maintainer feedback on #80 and the implementation in general perhaps?
To ensure this goes into the right direction and is best practice? We shouldn't waste time, this is really essential to maintain field schema changes in contrib.

apmsooner’s picture

Same as #88 for me. The code examples the patch work is based on don't work exactly right for me. Tested with drupal 10 php 8.1.

  1. For adding new column, I think this $field_exists = $schema->fieldExists($table_name, $columns[$property]);
    .... should be this: $field_exists = $schema->fieldExists($table_name, $property);
    Otherwise it fails with php error...
  2. This $manager->updateFieldStorageDefinition($field_storage_definition); wipes out tables with existing data so I scrapped it and did my own thing. Through trial an error I got something working in a branch of a very complex module scenario. I don't claim it to be perfect but its working. The key part i needed was restoring the existing data after modification which isn't documented well anywhere. https://git.drupalcode.org/project/custom_field/-/blob/updater_service/s...

Open to thoughts and suggestions on my approach. It is indeed more complex than I anticipated. Anyhow... i set my implementation up as a service that can be used in a custom module .install file like example below. I feel like something in core could be similar way of doing it?

/**
 * Add a new column to custom_field.
 */
function my_module_update_9000(): void {
  /** @var \Drupal\custom_field\CustomFieldUpdateManagerInterface $update_manager */
  $update_manager = Drupal::service('custom_field.update_manager');
  $update_manager->addColumn('node', 'field_custom', 'test_decimal', 'decimal', ['scale' => 3, 'precision' => 6]);
}

/**
 * Remove a column from custom_field.
 */
function my_module_update_9001(): void {
  /** @var \Drupal\custom_field\CustomFieldUpdateManagerInterface $update_manager */
  $update_manager = Drupal::service('custom_field.update_manager');
  $update_manager->removeColumn('node', 'page', 'field_custom', 'test_value');
}
Anybody’s picture

Issue summary: View changes