Updated: Comment #176

NOTE TO COMMITTERS: please make sure @amateescu gets commit credits, he didn't post to this thread, but fixed a ton of fails in the helper issue. I think all other contributers have posted patches in the main issue.

Proposed commit message:

Issue #1497374 by yched, chx, Damien Tournoud, fago, plach, swentel, amateescu: Switch from Field-based storage to Entity-based storage.

This is part of #1949932: [META] Unify entity fields and field API

Problem/Motivation

Currently the entity system and the field API implement two different storage systems, making it possible to use two different storage engines with a single entity type (e.g. entity in db, fields in mongo). However, it turns out that this creates loads of problems when querying the entity - that's why the most popular different storage engines like mongodb implement a new field *and* entity storage - storing the whole entity in one storage back-end and making queries feasible.

This issue does not cover "pluggable entity storage" - it's about unifying storage systems. Supporting real pluggability is difficult and deserves its own issue.

Proposed resolution

Move storage completely to the entity system, i.e. move field API storage to the entity storage controller. For that entity storage controllers can opt-in to support "extensible storage" - what is the field API storage mechanism re-named and moved to the entity system.

Advantages

  • Much simplified code flow as we have single point of responsibility -> API/code cleanup
  • Eases writing new or custom storage engines a lot
  • The existing Field API storage would not be tightly coupled with field API any more, instead it could be re-used without having to use field API. For example, a module implementing custom logic for entity types could re-use "extensible entity storage" for its data storage without providing a configurable field in the UI.
    That patch gets us closer, but the storage logic cannot currently be reused for "base" fields (see #154)

Official support for having multiple field(/entity) storage engines in a single entity types is dropped. However, if really necessary it still can be done by implementing a new storage engine which just plugs together multiple, different storage engines.

Impact on Field API structures

As was agreed with @chx,@Crell & @alexpott in NYCCamp mid-July 2013, a direct consequence of "the storage controller is in charge of storing the whole entity" is that Field API "fields" are now specific to an entity type, and can only be shared across bundles of a given entity type, but not across entity types. Different entity types means different controller classes, means different "data buckets". We never supported cross-entity-type queries anyway, so this has no functional impact.
If anything, this gives site builders more liberty to name their fields, not stumbling on the same name being already used in a different entity type.

See #181 and below (notably #193) for more discussion about this change.

To allow for a D7 upgrade path, fields are now identified by [entity_type, field_name], instead of just field name previously. There can be two fields with the same name in different entity types, they're seen as completely different fields.
The SQL storage (managed by DatabaseStorageController now) remains "per field", meaning per [entity_type, field_name]. table names are:
[entity_type]__[field_name]
[entity_type]_revision__[field_name]
The storage principles are otherwise the same than with D7's field_sql_storage.

Patch summary

Storage handling code:
- Removes the notion of "field storage backends", and their hooks (that were actually callbacks)
hook_field_storage_info() / _alter()
hook_field_storage_load(), hook_field_storage_write() ...
- Removes field_attach_*() CRUD-related functions:
field_attach_[load|insert|update|presave|delete|delete_revision]()
From the outside, loading/saving field values is not separated from loading/saving the entity, so there is no public API for that.
- Removes the associated hooks:
hook_field_attach_[insert|update|presave|delete|delete_revision]()
They duplicated the regular hook_entity_OP() hooks anyway.
- Adds FieldableEntityStorageControllerInterface
Methods are mostly about reacting to field & bundle CRUD events: onFieldCreate(), onFieldInstanceDelete(), onBundleRename()...
The actual reading / writing of field values is *not* in the interface: EntityStorage classes only provide external facing API to act on an entity as a whole, not on its fields separately.
- Adds abstract FieldableEntityStorageControllerBase
Provides protected methods for reading / writing field values: loadFieldItems(), saveFieldItems()...
Those are wrappers that contain the generic handling of the "field data cache". Actual storage processing is delegated to doLoadFieldItems(), doSaveFieldItems() methods that are up to each actual storage class to provide.
- DatabaseStorageController::doLoadFieldItems(), doSaveFieldItems()... receive the storage logic that was previously in field_sql_storage.module

See #212 & #218 for rationale behind method names.

Field API structures:
(The corresponding changes account for ~ 2/3 of the patch size)

$field ConfigEntity changes:
- Removed: $field->storage (contained the storage settings specific to each field)
- Removed: $field->entity_types (optional list of entity types the field was restricted to)
- Added: $field->enity_type (required in entity_create('field_entity'))
- Added: $field->name (required in place of 'id' in entity_create('field_entity'); this comes back to something more similar to D7).
- Changed: $field->id() is now "$entity_type.$field_name" instead of the field name.
This means changes and renames in the exising field CMI config files.

API changes:
- field_info_fields() returns a list keyed by field UUID instead of field names (to avoid clashes)
- FieldInfo::getField() / field_info_field() / field_read_field() require an additional $entity_type param
- FieldInfo::getFieldMap() / field_info_field_map() returns a list keyed by entity_type, then field name
- $field->getBundles() returns a list of entity bundle names (was keyed by entity type previously)

Other:
- views integration is updated accordingly
- field_update_8003() (field API -> CMI) is updated to account for the changes in $field config entities
- field_update_8006() splits D7 storage tables into separate tables per entity type

Note for reviewers:
Two patches are posted: "REVIEW" and "FULL", to keep the review job under 600k :-).
"REVIEW" is the patch to review the actual code changes above.
"FULL" is the patch for commit. It is equal to "REVIEW" plus:
- old storage hook documentations removed from field.api.php
- tests that still make sense from field_sql_storage adapted and moved to Drupal\system\Tests\Entity\FieldSqlStorageTest
- field_sql_storage.module removed
Look for the latest REVIEW-to-FULL.interdiff.txt in the issue.

User interface changes

None

Remaining tasks

None

Follow-up Issues

Related issues

Original report by Damien Tournoud

Note: this issue is not about changing how entities and fields are stored in the database. It is about code cleanup and architecture.

Since Field API is older than Entity API, it has its own concept of storage, and fields exist almost independently of entities. This no longer matches our current way of thinking, which is entity-first.

The proposal is to remove the concept of storage from the individual field level, and move it to per-entity-type.
This will allow easier storage of entities in different backends, since there will be no need to override both the controller and the field storage.
It will also allow for easier and more reliable querying (EFQ v1 can't query across different storage types).

A related issue is the EFQ rewrite: #1801726: EntityFieldQuery v2 which makes queries per-entity-type, matching the thinking in this issue.

CommentFileSizeAuthor
#248 field_storage-1497374-248.interdiff.do_not_test.patch1.26 KBplach
#244 field_storage-1497374-244-FULL.patch588.33 KByched
#242 field_storage-1497374-242-FULL.patch588.36 KByched
#241 field_storage-1497374-241-FULL.patch600.64 KByched
#231 field_storage-1497374-231-REVIEW-do-not-test.patch512.54 KByched
#231 field_storage-1497374-231-FULL.patch588.32 KByched
#231 interdiff.txt3.33 KByched
#230 field_storage-1497374-230-REVIEW-do-not-test.patch515.35 KByched
#230 field_storage-1497374-230-FULL.patch603.4 KByched
#230 interdiff.txt2.55 KByched
#229 field_storage-1497374-228-REVIEW-do-not-test.patch515.49 KByched
#229 field_storage-1497374-228-FULL.patch603.54 KByched
#229 interdiff.txt5.26 KByched
#224 field_storage-1497374-224-REVIEW-do-not-test.patch515.67 KByched
#224 field_storage-1497374-224-FULL.patch603.71 KByched
#218 interdiff-method_renames.txt36.39 KByched
#219 interdiff-review.txt6.86 KByched
#220 field_storage-1497374-220-REVIEW-do-not-test.patch515.85 KByched
#220 field_storage-1497374-220-FULL.patch603.89 KByched
#220 REVIEW-to-FULL.interdiff.txt89.67 KByched
#215 field_storage-1497374-215-REVIEW-do-not-test.patch515.25 KByched
#215 field_storage-1497374-215-FULL.patch603.3 KByched
#215 interdiff.txt10.13 KByched
#214 field_storage-1497374-214-REVIEW-do-not-test.patch511.71 KByched
#214 field_storage-1497374-214-FULL.patch599.75 KByched
#214 interdiff.txt3.21 KByched
#211 field_storage-1497374-211-REVIEW-do-not-test.patch513.71 KByched
#211 field_storage-1497374-211-FULL.patch601.75 KByched
#211 interdiff.txt32.29 KByched
#209 entity-field-storage-methods.png174.91 KBBerdir
#209 entity-field-storage-public-methods.png123.17 KBBerdir
#209 entity-field-storage.png16.29 KBBerdir
#210 field_storage-1497374-210-REVIEW-do-not-test.patch486.21 KByched
#210 field_storage-1497374-210-FULL.patch574.25 KByched
#207 field_storage-1497374-207-REVIEW-do-not-test.patch499.99 KByched
#207 field_storage-1497374-207-FULL.patch588.04 KByched
#207 interdiff.txt1.05 KByched
#203 field_storage-1497374-203-REVIEW-do-not-test.patch499.99 KByched
#203 field_storage-1497374-203-FULL.patch588.08 KByched
#200 field_storage-1497374-200-REVIEW-do-not-test.patch500.1 KByched
#200 field_storage-1497374-200-FULL.patch588.18 KByched
#196 field_storage-1497374-196-REVIEW-do-not-test.patch500.18 KByched
#196 field_storage-1497374-196-FULL.patch588.3 KByched
#193 field_storage-1497374-193-REVIEW-do-not-test.patch500.47 KByched
#193 field_storage-1497374-193-FULL.patch588.59 KByched
#193 interdiff.txt1.54 KByched
#191 1497374_191.patch501.02 KBchx
#191 interdiff.txt1.59 KBchx
#188 field_storage-1497374-188-REVIEW-do-not-test.patch498.95 KByched
#188 field_storage-1497374-188-FULL.patch587.07 KByched
#183 field_storage-1497374-183-REVIEW-do-not-test.patch498.91 KByched
#183 field_storage-1497374-183-FULL.patch587.02 KByched
#183 REVIEW-to-FULL.interdiff.txt89.74 KByched
#183 interdiff.txt3 KByched
#177 field_storage-1497374-177-REVIEW-do-not-test.patch499.47 KByched
#177 field_storage-1497374-177-FULL.patch586.24 KByched
#177 REVIEW-to-FULL.interdiff.txt88.38 KByched
#174 field_storage-1497374-174.patch499.47 KByched
#174 interdiff.txt8.07 KByched
#172 field_storage-1497374-172.patch497.69 KByched
#172 interdiff.txt18.25 KByched
#171 field_storage-1497374-171.patch490.7 KByched
#171 interdiff.txt1.64 KByched
#169 1497374_169.patch492.62 KBchx
#165 interdiff.txt5.19 KByched
#164 field_storage-1497374-164.patch496.15 KBswentel
#164 interdiff.txt15.97 KBswentel
#161 field_storage-1497374-161-no-not-test.patch485.08 KByched
#161 interdiff.txt26.63 KByched
#157 field_storage-1497374-156.patch475.33 KByched
#157 interdiff_storage_pre_hooks.txt6.96 KByched
#157 interdiff_table_names.txt5.87 KByched
#157 interdiff_test_reverts.txt27.91 KByched
#155 field_storage-1497374-155.patch486 KByched
#155 interdiff.txt5.02 KByched
#153 field_storage-1497374-153.patch486 KByched
#153 interdiff.txt7.13 KByched
#151 field_storage-1497374-151.patch485.66 KByched
#151 interdiff.txt13.17 KByched
#145 14973744_145.patch494.6 KBchx
#129 field_storage-1497374_129-do-not-test.patch411.71 KByched
#109 1497374_107.interdiff.do_not_test.patch2.27 KBplach
#109 1497374_107.patch193.61 KBplach
#103 1497374_103.patch193.36 KBchx
#103 interdiff.txt37.16 KBchx
#98 1497374_98.patch192.14 KBchx
#98 interdiff.txt15.56 KBchx
#93 1497374_92.patch192.02 KBchx
#93 interdiff.txt14.42 KBchx
#88 1497374_88.patch192.23 KBchx
#86 1497374_86.patch192.21 KBchx
#84 1497374_84.patch191.9 KBnot_chx
#84 interdiff.txt22 KBnot_chx
#80 1497374_80.patch217.1 KBchx
#79 1497374_79.patch216.76 KBchx
#78 1497374_78.patch216.07 KBchx
#69 1497374_69.patch140.07 KBchx
#58 d8_field_storage.patch83.03 KBfago
#37 Unified field API hanghout.png43.26 KBfago
#5 1497374-entity-storage.patch219.5 KBDamien Tournoud
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

yched’s picture

Didn't watch any video / slides from the Core Conv session yet, just browsed through the git branch.

Is this removing the ability to have different fields on the same entity / entity type use different storage backends ?

Damien Tournoud’s picture

Is this removing the ability to have different fields on the same entity / entity type use different storage backends ?

Yes, the idea is to merge entity and field storage and thus move from a per-field to a per-entity type storage.

Damien Tournoud’s picture

Status: Active » Needs work

There is a first shot at moving the code around in the 8.x-1497374 branch.

(I lost hook_field_storage_query() somewhere in the process, which obviously will have to be restored)

Damien Tournoud’s picture

Project: Entity API Sandbox » Drupal core
Version: » 8.x-dev
Component: Code » entity system
Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
219.5 KB

Patch for testbot review (extracted from the 8.x-1497374 branch of the sandbox). Far from ready for human-review yet (the code has been moved around, but the API has not been cleaned-up).

Test link

Damien Tournoud’s picture

Project: Drupal core » Entity API Sandbox
Version: 8.x-dev »
Component: entity system » Code
Status: Needs review » Needs work

Back to the sandbox.

yched’s picture

Project: Entity API Sandbox » Drupal core
Version: » 8.x-dev
Component: Code » entity system

OK, so I guess this should be discussed :-).

This is not a minor change in the entity / field architecture.

- Allowing fields on a given entity bundle to come from different storage backends (possibly "remote") was one of the key design concepts from the very first "data api / fields in core" sprints 4 years ago. Moving back to "entity types are monolithic storage-wise" is a 180° turn.

Granted, I couldn't say for sure that he concept has lived up to its promises - at least in the visible contrib land; we haven't seen that many alternate field storage backends being published, and AFAIK the ones I'm aware of (mongo, riak ?) are document (entity-as-a-whole) oriented. But at least mixed storage is currently possible in D7, and it's hard to tell whether existing sites out there rely on that feature.

I understand that this design change is probably at the core of the "Entity as document / PHPCR" proposal (although is it that clear-cut ?), and that the promises there possibly outweight the flexibility loss, but this needs to be a conscious move.

This being said, I do need to watch that core conv video now...
[EDIT: aw, http://denver2012.drupal.org/content/open-standards-and-document-oriente... has the wrong video :-(]

- In the current architecture, a Field exists independently of any entity type. It knows where it is stored (in fact a Field is almost only that : "a data bucket somewhere"), and therefore you can create Instances of it in any entity type & bundle.
Stating that "a field is stored where the entity type is stored" means that a Field is created for a given entity storage backend, and can only have instances on entity types that use the same storage. Why not I guess, but right now I'm not exactly sure how this bubbles up.

yched’s picture

crosspost

fago’s picture

Project: Drupal core » Entity API Sandbox
Version: 8.x-dev »
Component: entity system » Code

I understand that this design change is probably at the core of the "Entity as document / PHPCR" proposal (although is it that clear-cut ?), and that the promises there possibly outweight the flexibility loss, but this needs to be a conscious move.

Exactly. I mentioned that issue during the core-conversations. As at least all known field-storage implementations are working on the complete entity anyway no one had a problem with that. As it's going to make everyone's life easier when implementing remote-storage engines I think that's a good trade-off.
Personally, I like the "one entity type - one storage backend" approach as it conceptually does not make much sense to me to have an entity split between multiple storage backends. Also, it ensures that entities stay query-able as far as possible. Then, if you still want more storage backends, split your entity up - e.g. by using field-collection.

We also discussed the needs for cross-entity-type queries (joins?) which won't work with different storage backends. However, we can still support them by having multiple entity-types using the same storage. Whether we want or need to do that in core is another issue we'll have to deal with on the way.

chx’s picture

Yeah things get extremely hairy and ugly when you have an entity with fields in multiple storages. You can't query, caching becomes tricky to say the least. I would say you want two entities in that case and relate.

Crell’s picture

The canonical use case we used back when we were originally designing what became Field API is a museum. You have Artwork entities in a legacy 3rd party data store of some kind, and wnat to extend it with comments and fivestar ratings in Drupal. While I have not done that recently, I DID build that site in Drupal 5 (sans the formal entity part, of course; we used a rather ugly bridge node design that I would rather not have to repeat).

So if we switch to per-entity storage, how would one implement that use case in a non-ugly way? If we can figure out a way to do that then I'm OK with this simplification.

fago’s picture

So if we switch to per-entity storage, how would one implement that use case in a non-ugly way? If we can figure out a way to do that then I'm OK with this simplification.

I'd just go with a single "local" entity that also contains a reference to the "remote artwork entity". Then, the local entity could be used to display the data of both the local + the remote entity.
That way, modules would automatically treat the storage as what they are: split. So, loading, queries and caching would be separated - what makes much sense imo.

Crell’s picture

The problem is then creating the "local" entity on the fly and associating it. That's non-trivial right now, and the last time I did it (the aforementioned Drupal 5 site) I had to do a ton of should-be-unnecessary redirects.

For background, this is where I'm coming from: http://www.palantir.net/blog/remote-data-drupal-museums-and-web-2009 I in particular want to ensure that we still have a better approach for method 2 than what I describe in that article. Right now, per-field storage is that. If we switch to per-entity storage, I want to make sure that I don't have to fall back to what I describe in that article since a lot of work went into not having to do that anymore.

Damien Tournoud’s picture

The canonical use case we used back when we were originally designing what became Field API is a museum. You have Artwork entities in a legacy 3rd party data store of some kind, and wnat to extend it with comments and fivestar ratings in Drupal. While I have not done that recently, I DID build that site in Drupal 5 (sans the formal entity part, of course; we used a rather ugly bridge node design that I would rather not have to repeat).

If you want to implement this kind "enriching an external object with some local data" scenario using Drupal 7, you would have to:

  • Implement an Entity Controller for your remote entity... and alter Entity Field Query of that entity to query the remote API directly. Doing so, you are basically going to implement something similar to the Entity Storage Interface that is suggested here.
  • Next, you are going to need to implement a Field Storage for your remote fields. It would probably need to be a NULL field storage, because the data has already been retrieved, right?

After the suggested refactoring, you would need to:

  • Implement an Entity Storage Interface for your remote API. It would be *way* more simple to do because the API is going to be cleaner and make a lot more sense.
  • Then you can build an Entity Storage Interface implementation that proxies two or more storages. Precisely, you are going to be able to decide to use the Entity SQL Storage implementation to store some of your fields, using any business logic that you can think about. Most likely, this is going to be as easy as storing the override only on save, loading the override on load and preventing queries across storages.

This suggested refactoring doesn't reduce flexibility, it just moves it from fields to entity types.

Crell’s picture

Hm. Damien, if I'm understanding you correctly then you're suggesting that per-field storage doesn't technically go away, it just moves to an "even more custom" entity storage driver, correct? And such driver could be written in such a way to allow runtime addition of fields that use local storage, and just does its own separate queries as needed?

Since I expect any such hybrid case to be fairly custom anyway I don't see that as a significant problem. If my understanding above is correct, then I have no objection to the refactoring suggested here.

plach’s picture

Not strictly related but I guess most people here might be interested in giving their feedback in #1498634-7: [meta] Define revision/translation SQL schema for core entities (no patch) too.

fago’s picture

The problem is then creating the "local" entity on the fly and associating it. That's non-trivial right now, and the last time I did it (the aforementioned Drupal 5 site) I had to do a ton of should-be-unnecessary redirects.

If that's desired, you can of course go with the remote entity by default and add in the data from the local entity if available only. No redirects or on the fly associations involved.

This suggested refactoring doesn't reduce flexibility, it just moves it from fields to entity types.

True - of course one can still implement multiple-storage backends in the entity controller, so the flexibility is not gone.

Implement an Entity Controller for your remote entity... and alter Entity Field Query of that entity to query the remote API directly. Doing so, you are basically going to implement something similar to the Entity Storage Interface that is suggested here.
Next, you are going to need to implement a Field Storage for your remote fields. It would probably need to be a NULL field storage, because the data has already been retrieved, right?

Yep, and in addition you'd have to alter the efq field-query support just as for the entity properties part. Yuck!

The point is we'll have single interface/controller for dealing with an entity's storage. So it will be easier to change (one thing to implement) and easier to use (one backend to talk to). Thus, this will make remote entity use-cases much simpler to do - not worse! :-)

sun’s picture

  1. No issue summary?
  2. I wonder about the bird-level storage engine case with this. If entity (type) A and all of its fields, including an image field X, is stored in mongo, and entity B and all of its fields, including a/the same image field XY, is stored in SQL, then we're saying that XY can no longer be an instance of the same field as X, even though it has been before?

    How do we explain that differentiation to users? With regard to the Re-use existing field functionality? ("I want to re-use the image field I already have there, but... why doesn't it appear?) To be honest, the UX for users sounds most troublesome to me here.

    And how do we explain that differentiation to site/db admins? ("I want to find all entities that use image Z, but... oink?")

pounard’s picture

2. is indeed a revelant question, if the two image fields are not stored on the same backend, they are not the same field anymore because you cannot cross query entities from both storage at the same time (which we can with the actual field storage mecanism).

yched’s picture

As I wrote in #7, this approach indeed means that a field is created for a given entity storage backend, and can only have instances on entity types that use that storage. Field UI would take that in consideration when presenting the list of "existing fields" that can be added, and field_create_instance() would raise an exception.

As sun points, this could mean some level of WTF ("why can I add this field from entity type B but not this one from entity type C ?"). Although, note we already have something along those lines currently : field definitions can (programmatically) state that they are only eligible for a given list of entity types - was added in D7 so that users don't have the confusing option to add the 'comment_body' field in nodes.

More radically, I think we could live with the limitation that fields can only be shared within a given entity type.

catch’s picture

The inability to share fields between entity types with different storage sounds like quite small fallout compared to the other benefits this has. It's also something I can't see affecting very many existing sites.

Limiting fields within an entity type sounds potentially quite painful to me though. I'd think entity reference, relation and similar modules encourage cross-entity-type shared fields and I wouldn't fancy trying to write the upgrade path for it.

Crell’s picture

If we can figure out a good UI that makes it clear why certain fields are available only on certain entity types, I'd rather do that, although within-type fields is a fallback.

That said, let's also keep in mind that only a very small minority of sites are likely to have mixed-storage in practice. So we may be able to have extra UI information that only appears if more than one storage backend is in use, and just hides itself for the majority case where it wouldn't be relevant anyway.

effulgentsia’s picture

Far from ready for human-review yet (the code has been moved around, but the API has not been cleaned-up).

Is this still the case, and if so, what if any help/feedback are you wanting at this time? Also, do you think this is close enough that we should try to finish it before continuing with #1498634: [meta] Define revision/translation SQL schema for core entities (no patch), or would you suggest work on that issue proceed independently?

bojanz’s picture

Project: Entity API Sandbox » Drupal core
Version: » 8.x-dev
Component: Code » entity system

This needs to be in the core issue queue.
It is frequently discussed and referenced, but hard to find.

Also needs an issue summary.

A big chunk of this work is the EFQ rewrite that that is almost ready, at #1801726: EntityFieldQuery v2.

bojanz’s picture

Status: Needs work » Active

Added a very brief (and potentially inaccurate) summary.

Resetting status since I'm guessing that the code will need to be redone from scratch (taking into account EFQ v2, the new entity controllers, etc)

chx’s picture

About fields-are-only-shareable-within-entity-type:

> I'd think entity reference, relation and similar modules encourage cross-entity-type shared fields and I wouldn't fancy trying to write the upgrade path for it.

I can certainly speak for relation module: it defines an endpoint field type which is attached to the relation entity type and absolutely nothing else. Regardless, I think this issue should not introduce this change, it's radical enough that we say fields-are-only-shareable-within-storage.

yched’s picture

fago’s picture

I didn't have a closer look at the code yet, but couldn't we just

  • define an EntityStorageExtendableInterface and move existing field-sql storage code to it
  • move field-sql storage module functions into a ExtendableDatabaseStorageExtender class
  • wire together extendable storage implementations for our fieldable entities

?

Damien Tournoud’s picture

@fago: the hard part is that our current storage controller is mixing up entity-type specific lifecycle code (for example: Node wants to manage ->created and ->updated) and the actual storage specific lifecycle operation. This in essence makes the storage totally non pluggeable right now.

We need to figure out a clear way for entity-type specific operations to be implemented outside of the storage controller.

fago’s picture

@fago: the hard part is that our current storage controller is mixing up entity-type specific lifecycle code (for example: Node wants to manage ->created and ->updated) and the actual storage specific lifecycle operation. This in essence makes the storage totally non pluggeable right now.

True. Still, once we've everything running through the storage controller, it's possible to do new entity types with any storage backend (including fields). Pluggability is more difficult, yep. I think it's something we should look at in a second step (if time permits...).

I can see entity-type specifc customizations like node created and changed to be moved to property/field classes. However, I see more the problem in stuff like applying indexes and storage updates in an storage-acgnostic fashion. For that to be solved it might require introducing another storage engine layer as you suggested already in some core conversation.

tim.plunkett’s picture

I'm finding it hard to follow what the current proposal/line of thought is, can someone update the issue summary?

SameerJ’s picture

Currently we create fields from field-types, and a field can be attached to different entity-types / bundles.

One possible approach: impose the restriction that a bundle can be made up of only those fields that rely on the same storage back-end.

That way all components of an entity are stored in the same back-end, which should simplify querying across all entities that are instances of a particular bundle.

I'm not sure how that would fit in with the philosophy of document-oriented storage back-ends such as Mongo.

SameerJ’s picture

Issue summary: View changes

Add basic summary.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

I'm finding it hard to follow what the current proposal/line of thought is, can someone update the issue summary?

I've updated the issue summary based on my pov.

The question remaining is whether we should get started with this now or postpone it on entityNG supporting all fields. Once that has happend every field API field would be exposed as entity field allowing us to rely on entity field definitions for storage - instead of depending on field API $field arrays what would create a weird interdependency of field and entity systems.

Thus, I think we should postpone this on having every field API field exposed, what means we need to have all fieldable entites and all field types converted to EntityNG.

fago’s picture

Crell’s picture

I am very wary about postponing this issue, given how late we are in the development cycle. Since Field API currently (AFAIK) just does its own thing via hooks, why can't we just go ahead with this implementation and say "yo, if you're not using the new API yet you don't get this, sucks to be you, time to convert!"? We know we want everything to convert to EntityNG sooner or later, don't we?

fago’s picture

>yo, if you're not using the new API yet you don't get this, sucks to be you, time to convert!

hm, true. Well, we could leave the "old", existing field storage API in place while moving it over to converted-entities for. We can keep both in place until the "old" one is unneeded and can be removed. I like.

fago’s picture

Issue summary: View changes

add lists

fago’s picture

As posted on http://drupal.org/node/1949932#comment-7206860 we discussed this a bit on a hangout yesterday. I try to provide short summary of the discussion here.

This is a diagram of the architecture we discussed:

Unified field API hanghout.png

So the idea is basically to move the field-storage "plugin" interface to an ExensibleStorageInterface which is optional for an entity storage controller. The implementation could be moved to a separate helper class, which the entity storage controllers can re-use. There was an agreement that this idea is sound, however we miss someone who is in to take care of it.

Problem is, that this does not provide us with a crucial new feature as both field and entity storage is already pluggable - it's more a refactoring that makes things more sane and most notable storage engines easier to implement.

After the call I figured that real advantage would be that modules could re-use field storage without providing a configurable field (aka field API field) though, as already noted in the issue summary.

plach’s picture

chx’s picture

I would need a bit more instructions to start coding.

chx’s picture

First the issue was stalled for two months now it is for ten days because while I would like to help, I have no idea how.

What do you think the API should look like? Some specs? Skeleton code? Something so that I can start coding.

chx’s picture

Work is in http://drupal.org/sandbox/chx/1857558 branch entitystorage. The fundamental approach is that hooks are out, events are in and field storage is merely an event subscriber. When I am writing this, revisions and purge needs work, otherwise it seems to work.

chx’s picture

There's not a lot of new code mostly it's ripping out the code from field.attach.inc supporting cross storage fields which are unqueriable anyways and then moving the code from the field sql module into a class so that the class can be a proper little subscriber.

Every entity storage will need to fire the entity.* events but not the entity.sql.* events. That's storage backend specific. If it's leaky abstraction, sorry about that. We can add different approaches, let's discuss.

xjm’s picture

Priority: Normal » Major
Crell’s picture

Standardizing the events across storage engines makes sense. I don't have a fundamental objection to an individual implementation also firing additional events if it makes sense to, as long as it still fires all of the standard events in the right places. (I'm unclear on what an entity.sql.* event would do for me, but I don't at the moment have a conceptual objection to it existing.)

andypost’s picture

Event-driven approach to notify field-storage has fundamental difference because event subscriber able to stopPropagation() of the event that hooks don't

In context of "attach-api" this probably makes sense but could lead to state when entity get partial updates.

chx’s picture

We can create an unstoppable eventdispatcher if that's desired. I am using the eventdispatcher because it allows me to create dependency injected classes. It's not yet 100% complete but the subscriber classes are getting close to pure unit testable -- I think Drupal\aggregator\StorageSubscriber already is (despite it's not yet finished). The node subscriber will be after #1921426: [Followup] Move node access storage to DIC. The menu link subscriber likely won't be in D8. User uses session and mail so that's also D9... but, you can write mock versions of these functions, for there are very, very few.

The controller classes will be unit testable the moment plugins can become dependency injected. There are no functions call there.

yched’s picture

The controller classes will be unit testable the moment plugins can become dependency injected.

Controllers are not plugins :-), and it's still not clear whether they will be in D8. Making them dependency injected would be a different task story.

chx’s picture

Oh right sorry. Anyways, they are not unit testable because we use "controller class" and not "controller service" but that's prolly a different issue, this issue is big enough...

fago’s picture

Priority: Major » Normal

While I'm not opposed to events in general, I'm not sure it's a good idea to do this now. So here are some questions/thoughts:

a) Why do we add in field storage in a modular way? I mean, that makes us support having multiple field storages? Why that? Couldn't we just off-load calls to $this->fieldStorage ?

b) Generally, this issues events for all CRUD operations what makes storage work modular. E.g. node storage additions are added in a separate class - what's per se nice - but we expose an API now what makes this possible for everyone.
But handling storage in a modular way is what makes things difficult and what the issue was supposed to do away with, such we've a single point that's in charge of storage. By exposing events and allowing anyone to take part, we lose control.

c) Similar to b), but another point. Imho we should move towards #1729812: Separate storage operations from reactions, but this unifies storage-insert with the possibility to react upon inserts. I'd rather do not expose "insert" for storage, so we can use it for reactions only. You can still add in a Field that has storage insert logic, if you really need to.

d) This aims for more that what the issue is about. While I'm totally not opposed to improving our storage logic, I think it would be better to try to separate tasks to get this issue done faster. I fear it becomes to much distracted with lots of general-storage related discussion else.

fago’s picture

Priority: Normal » Major

I don't meant to change priority.

chx’s picture

The current branch has four things:

  1. Storage controllers. As small as we can and needs to have interfaces. Like user needs to store roles and that's it (of course, in addition to normal entity storage duties).
  2. Storage subscribers -- like calculating threads do not need to be in a storage controller, the storage controller needs to be able to read the maximum thread necessary but that's all.
  3. Entity subscribers -- these were entity hooks before, hook_node_load, hook_entity_presave etc. You can inject stuff here.
  4. Field storage as a subscriber.

We are changing 4. into a direct call from 1. per fago's request and we are agreed on the architecture of the rest.

fago’s picture

ad 1) - yeah, so storage controllers are that part which are really in charge of storing things only. As discussed I'm not sure about separing role-storage methods, but I agree we want to keep the current logic/flow in this patch and do further cleanups elsewhere.

ad 2) Yes, separating this is a plus as discussed in #1893772: Move entity-type specific storage logic into entity classes. I think should try to move as much as possible from this logic to field classes though - like comment threading stuff. That's stuff for other issue also though and probably depends on #1969728: Implement Field API "field types" as TypedData Plugins to give is more flexibility here.

ad 3) Sounds and looks great. I'd love to see this become notifications only as by #1729812: Separate storage operations from reactions, but that's unrelated on events vs hooks.

ad 4) Yeah, let's let the entity storage decide which field storage gets used (and only one ..).

So this patch does a lot, and it cleans up a lot! Impressive work!

chx’s picture

Now I remember why I wanted field storage to be a subscriber: ordering. If we move field storage out of the event system, where should the event fire before or after field storage is called? I recommend keeping it as an event. Currently there's no weight defined for it so it's 0 , you can weight your subscriber positive to fire before it, negative to fire after.

fago’s picture

If we move field storage out of the event system, where should the event fire before or after field storage is called?

The event shouldn't be about storage, but about notifying. So neither before or after, field storage should be detached from events. E.g. we do not have preSave for storage. We've pre/post field storage hooks - but that's something different. Imho one storage event for maintaining any denormalized storage like the forum table should be enough, whether that's pre or post the other storage won't matter. $skip_fields I think we should drop - if it's a field, it get's stored.

As said, I'm not in favour of any storage events - but the possibility to maintain de-normalized stuff like forum does we need to keep. But we should document that this is for additional, thus de-normalized storage and not for supporting modular storage.

I'd see it that way:

save($entity):
 - notify: pre-save
 - (transaction start)
 - storage operation (ordering below is arbitrary)
      - entity base fields insert/update
      - just as other entity base fields (not necessarily in the base table), like user roles, term parents, ..
      - extended fields insert/update
      - notify: storage event
 - (transaction end)
 - notify: insert|update
yched’s picture

IIRC $skip_fields was introduced to support sharding / denormalizing field storage.
I.e. let 3rd party modules store the data themselves and "steal" it from the regular storage engine.

The main use case was something à la http://drupal.org/project/pbs / "CCK D6 storage model", back when many people freaked out about the "each field in its own table" approach in field_sql_storage.

Pbs remained at the "proof of concept" stage, and to my knowledge no other notable contrib code leverages that, I'd think this can be removed / simplified.

chx’s picture

Status: Active » Closed (duplicate)
fago’s picture

Status: Closed (duplicate) » Active

I don't think that's necessarily duplicated. #1982880: Refactor the entity system to be evented could incorporate this one - (not sure if that's still the goal?), but doing this issue does not require us to do #1982880: Refactor the entity system to be evented. So let's keep this issue for tracking joint-storage success.

fago’s picture

FileSize
83.03 KB

So here a short start on #37, so we have something more concrete to discuss.

- For invoking field storage we need to know the entity type now, so either we need to make fields per entity type or move calls over to instance CRUD.
- @todo: Figure out what to do with reamining hook implemenations of field-sql-storage.
- @todo: Fix moved field API attachers to use methods for invoking storage.

So what about this first step, move everything but keep it working as is. Then continue with refactorings later?

chx’s picture

While remote entities which can not store configurable fields provide the use case for the interface, there is no use case for the separate class. Indeed, by providing a separate class we achieve little or nothing: we wanted to make sure that the whole of an entity is stored in one place and there's a single point of API entry point for every interaction (load, save etc) with it. I very strongly suggest to merge the field sql storage into the DatabaseStorageController and be done with it. If someone wants to write an entity controller which does something fundamentally different with the nonconfigurable properties then it's still possible to extend the DatabaseStorageController and override the load/save/etc methods while keeping the fieldSqlWrite etc methods in place. So there's nothing lost by merging in to one class while there is no win at all if that's not done.

fago’s picture

we wanted to make sure that the whole of an entity is stored in one place and there's a single point of API entry point for every interaction (load, save etc) with it.

That's the goal of this issue, yes! I think we have already agreed on the most important part here.

While remote entities which can not store configurable fields provide the use case for the interface, there is no use case for the separate class.

ok, it's great to have agreement for the interface also, as discussed I still think separating the ExtensibleStorage code *internally* is useful. It helps to keep the code better separated and makes it easier to re-use the ExtensibleStorage from within your storage controller. Yes, we do not support having split storage, e.g. remote entity + local fields - but if you want to so you can still go for it and wire it together in your storage controller yourself. And yes, you are in charge of providing a suitable query class also. For that, it's also useful to have ExtensibleStorage logic implemented in a separated class.

Howsoever, I think that's an implementation detail and we've tougher issues to solve (see #58), so to move on I'd suggest to postpone this discussion for now, go with either approach and re-visit it once we've solved everything else. It's just a question of moving around code anyway.

chx’s picture

> and makes it easier to re-use the ExtensibleStorage from within your storage controller.

And your storage controller does what? Well, I will frreely presume it stores the content somewhere which is not SQL. And then

> the whole of an entity is stored in one place

is violated. Decide.

fago’s picture

> the whole of an entity is stored in one place

is violated. Decide.

the whole of an entity is stored via a single controller -> the entity controller. What this controller decides to implement itself is up to it - but sure, usually it's single storage.

chx’s picture

> the whole of an entity is stored via a single controller -> the entity controller

Hey, that's what I suggested! You are suggesting using an entity controller and a separate ExtensibleDatabaseStorage class.

fago’s picture

You are suggesting using an entity controller and a separate ExtensibleDatabaseStorage class.

That's how it is for the system, yes. How it implements stuff *internally* does not matter to the entity system though, so we decide to structure the code internally as it's useful for us while having the single-controller architecture in place.

chx’s picture

We are running in circles. If you provide an ExtensibleDatabaseStorage it's no longer an internal matter. That is the problem itself: you insist it's an internal matter and I maintain that in order to make it an internal matter you need to move it in-class and make them protected methods. Otherwise another, non-SQL class could just use that to store things. We win nothing.

Crell’s picture

If I understand the problem space directly, it can be considered internal to the storage engine and still be a separate class as long as the storage engine is fronting for it as far as any users of the storage engine are concerned. That is, as long as users of the storage engine don't know the difference between it being internal or being a separate utility class, it's still "internal".

Whether or not it SHOULD be a separate class, I don't know enough about the details here to say. But a utility object does not necessarily break encapsulation here.

Crell’s picture

Issue summary: View changes

Updated issue summary.

chx’s picture

In Portland we agreed on internal methods, one class. Progress is in https://drupal.org/sandbox/chx/1857558 branch 1497374. Tests need to be done, everything else should be. I hope a patch tomorrw.

Dries’s picture

Priority: Major » Critical

This feels very important to me for a number of reasons so I suggest promoting this to critical.

chx’s picture

Status: Active » Needs review
FileSize
140.07 KB

This won't pass yet but fundamentally only the tests are left and so it's ready to review. The following files are material; the rest is not:

  • core/lib/Drupal/Core/Entity/EntityStorageControllerInterface.php
  • core/lib/Drupal/Core/Entity/EntityStorageControllerBase.php
  • core/lib/Drupal/Core/Entity/DatabaseStorageController.php
  • core/modules/field/lib/Drupal/field/Plugin/Core/Entity/FieldInstance.php
  • core/modules/field/lib/Drupal/field/Plugin/Core/Entity/Field.php

The actual removal of field_sql_storage code can happen in a followup; it'd be just patch bloat.

andypost’s picture

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,4 +237,311 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+    // Invoke the hook.
+    module_invoke_all($this->entityType . '_' . $hook, $entity);
+    // Invoke the respective entity-level hook.
+    module_invoke_all('entity_' . $hook, $entity, $this->entityType);

deprecated $this->moduleHandler->invokeAll()

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,4 +237,311 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+      $cache = cache('field')->getMultiple($cids);
...
+      foreach (module_implements('field_storage_pre_load') as $module) {
...
+        \Drupal::entityManager()
...
+      module_invoke_all('field_attach_load', $entity_type, $queried_entities, $age);
...
+          $instances = field_info_instances($entity_type, $entity->bundle());
...
+          cache('field')->set($cid, $data);
...
+    foreach (module_implements('field_storage_pre_insert') as $module) {
...
+    foreach (module_implements('field_storage_pre_update') as $module) {
...
+      cache('field')->delete('field:' . $entity->entityType() . ':' . $entity->id());

deprecated and needs injection

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,4 +237,311 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+      cache('field')->delete('field:' . $entity->entityType() . ':' . $entity->id());

cache backed should be injected

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,4 +237,311 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+    $values = $this->fieldValues($entity_id, $field, $instance);
+    foreach ($values as $value) {
+      $definition = _field_generate_entity_field_definition($field, $instance);
+      $items = \Drupal::typedData()->create($definition, $value, $field->id());
+      $items->delete();

move $definition out of loop

chx’s picture

thanks for the quick review. I have converted file info to injection; doing more is way too much work and not necessary at all for a passing patch. Followup material. I will get $definition out of the loop.

andypost’s picture

+++ b/core/lib/Drupal/Core/Config/Entity/ConfigEntityInterface.phpundefined
@@ -75,4 +75,14 @@ public function setStatus($status);
+  public function getExportProperties();

+++ b/core/lib/Drupal/Core/Config/Entity/ConfigStorageController.phpundefined
@@ -529,4 +529,11 @@ public function importDelete($name, Config $new_config, Config $old_config) {
+  public function storageType() {
+    return 'config';

Looks like awesome stub!!!

amateescu’s picture

+++ b/core/lib/Drupal/Core/Database/Connection.php
index b647009..73469e1 100644
--- a/core/lib/Drupal/Core/Entity/DatabaseStorageController.php

--- a/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -558,4 +546,637 @@ public function baseFieldDefinitions() {
+          $entities[$row->entity_id]->{$field_name}[$row->langcode][] = $item;

Since most entity types have been moved to EntityNG, wouldn't it be wiser to move field storage code to NG in here? I know it's not easier, so just asking.. :)

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -558,4 +546,637 @@ public function baseFieldDefinitions() {
+    $load_current = $age == FIELD_LOAD_CURRENT;
...
+    $this->doFieldWrite($entity, FIELD_STORAGE_INSERT);
...
+    $this->doFieldWrite($entity, FIELD_STORAGE_UPDATE);

I guess these constants should be moved to EntityStorageControllerInterface.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -558,4 +546,637 @@ public function baseFieldDefinitions() {
+  public function handleUpdateField(Field $field, Field $original) {

Missing docblock.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -558,4 +546,637 @@ public function baseFieldDefinitions() {
+        throw new FieldUpdateForbiddenException("field_sql_storage cannot change the schema for an existing field with data.");

Stale comment.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -558,4 +546,637 @@ public function baseFieldDefinitions() {
+   * @param $field
+   *   The field structure.
+   *
+   * @return
+   *   A string containing the generated name for the database table.
+   *
+   */
+  static public function fieldTableName($field) {

Missing data types and type hint.

Same for a couple other methods below.

+++ b/core/modules/field/lib/Drupal/field/Plugin/Core/Entity/Field.php
@@ -194,6 +175,14 @@ class Field extends ConfigEntityBase implements FieldInterface {
+   * The storage type this field is in. Typically 'sql'.

Not sure "Typically 'sql'" adds any value here :P

+++ b/core/modules/field/tests/modules/field_test/field_test.module
index e5172b5..0000000
--- a/core/modules/field_sql_storage/field_sql_storage.info.yml

--- a/core/modules/field_sql_storage/field_sql_storage.info.yml
+++ /dev/null

I guess the gist of my comment is... Awesome!!

chx’s picture

re @andypost: getExportProperties existed before but it was not clear at all that it was used as the actual list of properties being saved so I documented it.

re @yamateescu thanks for the review. I will see to the doxygen fixes. I am not sure on moving the methods to NG yet. We can do that later. Constants moving, sure, they need to. Not sure "Typically 'sql'" adds any value here :P -- I think it does because it's not quite explained wtf is a "storage type" so it's a good example.

amateescu’s picture

Actually, disregard my comment about doing it the 'NG way' in this patch, we can't do that until *all* entity types are converted.

amateescu’s picture

Not sure "Typically 'sql'" adds any value here :P -- I think it does because it's not quite explained wtf is a "storage type" so it's a good example.

If the purpose of it is to explain what's the deal with "storage type", I guess we can say something like: "Drupal core provides a basic 'sql' storage type by default, but contrib or custom modules can provide alternative storage types." .. or something along those lines.

chx’s picture

FileSize
216.07 KB

To further the issue I am uploading a hopefully passing version -- note that it's made to pass by killing the FilledStandardUpgradePathTest -- but I wanted to solicit more reviews. I sincerely doubt any material changes will be necessary to get that test to pass. Documentation fixes are not yet in.

chx’s picture

FileSize
216.76 KB

This should be passing all including FilledStandardUpgradePathTest -- keeping up with HEAD, it's hard.

chx’s picture

FileSize
217.1 KB

Keeping up with HEAD.

dawehner’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -566,4 +551,637 @@ public function baseFieldDefinitions() {
+        ->condition($load_current ? 'entity_id' : 'revision_id', $ids, 'IN')
+        ->condition('langcode', field_available_languages($entity_type, $field), 'IN')

I like that 'IN' is specified even it is not needed as far as I remember, so it is easier to read.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -566,4 +551,637 @@ public function baseFieldDefinitions() {
+  /**
+   * {@inheritdoc}
+   */
+  public function storageType() {
+    return 'sql';
+  }

I am wondering whether it would make sense to have this as static method.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -566,4 +551,637 @@ public function baseFieldDefinitions() {
+        // If the database supports transactional DDL, we can go ahead and rely

it would be cool to not as an abbreviation here as it would really help people to understand what DDL could mean. (needed to ask wikipedia).

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -566,4 +551,637 @@ public function baseFieldDefinitions() {
+   * @return array
+   */
+  public static function fieldSqlSchema(Field $field) {

Would be nice to have a short description about the result of the method. Maybe you want to avoid that to not help people to use it.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -566,4 +551,637 @@ public function baseFieldDefinitions() {
+   * @param $field
...
+   * @return
...
+   * @param $name
...
+   * @return
...
+   * @param $name
...
+   * @param $column
...
+   * @return
...
+   * @param $name
...
+   * @param $column
...
+   * @return

Just a couple of places where a typehinting would be cool

+++ b/core/lib/Drupal/Core/Entity/EntityManager.phpundefined
@@ -104,21 +113,38 @@ class EntityManager extends PluginManagerBase {
+  /**
+   * Add more namespaces to the entity manager.
+   *
+   * @param \Traversable $namespaces

We should document why we need the addNamespaces method. For example: the uninstall methods for comment and taxonomy.

+++ b/core/lib/Drupal/Core/Entity/EntityManager.phpundefined
@@ -104,21 +113,38 @@ class EntityManager extends PluginManagerBase {
+    $iterator = new \AppendIterator;
+    $iterator->append(new \IteratorIterator($this->namespaces));
+    $iterator->append($namespaces);

Nice!

+++ b/core/lib/Drupal/Core/Entity/EntityManager.phpundefined
@@ -153,6 +179,9 @@ public function hasController($entity_type, $controller_type) {
+    if (!$definition) {
+      throw new \InvalidArgumentException(sprintf('The %s entity type does not exist.', $entity_type));
+    }

+1 for the way better error message!

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,6 +237,26 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+    }
+    // Invoke the hook.
+    module_invoke_all($this->entityType . '_' . $hook, $entity);
+    // Invoke the respective entity-level hook.
+    module_invoke_all('entity_' . $hook, $entity, $this->entityType);

@@ -255,4 +278,291 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+    $info = entity_get_info($entity_type);
...
+        \Drupal::entityManager()
...
+      module_invoke_all('field_attach_load', $entity_type, $queried_entities, $age);

As follow up we should inject the module handler in here to be able to run it properly, load entity info etc.

chx’s picture

Status: Needs work » Needs review
FileSize
22 KB
191.9 KB

Hopefully this passes, I have added lots of doc fixes based on the review / discussion of amateescu and dawehner.

I have reverted the api.php removals, we can fix that later -- we will do that with field_sql_storage too, it just bloats the patch.

chx’s picture

Issue tags: +Avoid commit conflicts
FileSize
192.21 KB

The patch is close enough that I think an Avoid commit conflicts tag is in order. It breaks all the time and very hard to fix. Not just the 2 fails above but since then I have *another* commit conflict (in less than half a day).

Status: Needs review » Needs work
Issue tags: -Avoid commit conflicts

The last submitted patch, 1497374_86.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
192.23 KB

Urgh, $entity->getType doesn't return the entity type but $entity->entityType does? That's much fun!

chx’s picture

Issue tags: +Avoid commit conflicts

tagging

dawehner’s picture

Issue tags: -Avoid commit conflicts
+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -234,6 +237,26 @@ public function invokeFieldItemPrepareCache(EntityInterface $entity) {
+    module_invoke_all($this->entityType . '_' . $hook, $entity);
...
+    module_invoke_all('entity_' . $hook, $entity, $this->entityType);

@@ -255,4 +278,291 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+      foreach (module_implements('field_storage_pre_load') as $module) {
...
+        \Drupal::entityManager()

Novice follow up (to be filled): Inject the dependencies into the base storage controller.

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -255,4 +278,291 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+  protected function fieldValues(EntityInterface $entity, Field $field, FieldInstance $instance) {
...
+  public function fieldPurge(Field $field) {

Another place which should use the FieldInterfaces (see basically everywhere in the patch)

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerInterface.phpundefined
@@ -173,4 +175,86 @@ public function invokeFieldMethod($method, EntityInterface $entity);
+  public function handleFirstInstance(FieldInstance $instance);
...
+  public function handleUpdateField(Field $field, Field $original);
...
+  public function handleDeleteField(Field $field);
...
+  public function handleInstanceDelete(FieldInstance $instance);

This should use the FieldInterface and FieldInstanceInterface instead.

Novice followup (to be filled): Need docs for the parameters

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerInterface.phpundefined
@@ -173,4 +175,86 @@ public function invokeFieldMethod($method, EntityInterface $entity);
+   * @param $field
...
+   * @param $instance

Novice follow up (to be filled): Typehint entities.

+++ b/core/modules/field/field.attach.incundefined
@@ -324,8 +285,7 @@ function field_invoke_method_multiple($method, $target_function, array $entities
-        if ($grouped_items[$instance_id][$langcode][$id] !== array() || isset($entity->{$field_name}[$langcode])) {
-          $entity->{$field_name}[$langcode] = $grouped_items[$instance_id][$langcode][$id];
+        if ($grouped_items[$instance_id][$langcode][$id] !== array() || isset($entity->{$field_name}[$langcode])) {     $entity->{$field_name}[$langcode] = $grouped_items[$instance_id][$langcode][$id];
         }

You seemed to have killed the linebreak.

+++ b/core/modules/field/field.installundefined
@@ -90,6 +80,66 @@ function _update_8003_field_create_instance(array $field_config, array &$instanc
+      db_create_table($name, $table);

Novice follow up (to be filled):Why do we still use the wrappers even there are tons of issues everywhere to remove such stuff.

+++ b/core/modules/field/field.installundefined
@@ -90,6 +80,66 @@ function _update_8003_field_create_instance(array $field_config, array &$instanc
+function _update_8000_field_sql_storage_write($entity_type, $bundle, $entity_id, $revision_id, $field_name, $data) {

It is odd to have revision_id required here. For example block.install is calling it but it probably actually has no clue about a revision ID of a block. What about make revision_id optional and just remove all previous revisions if it is not set before?

+++ b/core/modules/field/field.installundefined
@@ -293,16 +338,14 @@ function field_update_8003() {
+      $field = new Field($config);
...
+        "field_deleted_data_{$record['id']}" => DatabaseStorageController::fieldTableName($field),
+        "field_deleted_revision_{$record['id']}" => DatabaseStorageController::fieldRevisionTableName($field),

I got confused by the fact that $field is never saved, but all we want is the schema for the update.

dawehner’s picture

Issue tags: +Avoid commit conflicts

Readd the tags

dawehner’s picture

Issue summary: View changes

adding followup

effulgentsia’s picture

The actual removal of field_sql_storage code can happen in a followup; it'd be just patch bloat.

I'm unclear on this. Do you mean it just adds a bunch of "-" lines to the .patch file? If so, that's ok, let's do it. Or do you mean there's still code making calls to field_sql_storage.module? If that, then please update the issue summary or open a follow up issue explaining what that is and what needs to be done to fully remove it.

chx’s picture

FileSize
14.42 KB
192.02 KB

Fixed FieldInterface and FieldInstanceInterface in 33 cases. Commented field_update_8003, fixed the linebreak in field.attach.inc. _update_8000_field_sql_storage_write is copied verbatim and I do not think it should be refactored in this issue?

chx’s picture

Re #92 please observe that field_sql_storage.info.yml is nuked and so there is absolutely no way anything making a call to field_sql_storage module would work because it is no longer recognized as a module. So yes, it would just add a lot of - lines.

chx’s picture

FileSize
15.56 KB
192.14 KB

This should pass. The interdiff is against the last passing / reviewed patch in #88.

effulgentsia’s picture

Sorry for starting with trivial reviews so far rather than anything fundamental, but I'm still only slowly getting up to speed with the patch. I'll have very limited time for the next week, so please don't hold back an RTBC on my account, but I'll post small reviews as I'm able to.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
+   * @private Calling this function circumvents the entity system and is
+   * strongly discouraged. This function is not considered part of the public
+   * API and modules relying on it might break even in minor releases. Only
+   * call this function to write a query that \Drupal::entityQuery() does not
+   * support. Always call entity_load() before using the data found in the
+   * table.
+  static public function fieldTableName(FieldInterface $field) {

We don't yet have an @private annotation for Drupal core. I like the idea of introducing it and using it more broadly, not just the few places in this patch. However,
1. Should we rename it to @internal instead? That would match Symfony's convention.
2. Should we prefix the method name with a leading underscore, so it's easier to see when a caller is doing something questionable?
3. file.views.inc, image.views.inc, and taxonomy.views.inc call this private/internal method. Do we need a follow up issue to replace those use cases with something that's not strongly discouraged?
4. file.install and taxonomy.install have update functions that change the schema of their field type, and call this private/internal method to update existing data accordingly. Do we need a follow up issue to provide a proper API to do that?

msonnabaum’s picture

Concerning @private (since that was my suggestion), @internal might work, i didnt know about that. The docs seem to be a bit different than what I've seen @private mean though:

http://manual.phpdoc.org/HTMLSmartyConverter/PHP/phpDocumentor/tutorial_...

vs

http://rubydoc.info/docs/yard/file/docs/Tags.md#private
http://usejsdoc.org/tags-private.html

chx’s picture

Reading the links by msonnabaum in #100 it seems @private is more appropriate.

As for leading underscore, I have no preference. I guess yes, though.

Views integration is problematic! It is SQL bound and needs to reach this information but we can't put Views functionality on DatabaseStorageController, can we?

As for updates, the pox on updates. They do whatever needs to be done, there is no clean code requirement for them.

jibran’s picture

Some more doc and typehints nip and tucks. I hope this is not out of line for a critical issue.

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -255,4 +278,291 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+  protected function doFieldInsert(EntityInterface $entity) {
...
+  protected function doFieldUpdate(EntityInterface $entity) {
...
+  protected function doFieldDelete(EntityInterface $entity) {
...
+  protected function doFieldRevisionDelete(EntityInterface $entity) {
...
+  protected function doFieldLoad($entity_type, $queried_entities, $age) {

docs missing.

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.phpundefined
@@ -255,4 +278,291 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+

Extra space.

+++ b/core/modules/field/field.installundefined
@@ -65,10 +53,12 @@ function _update_8003_field_create_field(array &$field_config) {
+ *   TRUE ff this is the first instance of the field. Defaults to TRUE.

Typo it should be if.

+++ b/core/modules/field/field.installundefined
@@ -90,6 +80,66 @@ function _update_8003_field_create_instance(array $field_config, array &$instanc
+function _update_8000_field_sql_storage_write($entity_type, $bundle, $entity_id, $revision_id, $field_name, $data) {

Some typehints and docs missing.

chx’s picture

FileSize
37.16 KB
193.36 KB

Renamed the @private methods to _methods, addressed #102. Thanks for the reviews!

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

That is a quite important patch and opens the door for more improvements, like generate the tables automatically.

There might be some more docs issues, like using EntityInterface instead of \Drupal\Core\Entity\EntityInterface (see last interdiff.txt) but there will be more and more issues
which will deal with these points.

plach’s picture

I'll have a look to this tonight, but if it is ready go on without me :)

Berdir’s picture

We need a critical follow-up to convert the save logic to no longer use the BC decorator, but I agree that this should get in asap and then we'll *change* the code when we can get a useful interdiff out of it that can actually be reviewed.

yched’s picture

FWIW, I'm back home and started reviewing this.

yched’s picture

As a first reaction, as (I think) @eff above, I'd also vote to actually remove the code that's intended for deletion:
- storage hook docs in field.api.php - especially as some storage hooks are apparently preserved : hook_field_storage_pre_[load|insert|update]()
[edit: and then I guess strictly speaking they should move out of field.api.php, into - hmm, not sure where...]
- field_sql_storage code
- the former implementation of the field_test_storage in field_test module's field_test.storage.inc

plach’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
193.61 KB
2.27 KB

Overall this looks very good to me, I found some minor stuff that it would be better to fix before commit. Patch attached. Below some questions.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -7,13 +7,15 @@
+use Drupal\field\FieldInfo;
+use Drupal\field\FieldUpdateForbiddenException;
+use Drupal\field\FieldInterface;
+use Drupal\field\FieldInstanceInterface;
 use Drupal\Core\Entity\Query\QueryInterface;
 use Drupal\Component\Uuid\Uuid;
 use Drupal\Component\Utility\NestedArray;
 use Drupal\Core\Database\Connection;
+use Drupal\field\Plugin\Core\Entity\Field;

I am really concerned by the amount of dependencies we are introducing between the entity system and the Field API module. I realize this might be needed atm, but I am wondering whether there is a way to exploit field definitions and the new field definition interfaces instead of directly referencing Field API classes.

Since our ultimate goal is not special-casing fields provided by the Field API module, I am wondering whether it makes sense to keep the distinction between configurable and non-configurable fields at storage level. This might be follow-up material, but I'd feel better if we at least had an answer to this concern before commit. And since we are past API freeze the less API changes are required to get this refactoring done, the better, hence maybe we should try to address this here if at all feasible in the current state.

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
@@ -566,4 +552,648 @@ public function baseFieldDefinitions() {
+    $ids = $load_current ? array_keys($entities) : array();
+    foreach ($entities as $entity) {
+      $bundles[$entity->bundle()] = TRUE;
+      $ids[] = $entity->getRevisionId();
+    }

I've read these lines multiple times and I still don't get why they are working. An inline comment would be welcome.

--- a/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.php
+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.php

I am wondering wether it would be better to introduce a FieldableEntityStorageController class to avoid "bloating" the base storage controller with methods that are used only by content entities. We have a similar issue for the entity objects (see #2004244: Move entity revision, content translation, validation and field methods to ContentEntityInterface), this solution would be consistent with the one proposed there.

Related to this, what about moving the Field CRUD code to a dedicated helper class? Right now we are loading in memory at every request code that will be used very few times. I know this is already happening in D7 but I see the potential for some optimization here.

+++ b/core/lib/Drupal/Core/Entity/EntityStorageControllerBase.php
@@ -255,4 +278,330 @@ protected function invokeTranslationHooks(EntityInterface $entity) {
+      module_invoke_all('field_attach_load', $entity_type, $queried_entities, $age);

Do we still need this, given that the other storage-related field attach hooks are gone?

Gábor Hojtsy’s picture

Tagging as per explicit approval from @Dries in #68 above.

chx’s picture

In New York, yched and me have agreed on me to remove the ability to share a field between entity types (Damien has also indicated this would be good quite some time ago) and we are working through the failures in the sandbox.

tim.plunkett’s picture

remove the ability to share a field between entity types

That will have a rather large impact on site builders.

For example, Views won't be able to filter or sort by a single field anymore.

effulgentsia’s picture

For example, Views won't be able to filter or sort by a single field anymore.

#111 says "between entity types", not "between bundles". Does Views ever allow cross-entity-type queries?

effulgentsia’s picture

Tag removal in #113 was unintentional; restoring them.

tim.plunkett’s picture

I know I've done it. I can't remember the use case, or if it was just academic...
Yes, across bundles is obviously essential.

catch’s picture

What's the upgrade path for sites already using a field across two entity types?

plach’s picture

What about duplicating fields and prefixing them by entity type?

Damien Tournoud’s picture

I have been arguing that fields should be local to an entity type. In most websites build that I participated recently we had a lot of stupid prefixing of field names by their entity types.

plach’s picture

Am I wrong or in Denver the conclusion was we would allow for cross-entity-type fields if they shared the same storage?

chx’s picture

> What's the upgrade path for sites already using a field across two entity types?

Instead of having a single field attached to user and node they will have two fields of the same field name. Once you loaded your user or node, there is no change. If they use SQL storage: the table name changes -- it now includes the entity type but the upgrade path is very easy, it can be done in a single SQL INSERT-SELECT command which already is written.

plach’s picture

@chx:

So we have fields with the same machine name and different ids?

chx’s picture

> So we have fields with the same machine name and different ids?

$field->id() is now "$entity_type.$field_name" if that's what you mean.

yched’s picture

Yes, basically a $field is now identified by [entity type, field name] (field_info_field() signature is changed accordingly)
Two fields with the same name on nodes and users are just like two different fields - storage location is different, we don't support cross entity type queries, and never have.
This is just a direct consequence of the "storage controller is in charge of storing the whole entity". Different entity types means different controller classes, means different "data buckets".

yched’s picture

As @chx pointed in #122, this means
- the ConfigEntiy id() of a field is now "$entity_type.$field_name" (instead of "id() = field name" previously)
- we get back to a $field->name property that is just the field name.

That's an unfortunate back and forth dance after #1735118: Convert Field API to CMI that stated "$field['name'] in D7 is $field->id() in D8", but I actually think this is for the best. Base (non-config) fields have a 'name', and all the code that deals with config or base fields indifferently, adresses them by "name". So, consistency ++.

$field->id(), just like $instance->id(), is now just a CMI id that should only matter to very little code.

Damien Tournoud’s picture

I am totally in line with #124.

plach’s picture

#124 makes perfectly sense to me too, nice solution :)

fago’s picture

I totally second #123 + #124 as well!

Amazing this one is green. Here a review:

- storage hook docs in field.api.php - especially as some storage hooks are apparently preserved : hook_field_storage_pre_[load|insert|update]()

Looks like the hook is still around, but that does not make sense to me. If there is no field storage, there should be no pre-field-storage [load|insert|update] hooks. Can we remove them as we've done with other pre-field-API hooks? If we need a pre-storage hook it should be on the entity level.

* The type of storage.
*
* Drupal core provides a basic 'sql' storage type by default which stores
* every field in a separate table. Contrib or custom modules can provide
* alternative storage types.
*
* @return string
*/
public function storageType();

Oh, that's a useful addition/port!

+ * The storage type this field is in. Typically 'sql'.

I'm not sure why this still needs to be stored with the field - it has no choice anyway. I guess we could replace it with a method?

   * Allows reaction to the deletion of a configurable field.
   */
  public function handleDeleteField(FieldInterface $field);

  /**
   * Allows reaction to the deletion of a configurable field instance.
   */
  public function handleInstanceDelete(FieldInstanceInterface $instance);

I think it would be a great goal if we could manage to get field-API interfaces out here, but stick to entity-api interfaces only - so both APIs would be better decoupled/separated. That should be doable if we would move entity field definitions to a class implementing the field definition interface as well. What remains then is some usages of field instances, but those do not seem to be used a lot besides getting the field object from it anyway. Howsoever, I'll open a separate issue for the class based field definition point so we can re-consider that once we know more there. (updated: see #2047229: Make use of classes for entity field and data definitions)

yched’s picture

Note: latest patch here is stale and doesn't reflect the new approach yet.
Notably, 'storage_type' is no more. Instead, $field gains an 'entity_type' property.

yched’s picture

Current state of the patch with the new approach.
Not green yet, but should be reviewable.

Too many changes & rerolls for an interdiff ...

Will try to answer @plach #109 and @fago #107 shortly.

Side note: opened #2047777: Helper issue for "Switch from Field-based storage to Entity-based storage" to go wild with testbot runs.

jibran’s picture

Let's put the new approach in issue summary.

/me runs

rlmumford’s picture

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.phpundefined
@@ -528,42 +536,633 @@ protected function saveRevision(EntityInterface $entity) {
+      $query = $this->database->insert($table_name)->fields($columns);
+      $revision_query = $this->database->insert($revision_name)->fields($columns);

I may be missing something, but are we assuming that fields are always revision-ed even when the entity may not be? What about a SQL storage method that's only difference is to not use revisions, would they have to completely replace this method?

+++ b/core/lib/Drupal/Core/Entity/Query/Sql/Tables.phpundefined
@@ -157,7 +158,7 @@ function addField($field, $type, $langcode) {
+        $sql_column = DatabaseStorageController::_fieldColumnName($field, $column);

@@ -242,12 +243,12 @@ protected function ensureEntityTable($index_prefix, $property, $type, $langcode,
+      $table = $this->sqlQuery->getMetaData('age') == FIELD_LOAD_CURRENT ? DatabaseStorageController::_fieldTableName($field) : DatabaseStorageController::_fieldRevisionTableName($field);

In the storage controllers above we use late static binding to make sure we get the correct implementation of _fieldColumnName(). Is called DatabaseStorageController directly here going to lead to an inconsistency?

+++ b/core/modules/field/field.views.incundefined
@@ -89,18 +113,22 @@ function field_views_field_label($field_name) {
+  $current_table = DatabaseStorageController::_fieldTableName($field);
+  $revision_table = DatabaseStorageController::_fieldRevisionTableName($field);

As above.

+++ b/core/modules/field/lib/Drupal/field/Plugin/views/field/Field.phpundefined
@@ -286,7 +287,8 @@ public function clickSort($order) {
+    $field = field_info_field($this->definition['entity_type'], $this->definition['field_name']);

As above.

rlmumford’s picture

Didn't man to remove tag, sorry!

rlmumford’s picture

I also think this issue has the potential to solve #2038707: Entity query sql backend limits storage controllers changes in contrib - we need to make sure efq's play nicely.

plach’s picture

@rlmumford:

I think the issue summary could still use a refresh :)

plach’s picture

Crosspost

giorgio79’s picture

#131

I may be missing something, but are we assuming that fields are always revision-ed even when the entity may not be?

I believe this is related to the D7 issue, where Field API always creates revision db field tables even though the nodes do not have revisioning turned on. This almost doubles SQL writes for nodes and decreases performance. More info at: #1279440: Deeply decouple revisions from Entity API and Node module
Another issue with a start on decoupling nodes and revisions: #120967: Separate revisions from node.module
Here is a case study that shows node write performance doubles using https://drupal.org/project/field_sql_norevisions Been a while since I've seen a patch that increases Drupal node write performance by 100%, but maybe this can be it :P

chx’s picture

skipping revision writes: followup. We are changing enough here. We are changing what we must. That's still a 400kb+ patch...

> Is called DatabaseStorageController directly here going to lead to an inconsistency?

We could change static:: to self:: I do not care at all. Note that the Views integration will treat everything extending DatabaseStorageStorage as having the same structure, anyways.

fago’s picture

Note that the Views integration will treat everything extending DatabaseStorageStorage as having the same structure, anyways.

This is where I think the previous storage-type, or a similar solution using an additional interface was better. Views should not depend on a certain implementation class to think this is sql-based storage, there should be an implementation in-dependent way to specify that. If it relies on additional methods also, then this should be an additional interface imho.

rlmumford’s picture

> Is called DatabaseStorageController directly here going to lead to an inconsistency?

We could change static:: to self:: I do not care at all. Note that the Views integration will treat everything extending DatabaseStorageStorage as having the same structure, anyways.

From a contrib point of view, to be able to "almost-do" something is often more unhelpful than not being able to do it at all. To write a new storage engine to find that some parts of drupal don't respect it is going to be a big DX issue surely?

It can't be that difficult to get hold of the storage class for the entity type your dealing with in these instances can it?

Sylvain Lecoy’s picture

LOL #118: I have been arguing that fields should be local to an entity type. In most websites build that I participated recently we had a lot of stupid prefixing of field names by their entity types.

Yeah I've opened an issue about this #1945278: Allows fields to be stored by entity type instead of name..

Sylvain Lecoy’s picture

And while speaking at stupid things we should also remove EntityInterface; successful frameworks i've seen do not dictact how to define an entity by extending base class or implementing interfaces.

Plain Old PHP Object are simple and effective!

yched’s picture

Can we avoid pure and unrelated trolling in a complex issue already at 140 comments ? thanks.

Sylvain Lecoy’s picture

I am completely sorry I thought we were speaking about API changes / improvements of the Field API... Let me bring some Ctools and Views fluff talk in my post so I will be not considered as a troll.

I noted some people saying

remove the ability to share a field between entity types

or Damien in #118

I have been arguing that fields should be local to an entity type. In most websites build that I participated recently we had a lot of stupid prefixing of field names by their entity types.

I am more than happy if this would be fixed and I proposed a solution in #1945278: Allows fields to be stored by entity type instead of name..

What do you think of this issue ? It is separate and does not require overhead in this discussion.

chx’s picture

Sylvain, you are one of those people who have the excellent capability to derail / talk to an issue to death. Enough. Thanks.

chx’s picture

Issue summary: View changes

add new follow ups

chx’s picture

Issue summary: View changes

Updated issue summary.

chx’s picture

FileSize
494.6 KB

This patch here implements the non-shareable fields. It includes two databases issues I linked from the issue summary which needs to get in first to make field_update_8006 doable -- it copies the data table structure and then copies the data with an INSERT ... SELECT *. Much thanks to amateescu and yched for getting this to green -- I only did a small part. We will clean uninstall up in the next few days -- either by the module disable removal issue getting in or storing the storage class and then doing something with entityManager to get that in there.

yched’s picture

Now that we're green :-), I'll try to update the issue summary with the new approach and the API changes. Probably won't be before monday, though, so anyone welcome to beat me to it.
I'll also try to answer the comments received since #109 above.

One issue that we still need to resolve: table name length.
In the new code, a field is identified by [$entity_type, $field_name] rather than just a field name. So table names, that previously only included $field name as their "variable part", now need to include both information (current patch does "field_{$entity_type}__{$field_name}" / "field_rev_{$entity_type}__{$field_name}").
Problem:
- we hit "table name too long" much sooner
- that's two varying parts, so we can't solve this by just putting a harder limit on field names.

The current patch reaches green by patching a couple tests to just use shorter field names (simpletest makes things worse by eating 10+ chars just with the db prefix). We need to solve that more reliably though.

Optionally, we could consider switching to just "{$entity_type}__{$field_name}" / "{$entity_type}_rev__{$field_name}" pattern for table names, this saves up a couple more chars, and has a nice side effect of kicking phpMyAdmin's autofolding in the "list of tables" left column, + laying field tables next to each entity type base table. Whether we do this or not, it's obviously not good enough though.

Possible approaches I've been thinking about:

[edit: snip approaches 1) & 2), they involved more restrictive name lengths, which, as @chx points below, forbids an upgrade path]

Other than that, we need a way to generate a unique, (relatively) human readable string out of an arbitrary [$entity_type, $field_name] pair:

3) truncate and add numeric suffixes
Drawback: it makes the table names depend on CRUD history on a specific site instance (e.g "create field A, create field B, delete field A" & "create field B" result in different table names for field B.
So table names might end up being different on dev & prod, which feels bad - and would actually break since views CMI definitions currently include field table names (even though this specific point feels wrong from Views).

4) Truncate and add a short hash of the Field UUID
We already do something similar for "data tables for deleted fields" in current HEAD.
Could be something like:

$table_name = "{$entity_type}__${table_name}";
if (strlen($table_name) > 50) { // Keeps a 14 chars margin for db prefixes
  $table_name = substr($table_name, 0, 40) . '_' . substr(hash('sha256', $field->uuid), 0, 10);
}
yched’s picture

Side note: opened #2057269: Exported CMI views definitions contain actual table names ? for the Views remark above.

chx’s picture

Limited entity or field name lengths makes me fear that angry mobs in Austin will beat me over the head with "640k is ought to be enough for anybody" signs (although Gates denied ever saying that). Aside from small personal issues like that, the update will fail if we try to enforce new limits. And it's much more plausible that someone has a long entity type (just commerce_line_item is 18 characters). So obviously 4) is the only way forward with the only detail waiting for some delicious bikeshedding is whether we want a field_ (or even field_data_ just to make it more confusing!) prefix. Not employing that prefix, of course, has the chance of making the meaningful table length longer so I really like that and unless someone has a real compelling reason not to go with 4) as it is, I would like to close this discussion right here.

yched’s picture

Doh, silly me. Stricter limits means no upgrade is possible, of course. Another novel I could have saved writing...
So yep, 4) it is.

Regarding "{$entity_type}__${table_name}" / ditching the "field_" prefix, I think I like that more and more as well. It does break the usual understanding of "prefix your tables with your module name", but this just means we use dynamic prefixes that other modules define... and the double underscore should safely avoid clashes. So IMO we can just go with it.
We can always reconsider in a followup, that would be a non API-breaking change.

plach’s picture

4) sounds very good to me, with the new data/revision tables and the refactroing planned in #1998366: [meta] SQLite is broken, we might end up with the following:

node
node_revision
node_field_data
node_field_revision
node__body
node_revision__body

We are going to get inconsistent data/revision table couples in some situations, I guess:

node__very_long_field_name
node_revision__very_long_field_1234abcdef
yched’s picture

- Reroll,
- fix code style remarks by @jibran in #2047777-65: Helper issue for "Switch from Field-based storage to Entity-based storage"
- rename Field::ID_MAX_LENGTH, the current limit is about field names again (and we still need one even if we truncate / hash table names),

yched’s picture

Missed a change from #2054699: Ensure config entity id is set for computed composite ids when merging.
+ more code style remarks from @jibran in the helper issue

yched’s picture

re @plach #109:

- generalizing to "non configurable fields" & FieldDefinitionInterface:
I wish we could go there, but honestly this feels both too much for this patch and really premature at this point.
The methods added to EntityStorageInterface are very much about reacting to the CRUD cycle of configurable fields - new $field created, $field updated, $instance deleted, $field deleted... Base fields have no observable CRUD cycle to react to, it's very unclear how housekeeping for fields defined in code would work.
+ base fields expose no schema for now, we don't have a unified mechanism to list all (base + configurable) fields of an entity bundle... Way too many unknowns for now.

- Code about $ids[] in doFieldLoad() :
Looks weird indeed... Will look into this

- FieldableEntityStorageController: yup, latest patch iterations have been doing that already

- hook_field_attach_load() being still around while field_attach_load() itself is gone.
Maybe @chx will have more insights on this.
The problem as I see it is that we have nothing in the current entity cycle hooks to replace this. It currently runs after values for configurable fields are fetched from storage, and before they are cached - and thus don't run again on next cache hit.
I guess that is more related to "what happens with field cache / entity cache", and so for now it is still needed.

re @fago #127

- hook_field_storage_pre_[load|insert|update]() still around:
Agreed, those do look weird now. They were kept because forum module uses them (insert & update).
I'll try to remove them.

yched’s picture

Updated patch:
- Fixes for _update_8000_field_sql_storage_write().
- Reverts a needless change in FieldUpgradePathTest
- as per #154, cleans up code about $ids in doFieldLoad()

yched’s picture

- Removes hook_field_storage_pre_*() as per #154
- Generates safe table names as per #146 option 4)
- Reverts tests that were changed to not generate long tablenames.
- Removes leftover 'storage' entries in field CMI files (tests, default config...)

yched’s picture

More todos though :-(

- views integration needs more work, it's still based on the fact that a field can appear in several entity types, and the $sql_entity_types array we introduced earlier makes no sense anymore ("The array of SQL-based entity types where the field appears in" - a field appears in one single entity type, and that entity type uses DatabaseStorageController or not - hook_field_views*() need to run only if yes, and they operate on a single entity type).

- FieldInfo::getFieldMap() / field_info_field_map() still returns a list modeled after the old model:

field_foo:
  bundles:
    node: article, page
    other_entity_type: bundle_1, bundle_2
  type: image
field_bar:
  bundles: ...
  type: ...

That is very wrong, since field_foo as it appears in 'node' and 'other_entity_type' entity types are totally different fields with possibly different field types.

The field map needs to be keyed by entity_type first now:

node:
  field_foo:
    bundles: article, page
    type: image
  field_bar:
    bundles: ...
    type: ...
other_entity_type: 
  field_foo:
    bundles: bundle_1, bundle_2
    type: text

- $field->getBundles() still returns an array keyed by entity types, then bundles. The 1st level is not needed, will only ever be one entry. This change is not 100% strictly needed, but leaving it that way really gives the wrong idea about the field model now, so since we're already changing field_info_field() & friends, it's only consistent to change that too.
Also the implementation currently relies on FieldInfo::getFieldMap() above, so...
All code calling $field->getBundles() (or $field['bundles'] through the BC ArrayAccess layer) will then need to be updated accordingly.

I'm working on the first item (views integration) in the helper issue.
Help welcome for the other two :-)

swentel’s picture

I'm on number two - and quite possible on 3 as it's logical. (note, deleted the other comment first, should have edited, but oh well)

yched’s picture

Cleaned up views integration.
Saving a testbot run here, it's green in the helper issue and swentel is going to post a patch for FieldInfo::getFieldMap() + $field->getBundles() soonish.

yched’s picture

right, "no-no-test.patch"...
Manually cancelled testing then :-)

swentel’s picture

Refactoring of FieldInfo::getFieldMap() + $field->getBundles().

yched’s picture

FileSize
5.19 KB

field_info_fields() cannot return a list fields keyed by field name anymore, there can be several fields with the same name in different entity types. Key by UUID like FieldInfo::getFields() instead.

yched’s picture

Doh, forgot to upload the patch in #165.
The change is pushed though, and the patch will need a reroll after #2056133: Add db_copy_table_schema anyway - need to run, will do that later if noone beats me to it.

chx’s picture

FileSize
492.62 KB
yched’s picture

Thanks @chx

Missed one test assert in #165.
+ filter arrays 5.3 style, goddammit...

yched’s picture

- Use table aliases instead of the real (truncated/hashed) table names in views data entries, so that they do not end up in views CMI files.
Lets us refine naming scheme later if needed without breaking saved views.
- This required some reorg in field_views_field_default_views_data(). Cleaned up while I was in there, that function has been through a lot of different hands since CCK D6 and is an awful mess in HEAD. All the lines in there are currently modified by the patch anyway, so this doesn't impact the patch size.
- Fixed broken join info (field type tables have no 'entity_type' column anymore, and do not require a join condition on it)

yched’s picture

OK, I think we're almost there.

- I want to give a second look at the purge part - fieldPurgeData() / fieldValues() / fieldPurge. Something bugs me in there but I couldn't spend time on really nailing what so far. Will try to do that later today.
- Patch removes one stale test method in Drupal\field_sql_storage\FieldSqlStorageTest ("storage details"), but the rest of the class is still valid as tests for the code in FieldableEntityStorageControllerBase, so it needs to be moved to an active test class somewhere.
I'd be fine with pushing this to a followup to avoid patch size explosion if that's accepted, but it needs to happen, we don't want to lose that test coverage. Actually it would be good to do the job now to avoid bad surprises and make sure that the current code still passes, even if the actual moving around happens in an immediate followup.
- Fine with leaving dead field_sql_storage in place for now for patch size, but for the patch to be self consistent we need to at least remove the old hooks docs from field.api.php. Leaving those in there is really misleading for reviewers & for people using HEAD meanwhile.
- The issue summary still needs to be updated with a complete list of the changes (classes, interfaces, storage reorg, API changes...). Will try to do it later today too.

yched’s picture

Reworked the purge area a bit:
- fieldPurge() was never called, the old hook_field_storage_*() hooks were called instead
- loading of data to purge produced invalid $items
- renamed a couple methods to be inline with the rest of the API (handleXXX(), doYYY()...)

yched’s picture

Issue summary: View changes

Updated issue summary.

yched’s picture

Updated the issue summary with the current approach and list of changes.

yched’s picture

Issue summary: View changes

Updated for the current state of the patch

yched’s picture

Issue summary: View changes

Oops, pasted too much stuff...

yched’s picture

Issue summary: View changes

Reorg a bit

yched’s picture

Issue summary: View changes

Adjustments

yched’s picture

updating tags

yched’s picture

Issue summary: View changes

typo

yched’s picture

As per #173, final cleanup:
- Removed hook documentations
- Adapted tests that still made sense from field_sql_storage, moved them to Drupal\system\Tests\Entity\FieldSqlStorageTest
- Remove field_sql_storage.module.

In order to not further complicate reviews, two patches are provided:
- REVIEW-do-not-test.patch: the patch for review, without the cleanups above (i.e. same as #174)
- FULL.patch: the candidate patch for commit, including the cleanups above
- plus a REVIEW-to-FULL.interdiff.txt to show the difference between the two.

smiletrl’s picture

This is really great work, especially for updated issue summary!

Considering

Field API "fields" are now specific to an entity type, and can only be shared across bundles of a given entity type, but not across entity types.

I start to think, property -- 'entity_type' becomes unnecessary to field instance. It would be a duplicate property, since this property is the same shared with the field this instance built on.

+++ b/core/modules/system/lib/Drupal/system/Tests/Entity/FieldSqlStorageTest.phpundefined
@@ -0,0 +1,426 @@
+    // Create a decimal 5.2 field and add some data.
+    $field = entity_create('field_entity', array(
+      'name' => 'decimal52',
+      'entity_type' => $entity_type,
+      'type' => 'number_decimal',
+      'settings' => array('precision' => 5, 'scale' => 2),
+    ));
+    $field->save();
+    $instance = entity_create('field_instance', array(
+      'field_name' => 'decimal52',
+      'entity_type' => $entity_type,
+      'bundle' => $entity_type,
+    ));

'entity_type' => $entity_type, seems duplicate for new instance imho:)

yched’s picture

@smiletrl: no, entity_type is still needed in field instances, because both 'field_name' and 'bundle' are ambiguous by themselves.
There can be different fields with the same name in different entity types; there can be different bundles with the same name in different entity types.

What could be done here is to accept a 'field' value (with a $field object) in entity_create('field_instance').
Then you can simply specify a 'field' and 'bundle' and you're good. That's not completely trivial though, and is outside the scope of this patch.

smiletrl’s picture

Yeah, right, right. Sorry about this -- 'field_name' and 'bundle' are still ambiguous for an instance.

David_Rothstein’s picture

From the issue summary:

As was agreed with @chx,@Crell & @alexpott in NYCCamp mid-July 2013, a direct consequence of "the storage controller is in charge of storing the whole entity" is that Field API "fields" are now specific to an entity type, and can only be shared across bundles of a given entity type, but not across entity types. Different entity types means different controller classes, means different "data buckets".

What are the proposed benefits of this change? I'm asking because the patch in this issue appears to have been passing tests and even RTBC somewhere around comment #110, before this restriction was ever added.

We never supported cross-entity-type queries anyway, so this has no functional impact.
If anything, this gives site builders more liberty to name their fields, not stumbling on the same name being already used in a different entity type.
....
There can be two fields with the same name in different entity types, they're seen as completely different fields.

I disagree; I think there is substantial loss of functionality here.

In practice, fields can have many things "attached" to them, including theming (CSS based on the field name), business logic (code), and configuration set via the UI (e.g., think of Field Permissions as an example of mission-critical field-level configuration).

Before the patch, these all work very naturally; if you choose to reuse an existing field on a different entity type, all those things automatically come along with it.

After the patch, you can't reuse the field across entity types, but if you create a new field on a different entity type that has the same field name, you may inherit some of the attached behavior but not all of it. CSS/theming will be inherited. Business logic may or may not (depending on how the code was written). Configuration set via the UI presumably won't be.

So if you want the new field to behave the same as the equivalent field on the previous entity type, some of it might happen automatically (like it does now), but you might also have to rewrite some code or redo configuration by hand for the rest of it - a big loss in flexibility. And this is also very confusing; whereas the "reuse existing field" concept is very clear in the user interface, this would be more like "create a field with a particular string as the field name so that it can magically inherit some particular behaviors". Furthermore, since default machine names are based on the field label and there are a number of common field labels that sites often use ("Body", "Tags", "Image", etc.) there is a high chance of this happening by accident, when the site builder did not actually intend the fields to be related.

yched’s picture

What are the proposed benefits of this change? I'm asking because the patch in this issue appears to have been passing tests and even RTBC somewhere around comment #110, before this restriction was ever added

That previous approach relied on brittle / unsustainable logic around "do those two entity types use the "same" field storage backend ?"
This just cannot fly, there's no "same field storage backend", there are just entity types and their - separate - storage controllers.
It was green because in practice our test suite didn't test the cases that would have been nasty.

This relied on storing "what 'family' of entity storage can you be attached to" (for some brittle definition of "family": 'sql', 'mongo'...) in the field CMI files - including some convoluted code flow that wrote this info back in the $field when the 1st $instance gets created, because you can only know that at the time you attach the first instance to an actual entity type, not at the time you create the field.
But we ship those in default configs. And the whole point of this issue is to make entity storage swappable (use mongo for nodes, for example). This directly conflicts: the storage used for an entity type is a runtime property, default config cannot assume / hardcode it.

This also resulted in much more complex and hard to predict logic around "which entity types can share fields, which cannot".

In short, this made the whole architecture really shaky - while there is no actual use case for "two fields in different entity types being the *same* storage bucket".

In practice, fields can have many things "attached" to them

Yes, they now need to be attached to [entity_type, field_name] separately, not just field_name. There should be no "magically sharing behavior because of same field name".
And true, CSS classes and field template overrides are an omission in this patch. They should now include the entity_type name.

yched’s picture

#177 had a @todo left about testing "long table names".
Here it is (+ some type fixes).

This only affects the "FULL" patch, "REVIEW" patch is the same as in #177.

David_Rothstein’s picture

Thanks for that explanation. I guess I'd need to study this more to have a better perspective, but the way I'd have expected this issue to work is that the storage configuration would be entirely at the entity type level rather than the field level, and if the field is attached to more than one entity type, there's just more than one location in which the field data can be dumped into. So for a field shared between nodes and users (and if both of those entity types use the database for storage), instead of a single {field_data_some_field} table there would be two tables, {field_data_node_some_field} and {field_data_user_some_field}. Or if nodes and users have different storage then it would be in two different storage systems instead. But that doesn't mean that anything else about the field has to be separated out, so there would be no need to have restrictions on which types of entities a field can be shared with. It's not clear to me why that wouldn't work, but perhaps there is some reason I'm not seeing right now.

Yes, they now need to be attached to [entity_type, field_name] separately, not just field_name. There should be no "magically sharing behavior because of same field name".
And true, CSS classes and field template overrides are an omission in this patch. They should now include the entity_type name.

If the CSS/template issue is fixed and the "don't use field name alone in your code" rule is followed I suppose it's possible the magic sharing behavior would go away. But that does still mean the opposite, which is a big loss of functionality. Instead of just reusing a field when you want it to work the same way, you'd have to rebuild a duplicate field and possibly change your code. Given the explosion of entity types in Drupal 7 including some which can have fuzzy boundaries distinguishing them (e.g., nodes vs. fieldable panels panes) the ability to easily share fields between entity types is a major part of the architecture of many sites, and I think it would be a big loss.

chx’s picture

If you do {field_data_node_some_field} and {field_data_user_some_field} then you already do not support sharing the field between entity types because it is not possible (OK, very hard but if one of them is not SQL then impossible) to retrieve all the field data in one query and so then why bother with pretending it's the same field? Indeed, if they are separate tables why they should be the same field at all?

tstoeckler’s picture

I totally agree with @David_Rothstein that sharing fields is a valuable feature in D7. But as he points out, it's not so much the data sharing that is valuable (which is the part this issue makes impossible) but more the configuration sharing.

I think this problem could be solved in contrib, however, by providing something like 'field templates' that you can configure once in the void and then attach to different entities with the exact same configuration.

In fact, I've thought many times about writing such a module in D7 already because the same thing already happens for field instances. Often times a field will come with a myriad of instance settings which you often want to configure in the exact same way multiple times. A way to share (the configuration of (!)) instances is equally valuable sometimes, but completely impossible in D7.

So while it's true that this is a loss of functionality with just core, I think such a tool in contrib could fill this void quite well.

tstoeckler’s picture

Issue summary: View changes

update remaining

yched’s picture

Being the same, shared "field" has always been about being the same data set, and I'm really uncomfortable with the idea of reshuffling this at this point (one $field meaning different datasets depending on which entity type uses it)

And yes, sharing a field "just" to reuse configuration has always been a wrong case for field sharing IMO. This talks more about the lack of a "clone" functionality in the UI, similar to the one offered by views: create a *separate* copy, saving the hassle of filling the configuration forms again. In the absence of "clone", yes, sharing a field instead is very tempting, but this doesn't make it right.

Also note that the D7 behavior (the "namespace" of field names is "cross entity types") has also been noted as a pain point (see #118 above - "oh I can't name this field "field_the_name_that_fits', it's already taken by some other field in some other entity type"). As a direct example in core, the body field on comments had to be named 'comment_body' instead of 'body' because 'body' was already taken (and had a different definition) for nodes. Then we had to add an optional $field['entity_types'] property to avoid 'comment_body' showing up in the "reuse existing field" UI for nodes and vice versa...

yched’s picture

Reroll.
Diff between REVIEW & FULL is the same as in #183.

David_Rothstein’s picture

Status: Needs review » Needs work

"Cloning" wouldn't meet the same needs as field sharing since it doesn't address what happens when you want to change configuration after the field is already cloned... Nor does it go into your codebase and magically rewrite your code so that any behavior attached to the first field is applied to the cloned field as well :)

I think @tstoeckler's idea of field templates is interesting, though, and it could replace all the functionality that would be lost here. I'm still a little wary of removing a useful feature from core in favor of something that might theoretically exist in contrib (but doesn't yet), and also shared fields would still be around within an entity type so it might confuse people to have these two concepts "competing" with each other. Nonetheless, it could probably work.

chx’s picture

FileSize
1.59 KB
501.02 KB

The test coverage for field deletion must be... spotty :) I am not sure how yched rolls his review only patches but the changes are minimal. Thanks for removing getType, that was truly confusing, it's a fantastic API improvement.

chx’s picture

Status: Needs work » Needs review
yched’s picture

Added entity_type in field CSS classes & theme hook suggestions.

More thoughts on "field name has a global meaning over all entity types" vs "field names are namespaced by entity types":
I really think that the latter makes more sense in terms of data modelling. Imagine building an SQL schema where a column definition has to be the same in every table where the column name appears - that's pretty much what the first option is. Then you need to invent prefixing strategies to make sure the column you want to use doesn't clash with a column a completely unrelated table (existing, or added by a module you'll enable in six months).

Field names should carry a fixed meaning only within one business domain - you can't predict how a contrib module you'll install in 6 months will name the fields on its entity type.
Modules attaching "behavior" just to a field name are not a use case IMO. The trend, already in D7, is to allow flexibility by letting the behavior vary at least by bundle. RDF module attaches semantics to field names by entity type & bundle, and it would be severely crippled if it didn't.
Even within Field API $field & $instance structures, the trend and feature requests since CCK 4.7 have always been "put most things at the $instance level, so that they can vary for each bundle". So that in the end, all that's left in $field is strictly what defines a database schema, since by definition this is the only thing that *can't* vary.
This is also why I'm not buying "sharing fields just to use the same $field definition" as a use case that should drive our model and APIs: there's really not much in $field, most things are in $instance / by bundle, and this is what really takes effort if for some reason you want to keep it synchronized across entity types or bundles. This is where a "clone", or a "default / override" model à la Views, could potentially be useful, bot not really at the $field level IMO.

yched’s picture

@chx: "REVIEW-do-not-test" patches are just the sandbox, "FULL" patches are a local branch with the additional "cleanup" commits I manually rebase over the sandbox for now.

yched’s picture

Reroll, plus should hopefully fix a couple fails from the previous reroll.

Status: Needs review » Needs work
Issue tags: -API change, -Field API, -Entity Field API, -Approved API change

The last submitted patch, field_storage-1497374-196-FULL.patch, failed testing.

yched’s picture

Status: Needs work » Needs review
Issue tags: +API change, +Field API, +Entity Field API, +Approved API change

Bot fluke, passes locally.

#196: field_storage-1497374-196-FULL.patch queued for re-testing.

plach’s picture

yched’s picture

Reroll - & #200 comments mark...

So, how do we get this thing reviewed and committed ? :-)

Wim Leers’s picture

yched’s picture

Status: Needs work » Needs review
FileSize
588.08 KB
499.99 KB

Yup, it got merged, but needed additional changes in the new code :-/

yched’s picture

"No, this is *not* a 500k patch".

Status: Needs review » Needs work
Issue tags: -API change, -Field API, -Entity Field API, -Approved API change

The last submitted patch, field_storage-1497374-203-FULL.patch, failed testing.

yched’s picture

Sigh. Missed one.

msonnabaum’s picture

Most field methods (fieldLoad, fieldInsert, fieldUpdate, etc) read very strangely to me since the noun comes before the action. I see no reason why they shouldnt be reversed (loadField, insertField, updateField, etc).

+      // Invoke the field type's prepareCache() method.
+      foreach ($queried_entities as $entity) {
+        \Drupal::entityManager()
+          ->getStorageController($this->entityType)
+          ->invokeFieldItemPrepareCache($entity);
+      }
+
+      // Invoke hook_field_attach_load(): let other modules act on loading the
+      // entity.
+      \Drupal::moduleHandler()->invokeAll('field_attach_load', array($this->entityType, $queried_entities, $age));

If we're introducing additional calls to \Drupal in classes, we should at least wrap them in local protected methods (i.e. $this->entityManager()->getStorageController). This makes the classes more refactorable while making them significantly easier to unit test as well.

protected function fieldUpdate(EntityInterface $entity)
public function handleFieldUpdate(FieldInterface $field, FieldInterface $original) {

It's not clear from the names of these methods how they are different. The former seems to update a field on an entity whereas the latter updates a field's schema. If I'm understanding that right, the word "handle" doesn't quite tell this story. The docs imply that they are "reacting" to an event, when what they are actually doing is more significant.

FieldableEntityStorageControllerInterface has all of the do/handle methods, but doesn't contain the prefix-less methods on FieldableEntityStorageControllerBase. I can't think of a reason I'd leave those out while the do/handle methods are there.

If some of these can be addressed that would be great, but I dont want to hold up this patch any longer so I'm fine if all this is pushed to a followup. Even the API breaking changes are really only breaking APIs for chx.

Berdir’s picture

Status: Needs review » Needs work
FileSize
16.29 KB
123.17 KB
174.91 KB

Diagrams proudly presented by PHPStorm ;)

Embedding the one that doesn't have methods, see the other one for all the methods and where they are defined. One with only public methods and one that has all.

entity-field-storage.png

Agree that the naming isn't great. I understand that we call them dynamically but many read backwards. And the difference between fieldSomething, handleFieldSomething and doFieldSomething really isn't clear.

How hard would it be to do something like this?

- handleFieldCreate() => This actively does something, so createFieldStorage() and so on ?
- handleInstanceCreate() => Same, so createInstanceStorage() and so on?
- handleBundleCreate() = This doesn't actually create the bundle (so using same pattern is even more confusing). Maybe this could stay like this, or we could slightly change it to handleBundleCreate*d(), to different that this is essentially an hook/event?
- fieldPurgeData() => This also actively does something, so put the verb first, purgeFieldData()? Actually, it purges instances data, as it receives an instance, so purgeInstanceData()? Not sure...
- handleFieldPurged() => This already follows my suggestion for the bundle methods with Purge*d(), so those would be consistent

For the class/interface name, with Entity Field API, everything is a field, including previously called properties like node title. So Fieldable doesn't really say much. One thing we discussed was to use *extendable* instead of *fieldable* for this. So ExtendableEntityStorageControllerInterface.. (And the Controller part will probably go away together with the approved controller => something else issue, so just ExtendableEntityStorageInterface)
Edit: I think that's out of scope for now, as long as we still have fieldable = true on the entity type definition, that's consistent with that.

As for the protected doFieldSomething() and fieldSomething() methods, if we can avoid using handle() for to active field storage methods, then I guess we could live with them, especially as they're not part of the public interface and you don't end up calling them accidentally. That said, while having consistency is helpful, having empty methods like fieldInsert() that just forward to doFieldInsert() really seem unnecessary? Should we define them as abstract on the Base class so that child classes are enforced to implement them? Maybe it doesn't need to something but by making them abstract, we force them to think about it and don't accidently forget to e.g. implement doFieldDelete().

More later :)

yched’s picture

Simple reroll for now, haven't processed #208 / #209 yet.

yched’s picture

Status: Needs work » Needs review
FileSize
32.29 KB
601.75 KB
513.71 KB

- Rerolled
- The call to invokeFieldItemPrepareCache() (@msonnabaum #208) doesn't need to fetch a storage controller, we are on it :-p (copy / paste effect)
- Removed hook_field_attach_load(), I actually can't think of a use case for this. This is a matter for "entity cache".
- (Re)removes the field_attach_*() functions (sneaked in again in the reroll after #2067127: Reorganize the contents of field.* files)
- Removes mentions of the old field_attach_*() functions

yched’s picture

Re @msonnabaum #208 - \Drupal::foo() calls in FieldableEntityStorageControllerBase;
So after #211, there is now just a \Drupal::typedData() call. Given that EntityStorageControllerBase has quite a few of those itself, I'd tend to leave this one here for now.

Re @msonnabaum #208 / @Berdir #209 - method names:
So yeah, I agree the current method names are confusing. To clarify:

- handleFieldCreate(), handleBundleRename()...
Those are the FieldableEntityStorageControllerInterface. Those are the only methods external code will call.

the word "handle" doesn't quite tell this story. The docs imply that they are "reacting" to an event, when what they are actually doing is more significant.

It's an interface, so we precisely can't assume "what" the code will do. This really is about reacting to "the field has been deleted - as you are a FieldableEntityStorageControllerInterface, it probably interests you, deal with it in whatever way makes sense to you".
A Mongo storage controller will most likely do nothing in handleBundleRename(), for example, because bundle names are not relevant for the way it organizes its storage.

--> Would it make more sense if the methods were named onFieldUpdate(), onBundleRename()... ?

- fieldLoad(), fieldUpdate()...
Those are about storing field values of an entity. So yeah, same old ambiguity about "field" = field definition" or "field value"
They are protected methods, not part of the interface (we only provide an external facing API to act on an entity as a whole, not on its fields separately).
They are wrappers that take care of maintaining the field cache, and delegate the actual storage to the sister doFieldOp() methods, to be implemented by the actual storage controllers (e.g DatabaseStorageController)
A "Field cache" limited to configurable fields makes little sense in D8, and those wrapper methods would typically go away if we had an "entity cache", but until then removing the field cache would be a big performance regression...

- doFieldLoad(), do FieldUpdate()...
They hold the actual storage logic. They would typically be left empty on a "document-oriented" storage such as Mongo, since they would store the entity as a whole, not the fields separately from the rest. Hence, they are not marked abstract currently, but I guess they could for clarity.
Then again, *if* we had an entity cache, then the wrappers would go away, most of FieldableEntityStorageControllerBase disappears, and the question of "abstract or not" vanishes, since the storage controller is then entitrely free to have field[op]() methods and call them if it sees fit.

--> for now, what about loadFieldItems(), deleteFieldItems()... / doLoadFieldItems(), doDeleteFieldItems() ?

- fieldPurgeData()
Ok, this is an exception - this is called from the outside, and is the only method in the interface that's not "handleX()".
This method is in charge of loading values to purge for a given deleted field on a given entity, call $item->delete() on them to let the field type properly do its cleanup, and then remove the values from storage definitively.
There is a handleFieldPurged() method on the interface, called when there is no data left to purge and the storage location for the field can be removed altogether (DROP TABLE...)

--> rename those to onFieldPurgeItems(), onFieldPurged() respectively ?

Note: In addition to the proposals above, I intend to do the following changes:

- have the storage controllers explicitly call fieldLoad(), fieldDelete()... (or loadFieldItems(), deleteFieldItems() if we rename) where appropriate in their CRUD cycle.
Right now, field[Op]() is called automatically inside StorageController::invokeHook(), because that is where HEAD currently wires the calls to field_attach_[op](). But this really makes no sense now, and is really confusing when following code execution ("so, at what point does the storage controller take care of field values ? oh, inside "invoke 3rd party hooks", sure").

- doing so, merge insertFieldItems() and updateFieldItems() into saveFieldItems()

yched’s picture

Issue summary: View changes

Updated issue summary.

Status: Needs review » Needs work

The last submitted patch, field_storage-1497374-211-FULL.patch, failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
3.21 KB
599.75 KB
511.71 KB

Doh, #211 removed one function too many.

yched’s picture

Rerolled, + as per the last paragraph in #212:
- call fieldOp() methods explicitly instead of within invokeHook()
- merge fieldUpdate() & fieldInsert() into fieldSave()

Bump @msonnabaum / @Berdir on the method rename proposal in #212 :-). Once we resolve this, I think we should be good to go.

Berdir’s picture

- onFieldSomething() is certainly better than handleFieldSomething(). The difference betwen on and do still isn't completely obvious but it's better. on reacting vs. actually doing something, that can be argued both ways round I think, telling the storage controller to "create whatever field storage you need to work" would be valid too :)

- Using FieldItems for the other methods work as well for me, wondering if loadFieldValues() would also be an option.

- The suggested renamed around fieldPurgeData() work as well for me. Although I'm stilling missing a pattern with the ending *d*, why on onFieldPurge*d*() but not onFieldCreate*d*(), if both are re-acting, then the field has already been created?

- The interdiff in #215 looks great :)

Berdir’s picture

Looked through most of the patch, noticed a few things, nothing big.

  1. +++ b/core/lib/Drupal/Core/Entity/FieldableEntityStorageControllerBase.php
    @@ -0,0 +1,280 @@
    +  public function handleFieldCreate(FieldInterface $field) { }
    

    As suggested above, I think we should not provide default implementations for those but force implementations to think about how and if they need to implement them..

  2. +++ b/core/lib/Drupal/Core/Entity/FieldableEntityStorageControllerBase.php
    @@ -0,0 +1,280 @@
    +      $definition = _field_generate_entity_field_definition($field, $instance);
    

    This is the only call to that function, should we inline it? A few more lines isn't going to matter that much in a 500kb patch ;) But I'm also fine to do that later.

  3. +++ b/core/lib/Drupal/Core/Entity/FieldableEntityStorageControllerInterface.php
    @@ -0,0 +1,116 @@
    + */
    +
    +
    +namespace Drupal\Core\Entity;
    +
    +
    +use Drupal\field\FieldInterface;
    

    Too many empty lines here.

  4. +++ b/core/modules/contact/lib/Drupal/contact/Tests/Views/ContactFieldsTest.php
    @@ -58,16 +60,11 @@ protected function setUp() {
    +    $this->assertTrue(empty($data), 'The field is not exposed to Views.');
    

    Nitpicky, but this could be a simple assertFalse($data), doesn't need the notice-safe empty() check.

  5. +++ b/core/modules/content_translation/lib/Drupal/content_translation/Form/TranslatableForm.php
    @@ -8,7 +8,8 @@
    +use Drupal\field\field as FieldInfo;
    

    field should be Field here I think, suprised hat this works...

  6. +++ b/core/modules/field/field.install
    @@ -98,6 +94,67 @@ function _update_8003_field_create_instance(array $field_config, array &$instanc
    + * Writes field data directly to SQL storage.
    + *
    + * @param string $entity_type
    + * @param string $bundle
    + * @param int $entity_id
    + * @param int $revision_id
    + * @param int $field_name
    + * @param array $data
    + *
    + * @ingroup update_api
    

    api doc stubs that should be documented.

  7. +++ b/core/modules/field/field.views.inc
    @@ -88,111 +106,105 @@ function field_views_field_label($field_name) {
    +  // Override Node to Content.
    +  $group_name = ($entity_info['label'] == t('Node')) ? t('Content') : $entity_info['label'];
    

    Hardly related but the node entity label already is Content, so this can be dropped?

  8. +++ b/core/modules/field/lib/Drupal/field/Entity/Field.php
    @@ -237,22 +220,18 @@ class Field extends ConfigEntityBase implements FieldInterface {
       public function __construct(array $values, $entity_type = 'field_entity') {
         // Check required properties.
    -    if (empty($values['type'])) {
    -      throw new FieldException('Attempt to create a field with no type.');
    

    Those kind of checks are usually in preCreate() but that's something for later.

  9. +++ b/core/modules/field/lib/Drupal/field/Entity/Field.php
    @@ -319,18 +305,20 @@ public function save() {
    -    if (drupal_strlen($this->id) > static::ID_MAX_LENGTH) {
    +    if (drupal_strlen($this->name) > static::NAME_MAX_LENGTH) {
    

    As you change the line anyway, switch to Unicode::strlen() ? Feel free to ignore ;)

    Just trying to prove that I'm actually reading through most of this ;)

  10. +++ b/core/modules/field/lib/Drupal/field/Entity/Field.php
    @@ -402,7 +375,8 @@ protected function saveNew() {
    -    $storage_controller = \Drupal::entityManager()->getStorageController($this->entityType);
    +    $entity_manager = \Drupal::entityManager();
    +    $storage_controller = $entity_manager->getStorageController($this->entityType);
    

    This looks a bit strange, we save $entity_manager into a separate variable only to get the storage controller from it a second time?

    Also, that saveNew() and saveUpdated() stuff really needs to be moved pre/postSave(), a fair amount of it unecessary like loading the original entity as discussed in the other issue ;)

  11. +++ b/core/modules/field/lib/Drupal/field/Tests/BulkDeleteTest.php
    @@ -198,19 +199,25 @@ function testDeleteFieldInstance() {
    +   * Test that the actual stored content didn't change during delete.
    +   *
    +   * @param FieldInterface $field
    

    missing namespace and description.

    TestS

  12. +++ b/core/modules/field/lib/Drupal/field/Tests/FieldUnitTestBase.php
    @@ -92,6 +97,16 @@ function createFieldWithInstance($suffix = '', $entity_type = 'entity_test', $bu
    +   * Save and reload an entity.
    +   */
    +  protected function entitySaveReload(EntityInterface $entity) {
    

    Missing @param and @return. Don't want anyone to block it on this in the end. Not that anyone is crazy enough to read through a 500kb patch ;)

  13. +++ b/core/modules/forum/forum.module
    @@ -394,6 +422,23 @@ function forum_node_insert(EntityInterface $node) {
    +          'tid' => $translation->taxonomy_forums->target_id,
    

    Keeps reminding me about the missing foreach loop here, not your problem ;)

  14. +++ b/core/modules/node/lib/Drupal/node/NodeStorageController.php
    @@ -9,6 +9,7 @@
     use Drupal\Core\Entity\EntityInterface;
    +use Symfony\Component\DependencyInjection\Container;
    

    Where's the use for the Container coming from here?

yched’s picture

FileSize
36.39 KB

Chasing HEAD was hell yesterday, but hopefully we're there.

So, regarding method names - I went with:

FieldableEntityStorageControllerInterface:
  onFieldCreate()
  onFieldUpdate()
  onFieldDelete()
  onInstanceCreate()
  onInstanceUpdate()
  onInstanceDelete()
  onBundleCreate()
  onBundleRename()
  onBundleDelete()
  onFieldItemsPurge()
  onFieldPurge()

FieldableEntityStorageControllerBase:
  loadFieldItems()
  doLoadFieldItems() (abstract)
  saveFieldItems()
  doSaveFieldItems() (abstract)
  deleteFieldItems()
  doDeleteFieldItems() (abstract)
  deleteFieldItemsRevision()
  doDeleteFieldItemsRevision() (abstract)
  readFieldItemsToPurge() (abstract)
  purgeFieldItems() (abstract)
  + empty implementations of the onFooBar() methods from the interface.

DatabaseStorageController:
  onFieldCreate()
  onFieldUpdate()
  onFieldDelete()
  onInstanceDelete()
  onBundleRename()
  onFieldPurge()
  doLoadFieldItems()
  doSaveFieldItems()
  doDeleteFieldItems()
  doDeleteFieldItemsRevision()
  readFieldItemsToPurge()
  purgeFieldItems()

Rationale:
- There are two naming trends in HEAD:
onFooBar() ("event-like") methods tend to be named onThingAction(),
methods in storage controllers tend to be named verbComplement(),
so the above sticks to that
- The verb() / doVerb() pattern for "external-facing wrapper method" / "implementation logic in subclass" is used in other places in core
- in FieldableEntityStorageControllerBase,
the doActOnFieldItems() methods are abstract because subclasses *have* to provide implementations or they're broken.
the onEvent() methods have a default empty implementation, because a given subclass only needs to implement what makes sense for them. Typically a document-based storage will leave most of those empty (and I expect most alternate storages to be document based...)

More generally, if the above is deemed somewhat acceptable, I'd advocate we try to restrict the bikeshedding here, I'd expect like at most a handful of alternate storage classes in D8, so nailing the absolute perfect DX is less critical than for, say, blocks or field formatters.

Attached is the interdiff for those changes. Next comments will answer the review points from #217 and include the actual patch.

yched’s picture

FileSize
6.86 KB

re @Berdir's review in #217 (thanks for that!):

1. See above
2. No, _field_generate_entity_field_definition() is used in other places in HEAD. It's our only way build a FieldItem object with a "made up" $instance. Hopefully we can do this without hacks at some point.
3. 4. 5. 6. Fixed
7. OK + lol, but let's fix this in a separate issue, other places in core, not touched by the patch, do this too. [edit; #2076141: Unneeded dance around 'Node'/'Content' entity label in views integration code]
8. interesting, will need some discussion, but out of scope here
9. Fixed
10. $entity_manager is used in other places in saveNew(). Moving to pre/postSave() is being worked on in #2020895: Move save() / delete() logic in Field / FieldInstance to [pre|post]Save(), [pre|post]Delete().
11. Fixed (inlined that helper in the only test that was using it)
12. Fixed
13. Didn't get that, but apparently that was not for me :-)
14. Indeed, fixed.

yched’s picture

Issue summary: View changes

Removes hook_field_attach_load()

yched’s picture

And the resulting patches.

As earlier: FULL is the patch for commit,
FULL = REVIEW +
- old storage hook documentations removed from field.api.php
- tests that still make sense from field_sql_storage adapted and moved to Drupal\system\Tests\Entity\FieldSqlStorageTest
- field_sql_storage.module removed
(REVIEW-to-FULL interdiff is provided)

Fingers crossed on this still being green...

yched’s picture

yched’s picture

Added to the OP:

NOTE TO COMMITTERS: please make sure @amateescu gets commit credits, he didn't post to this thread, but fixed a ton of fails in the helper issue. I think all other contributers have posted patches in the main issue.

webchick’s picture

Issue tags: +beta blocker

Because this would be nearly impossible to write a beta-to-beta upgrade path for, the core maintainers discussed making this one a beta blocker. Tagging.

yched’s picture

Issue summary: View changes

commit mentions

yched’s picture

yched’s picture

Updated/refined the OP with new method names & links to discussions that happened since the last update (field by entity type, method renames)

yched’s picture

Issue summary: View changes

Updated the OP

yched’s picture

Issue summary: View changes

explain REVIEW / FULL patches

yched’s picture

Issue summary: View changes

clarify table names

Berdir’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/Entity/FieldableEntityStorageControllerInterface.php
@@ -41,9 +37,9 @@ public function handleFieldUpdate(FieldInterface $field, FieldInterface $origina
-   * @see fieldPurgeData()
+   * @see fieldPurgeData(onFieldItemsPurge

That looks like a copy & paste mistake.

Looks like the patch doesn't apply anymore, so can be fixed on re-roll. (Very weird conflict core/modules/file/lib/Drupal/file/Tests/FileFieldWidgetTest.php, ndde?!)

Doing a more detailed review of the

Berdir’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
    @@ -528,34 +540,649 @@ protected function saveRevision(EntityInterface $entity) {
    +   * Implements \Drupal\Core\Entity\EntityStorageControllerInterface::getQueryServiceName().
    

    @inheritdoc.

  2. +++ b/core/lib/Drupal/Core/Entity/DatabaseStorageController.php
    @@ -528,34 +540,649 @@ protected function saveRevision(EntityInterface $entity) {
    +    if (!isset($vid)) {
    +      $vid = $id;
    +    }
    ...
    +          if (isset($vid)) {
    +            $revision_query->values($record);
    +          }
    
    +++ b/core/modules/field/field.install
    @@ -98,6 +94,73 @@ function _update_8003_field_create_instance(array $field_config, array &$instanc
    +      if ($revision_id) {
    +        $revision_query->values($record);
    +      }
    

    The behavior of the revisio n_id/vid is not consistent between the update helper and the actual implementation. The actual implementation uses isset(), the update helper does not, which would mean that it wouldn't write the records if the revision id is 0. Additionally, the actual implementation enforces that vid defaults to the id, so that isset() check there is never FALSE. I guess both should default revision id to id and remove the unecessary check. Because this would actually not just result in missing rows, it would result in a broken query, as it would try to execute it without values.

Currently testing the upgrade path. Seems to be working so far, but worried about the table names.
commerce_customer_profile_revision__fi11b4b8701c

so, an entity type with 26 characters leaves us with exactly 13 characters for the field name. Which means that it will explode completely for entity types >= 39 characters and has an increased of conflicts the closer we get.

What's the official limit for entity types now? And what are we going to do with the upgrade path of those that are longer?

chx’s picture

There is _truncateFieldTableName to protect but for some odd reason it hashes the field_uuid instead of the full table name. I believe that's a mistake. We need the full table name in the hash. Because otherwise an entity type with 38 characters would cause the revision and data table to clash. To avoid that, we should make _truncateFieldTableName compile the table name: potentially truncate the entity type, add a $suffix (which would be set to "revision__$field_name" by the caller) and then the hash of the whole thing if there was truncation. I pushed a version of this to the sandbox.

yched’s picture

Status: Needs work » Needs review
FileSize
5.26 KB
603.54 KB
515.49 KB

- Rerolled
- #226: fixed
- #227.1: we're actually not touching that method, it being part of the patch is only a diff effect :-), but sure, fixed (there are a lot of other "should be {@inheritdoc}" in the class)
- #227.2: Good catch - not strictly related though, this code (update helper & content of doSaveFieldItems()) is just copied from existing code in field_sql_storage.module, so I guess HEAD has this same issue. OK, fixed.
- #227.3: @chx pushed a change in the 'truncation algorithm": In case of "[entity_type]__[field_name]" being too long, truncate the [entity_type] part in addition to the previous behavior using a 10 chars hash of the field UUID for the [field_name] part. Updates will be fine, the same algorithm runs.

yched’s picture

#229 crossposted with #228.
I didn't realize that chx's code was hashing the table name instead of the field UUID, so the comments & var names I added were off.
Minor, but I'd rather stick with a hash of the field UUID, since that's what we do for "data tables for deleted fields". There can be no clash, since we truncate the entity type, add a different separator for "current" vs "revision" table, and field UUID cannot appear in different entity types.

Attached cumulative interdiff for this "table name generation" code since #224.

yched’s picture

More fixes after @berdir's remarks on IRC:
- unresolved commit conflict sneaked in the middle of comments - no runtime impact, but was re-adding a method that is now gone in HEAD.
- unneeded field_cache_clear() was added in a test
- generating the FULL patch with "git diff -M25" properly recognizes former field_sql_storage tests as a move / rename.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Ok...

I haven't looked at every single detail, but I've looked through it, with a more detailed final review, so I think I checked most of it. This might not be perfect, but I couldn't find anything that's worse than before. Still, additional checks would be very welcome, this a considerable change.

There are a few things where we need follow-ups, some exist already:
- Clean up the code in Field.php, correctly use ->original instead of re-inventing the wheel. There already is an issue and it was like that before.
- Move the mentioned constants to the Interface. Maybe we can also move some classes or interfaces (e.g. for Field entities, FieldInfo) to core/lib to avoid hard dependencies on field.module.
- We need a follow-up to stop relying on the BC decorator in the save/load/delete storage code.
- We should have unit test coverage for _generateFieldTableName() (and other things, when possible), as this issue has shown, it's not trivial code, so it would be good to hit it with all kinds of combinations and long fields and entity type names to verify that there can be no clashes or other bugs.

Let's get this in, assuming that the testbot is happy.

plach’s picture

I will have a look later tonight if you want to wait for me.

Berdir’s picture

Oh, forgot to mention, I also tested the 7.x upgrade path manually on a site with ~100 different fields and 270 instances, on different entity types. Couldn't find any problems related to this (various others, of course..), apart from the possible conflict in table names with very long entity type names, that has ben addressed.

xjm’s picture

Assigned: Unassigned » catch

Discussed with @webchick, @effulgentsia, and @Dries -- we agreed it would be great if @catch could look at this one.

catch’s picture

OK I didn't make it through every line of the patch but the new interface and abstract class look fine and the move from field_sql_storage looks OK as well.

What I didn't see in the patch (and excuse me if it's there and I somehow missed it), was tests for the case where a field is shared between entity types in 7.x, to ensure the data ends up in the right place for both entity types in 8.x.

If those tests are missing then I guess it could be a critical task follow-up to add them but that seems like the likeliest place this could go horribly wrong. As Mark said the API for field/entity storage backends is only implemented by one module so we've got some flexibility to fix niggles there if something came up.

I'm also wondering a bit why we bother migrating deleted fields - could we not instead require that sites purge any deleted fields prior to upgrade and have a warning in update requirements if any are found? That would also save querying the state system in the upgrade path.

yched’s picture

tests for the case where a field is shared between entity types in 7.x

True, that is not in the test suite currently.
- I think @Berdir manually tested that it works (#234)
- I'd be +1 on adding those tests in a followup to stop patch size bleed + reroll hell :-)

re: migrating deleted fields - sounds like a discussion for a separate issue, supporting those is already part of the current upgrade path in HEAD (including the "querying state" part)

catch’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Avoid commit conflicts

I wasn't clear from #234 whether Berdir's site includes fields shared between entity types.

I've opened #2078505: Don't migrate deleted fields in the upgrade path and #2078507: Add explicit upgrade path tests for fields shared between entities. However this patch doesn't apply any more so I couldn't commit/push it.

catch’s picture

Status: Needs work » Needs review
Issue tags: -API change, -beta blocker, -Field API, -Avoid commit conflicts, -Entity Field API, -Approved API change

Status: Needs review » Needs work
Issue tags: +API change, +beta blocker, +Field API, +Avoid commit conflicts, +Entity Field API, +Approved API change

The last submitted patch, field_storage-1497374-231-FULL.patch, failed testing.

yched’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
600.64 KB

Reroll - there seemed to be no conflicts since the last merge, but possibly some fuzz, yes.

yched’s picture

Sorry, here it is rolled with diff -M25 like last time, so that we don't freak out on patch size difference.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, field_storage-1497374-242-FULL.patch, failed testing.

yched’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
588.33 KB

phpstorm, why u tell me no conflict markers when I search for them :-( ?

chx’s picture

Now that we are so, so close: please make sure @amateescu gets commit credits, he didn't post to this thread, but fixed a ton of fails in the helper issue.

yched’s picture

Yup - I mentioned that in #222 and added it to the top of the OP, but it never hurts to remind it:
Do not forget @amateescu in the credits :-)

yched’s picture

Issue summary: View changes

minor

dawehner’s picture

Manually testing the basic views bit worked 100% fine. Congratulations for this amazing piece of work.

plach’s picture

I was finally able to review the whole patch, overall it's truly awesome I must say :)

I found almost nothing to complain about, just some very minor nits you can find in the attached interdiff. They can be incorporated if a new patch is rolled (or just ignored ;) I manually tested the code switching field translatability, which is not test covered since it's going away soon, and everything worked flawlessly.

My final impressions are mostly the same of my earlier review: those use statements introducing an explict dependency between the entity system and the Field module IMHO deserve a critical follow-up to be cleaned-up, unless they are already covered by existing issues.

There's only one thing that I'd really wish to see changed: currently the views integration code of the Field module treats every class extending the DatabaseStorageController as a SQL implementation. IMHO this is an unneeded limitation, as already pointed out above by @fago: should someone need to write an alternative SQL storage controller, she would be forced to extend
the core one with all its assumptions about the underlying implementation. I'd suggest instead to provide an interface, e.g. SqlStorageControllerInterface and make DatabaseStorageController implement it. The current pseudo-private static methods could be refactored and moved onto that interface, becoming regular methods, and then the views integration code could just check whether the storage controller implements SqlStorageControllerInterface.

I realize that those methods were not made part of the public API intentionally, but I think that since there are legal use cases for needing to retrieve some details about the storage implementation (Views integration being the most obvious), we should provide an actual API that code needing to deal with SQL can properly rely on. Doing the opposite IMHO will just "authorize" people to find the most sneaky ways to circumvent the Entity Storage API. I think a SqlStorageControllerInterface marker interface would be enough for this issue, as I guess that a proper API allowing to describe any SQL table layout would not use the static methods in the current form. Refactoring those would totally be a follow-up.

Finally a reminder for committers:

Do not forget @amateescu in the credits :-)

catch’s picture

Title: Switch from Field-based storage to Entity-based storage » Change notice; Switch from Field-based storage to Entity-based storage
Assigned: catch » Unassigned
Status: Reviewed & tested by the community » Active
Issue tags: +Needs change record

Committed/pushed to 8.x!

I included plach's interdiff in the commit, and credited amateescu ;)

chx’s picture

Status: Active » Needs review

The change notice is based on the issue summary and is here https://drupal.org/node/2078765

Berdir’s picture

Title: Change notice; Switch from Field-based storage to Entity-based storage » Needs follow-up; Switch from Field-based storage to Entity-based storage
Status: Needs review » Active
Issue tags: -Needs change record, -Avoid commit conflicts +Needs followup

Change notice looks good to me.

Tagging as needs followup to create the discussed follow-ups.

Berdir’s picture

Issue summary: View changes

clarified commit message instructions

plach’s picture

Issue summary: View changes

Updated issue summary.

Gábor Hojtsy’s picture

Woot, thanks all involved!

catch’s picture

@berdir - is this the comment with the follow-up issues we need?

https://drupal.org/node/1497374#comment-7810795

Berdir’s picture

Title: Needs follow-up; Switch from Field-based storage to Entity-based storage » Switch from Field-based storage to Entity-based storage
Status: Active » Fixed
Issue tags: -Needs followup
Berdir’s picture

Issue summary: View changes

added comment http://drupal.org/node/1497374#comment-7680855 to followup issues

amateescu’s picture

LOL! thanks guys :)

plach’s picture

@Berdir:

Awesome, thanks!

sebas5384’s picture

hey this is great! I was following the issue a time ago and I thank you guys for this amazing work.

andypost’s picture

This patch introduces changes that seriously affects #731724: Convert comment settings into a field to make them work with CMI and non-node entities
So we started a new branch #2079093: Patch testing issue for Comment-Field
The primary trouble that we used $field_name as bundle for comment entity but now field name is not unique but there's no ability to to use 2 fields for bundle - if there's any suggestions please comment in #2079093: Patch testing issue for Comment-Field

yched’s picture

@plach #248:

those use statements introducing an explict dependency between the entity system and the Field module IMHO deserve a critical follow-up to be cleaned-up, unless they are already covered by existing issues

Created #2079427: Core/Entity depends on classes / functions from field.module about this topic - in short: yes, but nothing new :-/

plach’s picture

yes, but nothing new :-/

I just wanted to be sure there was an issue for that: a meta is even better, thanks :)

plach’s picture

Issue summary: View changes

Added follow-ups.

yched’s picture

yched’s picture

Issue summary: View changes

one forgotten followup

yched’s picture

yched’s picture

Followup: #2086095: Remove remaining references to field_sql_storage (including a broken behavior on bundle renames)

yched’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

yched’s picture

fago’s picture

drumm’s picture

Issue summary: View changes

followup

chriscalip’s picture

I am following up on this issue, 4 years after this got closed and committed to core. This commit took out field_attach api, and this has been a pain to us doing drupal 8 backend.

We no longer have the ability to do partial loads of field/fields on multiple nodes.
http://timonweb.com/posts/loading-only-one-field-from-an-entity-or-node-...
No longer possible :
http://drupal.stackexchange.com/questions/199268/how-load-only-some-node...

We no longer have the ability to partial updates of field/fields data on nodes.
https://www.urbaninsight.com/2011/10/24/saving-nodes-fields-without-savi...
http://drupal.stackexchange.com/questions/64833/fast-saving-single-field...

Try updating field values on > 1k nodes on a drush script.
I am comparing drupal 7 scripts using field_attach_load and field_attach_update, versus, drupal 8 $node = node_load($nid), $node->set('field_name', $field_values), $node->save() . Against real world data address fields, geocoder events for geofields, etc.. Drupal 8 workflow is magnitudes slower and memory intensive than a comparable drupal 7 workflow with field_attach api. This is a mess!!

-1, on this commit because taking out field_api is just a pain.