I've been reading through all the issues and some other information (e.g. the N-ary document) in order to get an idea about the concept you're heading for. But there are some things which are not yet completely clear for me. First of all, the data storage.

A few concept choices are made, I guess. A relation is an fieldable entity. No discussion anymore, right?
Nodes and users are also fieldable entities. They are configured with the base tables 'node' and 'user'. These tables contain the basic information. Extra fields are stored in separate tables of the form 'field_data_field_*'. When I look into the code, I find a base table 'relation' in relation_entity_info, however not used/implemented. Currently the relations are stored in fields. Every 'relation type' get its own 'field_data_field_*' table. So I get a little bit confused.

I tried to follow the reasoning behind this choice in #981364: Re-use field storage for relation data and more specifically in http://drupal.org/node/981364#comment-3778106.
As far as I understand, I find 3 reasons to store relation data in fields:

  • Query performance, in case we need to query all entries for a specific relation type.
  • The field schema isn't to different from the relation schema because it already has entity_id and entity_type.
  • Field values already have a built-in concept of language.

In my thinking I tried to take the concept which is used for nodes towards relations.
About query performance. What if we need to query all the relations? Do we then need to query all relations in all tables? We query all nodes too, right? So, why not storing the relations in the relation table?
About the field schema. Of course the field schema has the entity_id and entity_type. This is inherent to fields. They are attached to an entity (node, user, ...). And these values can be used for relations, but only for the extra fields of relations. And then the entity_id and entity_type will refer to the relation entity, which needs to be in the general table. It is not because the fields have the same name, the usage is the same.
About the language. The field API provides language possibilities, the Drupal entity API not. But isn't this something which can be done via the entity module (http://drupal.org/project/entity)? Further, it is done fore nodes, so it is possible to do for relations.

So, I would suggest to follow the concept of the other entity implementations (nodes, users, ...) as much as possible. This would be good towards interaction with other modules.
This means basically 3 column types: relation, relation_revision & field_data_field_*.
The relation table could then also contain fields like 'relation_type', an optional 'title', a 'created' and 'changed' date, ... fields for which I see possibilities in future applications. In the future, a table 'relation_type' can make sense as other modules are going to create specific relations.

But perhaps I'm still missing something. I have some ideas about use cases and user interface too, but I would like to get the data storage concept right first. So, can you point me the direction that you're heading for?

Regards,
Frank

Comments

scroogie’s picture

I thought its implemented this way. You have relation as the primary table, and the specific data is saved by the FieldAPI in the field_data_field_X table. It says that FieldAPI should already handle revisions. So actually relation follows the core model. It's just not ready yet.

Firewolf’s picture

That is just my point. The specific data which is now saved in field_data_field_X should actually be saved in the primary table relation. If there is data to be saved in a field_data_field_X table, then it should be data from a field attached to the relation (fieldable entity), e.g. a weight, a priority or any other information value which is a kind of attribute of the relation.

naught101’s picture

I've been absent for more time that I intended, so sorry that this stuff hasn't been getting much attention.

The relation data was added to field data partly so that we wouldn't need another table, and partly so that we wouldn't have to deal with revisions. The relation_table is partly a hangover from the previous implementation, and partly left there incase we decide that relations are entities after all, and should contain more information (which is, I think, what the original issue discusses).

ginc’s picture

To make it easy to understand lets figure out: What are our options for data storage?

then we will figure out their pros and cons

Shadlington’s picture

Issue tags: +1.0 blocker

Tagging

joe.murray’s picture

I'd like to come at this from a database perspective. Supporting foreign key relations in core got punted from D7 to D8. Currently, the foreign keys exist in a schema as documentation though apparently Ctools will look at that information. Is this project aiming to provide support for a foreign key field (that makes sense to me) or to support creating a foreign key to a new table that has a foreign key (this is not the use case I am interested in). Making a relation fieldable, so far as my limited understanding of D7 goes, means that you are doing something more along the lines of the second approach. If you are not really focused on providing relation support to schema in DBTNG, can you point me in the right direction of efforts to make that happen?

(My interest is in enabling high volume data import and export with databases of complex high volume transactional applications that are integrated with Drupal but likely reside in a different database supported by D7's multiple DB connection functionality. Very roughly, I'd like to build out in D7 something equivalent to D6 Table Wizard and Views Migrate for bulk operations).

Use case: There are entities for persons, organizations, parties and donations; people and organizations can make one or more donations to one or more parties. Each donation comes from one and only one person or organization and is directed to one and only one party. Initially, we just need to be able to determine for any donation, which party it went to, and for each person or organization, which donation(s) they made. This requires the following relations, where a relation is understood as equivalent at the database level to a single foreign key:
person FK to donation
organization FK to donation
donation FK to party
So far so good from my perspective. I want to work on a project that implements this kind of support for schema relations.

For functional or performance reasons it might be that we also want it to be convenient to find the single (person or organization) that made the donation. This could be implemented in several ways. A simple way is just to create two relations in donation:
donation FK to person (nullable)
donation FK to organization (nullable)
This doesn't capture the notion that one and only one of them can be non-null at a time. Code could be used rather than the DBMS to maintain this constraint. For some DBMSes, this code could also be pushed into stored procedures/triggers, etc.

A second way would be to have
donor table name (non-null, restricted to either person or organization)
donor table id
The difficulty here is that MySQL and probably other DBMS can't maintain foreign key constraints at this point.

naught101’s picture

Big long discussion condensed: will be moving away from field storage
See for propposal, comments welcome there: http://groups.drupal.org/node/135804

naught101’s picture

Status: Active » Closed (duplicate)