The current relation_data schema is

relation_id    entity_type   entity_id

and I wondered why it's not

relation_id    source_type   source_id   target_type   target_id

...unless naught101 mentioned that there's a possibility for n-ary relations; i.e., not a 1:1 relationship but 2 to X entities in a single relationship.

That sounds worth to document.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

naught101’s picture

The current schema also allows simple descriptors (one to none relationships), eg. the_earth:is_round

The thing I don't understand is that the two rows don't appear to describe the way that the entity relates to the relationship? eg. which is the subject, which is the object? Or is that in a different table?

chx’s picture

Because such a schema would be directional, the current one is unidirectional. Same for #1. We do not have any directionality right now. We can add if we want but note it's much easier to add a flag which one is which than creating an unidirectional module once you have a directional (aka nodereference and backreference).

naught101’s picture

Flags make sense in the relation_data table. I can think of four possible flags: source, target, auxiliary (n-ary relations), and none/self (descriptors eg. node1:isPublished).

Maybe the latter is out of the scope of the project. There is also another possibility: auxiliary values that aren't entities - eg. scalars.

I'm not sure if the latter few should even be in this table, since they're gonna be fields. Is that gonna be a separate table? if not, this table would probably have to get much more complex, right?

naught101’s picture

I don't understand why the relation table exists? If relation defines bi-directional relationships, and a directional predicate always has a logical opposite, would it make more sense to have the predicates in the relation_data table? That way you could have :

relation_id entity_type entity_id predicate
1 node 1 hasFather
1 node 2 hasSon

Then again, maybe it's best to leave the predicate in the relationship_type table or what ever it turns out to be called, and just have the reverse-predicate as a field or something. see #981398: Bundle creation and UI

naught101’s picture

As sun pointed out in skype, the current schema has no way of supporting revisions. I don't think we can just let field API deal with the storage, because that makes the unidirectionality difficult to deal with, right? Maybe we can add a revision id to this table as well? And that would require a revision id in the relationship table as well, right?

sun’s picture

The lack of revisions worries me. Indeed, Field API already has a built-in concept for handling field revisions, but since we are actually not storing any field value (yet), the field's tables are basically empty (except for Field API's internal values).

Of course, one option would be to simply duplicate (i.e., store a unused copy of) the relation data in the field's values.

However, in an ideal world, we'd directly use the field storage as primary storage for the relation data, but I'm not sure whether that is possible to do. Technically, the schema isn't too different -- entity_id already exists, entity_type basically exists via etid (entity type ID) already, but could as well be added as-is to the field's schema, and relation_id would definitely be a relation field's schema column.

Since relation types will be (bound to) fields anyway, this really sounds worth to investigate.

It's not only about revisions, but also about query performance. If the data for a specific relation type is contained in a dedicated table (which exists anyway), then lookup queries are going to be faster. Furthermore, field values already have a built-in concept of language; i.e., users would additionally get the possibility of different relations per language for free.

chx’s picture

  1. One column, a relation id
  2. You need to attach the relation field to all bundles it relates to
  3. When relating to another entity we load the entity, change the field and save it back (entity API)
  4. There is an instance setting whether it's "many" or not , indicating whether to load (if you have a million nodes on a term, don't load)
  5. If you need to figure out whether entity A and B are related, if both can be loaded, then load entity A and entity B and diff their field values. If one of them is not load then fire a storage engine hook that we will implement for SQL and will do the current join.
chx’s picture

We can remove the entity-ness and piggyback on the combofield/multigroup/whatnot http://drupal.org/node/939836

Oh and this solves the bundle discovery problem we had -- you can only relate to bundles the user already attached the relation to.

naught101’s picture

FileSize
2.09 KB

baby steps.

naught101’s picture

FileSize
4.65 KB

hook_field_presave() implementation

naught101’s picture

Status: Active » Needs work
FileSize
5.92 KB

This just stops the previous patch from breaking on page loads (there's some empty item in there that I don't understand..). Too braindead right now. will pick it up again tomorrow.

chx’s picture

a) i think relation_field_is_empty should check for relation_id b) the widget should use #type value not #type hidden

sun’s picture

Status: Needs work » Needs review
FileSize
10.18 KB

Quite mind-blowing. Not much better, but at least some progress.

sun’s picture

Sleeping over this, I realized that we still don't have proper field data revisions with this, but we can fix that easily (and possibly simplify even more code).

Problem space:

  1. Add a donation field to entity types company and party.
  2. Add a relation to a company entity, referencing a party.
  3. The stored field data for the company does not store any info about the party. We just have a relation_id.
  4. If I edit the company, and change the relation to reference a different party, that change is not contained at all in the company.

The field data does not contain any information about what has been referenced. Likewise, the field data revisions don't contain that either.

AFAICS, this is easily resolvable by adding entity_type and entity_id to the field schema of the relation field type.

Thus, each field $item in an entity fully describes the relation.

Actually, it even looks like that would additionally simplify our field loading -- potentially even removing the need for that crazy double-self-join.

sun’s picture

When ignoring n-ary relationships, then there wouldn't be a change in storage:

       etid     entity_id       field_relation_id
john:  1        1               1
john:  1        1               2
john:  1        1               3
mary:  1        2               1
houe:  1        3               2
pietr: 1        4               3

       etid     entity_id       field_relation_id     field_entity_type   field_entity_id
john:  1        1               1                     user                2
john:  1        1               2                     user                3
john:  1        1               3                     user                4
mary:  1        2               1                     user                1
houe:  1        3               2                     user                1
pietr: 1        4               3                     user                1

As visible, each field item would fully store the entered information. Thus, every field revision would be self-contained, allowing admins to figure out (diff) the actual change between revisions.

I'm aware that this imposes problems. Food for thought, and probably, a conf call.

naught101’s picture

Other possibility, with fielded relations.

            |entity_type  |entity_id  |field_relation_id  |field_entity_type  |field_entity_id
john:       | user        | 13        | 1                 | user              | 22
mary:       | user        | 22        | 1                 | user              | 13
relation1:  | relation    | 1         | 2                 | node              | 10
house:      | node        | 10        | 2                 | relation          | 1
relation1:  | relation    | 1         | 3                 | user              | 31
pietr:      | user        | 31        | 3                 | relation          | 1

This obviously fucks up the relation_id concept a bit, but it wouldn't be hard to add a top level relation id, parent_id perhaps (which would just be = 1 in this example). This shouldn't event be too hard for more than one level of relation hierarchy (which I think should never, ever happen. At least not without some kind of punishment).

naught101’s picture

FileSize
9.61 KB

with stuff added to schema. Works without hook_load(), but the presave isn't working..

naught101’s picture

FileSize
10.02 KB

This fixed some stupid mistakes, and definitely puts the field data into the target entity correctly, but it's not saved to {field_data_[field_name]}.

Do we need to do something more than just save it? Trigger field storage somehow?

naught101’s picture

FileSize
9.64 KB

This, combined with the workaround at #988780: Merge both modules into one causes an infinite loop
[entity1]_save() -> hook_field_presave() -> [entity2] -> hook_field_presave() -> [entity1] -> etc.

my loop check doesn't work, I guess because it never actually saves either entity. So we might need to pass a recursion depth flag, but I'm not sure where.

naught101’s picture

Huh. A recursion-depth flag won't work, because if you have other relations attached to the target, they will recurse (separately) on save, so there's still a potential for very large loops. possibly infinite, not sure.

naught101’s picture

FileSize
9.68 KB

ok, this works. set $target->recursion, and then just don't do anything if that's set in $entity next time around.

works for nodes, but apparently not for users... hrm..

naught101’s picture

FileSize
9.8 KB

fix a couple of mistakes things, make code more self documenting.

Pretty sure user_save() doesn't accept $user objects with programatically added fields. this may be a problem.

works perfectly with taxonomy terms and nodes.

sun’s picture

So here we go! :)

Note: db_next_id() still has to be moved from the field widget code into the field presave hook. We're needlessly incrementing IDs whenever the widget is displayed currently ;)

sun’s picture

Fixed db_next_id() issue, which directly fixes a data integrity problem.

sun’s picture

Slightly fixed field schema column definitions.

sun’s picture

Killed relation_get_possible_targets(), since field instance information is fully available in $field already.

sun’s picture

Title: relation_data schema reasoning and documentation » Re-use field storage for relation data
Status: Needs review » Fixed

Intensively reviewed and discussed with naught101, and we think that this is a big milestone in the progress already, so...

Thanks for reporting, reviewing, and testing! Committed to HEAD.

A new development snapshot will be available within the next 12 hours. This improvement will be available in the next official release.

aidanlis’s picture

Awesome! Well done guys!

sun’s picture

Status: Fixed » Needs review
FileSize
1.45 KB
naught101’s picture

Status: Needs review » Reviewed & tested by the community

yep, fine for now. we neet to discuss #989264: field columns are VERY confusing as well though

sun’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
5.74 KB

Untested.

sun’s picture

Other FTW! ;)

sun’s picture

s/target/other/

naught101’s picture

FileSize
8.09 KB

change empty check back to other_entity_id because relation_id is empty, doesn't get filled until presave.

also, add relation_id to $item by reference, otherwise SQL error.

naught101’s picture

FileSize
7.82 KB

once more without debugging stuff..

naught101’s picture

Status: Needs review » Fixed
naught101’s picture

Status: Fixed » Needs review
FileSize
932 bytes

Damn. last patch introduced a bug - last &$item in the for look stays set, and so gets garbled later. Changed the name of a the later variable to $other_item, but perhaps the reference should be unset/not used in the first place?

naught101’s picture

Status: Needs review » Fixed
sun’s picture

Status: Fixed » Reviewed & tested by the community
FileSize
1.05 KB

The committed code differed from the patch here, and contained a coding style flaw.

naught101’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.