I'm a bit unclear on exactly how getter/setter/export/import functions are supposed to work with non-scalar (i.e. object and array) values. The only field type as of 6.x-1.0-beta2 to have all four of these is $node->uid, so I've not got a large sample size to extrapolate rules from.

Going by the uid example, where the getter function returns a full user object, the intention seems to be for getter functions to output as much relevant data as possible. This makes sense in terms of usefulness, but could have performance issues when dealing with exporting squillions of entities. Personally I think flexibility is more important than performance, so I'm happy with this.

The uid setter function will only accept a user object. That also makes sense, as you should be able to take the output of a getter function and give it to a setter function, but there are lots of instances in Drupal where the content of a field can be either an array of properties or a unique identifier, moreover these often refer to entities which are commonly expressed as objects (eg. when retrieved from *_get() functions). I don't think it's too much to expect a setter function to accept a scalar value (assuming it to be a unique identifier), array, or object.

The description of fieldtool_export() (comment in fieldtool.module) says "Extract the value of a field for exporting to a serialized variable." To me that says it gives you all the data that the getter function would give you (perhaps structured differently) as a serialized object/array. However the uid export function just spits out the numeric uid. This doesn't make sense to me, as in cases where data is being exported to systems other than Drupal, Drupal uids don't mean anything.

Import callbacks "are supposed to accept the same type of data that was returned by fieldtool_export() (although they might accept other kinds of data as well)." Personally, I think they should be as flexible and forgiving as possible. For instance, in the case of taxonomy terms (which I'm working on) an import function should:

- if the value supplied is numeric, look for a term with that tid. If none found, try to find a term with that name in the vocabularies that apply to the current entity (clutching at straws).
- if the value is an object/array, look for a 'tid' value and put the data from taxonomy_get_term() into the field.
- if the value is a non-numeric string try to find a term with that name in the vocabularies that apply to the current entity, and import the first term found (assuming whoever's importing the data has adopted a policy of enforcing unique term names to enable importing like this to work sensibly).

What have I completely misunderstood or overlooked?

Comments

Matthew Davidson’s picture

Ah, actually now I think of it, CCK nodereference and userreference getter/setter callbacks only work with nids/uids, not the complete node/user object as in $node->uid. This is a good thing, as you can get into a recursive loop with nodereferences, but it is an inconsistency. In D7 even returning the full user object as the node author will be a problem, as users could have userreference fields.

Maybe the getter callback should return just a unique identifier, but the exporter spit out as much as possible with some mechanism for dealing with recursion (static variable storing seen entities?).

Matthew Davidson’s picture

Version: 6.x-1.0-beta2 » 6.x-1.x-dev

Thinking about this some more, in the light of recent developments, and working from the outside in and back out again...

Firstly, we need a new field property: "multiple" in order to define whether the setter callback can accept an array of values or just a single value. I think @jpetso might have intended the same with "non-array value" on some CCK field types, but as our use of the word "value" here can include arrays, "multiple" is less ambiguous, and fits the terminology of CCK, Taxonomy, Location, etc.

Import callbacks do their best to work out whether the data imported from an external source is a single value or multiple values. A really generic way would be to test for the existence of $value[0]. In some circumstances you can be a bit more idiot-proof; eg. the Taxonomy plugin can test for the existence of $value['tid'].

The import callback then (optionally, but highly recommended, depending on the complexity of the field) passes each value in turn (or just the one value if not multiple) to an itemization callback, which tries to take whatever it's given and turn it into a value fit to send to the setter callback. For example the setter callback for a CCK Link field will want a $value like:

$value = array(
  'title' => 'Some text',
  'href' => 'http://example.com',
);

The itemization callback can do something like:

  if (is_string($value)) {
    if (valid_url($value)) {
      return array('href' => $value);
    }
    else {
      return array('title' => $value);
    }
  }

You can think of the itemization callback as "the clever callback".

The setter callback accepts a value (or array of values in the case of multiple) identical to the values that would be output by fieldtool_get() afterwards. This is the minimum necessary to be able to do what you need to within a Drupal environment. e.g. for node reference fields, all you need is nids, for taxonomy all you need is tids, etc. because dependant modules can always call node_load(), or taxonomy_get_term(), etc.

The export callback takes the output of the getter callback and adds as much additional information as an external program might find useful. Again, highly recommended to pass each value from the getter callback off to an extraction callback to do the work on each value.

You may notice that there's no real need for itemization and extraction callbacks, so why bother? Well, now that we #696270: Support field hierarchies we can use these to automatically construct getter/setter/import/export functionality for fields higher up the field tree. In the absence of a callback, we can just look for child fields and call the callbacks for those. eg. Say we have a CCK Link field called 'field_website':

$value = fieldtool_get($fields['field_website'], $node)

could do the work of

$value = array(
  'title' => fieldtool_get($fields['field_website_title'], $node),
  'href' => fieldtool_get($fields['field_website_href'], $node),
);

without us having to write a getter function for the top level of the CCK Link field.

In fact in most cases we can just describe the field hierarchy get something that works well enough. We can write generic callbacks that work down the tree, and unless finding an explicitly declared callback, fall back on the 'verbatim' ones which already exist. This not only makes writing plugins easier (plugin authors just describe the field tree and provide custom callbacks for tricky fields); it means that you can get/set/import/export a CCK fieldgroup with one function call even though there's no way that the author of fieldgroup plugin can know what types of fields the fieldgroup contains. You could even (although this is not on the plan for Field Tool 1.0) apply the same principle to importing/exporting entire nodes or users.

@jpetso used the nomenclature of 'cck extraction callback' and 'cck itemization callback', and because an aim of 1.0 is not to break compatibility with 1.0-beta2, we'll follow this convention ('taxonomy extraction callback', 'location itemization callback', etc.) but I'm not entirely sure it's necessary to handle field hierarchies from different modules differently.

Note that this is the plan for beta3. The current dev version does not work like this yet.

Matthew Davidson’s picture

Title: Policy for getter/setter/export/import functions » Generic callback functions
StatusFileSize
new24.5 KB

Have more or less implemented as above. This should not break any existing plugins. Have left #713820: Recursive callback fallbacks as a separate issue, as there are other issues I want to address more urgently.

Have rewritten the Location plugin to leverage this functionality, thereby reducing the amount of code in the plugin to about 1/3 of what it was originally.

Documentation on how to make use of this to follow by the time of #701892: Beta3 release.

Also moved fieldtool_cck_verbatim_value_extract() and fieldtool_cck_verbatim_value_itemize() to the node_cck.inc plugin, as it seems to make more sense there.

Have introduced the concept of "virtual" fields. These are fields which don't correspond to anything that exists in the entity, but do exist elsewhere; eg. on a node edit form - CCK fieldgroups, or Taxonomy vocabulary select elements, etc.

Also have "multiple" fields; fields which consist of a numerically-keyed array of items. Note this does not as envisaged above correspond to the CCK or Taxonomy meaning of "multiple". We're just saying that the values exist in a list, even if it's only a list of one item. We do not enforce a maximum or minimum number of values, as that would be validation, and Field Tool does not do validation.

Matthew Davidson’s picture

Assigned: Unassigned » Matthew Davidson
Category: support » feature
Status: Active » Fixed

Whoops. Committed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.