At the suggestion of Catch, I've broken up the task #412518: Convert taxonomy_node_* related code to use field API + upgrade path into pieces that may be easier to evaluate, test and commit.

Term.module is basically a copy of list.module that uses taxonomy functions to gather allowed values. No other magic happens.

I could have used list module with its allowed values function, but I couldn't see a way to extend list.

Options.module provides the widgets. It also requires a patch.

I tried writing tests, but it was a disaster. I'll post them separately tomorrow so the patch isn't overlooked on the basis that bad tests were included. I do want to get it right, though! If only list.module had some tests I could rip off...

CommentFileSizeAuthor
#90 term_field_changelog-491190-90.patch728 bytesAnonymous (not verified)
#87 d7-term-field_491190_86.patch12.84 KBAnonymous (not verified)
#80 d7-term-field.patch12.41 KBAnonymous (not verified)
#67 d7-term-field.patch12.19 KBAnonymous (not verified)
#66 d7-term-field.patch12.34 KBAnonymous (not verified)
#64 d7-term-field.patch12.53 KBAnonymous (not verified)
#62 d7-term-field.patch12.21 KBAnonymous (not verified)
#61 d7-term-field.patch12.21 KBAnonymous (not verified)
#57 d7-term-field.patch12.17 KBAnonymous (not verified)
#55 d7-term-field.patch12.15 KBAnonymous (not verified)
#52 d7-term-field.patch12.25 KBAnonymous (not verified)
#48 d7-term-field.patch11.44 KBAnonymous (not verified)
#44 d7-term-field.patch11.61 KBAnonymous (not verified)
#34 d7-term-field.patch12.68 KBAnonymous (not verified)
#31 d7-term-field.patch12.65 KBAnonymous (not verified)
#29 d7-term-field.patch14.13 KBAnonymous (not verified)
#28 d7-term-field.patch9.51 KBAnonymous (not verified)
#25 d7-term-field.patch8.05 KBAnonymous (not verified)
#24 d7-term-field.patch8.04 KBAnonymous (not verified)
#18 d7-term-field.patch7.61 KBAnonymous (not verified)
#14 d7-term-field.patch8.16 KBAnonymous (not verified)
#7 d7-term-field.patch3.83 KBAnonymous (not verified)
#3 d7-term-field.patch4.06 KBAnonymous (not verified)
d7-term-field.patch6.28 KBAnonymous (not verified)
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Anonymous’s picture

Category: feature » task

There's a patch in the CCK issue queue to allow term.module fields to be configured in CCK. #491196: CCK Field UI support for term fields. This also requires #491130: cck_field_update_field does not generate the 'data' field before calling drupal_write_record to be fixed, and there's a patch for that too.

Dries’s picture

Anonymous’s picture

FileSize
4.06 KB

It is not an intentional duplicate. That issue describes the scope of the problem and it has a few patches that are still in heavy development.

THIS issue is a patch that needs review.

And here's a new one: previously I was creating a module for term fields. This creates NO NEW MODULES. It only patches taxonomy.module and list.module.

To use it, you must create a field along this pattern:

$field = array(
  'type' => 'list_number',
  'settings' => array(
    'allowed_values_function' => 'taxonomy_field_allowed_values',
    'vid' => YOUR VOCAB ID
  ),
);

You must also use these formatters for your instances: term_link OR term_plain.

The aforementioned CCK patches will let you easily create a term field for nodes.

This version of the patch is much much better.

yched’s picture

Great to tee you embrace this, bangpound !

Sticking this on the generic 'integer list' field type seems odd, though. Just like we wouldn't cram 'node reference' field type on the 'list' field type, I don't think we can spare a dedicated 'term reference' field type.

Anonymous’s picture

yched: You don't think we can spare a dedicated term reference field type?

I also created a term reference field the first time around, but my term reference field was 100% identical to the list module except for the allowed values function, which is pluggable in list module. As far as I can tell, if we introduce a special term field type, we are essentially forking list module and we'll lose the ability to re-use list module widgets with this field.

A taxonomy vocabulary is just a list of names, no?

webchick’s picture

The problem I have with this hunk:

-      'settings' => array('allowed_values_function' => ''),
+      'settings' => array('allowed_values_function' => '', 'vid' => ''),

Is that if Organic Groups module wants to expose "Group selector as a field" (which is entirely possible), they have to provide a patch to list.module in core. Big ol' -1 to that.

I understand the concerns about code duplication, but we ultimately need figure out a way to make whatever works here work for both core and contrib.

Anonymous’s picture

FileSize
3.83 KB

Webchick:

The patch to list.module actually isn't necessary. I thought it was, but... nope!

This can be accomplished by only patching taxonomy.module.

And I've now implemented taxonomy_field_attach_load so that the whole set of terms can be loaded at the same time.

I am not going to work on this patch more today. I would appreciate some help from others about the question of reviving term.module or using an unmodified list.module.

Anonymous’s picture

Some thoughts about using list.module vs. a term-specific module.

What is not possible with list.module is the creation of new values. If we want all of our term fields to support widgets who let users optionally add new terms, we'll need a new field module devoted to terms. I could add support for adding new allowed values to list.module, too.

We still need an autocomplete widget. I see no reason why an autocomplete widget wouldn't also be useful for list fields.

I briefly discussed an idea of separating tags into a separate module. It would still rely on taxonomy. It would still be a field. However, tags would be a one-off field with an autocomplete widget that also supports adding new tags.

I can see creating a term field module before D7 is frozen, but I know for a few more weeks, that module would be identical to list and that list may continue to develop and change.

All of that said, I'm eager for feedback and advice. I'm happy to be working on this task!

yched’s picture

I see the arguments about 'term_ref field almost equals list field'. Right now in D6, node_ref and user_ref are maybe like 70% code duplicates, but the differences are not easily mergeable within the current field concepts. An OO model where a field type can subclass another would possibly alleviate some of this, but that's not likely to happen in D7.

More specifically on 'taxo as field':

- The problematic hunk + 'settings' => array('allowed_values_function' => '', 'vid' => ''), is gone, but taxonomy_field_allowed_values() still refers to $field['settings']['vid'].
'vid' becomes an 'unofficial setting'? Works, I guess, but feels odd, and could easily break in the future.
Could possibly be worked around with a generic 'allowed_values_args' setting.

- the point raised in #8 about specific widget needs also applies to formatters : the current approach makes the 'Term: link', 'Term: plain text' formatters available for all 'list' fields, while they don't make sense for 'regular' list fields. This approach thus cannot be extended to other 'almost list field' field types without clobbering the list of display options with irrelevant choices. If the approach doesn't scale to similar use cases, I'm not sure it's a good approach.

- In #111127-207: Cache node_load we defined a generic pattern for 'reference' field types. For 'taxo as field', it translates to:
For performance, actual term data needs to be multiple-loaded at field_attach_load() time and get saved in the field cache, instead of having separate taxonomy_term_load() calls in the formatters (note : it seems to be the role of taxonomy_field_attach_load(), but I'm not sure what the function actually does).
Then when a term definition is changed or a term is deleted (hook_taxonomy_term_[update|delete]), the field cache needs to be cleared, using field_attach_query() : "find all objects that have value $tid on a 'term_ref' field, and clear their field cache"
With the current approach the latter needs to become "find all objects that have value $tid on a 'list' field with 'allowed_values_function' == 'taxonomy_field_allowed_values' ". Ugh.

- Usability: how do you explain to end-users "you want to add this vocab to this node type ? sure, add a 'list' field, and make sure to provide the right allowed_values_function" ?

I'm pretty much convinced this deserves a dedicated field type.

catch’s picture

So taxonomy_field_attach_load() currently just pre-loads the terms into memory - then taxonomy_term_load() is pulling them from the static instead of the database individually. But agreed it'd be better to have these in the actual field cache instead. I think yched's making good arguments in favour of a term reference field, there's not a lot of code duplication really between the two patches, and I can see this extending a lot.

Just a side note, I'd like to be able to use autocomplete for non-tagging vocabularies - for both interface and performance reasons with huge vocabularies - so it'd be good to have the autocomplete widget and actual term creation separated (similarly if hierarchical select becomes a field formatter and lets you create terms from select lists).

This is looking really nice though :)

Anonymous’s picture

thanks for the feedback yched. i'll post a new patch with term.module separated tonight.

i was also trying to demonstrate and experiment with list.module... and i was trying to avoid writing tests. ;-) does anything in core use list.module yet?

given the lack of a UI for adding fields to bundles in Drupal 7 right now, i am not sure how i can tell you what end users should do.

yched’s picture

@bangpound:
- No, nothing in core uses list fields yet. 'Node body' is the only core field, right now...
- CCK HEAD is updated to work with D7 Field API (but I guess you know that, you posted a patch over there)

chx’s picture

Title: Let taxonomy vocabularies be fields. » Let taxonomy terms be fields

No! Dont let them! Benchmarks! What is this craziness of putting vocabularies in titles when the issue is about terms and not just vocabularies?? That's like trying to make aggregator items nodes when the issue title says make aggregator feeds nodes.

Anonymous’s picture

FileSize
8.16 KB

Here's a new patch.

Term.module is now a separate thing. I'm no longer piggy-backing on list.module but it is more or less a straight copy right now. Options module is patched so we have some widgets for term fields.

I did my best at implementing the cache invalidation and field value integrity functions as described by yched in #9. It's still a little bit mysterious, but I think I manage to do it without breaking the multiple loading feature on fields.

I chatted with chx in IRC briefly about his concern about the titling of this issue, and I'm going to be maintaining a daily presence in IRC while I work on this and related patches. You can also tweet @bangpound to get my attention if I've got my head down.

catch’s picture

Title: Let taxonomy terms be fields » Provide a taxonomy term field type

Rather than just unsetting the ID in term_field_load() we should also stuff the term object in there - hook_field_load() is cached - and won't be run each request, so we actually have to do this if we want to pre-load terms at all, I forgot this when talking about it in irc :(. text_field_load() does similar. This probably means iterating over each delta rather than the array_search, but at least all the work will be cached. Not sure where to put the term object though - can this be in 'value'? Or we need a separate array key for the full term object?

I'm a bit concerned about the formatters:

+  $field = field_info_field($element['#field_name']);
+  if (($allowed_values = term_allowed_values($field)) && isset($allowed_values[$element['#item']['value']])) {

That means running term_allowed_values() for every term shown on every page - since we're removing invalid tids in hook_field_load() that should be enough no? Even if it's not I think the formatters should be dumb and do the minimum amount of work possible - allowed values should validate input, can't be responsible for already saved data if admins change it later.

The cache clearing looks good. If/when #333171: Optimize the field cache goes in we should be able to make that an array of cache IDs to clear instead of one by one deletes.

chx’s picture

function term_field_schema is weird with a default only switch. I think we want tests too.

yched’s picture

Re catch #15: The additional data loaded in hook_field_load() should be put in additional keys. D6 filefield merges data from the {file} table into a flat array with keys fid (the original, raw field value), filename, filepath, etc...

Didn't think about this earlier: with terms now being fieldable, calling taxonomy_term_load() in term_field_load() triggers field_attach_load() on terms and introduces potential infinite loops. More general issue with any 'foo reference' field if 'foo' entity is fieldable. Think "term with a term field" (odd but not forbidden) or "node references term which references node".
I'm still convinced that loading referenced data in hook_field_load() is the right approach, but I'm not sure of the best way to avoid infinite loops.
Maybe hook_field_load() should only load base properties: hardcoded, predictable set of properties, which are also the only ones needed by most common formatters, so this is good enough for us.
But:
- partial loads should be multi-loadable too
- they can not pollute the static cache for 'regular loads'
- yet having a static cache for partial loads would be handy too :-(

Agreed with catch's remark about invalid values and formatters - this should be changed also in list.module (not in this patch, but would be cool if someone created a separate task for that)

Anonymous’s picture

FileSize
7.61 KB

I asked bjaspan today in IRC about adding more stuff to the field item's array. He said that hook_field_load gets to set the agenda, and all the hook_field_attach_load implementations have to respect it. As such, we can set our index in the array to be anything we want. I chose term.

Terms are now loaded with the field and cached.

I can see the elegance of adding the whole term object to the value. What do others think?

Do I need to be unsetting the value in the item now? Shouldn't I just ignore it now?

I'm still playing with the settings array on the field. I've made the 'vid' value into an array. This opens the possibility of using multiple vocabularies in one field, which is a desirable feature that's easy to support. However it does raise some UI questions, which I'm posting on the original issue for this task.

The structure of hook_field_schema can be changed. I was seeing warnings about missing 'indexes' index for in field sql handler module, so I cut and paste the schema from list just to prove that my eyes weren't deceiving me.

Tests will come soon. There aren't many examples of field modules with tests that I can find, and my first attempt wasn't so good.

Frando’s picture

Regarding the infinite loop problem. One way to somehow solve this is build modes IMO. Say you do a field_attach_load on a node which loads the terms with a build mode of "minimal" or "as field" or "no attach" or something, which doesn't even trigger field_attach_load on the terms, while on a term view page, a term is loaded with a build mode of "full", which then triggers field_attach_load. This of course implies that a) the build mode overhaul patch gets in, b) build modes are properly supported not only by nodes but also by other entities and c) build modes are already passed to the _load_multiple functions (which currently call field_attach_load). Especially the latter could be a great improvement by its own (making loading and _load hooks build mode aware).

Anonymous’s picture

yched: i think it's a safe assumption that the initial offering of reference fields will be fields that reference bundles. this is a hard one, for sure. i'm thinking about it...

yched’s picture

re Frando #19: build modes are currently of no help here because they're a *display time* property ('build') - and I think it should stay that way: node_load() loads full raw data, there's no variants. Once loaded, you might want to do different things with the object, including displaying it in several variants, but having several levels or flavors of 'load' would add much confusion IMO.

#409750: Overhaul and extend node build modes takes this a step further by removing the $node->build_mode property and extending/renaming the current (bool) $teaser param of node_view() and friends into (string) $build_mode param.

catch’s picture

Is there somewhere in field api, after field_attach_load() where we can still have access to $objects - then we could have hook_field_attach_pre_render() or hook_field_attach_after_load() or something - do taxonomy_term_load_multiple() in there (not cached) - and this wouldn't call field_attach_load() on the referenced terms.

yched’s picture

@catch: right now we don't.

Anonymous’s picture

FileSize
8.04 KB

this might be a naive approach to the problem, but it does stop circular references from crashing the request. The function keeps a static tally of which object types and IDs it is currently loading fields for. If the function is called again to load fields for an object which is already loading, it simply doesn't try to load more terms.

i was able to deliberately create a circular reference like this: node X -> term Y -> term Z -> term Y. I cleared the cache on node X, and node X turned out fine. I could link to each of the terms including the circular reference. Then I cleared the cache while viewing term Y and then term Z.

I wish I could write tests...

Anonymous’s picture

FileSize
8.05 KB

the last patch did it wrong for sure. it was unsetting the flag in the static loading array if the bundle's fields were loaded a second time, which meant the third circular reference would cause a problem.

yched’s picture

Well, if I'm not mistaken, this makes it impossible to predict what can be found in $obj->taxo_field[0] : 'tid' ? 'tid' + 'term' ? And thus sprinkles uncertainty on the rest of the field lifetime.
+ I don't think that approach works if you try to separately load objects referencing the same terms in the same request (think : in a test). Static as hysteresis factor.

My proposal to 'Load only base properies' in #17 still stands, IMO.

catch’s picture

If we only load base properties, while that's better than doing it in formatters, I can see a lot of formatters still having to do node_load() individually to get missing pieces. Opened #493314: Add multiple hook for formatters to discuss whether it's possible to have our cake and eat it too. Until that's resolved, I agree with yched that just a direct query to get $term->name and similar is the safest bet.

Anonymous’s picture

FileSize
9.51 KB

Catch is right to point out that this is going to be more severe for nodes than it is for terms in terms of the awkward constraints on caching and the need to load the referents if you want to use non-base-table data in your formatter.

Here's a new patch. The "term load multiple" cut-n-paste job can probably be made even smaller, but here's a start.

Anonymous’s picture

FileSize
14.13 KB

Now with 25 tests that check the validator, widget and formatter.

I reused text.test. My main dilemma was to understand what I needed to test. Field API tests its own code with made-up bundles and made-up fields that don't exist outside the test framework. By now I've figured out that I need to test the code in the module, and I can see how that's done with text.test.

Someone should check how I'm testing the select widget. Something is making me doubt the absence of [0] in the element name.

I'm testing the validator on the most basic settings. A term from the field's vocabulary passes validation. A term from another vocabulary triggers an error.

The options select widget appears.

The output of the formatter matches the term's name.

I can see where more tests could be written: link vs plain text formatter, multiple values, fields with multiple vocabularies, different widgets, etc.

Where should I be looking for code examples for Field API tests? field.test?

catch’s picture

Because term_field_load() is going to be cached in the field cache (and in any fieldable entity which maintains a static cache during the request) I think we can drop the static and query building and just do a straight SELECT * FROM taxonomy_term_data WHERE tids IN (:tids); The 'multiple formatter hook' discussion suggests we'll need to keep that hunk of code even if it's implemented in a different hook eventually, which ought to work out pretty well. Will try to have a look at the tests later - IMO as long as we have roughly equivalent test coverage to the other field modules we're alright - a lot of taxonomy tests will need to be rewritten when converting the nodeapi stuff to fields in that issue - so no point doing the same work twice.

Anonymous’s picture

FileSize
12.65 KB

This patch removes the cut-n-paste of taxonom_term_load_multiple into term_field_load. Now I'm just doing what Catch suggests in #30.

moshe weitzman’s picture

subscribe

+ * TODO access control?: I think this should be removed. access control for terms is not implemented in core. let contrib modules deal with it using form_alter and node_build_alter.

+ * Theme function for 'default' term field formatter. : this text is repeated even for the plain formatter.

Anonymous’s picture

Moshe:

You mean the comment text! Aha! Yes I will fix that.

TODO Access control: It's not just about viewing and changing term field values, actually. Tagging widgets will allow users to create new terms. I remember a slide at Drupalcon DC where Catch showed us that vocabularies will have more granular permissions, but now I don't see that in core. Either way, creating a term is one permission I wanted to consider, but I agree that the full range of possibilities just need to be opened up for contrib and not actually implemented. Would we be doing that with node_build_alter and form_alter even with field api?

Anonymous’s picture

FileSize
12.68 KB

New patch incorporates Moshe's change to the "plain" formatter's comment.

The TODO in "term_allowed_values" function has been changed.

For folks who are not very familiar with the various field and widget modules in D7 core, FIELDMODULE_allowed_values is called by options.module widgets. It's implemented only by list.module so far. In D6, new terms can only be created using the "tagging" autocomplete widget. A straight port would mean that I (or someone else) will write an autocomplete/add widget for term.module only.

If people want to use different widgets and still be able to add a new term, it means that I'm going to shake up options.module a little bit or I'm going to basically copy it to be a term-specific widget module that supports one new feature. So this is one of the next questions I need to deal with when I start dealing with tagging. #495774: Modularize tagging is where you can get your pies lifted off the ground around this problem.

Status: Needs review » Needs work

The last submitted patch failed testing.

Wim Leers’s picture

Subscribing.

catch’s picture

Status: Needs work » Needs review
catch’s picture

I'll do an in-depth/nit-pick review later, couple of quick (but big) things.

1. Why do we have to change options module? Is there a way we could add an alter hook there or something if it's going to be a general problem?

2. I still think this should be taxonomy.module providing the fields instead of a new module - once field conversion is done on taxonomy_nodeapi, taxonomy.module would be pretty much useless without this for 99% of Drupal sites - so it's just extra stuff to add to the modules page.

Anonymous’s picture

Catch:

We have to change options.module (as far as I know) because the widgets need to define which fields they can work for. Is there another way?

My reasoning for keeping term.module outside taxonomy.module is this:

  1. It follows the pattern set up where fields are provided by individual modules. Text, number and list are not in field module but they're all now core required modules. The only thing that prevents them from appearing on the module page is that now the module page hides required modules. (Which has its own unresolved issues, by the way: #490400: modules with dependencies whose .info file sets required=TRUE cannot be enabled.)
  2. I see Taxonomy.module becoming very focused on administering and developing vocabularies in the way content_types.inc is for managing node types or CCK Field UI's "allowed values" is for managing lists. Today it's not possible to use term.module without taxonomy_get_tree and other taxonomy DB functions, but will that necessarily be the case if we can support external standard vocabularies like the IPTC press topic catalog?
  3. For new users of Drupal, the standard install profile could enable term.module the way it enables taxonomy.module now.

With all that said, I'm not intransigent on this issue. It's been easier to work with a separate term.module for this patch because my fork of taxonomy.module has some of this nodeapi to field work started but not ready to submit. I'm only trying to follow patterns I read in Drupal 7's code.

catch’s picture

OK fields provided by individual modules - I guess the question is - would we add nodereference to node.module, probably we wouldn't. So I can live with a separate module.

I think we probably need a hook_field_widget_info_alter() - but that needs barry or yched to chime in - let's leave it as modifying options module for now.

yched’s picture

We indeed need a hook_field_widget_info_alter() - more generally, we need a way to alter field_info, widget_info, formatter_info, and to alter corresponding settings form. 3 use cases in the latest 24 hours. I'll create an issue shortly.
Until then, we're left with changing options.module, but that's clearly not future proof. Contrib field types won't have that option.

I'd like to take a step back on the current approach:
a taxo field picks terms in one (or more ?) vocabularies, and several fields can possibly use the same vocabulary.
This means that the information previously stored in {term_node} is now scattered in several tables - possibly even several tables for a given vocabulary.

Similarly, semantics change from "node A has terms a, b, c" to "node A has term a in taxo field foo and terms a, b in taxo field bar"
Dumb example: vocab 'Rate', with terms 'good', 'average', 'poor'. Node type 'restaurant', with three taxo fields 'Food', 'Service', 'Price' using the same vocab.
The 'rating' flavor this example is intended: it's a semantic shift quite like the one from 'fivestar as a nodeapi module' to 'fivestar as CCK field', with terms instead of notation.

I have no real objection myself, just making sure everyone goes there willingly :-)

This raises a couple questions on taxonomy listing pages though:
- do we list all nodes that are tagged with the term in *any* field ? That's one query per field using the vocab - problem: with the field storage abstraction, sort and merge results have to be done in PHP and I'm not even sure pagination is solvable.
do we list all nodes that are tagged with the term in a given field, specified in the URL ?
- not just nodes can have taxo fields; but then how do we display taxo field values on a user object ? as a link to what ?
Looks like taxonomy/term/[tid] needs to become taxonomy/term/[field_name]/[obj_type]/[tid]

More generally: IMO it would be worth re-reading the discussion in http://groups.drupal.org/node/8793 - speculative 'taxo as field' discussion after the data API design sprint in winter 2008. I don't remember what the conclusion was back then (if any), and large parts of it are possibly stale now, but there might be good ideas or remarks.

catch’s picture

@yched I agree with all these questions (not sure on the answers yet) - but I think they're more for #412518: Convert taxonomy_node_* related code to use field API + upgrade path. IMO we should focus on this issue as offering a 'term reference' field type which can be used orthogonally to the current taxonomy_node_*() integration. That lets you use terms with other entity types, and do stuff with views with the field, link the formatters to taxonomy/term/type/tid and stuff like that. Then we need to look at replacing taxonomy_node_*() integration in core to use that reference field, which is where all the really nasty, sticky issues are.

Anonymous’s picture

FileSize
11.61 KB

This new patch avoids changing options.module and now relies on the hook_field_widget_alter provieded in #502522: Allow drupal_alter() on the various Field API declarative hooks.

RobLoach’s picture

I really like where this patch is going. It demonstrates how powerful the Fields API is and is the direction we should go with how the Taxonomy is used on the node edit form. One concern I have, however, is that its an entirely separate module. I understand the argument that this is a CCK/FieldsAPI addition, but wouldn't it make sense to add it to the Taxonomy module itself since the Fields API is part of core now anyway? Maybe in a taxonomy.field.inc file?

joshmiller’s picture

Status: Needs review » Needs work

This is a review for the patch at #44. Tests look nice. Just found some doxygen formatting issues. No spelling errors!

  1. + * Implement hook_theme().
    

    Doxygen standards say it should be "Implements" -- found throughout patch.

  2. Index: modules/taxonomy/term.test
    ===================================================================
    RCS file: modules/taxonomy/term.test
    diff -N modules/taxonomy/term.test
    --- /dev/null	1 Jan 1970 00:00:00 -0000
    +++ modules/taxonomy/term.test	3 Jul 2009 04:23:50 -0000
    @@ -0,0 +1,135 @@
    +<?php
    +// $Id$
    

    Missing Doxygen @file declaration

  3. +    field_create_instance($this->instance);
    +    // Test valid and invalid values with field_attach_validate().
    +    $entity = field_test_create_stub_entity(0, 0, FIELD_TEST_BUNDLE);
    

    Readability: perhaps a whitespace line between comment and code?

Like I said, just a quick check of the spelling and code syntax.

Josh

Dave Reid’s picture

@joshmiller: The current Drupal 7 coding standard for hook docblocks is "Implement hookname()." until we can decide on the best format (another patch out there).
See example http://api.drupal.org/api/search/7/_block

Anonymous’s picture

Status: Needs work » Needs review
FileSize
11.44 KB

I've abolished term.module. Catch has advised to keep all the field api hook implementations in taxonomy.module, which hopefully is going on a diet very soon when we trim away nodeapi.

Please check how I've added _taxonomy_clean_field_cache() to the save and delete operations for terms.

Your reviews are always helpful. Thank you!

catch’s picture

Status: Needs review » Needs work

Here's a nitpick pass:

if (count($allowed_values) && !array_key_exists($item['value'], $allowed_values))
Seems like this could just be !isset($allowed_values[$item['value']]); ?

(string)$item['value']
I think the casting should be string $item - with the space, but have a feeling this is inconsistent in core as it currently is anyway.

+ *  TODO deal with excluded tids?
+ *  TODO support scope limiting to a particular subtree of the vocabulary
+ *  TODO support multiple subtrees of the same or different vocabularies in one field
+ *  TODO allow values that aren't in a vocabulary if the field/widget/vocabulary (which?) is for
+ *  tagging.
+ *  TODO the field settings could also enforce some constraints on the user's choosing behavior.
+ *  e.g. force user to choose a term with no children, etc.

While these are all potentially useful - they're not @TODO - this is broken, they're @TODO this might be nice, so I think we can remove them (but good to document in followup issues maybe).

+ * This preloads all taxonomy terms for a given object at once using taxonomy_term_load_multiple
+ * and unsets values for invalid terms which don't exist.

No longer users taxonomy_term_load_multiple() so that just needs removing from the docs.

I think the query should have
$query->addTag('term_access');
added to it - then it'll respect modules like tac lite.

+function _taxonomy_clean_field_cache($term) {

Needs some phpdoc - I defer to yched on field_attach_query() implementation but that looks good.

However

+        foreach ($ids as $id) {
+          cache_clear_all("field:$obj_type:$id", 'cache_field');
+        }

We can now pass an array of cache ids to cache_clear_all, so this could be

foreach ($ids as $id) {
  $cids[] = "field:$obj_type:id";
}
cache_clear_all($cids, 'cache_field');

then cache.inc will do DELETE FROM 'cache_field' WHERE cids IN ($cids); internally rather than a one-by-one delete.

+  // Test fields.

Not needed.

Term $term->tid doesn't
You can use t() for the assertions and therefore put $term->tid into a placeholder. Or you probably don't need to put the actual $term->tid into the assertion text - just "Term does not cause validation error" is usually enough.

Everything else looks great to me.

yched’s picture

+function taxonomy_field_schema($field) {
+  switch ($field['type']) {
+    default:

Then we probably don't need a switch ;-)

function taxonomy_allowed_values($field) {

minor : it's related to the 'taxo field' part of the module, so maybe it should be named taxonomy_field_allowed_values() ?

+ * @return
+ *   array with 'term' object at that index
+ */
+function taxonomy_field_load($obj_type, $objects, $field, $instances, &$items, $age) {

Capitalization, trailing dot + the function doesn't return anything, actually ;-)

_taxonomy_clean_field_cache() is fine (once it uses the new 'multiple clear' mentioned by catch), but any other 'reference' field type will need similar code. Would be cool to abstract this into a generic function in field.module, as outlined in #111127-207: Cache node_load. At worst, could go in a followup patch, but we'll need it anyway at some point.
Also, note that there's a discussion pending in #367753: Bulk deletion to refactor field_attach_query() a bit, because currently it has the potential to completely blow PHP memory on large sites. Not something this specific patch should worry about for now, though.

The current code just lacks a check for 'is object type cacheable' before trying to clean the cache. Something like:

foreach ($objects as $obj_type => $ids) {
  $info = field_info_fieldable_types($obj_type);
  if ($info['cacheable']) {
  foreach ($ids as $id) {
    ...
  }

Actually, it might even be better to parse all object types for non-cacheable ones (in most cases there won't be any), then run field_attach_query() with a 'object_type NOT IN (non cacheable types)' condition.
The NOT IN operator in f_a_query() is currently not documented, but actually supported (I just created #511486: field_attach_query also supports 'NOT IN' to document it).

Other than that, as I wrote in #491196: CCK Field UI support for term fields: is it really a good idea to support 'taxo fields' spanning over several vocabs ? On 'object edit form', they will show up as one unique select widget merging terms from different vocabs, I'm not sure about this.

catch’s picture

I'd like to be able to have an autocomplete on the node form, which allows assigning terms from combinations of different vocabularies. That way users think they're just tagging, but you get the convenience of multiple vocabularies for everything else (management, views, theming, etc.). However that doesn't need to be dealt with by this patch, at all.

Anonymous’s picture

FileSize
12.25 KB

yched:

Are you concerned about the UI and sensible defaults for this feature? Do you think it's fundamentaly a bad idea? I would be happy to keep this feature out of the headlines and "marginalize" it in the UI (as an advanced setting? custom module field creation only?). For most users, 1 field per vocabulary, 1 vocabulary per term field is the design we'll want to use in examples and defaults. In fact, good speedy taxonomy/term/[tid] pages might depend on this kind of design.

However, there are a few use cases for allowing term fields to rely on multiple vocabularies for options. Catch has already mentioned one related to combining controlled and tagging vocabularies in one field. In addition to that one, I think the proliferation of auto-tagging services and modules could this case even more compelling to consider. For me, I need to be able to combine topic, toponym, and chronology vocabularies into a single field called "coverage" (the way Dublin Core uses that word).

The alternative is to force $settings['vid'] to be a single value and see field modules appear in contrib that introduce this one difference. Overloading field settings with undocumented or unoriginal values has already been mentioned as risky and bad, so the only sanctioned path is either contrib or let it be an array that usually has a single value. I see this as low-hanging fruit that feeds use cases already identified by contrib (see Content Taxonomy module for example).

Some problems are bubbling up though. For tags where the user could add new terms, which vocabulary should get the new term if 4 different ones are giving options to the field? If there are use cases for multiple vocabularies, are there also use cases for subtrees? Multiple subtrees? Multiple subtrees from different vocabularies?

Here's an alternative in this new patch: each field has required 'vid' and 'parent' settings. If the term field's widget is configured for tagging (not real yet), the new terms will be added in that vocabulary, under that parent term.

Additional allowed values are defined as repeated arrays with 'vid' and 'parent' keys. These are simple unions. There's no complex set mathematics going on here.

$settings = array(
  // this 'vid' and 'parent' are required. if 'parent' is 0, it's the whole vocabulary
  'vid' => '1',
  'parent' => '0',
  'allowed_values' => array(
    // this 'vid' and 'parent' are optional.
    array(
      'vid' => '2',
      'parent' => '10',
    ),
  ),
);

taxonomy_allowed_values would combine the top 'vid' with the ones under 'allowed_values,' but new terms would be added only following the primary settings.

It's also important to state that allowing fields to pull options from multiple vocabularies does not "break" the ... theoretical? informatic? underpinnings of taxonomy.module. The vocabularies themselves are allowed to remain ontologically "pure" and term fields provide the flexibility of expanding the field's domain across mutiple vocabs. Nobody has to bend their controlled vocabularies to fit their site's content model and UI goals.

yched suggested renaming 'taxonomy_allowed_values' to 'taxonomy_term_allowed_values', but this would require that the field's module be named 'taxonomy_term.' Options.module calls the field's MODULE_allowed_values. Can I set a different module in the field info?

_taxonomy_clean_field_cache.. umm.. yeah. I really can't talk about it now. I'm dizzy from all the cache clearing. I think it's extra loopy because of the way the available options subtrees are defined. However, I documented it quite a bit with comments.

Anonymous’s picture

Status: Needs work » Needs review
catch’s picture

The allowed values stuff seems to be commented out? If bangpound's approach with the extra array for allowed values works that seems like a good compromise to me - default to a field per vocabulary but leave things open for other uses.

+ * and unsets values for invalid terms which don't exist.
+ *

I keep getting told not to use abbreviations in code comments - also extra space here which isn't needed.

(count($tids)) 

!empty is enough there I think.

Anonymous’s picture

FileSize
12.15 KB
/**
 * Implement hook_field_info().
 */
function taxonomy_field_info() {
  return array(
    'term' => array(
      'label' => t('Term'),
      'description' => t('This field represents a taxonomy term reference.'),
      'default_widget' => 'options_select',
      'default_formatter' => 'term_default',
      'settings' => array(
        'vid' => '0',
        'parent' => '0',
        'allowed_values' => array(
          // array(
          //  'vid' => '0',
          //  'parent' => '0',
          // ),
        ),
      ),
    ),
  );
}

The code is commented because the 'allowed_values' should by default be an empty array. Normal term fields will only define settings[vid] and settings[parent] will default to 0. Additional options for the field are additional arrays of 'vid' and 'parent' in that 'allowed_values' array. I've removed the comment. Where/how do I document the field's settings?

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
12.17 KB

Test bot, I'm on to you...

yched’s picture

Multiple vids / parent terms in a field: the consequences are a bit over my head, but the approach looks sound.

_taxonomy_clean_field_cache(): OK, hard to abstract to a generic function because you're using extra logic specific to "taxonomy field allowed values" to avoid querying on fields that won't contain the tid anyway. Which is cool, I guess. We'll see about a generic function in a separate issue (when adressing 'text input format changed, clear text fields cache')

#512236: Design flaws in field_attach_query() got (partially) committed, and you updated the code to use FIELD_QUERY_NO_LIMIT. This won't fly in the long run (PHP memory explosion), but we can keep it until that issue settles.

if (count($cids)) could be just if ($cids), since $cids is initialized as an empty array anyway.

Other case for cache clear: when a vocab gets deleted :-(. We'll want to clear field cache for any object that uses a field referencing that vocab. No need to check for individual tids, it's ok if we delete 'too many' cache entries. But what we want to avoid is 'run _taxonomy_clean_field_cache($tid) on each $tid in the vocabulary' (PHP timeout pretty fast). We'd need to ensure that taxonomy's workflow on vocab deletion doesn't lead to this case.

yched’s picture

Also re #55 "where do we document the field settings ?".
This is actually #464548: Document field types specifics (settings, columns...) - in short: there should be a PHPdoc block at the beginning of each field-type module that documents field, widget and formatters settings. I'm not familiar enough with PHPdoc conventions and how they translate to api.drupal.org to know precisely how it should be structured...

Anonymous’s picture

yched:

All of the sudden I've become dissatisfied with the method. I think it's a design flaw to impose this constraint (options come from any vocab and parent but only one vocab and parent can receive a new term) on widgets. There's no reason a widget couldn't support more versatile "add a term" features. I need a few hours to think about this, because I don't want to break the reliability of the options.module widgets.

When a vocabulary is deleted, fields and field instances also persist. This problem is a little bit deeper than you've realized. ;-)

I understand your points about FIELD_QUERY_NO_LIMIT.

I'm not sure what to do about documentation. Of course I want to document, but I also don't know if I'd be setting the standard or if I'd be evaluated against one that I haven't found... does anyone have some specific formatting hints or a template so I can write it well the first time?

Anonymous’s picture

FileSize
12.21 KB

Ok. I'm happy with this now. The field settings include an 'allowed_values' array which is an array of taxonomy trees (vid and parent term). For most of us, a single tree scope definition will be fine, and usually that tree scope will be just the vocabulary ID and parent term 0.

I will be posting the documentation for this new field at #464548: Document field types specifics (settings, columns...) so that this patch doesn't get held up because we're discussing the way fields should be documented.

Anonymous’s picture

FileSize
12.21 KB

I'm glad I looked at that again. That would have been messy ... to keep adding conditions to the field_attach_query.

yched’s picture

@bangpound: I think you can post the doc about the field settings in this patch's PHPdoc for taxonomy_field_info().
At least it will be somewhere where it can be picked up later :-)

Anonymous’s picture

FileSize
12.53 KB

I hope you're right yched. Here's a new patch with settings documentation on the implementation of hook_field_info().

catch’s picture

This is looking great, minor stuff:

The new docs in taxonomy_field_info() look double indented?

+ *
+ * TODO figure out how to reverse the order of cacheable object type checking
+ * and the loading of the objects see #50

#50 where?

There's phpdoc for formatter tests, but no tests underneath, best to hide that away.

Nice work!

Anonymous’s picture

FileSize
12.34 KB

No wonder my to do list only ever grows. #50 was referring to yched's comment #50.

Comment about formatter tests removed. We're actually testing a formatter already. After code freeze, I'd like to make some boilerplate field tests that do a more thorough job of testing all facets of fields and widgets. We don't really have great examples in core yet.

Indentation could be still bad. I'm trying to indicate that the value of allowed_values is an array of arrays.

Anonymous’s picture

FileSize
12.19 KB

This new patch fixes some code style and phpdoc format issues. It also merges the _testTermfieldWidgets() function into testTermfieldWidgets().

webchick’s picture

I looked at this and only found pretty minor stuff in my first pass. A few comments not conforming to standards, an extra /** and a missing /** here and there, and had a couple field API-related questions for bangpound and yched. Overall this looks really good. I'd like to play with it a bit to see if there are more substantiative comments, but overall I found the code pretty easy to follow.

Anonymous’s picture

I rolled patch #67 after chatting with webchick in IRC, and it no longer has the issues she describes. Let me know if there are new questions after you've had a chance to actually use it.

Dries’s picture

I spent 10 minutes skimming this patch while at the airport. In short: I'm cool with it. It could use a more in-depth review, but it sounds like people have scrubbed it quite a bit already. It would be _really_ helpful if we could focus on the CCK UI -- that would make it easier to play around with patche like this. I think we should give that some priority (not in this patch).

Anonymous’s picture

The current HEAD of CCK Field UI contains code to create fields and instances of this type.

yched’s picture

Dries: see #516138: Move CCK HEAD in core - I'll be rolling an initial patch asap.

catch’s picture

Status: Needs review » Reviewed & tested by the community

OK I read through this and couldn't find anything else to complain about. Additionally I applied this patch, #516138: Move CCK HEAD in core and c&ped http://github.com/benjamin-agaric/drupal-taxonomy-as-fields/raw/11ebd8b2... into taxonomy.module and was able to attach a term field to a user, which is wonderful.

There's going to be followup work required - this only gives us the 'term reference' field, it doesn't replace taxonomy_node_*() in any way (that's #412518: Convert taxonomy_node_* related code to use field API + upgrade path), but IMO this exactly what we need for the initial patch - any more would be too much to review properly - and it opens up a lot of other things as well (related terms and synonyms as a field? replacing profile autocomplete and keyword listings with a taxonomy field?) which don't rely on 412518.

Assuming #67 fixes everything webchick mentioned in #68, I think we're ready to go.

webchick’s picture

I actually sat down with CCK module and this patch to dink around with it rather than doing a straight-up code review, which I already did earlier. A couple of things were a little bit funny, but overall this patch is completely awesome!

My first problem was I had a bit of difficulty locating the new field type in the list:
Where's the field?

We could chalk this up to me searching for the keyword "Taxonomy" and not finding it, or me expecting "Text" to be the short "T" word at the bottom and thus ignoring it, etc. But it tripped me up, and I figured I'd mention it.

Additionally, although with "Node" we mask everywhere as "Content", we do not (at least so far) do that with "Taxonomy." And while "Taxonomy" is visible in the admin panel, and "Vocabularies" can be visible if there are more than one taxonomy vocabulary attached to a content type, I don't think "Term" is something commonly associated with taxonomy to the average person (more apt would be "Category" which we don't use), and it's a "normal" English word that means something relating to definitions, so I'm not sure the mental model quite meshes.

It's also possible that I'm just really exhausted and this would not trip up anyone else, including me on a day where I had more than 4 hours of sleep. ;)

The second thing to trip me up is that autocomplete was not in the selection of widgets to choose from. This was particularly weird given that core *does* ship with a Tags vocabulary for just this purpose:

No autocomplete

I talked to bangpound about this and he informed me that this was next on the list of things to do, but the initial pass was just for normal categories. Fair enough.

The interim settings screen further scrambled my brains on the naming:

ambiguity

And finally, it'd be nice if there were some smarts passed along from the vocabulary chosen to the field settings:

Read my mind

I also discovered you cannot delete fields from CCK Field UI. Whee!

Talking over with bangpound, most of these issues identified here are CCK things. Really the only thing that affects this patch are the names. Posting this review so we can talk about it more.

webchick’s picture

In talking with bangpound a bit more, it probably makes sense to go ahead with the names/labels as-is for now and see how the follow-up work shakes out and whether it makes sense to rename at that point.

I found a few more minor things reading through (for example testTermfieldWidgets -> testTermFieldWidgets), but I can fix them prior to commit.

I would love to commit this tonight, but my brains are about to leak out all over the floor courtesy of a splitting migraine. So I'll look at this mid-day tomorrow and, assuming no one else has freaked out over anything, do final touch-ups and commit it then. Thanks for your patience. This week is a little rough. :\

Anonymous’s picture

This patch is really supposed to be the first of a few that break taxonomy.module's relationship to nodes through node API and replace it with Field API. I'm happy to see that webchick expected to see things that aren't there yet.

Tags and tagging are the next problem I'm going to work on. The most basic requirement to bring tagging through Term Fields is to create an autocomplete widget that also adds new terms to the tag vocabulary. I've also solicited input from contrib module maintainers and the wider community about what tagging looks like in Drupal 7. #495774: Modularize tagging Much of the discussion there is a little free-wheeling, so rest assured that I know the priorities despite opening the issue like a brainstorming session. As a free gift from our sponsors, any autocomplete widget for tagging will also work with list module fields, and adding new terms will be optional per term field instance (list module won't support new allowed values).

Vocabulary settings such as nodes, multiple, required and tags are all eligible for being cut. 'Tags' could be an exception; see #495774: Modularize tagging discussion about UI for a tagging vocabulary versus a controlled vocabulary. 'Multiple' and 'required' are field or field instance settings. They're nothing to do with what a vocabulary really is.

Each new vocabulary will automatically create a new field. Although the Field UI is not settled, I believe that site builders will probably be adding existing fields to their "bundles" that will have a clear descriptive label: "Taxonomy: Topic vocabulary terms" or something like that. I defer to others to decide what's best here, but I just want to make sure you know that most use cases will not require setting the vocabulary for a new term field.

I'm unable to give you perfect answers for what this field should be called and how it should be described, though I'm happy to provide my reaction to other ideas! For me, this field is for making relational links to names of things in a schema or vocabulary. That's not a friendly definition, but it's a very powerful concept for organizing information on the web.

The name of the field is 'term' simply because it's what made sense to me. I think 'taxonomy' is not great as the name for the field because (to me) it's a very specific meaning that isn't valid for the values of a field.

The points about 'term reference' are taken. I think I was just making a search and replace there. That's part of the CCK Fields UI patch, so it was made primarily to ease the testing of the patch. It still needs some work!

catch’s picture

On naming, I think this'll be easier to do with CCK UI in core, and in followup patches, but I'd probably go for 'Taxonomy term' as the field name - it's provided by taxonomy module, and it's not a vocabulary - so it gives you a bit of a clue if you managed to find term fields before ever going to admin/build/taxonomy.

Completely agreed with bangpound on vocabulary settings - we want to cut that out of vocabulary creation and storage entirely, so they become just loose containers for terms - and all the help text/tags/multiple etc. is handled via the field UI. But all that happens in the other patch.

Anonymous’s picture

I have editor windows open for another patch on this issue.

So far I have fixed some funky test code that was a hornets nest of curly and square brackets. Chx pointed out how unreadable the code was.

Please find me in IRC to direct me to make any other changes.

Anonymous’s picture

On second thought: 'no let's fix it in a follow up patch.' (per Catch)

Anonymous’s picture

FileSize
12.41 KB

Field changes:

Term becomes Taxonomy term.

Description is now "This field stores a reference to a taxonomy term."

Field info changes:

term becomes taxonomy_term

Formatter info changes

term_default becomes taxonomy_term_default
term_plain becomes taxonomy_term_plain

Test changes:

TermFieldTestCase becomes TaxonomyTermFieldTestCase
testTermFieldValidation becomes testTaxonomyTermFieldValidation
testTermfieldWidgets becomes testTaxonomyTermFieldWidgets
monstrosity $entity->{$this->field['field_name']}[0]['value'] gets a little less monstrous as $entity->{$this->field_name}[0]['value']

RobLoach’s picture

Subscribing........ Umm, nevermind, wrong issue.

drewish’s picture

sub

webchick’s picture

Just a note that I discussed this issue with Dries over dinner the other night, but he said he wanted another crack at it before committing.

catch’s picture

Anonymous’s picture

Hooray! Dries' laptop is fixed! (cough)

BTW: This patch will effectively fix #275352: Taxonomy throwing odd errors when previewing node and all kinds of other issues related to taxonomy's legacy and exceptional behaviors.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

My laptop isn't fixed yet, but I had a chance to look into this:

1. Why is field_formatter_taxonomy_term_default called field_formatter_taxonomy_term_default and not field_formatter_taxonomy_term_link?

2. The documentation of taxonomy_field_load mentions there being one object but it actually takes multiple objects.

3. taxonomy_field_load needs some code comments, IMO.

4. I expected taxonomy_field_widget_info_alter to be unnecessary and these things to be adjusted automatically but maybe I'm not 100% speed on that part yet. I'd need to dig a bit deeper into the code, or the documentation could be a bit more explicit/verbose about what is going on.

5. t("Test the creation of term fields.") should use single quotes.

Anonymous’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
12.84 KB

Here's a new patch.

Thanks to tha_sun for helping me add useful comments in taxonomy_field_load. He also pointed out an unnecessary call to drupal_schema_fields_sql.

The hook_field_widget_info_alter implementation exists because the core field and widget modules explicitly mention support for each other but not for optional core fields, which admittedly don't exist yet. The first few comments on the issue were related to the need to patch options.module, and hook_field_widget_info_alter was developed as a result.

This was marked RTBC before. I apologize if I'm breaking the rules about re-setting that status. LMK.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD.

This needs a CHANGELOG.txt entry, IMO.

I'd also like to see a list of follow-up issues.

Anonymous’s picture

Status: Fixed » Needs review
FileSize
728 bytes

This patch adds a changelog entry.

webchick’s picture

Status: Needs review » Fixed

Committed #90 to HEAD. Thanks!

Status: Fixed » Closed (fixed)
Issue tags: -Favorite-of-Dries

Automatically closed -- issue fixed for 2 weeks with no activity.