While trying to port a module to D7, I noticed that the node object no longer contains $node->taxonomy. The function taxonomy_node_get_terms() has also been removed. When looking at a node object (as returned by node_load), there is no way to tell which fields contain taxonomy data, except when you know the precise field names. Those field names can of course have any value, there isn't a pattern like field_taxonomy_[vocabulary_id]. All of this means that it has become much harder to get the terms assigned to a node. I believe $node->taxonomy array was used by many modules and themes so I consider this a serious regression.

The upgrade guide that addresses the removal of taxonomy_node_get_terms() (http://drupal.org/update/modules/6/7#taxonomy_node) tells us to loop over the array returned by field_info_fields(). Once you know the term_reference fieldnames for the content type you're interested in, you can read the taxonomy terms from the node object. However, that feels like a lot of work to get a couple of term id's.

In IRC, benjamin-agaric mentioned the existence of the taxonomy_index table. In some situations it may help to query that table directly, and you could even join the tid on the taxonomy_term_date table to filter the results by vocabulary. Still I think that the API should have some kind of function for this.

I can see that Field API is great and that the level of abstraction in D7 offers lot of possibilities, but I'm disappointed that taxonomy_node_get_terms() was simply removed, instead of replaced by something like taxonomy_entity_get_terms(). I know and respect that API changes are a no-go at this moment, but can API additions be considered (if not now, perhaps for 7.1)?

Comments

dman’s picture

I'd agree that this sounds a bit painful.
No $node->taxonomy? No API to easily retrieve "fields of type taxonomy"?
:-(

I think I'd have to create an API helper that retrieved "fields of type x from entity" Pretty quickly if I'm to port some taxonomy tools to D7.

Damien Tournoud’s picture

Category: bug » feature

This is the textbook definition of a feature request :)

I'm not sure what could ease your pain here. What you describe in the original post seems to correspond to wildly different use cases.

Nick_vh’s picture

While researching this topic together with marcvangend and some questions on IRC later I confirm that it is not as easy as it was before to get something this simple and straightforward into the context of a custom module.

Altough it might be a bit late, I do agree that atleast a helper function should exist that takes over the functionality of the taxonomy. Drupal is built upon easy development standards. If all of this is going to be too abstract the wall keeps getting higher and higher for developers which is regression as stated by marcvangend.

I would love to see some pointers to topics where this was discussed (mainly the indirect ability to get terms as a complete concept from a entity) because now that D7 will get used in production I suppose more of these issues will pop up and some trade-offs have to be made?

Damien Tournoud’s picture

Altough it might be a bit late, I do agree that atleast a helper function should exist that takes over the functionality of the taxonomy.

This is all still very unclear to me. An helper function that does what? What is the use case that all those contribution modules presumably out there want to achieve?

From the top of my head, I cannot see a reason a theme or a module would want to pull all the taxonomy terms that are attached to a node. Could you describe the use case here, please?

dman’s picture

Um, a quick search on a dev site of mine throws up a large number of instances where current D6 modules make use of either $node->taxonomy or taxonomy_node_get_terms() on a regular basis.

like token or pathauto - which uses it to make the term names available in patterns
apachesolr adds all taxonomy terms to search indexes
biblio, Feeds, each use the set of all terms when mapping data to it.

A few, like forum, will indeed find the new specific already-restricted-to-vocab method easier, as before they had to filter all of $node->taxonomy to find the one with the right vid. That's handy, although they already had taxonomy_node_get_terms_by_vocabulary() anyway.

But this large swing in the opposite direction.
I'm sure we can find ways around it and deal with it. It's more common that an individual module should specify what type of tag it cares about anyway. But it's regression where it makes things that used to be easy - a bit harder now.

marcvangend’s picture

Good to see some activity here after a night's sleep :-)

Damien, there is one important thing that differentiates this issue from a stereotypical feature request: the regression aspect. But let's not get into the feature-vs-bug meta discussion here. There are indeed wildly different use cases for $node-taxonomy and taxonomy_node_get_terms(), but I think that only demonstrates the impact of this regression. Looking at my own use case and at the theme snippets at http://drupal.org/node/46012, I see plenty of use cases for taxonomy_node_get_terms() and taxonomy_node_get_terms_by_vocabulary().

My use case is this: I'm trying to get the smartqueue module (part of nodequeue) working on D7. Smartqueue can automatically build a queue of nodes that are tagged with a term in a specific vocab. When a node is saved, smartqueue loops over $node->taxonomy and takes out the terms for the desired vocab. In D7 that is much harder; I know which vocabulary I'm looking for, but the $node object doesn't reveal which field holds that data. With $node->taxonomy not being available here, taxonomy_node_get_terms_by_vocabulary() would be a great help.

dman, you're talking about a "new specific already-restricted-to-vocab method"... which method is that? If you're referring to the one-field-per-vocab that Field API offers - that doesn't help in my use case. It would help if the field names would follow a pattern like field_taxonomy_[vocabulary_id], but they don't and it's way too late to change that now.

In conclusion: IMO we need helper functions that offer the equivalent of taxonomy_node_get_terms() and taxonomy_node_get_terms_by_vocabulary().

JohnAlbin’s picture

My use case: I have a module that checks the $node object to see what its taxo terms are. If one of the taxonomy matches a pre-defined term, then I set up some breadcrumbs and menu tree trails. See #963856: Desperately seeking taxonomy

I haven't had time to research the Field API just to get the node's list of taxonomy terms. Code snippet anyone? Then maybe we could roll a patch or at the very least write some docs on how to do it.

marcvangend’s picture

JohnAlbin: The only code snippet I have seen so far is in the upgrade documentation at http://drupal.org/update/modules/6/7#taxonomy_node. Copy-paste:

The following snippet iterates over field_info_fields to collect machine names of all the vocabularies associated with node type "story" into a $story_vocabs array:

// field_info_fields() returns information about all fields
$fields = field_info_fields();
$story_vocabs = array();
foreach ($fields as $field_name => $field) {
  // $field['bundles'] contains names of bundles and entities associated with this field.
  // keys are entity types, values are arrays of bundle names.
  if ($field['type'] == 'taxonomy_term_reference' && !empty($field['bundles']['node']) && in_array('story', $field['bundles']['node'])) {
    // Collect all vocabularies allowed for the field.
    foreach ($field['settings']['allowed_values'] as $allowed_values) {
      $story_vocabs[] = $allowed_values['vocabulary'];
    }
}

I haven't tested this yet, but I guess it's a start.

greg.harvey’s picture

I thought all regressions were bugs? Why is this not a regression? And if it is, why does it not qualify as a 'bug report'?

Anyway, dman says it all in #5. I just needed $node->taxonomy last week for some category-based ad serving. I often get asked to do weird custom things with categories. I need it all the time. If contrib modules are obliged to all come up with their own way of getting terms, because core no longer does it for them, I can imagine horrible performance issues and hacks a-plenty, as a bunch of different modules try to achieve the same thing in a different way, whereas before taxonomy.module just did it so no one else had to - just call the core function.

I could understand it being dropped from the node object, maybe, in preference for contrib modules loading it in via the core function if required, but the API function vanishing too? Odd decision, IMHO, but I don't know what's involved in making this work in D7. Perhaps it's an oversight born out of architectural decisions and now it's difficult to implement?

joachim’s picture

Category: feature » bug

> From the top of my head, I cannot see a reason a theme or a module would want to pull all the taxonomy terms that are attached to a node. Could you describe the use case here, please?

As has been said above, this is used all the time, by lots of contrib modules and in custom theming.

This is a regression bug. API functionality has been lost.

marcvangend’s picture

Tonight, I tried to port taxonomy_node_get_terms() to D7, just to see if I can. It's quite a change to write dynamic queries (couldn't use the static type because it needs a hook_query_alter() on 'term_access'), but fortunately I don't mind learning something new :-)

However I quit (at least for today) when I found out the following:

It seemed that the {taxonomy_index} table is the drop-in replacement for {term_data} in D6, but it's not. While {term_data} used the $node->vid as foreign key, {taxonomy_index} uses $node->nid, so it can only provide information about the current revision of a node. That means that we should A) jump through hoops (like the snippet in #8) to allow getting the terms of a previous node revision, or B) simply stop supporting version id's in taxonomy_node_get_terms().

The problem with option B) is that it would not be a full port of the D6 function, so we would still be left with a bit of regression.

The problem with option A) may be even bigger - I'm not sure yet. The snippet from #8 assumes that node-term relationships are always stored in a field of type 'taxonomy_term_reference'. I'm wondering if that assumption is safe. Can we be certain that other taxonomy input methods (like Hierarchical Select) will also store data in a taxonomy_reference field, or can they use their own field type?

Damien Tournoud’s picture

Title: regression: really hard to get a node's taxonomy terms » It's really hard to get a node's "taxonomy terms"
Category: bug » feature

This is not a regression. Drupal 7 just radically changed the meaning of "node's taxonomy terms".

In Drupal 6, we used to have only one flat list of taxonomy terms attached to a node via the {term_node} table.

In Drupal 7, each node can be put in relationship with terms by the way of one or more "taxonomy reference field". Each of this field can have it's own set of allowed vocabularies, and as a consequence, you could even have the same taxonomy term referenced twice for the same node by the way of two different taxonomy reference field.

Note that there is no 1-1 mapping between "taxonomy reference field" and "vocabulary". Those are two different concepts.

The notion of the "node's taxonomy terms" is pretty close to meaningless now in Drupal 7. You can easily get "the terms referenced by the node in a given taxonomy reference field", and this has full meaning. But getting "all" the terms just cannot make sense in this framework. All the modules / themes that used to do that need to migrate to be field based.

Let's see the use case:

dman: Um, a quick search on a dev site of mine throws up a large number of instances where current D6 modules make use of either $node->taxonomy or taxonomy_node_get_terms() on a regular basis.

like token or pathauto - which uses it to make the term names available in patterns
apachesolr adds all taxonomy terms to search indexes
biblio, Feeds, each use the set of all terms when mapping data to it.

Token and Pathauto will probably migrate to field-based taxonomy terms. Something like [field_name:term:name] or similar.

Apachesolr needs to index each taxonomy reference fields separately.

Feeds already does a mapping and need to support mapping incoming terms to taxonomy reference fields.

Not sure about Biblio (not sure I really care either).

marcvangend: My use case is this: I'm trying to get the smartqueue module (part of nodequeue) working on D7. Smartqueue can automatically build a queue of nodes that are tagged with a term in a specific vocab. When a node is saved, smartqueue loops over $node->taxonomy and takes out the terms for the desired vocab. In D7 that is much harder; I know which vocabulary I'm looking for, but the $node object doesn't reveal which field holds that data. With $node->taxonomy not being available here, taxonomy_node_get_terms_by_vocabulary() would be a great help.

You need to move from "Smartqueue can automatically build a queue of nodes that are tagged with a term in a specific vocab" to "Smartqueue can automatically build a queue of nodes that are tagged with a term in taxonomy reference field".

John Albin: My use case: I have a module that checks the $node object to see what its taxo terms are. If one of the taxonomy matches a pre-defined term, then I set up some breadcrumbs and menu tree trails.

If the configuration is folded into the term entity, it would make sense to iterate over all the fields.

greg.harvey: Anyway, dman says it all in #5. I just needed $node->taxonomy last week for some category-based ad serving. I often get asked to do weird custom things with categories.

For this type of custom functionality, just a use a special taxonomy reference field and hardcode its name in your module.

--

Let's study this as a feature request. I'm still not exactly sure which feature that is, given the widely different use cases that we see here.

greg.harvey’s picture

@Damien, thanks for the detailed explanation - I'll bookmark this and have a play with your suggested solution ASAP, but I think I see what you mean. It was a 'wtf?' at the time, but once explained I think this makes sense. Effectively, the table for your field becomes the replacement for term_node (kind of).

joachim’s picture

I have to say, the flat array of $node->taxonomy was not always useful. If you wanted the terms for a given vocab (to make changes in the theme based on them, for instance), you have to muck about with looping and checking their vids.

marcvangend’s picture

Thanks Damien.

You need to move from "Smartqueue can automatically build a queue of nodes that are tagged with a term in a specific vocab" to "Smartqueue can automatically build a queue of nodes that are tagged with a term in taxonomy reference field".

That's exactly what I was thinking of, last night after writing #11. It's good to see that you consider that the right solution, not just a possible workaround.

I asked "Can we be certain that other taxonomy input methods (like Hierarchical Select) will also store data in a taxonomy_reference field?" I believe that your explanation in #12 implies that the answer is "yes" - is that correct?

Having read #12, I agree with you that this is probably not a bug. It only is a bug if we can not be certain that all node-taxonomy relationships are stored in taxonomy_reference fields, because then we wouldn't have a way to find out which fields store taxonomy data.
Maybe this issue should actually be a task with the title "Document the paradigm shift in programmatically handling node-term relationships".

dman’s picture

In broad terms, I totally agree that the specific per-vocab (or per-taxonomy-reference-field) approach is better and more useful. To use the $node->taxonomy array usefully, I also had to filter in vid like joachim says.This is all a win, and a better way to go.
We can adapt to that, on a case-by-case basis.

The only discomfort is from coming from D4,5,6 ways where a question that used to be easy (what terms is this node tagged with) has become much more difficult to answer now. We'll get through it.

Nick_vh’s picture

I don't completely agree with this - Making things that used to be easy harder is not a good way forward.
But I can follow in the architectural choices so therefore I suppose that a helper module could be made to easy people in the transition and in this module they could see how it exactly works with the fields and the iterations.

Thanks for the good explanations all!

joachim’s picture

> that a helper module could be made to easy people in the transition and in this module they could see how it exactly works with the fields and the iterations.

Yes, I'm wondering if a helper API-ish module might be useful, just so the wheel doesn't get reinvented in all the various taxonomy-based contrib modules...

marcvangend’s picture

joachim, at first I was thinking along the same lines, but I think we have to be careful not to write modules that offer nothing more than 'legacy support'. It has always been a deliberate choice to break backwards compatibility between major versions instead of providing legacy support. If the new approach is able to provide the same functionality (still looking for an answer to my field type question!) then we just need to document and explain it for everyone who has a WTF-moment just like I did.

Damien Tournoud’s picture

I asked "Can we be certain that other taxonomy input methods (like Hierarchical Select) will also store data in a taxonomy_reference field?" I believe that your explanation in #12 implies that the answer is "yes" - is that correct?

Yes. Hiearchical Select, Active tags and the like will be special widgets of the taxonomy reference field.

marcvangend’s picture

Thank you, Damien.

Does anyone still feel that there is bug to fix, or a feature to add? AFAIC, we can now turn this into a task for better documentation, using parts of #12 as starting point.

greg.harvey’s picture

Title: It's really hard to get a node's "taxonomy terms" » Document the changes to taxonomy.module properly to avoid confusion about missing $node->taxonomy property
Category: feature » task
Priority: Major » Normal

Good plan. Let's do it, people can change it back if they disagree.

Scott J’s picture

I was excited to find this conversation, but I'm still not clear on the best way to do this.

I found one idea at http://drupal.org/node/959984#comment-4028200 and I see that Token module is just getting around to this now in #741914: Add a [node:term].

seanbfuller’s picture

I was doing some work on the tac_lite D7 upgrade and wanted to see if I could get at all of the taxonomy information I needed with just the information available to me in hook_form_alter(). I ended up with something like this:

    // Go through all of the fields in the system to find the ones we want use
    $term_fields = array();  
    $info = field_info_fields();
    foreach($info as $key => $value){
      // First check for taxonomy_term_reference fields.
      if ($value['type'] == 'taxonomy_term_reference') {
        // Then check to see if they are associated with this node type (entity bundle).
        if (isset($value['bundles']['node'])) {
          if (in_array($form['#bundle'], $value['bundles']['node'])) {
            // Add an entry to our term_fields in the form of field name => vocabulary machine name
            $term_fields[$key] = $value['settings']['allowed_values'][0]['vocabulary'];
          }
        }
      }
    }
    
    // Now that we have the names of the fields, go through each one in the form.
    foreach($term_fields as $field_name => $vocab_name) {
      // Get the language key so we can find the correct element in the field.
      $field_language = $form[$field_name]['#language'];
      // Get the vocabulary info so we can get the vid.
      $v = taxonomy_vocabulary_machine_name_load($vocab_name);

      // Now act on $form[$field_name][$field_language] based on the $v->vid or other conditions...
    }

The first chunk seems a bit fragile to have in a contrib module, but seemed cleaner than trying to go through the entire $form structure. Ideally this is a smaller memory footprint than doing a direct query against the taxnomy tables. Not sure if it would make sense to add a helper function down the road that would return a list of field names based on optional entity, bundle or field type parameters. Hopefully this (or something like it) is useful and also in line with the concepts discussed above. Any feedback is definitely welcome.

dman’s picture

What you are doing there is clear code.
It does feel like there should be a more elegant short-cut - some more filtering that could be done by field_info_fields() ?

I think your suggestion along those lines is sensible.

scott.whittaker’s picture

I could see this being the basis of a helper function along the lines of taxonomy_get_fields($entity_type = 'node') and cache the assembled index using a static variable for repeated requests.

Summit’s picture

Hi, this is great architectural information in this thread!
+1 for helper module, but I really like the new architecture going forward!
greetings, Martjin