Is there a reason hook_field_update doesn't update the taxonomy index for unpublished entities? I have a use case where editors can manipulate the taxonomy terms of unpublished nodes but these updates aren't inserted into the taxonomy index table. This makes it a challenge to create a view that searches unpublished nodes by taxonomy term name (as opposed to tid, which can be search by taxonomy_field_name_instance)

This would be trivial to patch as it is just an if statement in line 501 of field.api.php.

Files: 
CommentFileSizeAuthor
#30 add-unpublished-nodes-to-taxonomy-index-962664-30.patch8.34 KBklaasvw
FAILED: [[SimpleTest]]: [MySQL] 34,438 pass(es), 4 fail(s), and 0 exception(s).
[ View ]
#28 add-unpublished-nodes-to-taxonomy-index-962664-28.patch8.29 KBklaasvw
FAILED: [[SimpleTest]]: [MySQL] 34,296 pass(es), 4 fail(s), and 2 exception(s).
[ View ]
#20 taxonomy-index-unpublished-20.patch1.54 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 33,544 pass(es).
[ View ]
#18 taxonomy-index-unpublished-18.patch1.54 KBxjm
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in modules/taxonomy/taxonomy.module.
[ View ]
#4 taxonomy.module.patch1.53 KBngmaloney
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch taxonomy.module_65.patch.
[ View ]

Comments

Component:field system» taxonomy.module

recategorizing

The code for this indexing happens in /modules/field/field.api.php so one could argue it is belongs there. Regardless of where it lives, it would be helpful to give developers the ability to index non-published fields, particularly in views. The patch could be something as simple as:

(field.api.php:501)

if ($entity->status) {...

to something like...

if($entity->status || variable_get('index_non_published',false)) {...

field.api.php is just documentation code that is never executed.
The behavior happens in taxonomy_field_update() in taxonomy.module.

StatusFileSize
new1.53 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch taxonomy.module_65.patch.
[ View ]

Thanks for the info yched. Attached is a patch that would allow developers to override the default behavior (allowing non-published entities to be indexed in taxonomy_index). It is similar to how taxonomy currently handles "taxonomy_maintain_index_table".

Status:Active» Needs review

catch and bangpound designed the behaviour of the taxonomy index table.
Changes around this would need their seal of approval.

Status:Needs review» Needs work

The last submitted patch, taxonomy.module.patch, failed testing.

patch needs to be rolled with git diff --no-prefix :-)

I'm going to need a little orientation with this issue, so please be patient with me.

The taxonomy_index table is used to reproduce the "taxonomy/term/%term" pages similar to how they were done in Drupal 6. Is Views also relying on it for some reason? Why can't it use the field data tables directly? The taxonomy_index table is mostly a bridge back to old functionality. You should still be able to create a view that can show unpublished nodes. These term vallues are stored in the field data tables.

bangpound, you are correct, It is possible to query by field value in views. One of the drawbacks is, by default, you can only query by TID, not Term Name. Having the values in the taxonomy_index table gets around this issue. The other way is to write a php argument handler. The patched method seems like a "cleaner" implementation when doing look ups by Term Name.

I made the patch merely as a demo to spark some dialog. While it would be helpful I understand if it creates new issues and isn't worth going any further with.

This is also necessary for #741914: Add a [node:term] to work reliably, which is a token that I'm getting a *ton* of requests for.

EDIT: Wrong issue link...fixed.

The patch in #4 seems to fix this issue on nodes that are created, but upon saving a new (unpublished) node there still seems to be a problem. Any ideas on how to solve this?

Subscribed.

Is there documentation for module maintainers on how to get taxonomy data for unpublished nodes? That is if this issue is not being followed through.

Subbing

Has this issue been abandoned? It is holding back #741914: Add a [node:term] and it would be really great to be able to have forums which have a proper naming structure again. If there is a work around could someone please post it?

Version:7.0-beta2» 8.x-dev
Issue tags:+needs backport to D7

Subscribing, re-versioning, tagging, and linking to relevant Views issue: #1182924: Unpublished entities do not appear in views when using a taxonomy relationship

I have written http://drupal.org/project/taxonomy_entity_index in the meantime to index all entities regardless of published or not. I've opened an issue to add views integration for it.

This is the same sort of trap that D6 core had in node_access(): if a node was not published, all access grants were ignored. Game over.

There are two checks for $entity->status in the Taxonomy module, one when inserting entities and one when updating. In my opinion neither should be there.

It should not be up to the Taxonomy module to do nothing simply because a node/entity is not published.

It's up to the (node) access modules to decide whether (fields of) a node/entity should be displayed to the logged-in user or not.

To give my D7.4 users Views that display taxonomy terms correctly whether the nodes in question are published or not, I ended up implementing in my module hook_node_insert() and hook_node_update() inserting in their bodies calls to a function that basically does what taxonomy_field_update() does, but unconditionally, regardless of publication status.

Status:Needs work» Needs review
StatusFileSize
new1.54 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in modules/taxonomy/taxonomy.module.
[ View ]

Rerolled against 8.x, with the following changes:

  1. Renamed the variable to taxonomy_index_unpublished, which is more standard terminology
  2. Changed the default value to FALSE so the default core behavior is maintained when the variable is not set.

Status:Needs review» Needs work

The last submitted patch, taxonomy-index-unpublished-18.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new1.54 KB
PASSED: [[SimpleTest]]: [MySQL] 33,544 pass(es).
[ View ]

Erm. Let's try a valid patch.

Status:Needs review» Needs work

taxonomy_select_nodes() (which creates the term page and feed listings) selects all nodes by term from the {taxonomy_index} table. This patch would introduce unpublished nodes into those listings. So, at a minimum, the patch needs to include a fix there as well.

We would also need to identify any other instances where core might be making this assumption. There's no accounting for contrib. The upshot is that the setting would only ever be used by contrib or custom module developers who knew it existed, so for anyone else, no change has been made.

From catch on IRC:

I could live with adding unpublished nodes, if we add the status column too and fix the query.
adding a column is not out of the question, the table is built dynamically.

So, to add a status column to the table, the following changes would need to be made:

  1. Schema updated with the additional column.
  2. hook_update_N() added to:
    • add the column to the existing table.
    • set the status for existing records to 1.
    • insert records for unpublished nodes, with status 0.
  3. Add a WHERE status = 1 condition to existing selects against this table where relevant.
  4. Remove the if ($entity->status) checks in taxonomy_field_insert() and taxonomy_field_update().
  5. Add the status to the db_insert() queries in taxonomy_field_insert() and taxonomy_field_update().

Shameless subscribe - been bitten hard by this one for creating adminisatration pages a client today.

Subscribing

subscribing, would love to see this in D7 asap :x thanks!

PS. Agreeing to what http://drupal.org/node/962664#comment-4692122 said, should this not be marked as a bug report? IMO it really is one as it makes it impossible to fetch unpublished nodes through their taxonomies :/

Well, I think leaving out unpublished nodes was a design feature to reduce overhead, since this table can get crazy-big on a large site. Technically you could use the EFQ to look up unpublished nodes by taxonomy--fetch a list of all fields, look for taxonomy fields, look up which bundles have these fields, and loop to query against values in all these fields. But that's kinda crazy.

#962664-23: Taxonomy Index for unpublished entities has a pattern for a solution that one of the core taxonomy maintainers has tentatively okayed, so that's the best bet for fixing this. I can try a patch at some point when I have time, but I won't have time this week, so if someone else does have time this week... :)

Status:Needs work» Needs review
StatusFileSize
new8.29 KB
FAILED: [[SimpleTest]]: [MySQL] 34,296 pass(es), 4 fail(s), and 2 exception(s).
[ View ]

Here's a patch that does the following:

  1. Add a status field to the taxonomy_index table and set it to 1 for existing entries.
  2. Add the status field to term_node index in the taxonomy_index table.
  3. On update, execute a batch job to add all unpublished nodes to the taxonomy_index table.
  4. Index both published and unpublished nodes.
  5. Set the status field in taxonomy_index to that of the node on insert and update.
  6. Make sure taxonomy_select_nodes only returns results for published nodes.
  7. Update documentation and field.api.inc code to reflect all the above changes.

Status:Needs review» Needs work

The last submitted patch, add-unpublished-nodes-to-taxonomy-index-962664-28.patch, failed testing.

StatusFileSize
new8.34 KB
FAILED: [[SimpleTest]]: [MySQL] 34,438 pass(es), 4 fail(s), and 0 exception(s).
[ View ]

Second try! This patch uses $status instead of $node->status and fixes some newline and whitespace issues.

I have no idea why the hook_update is failing. Looks like testbot sets up a site with the new schema and then executes the update, instead of setting up a site with the old schema.

Does someone have any idea how to fix this?

Status:Needs work» Needs review

Setting status.

Status:Needs review» Needs work

The last submitted patch, add-unpublished-nodes-to-taxonomy-index-962664-30.patch, failed testing.

Has anyone tried #30, really needing it for a project.

@arcane: In the interim, you might consider:
http://drupal.org/project/taxonomy_entity_index

Thanks, I did get taxonomy_entity_index working

@xjm - re: taxonomy_select_nodes() update: the current patch includes that change.

...friendly bump ;)

Kick to D9? :/

Any reason it's a feature request and not a bug? I mean, really, index gets out of sync with nodes indexed in it... why? Was that some historic, pre-views design from times when adding filter->published for taxonomy listings was difficult? I was sure Drupal matured way past that.

It's a performance optimization for a common query.

I always believed that performance data was supposed to be kept in cache_* tables. Current approach simplify one query, but breaks other, particularly in views. And given views are going to be in core now, it's certainly a core bug that setting view filters to show unpublished nodes of taxonomy term always returns empty set, isn't it? Question is, is it a bug in views or in taxonomy?

Issue tags:+Views in Core

Adding "Views in Core" tag as this is quite significant incompatibility between views, new in core, and taxonomy. This incompatibility was forgiveable when views was a contrib module not meant to fully support everything out there. And please, consider if it's a views problem (they use table not meant for it in their filters) or taxonomy one (design flaw in table's routines).

Issue tags:-Views in Core

With Views in core, it is possible that the taxonomy-module page that relies on this table will be removed entirely and replaced with a view. Currently, you should use one or the other.

It's not possible to build sane views with taxonomy filters now, as views are using this very table to join nodes to taxonomy, and it contains only part of the data. That's the very point. Do you suggest that using taxonomy_index in views code is views bug now?

Just ran into this issue on a new feature to a site I am working on.

Wanted to avoid installing another module onto the site which includes it's own tables, so comment #30 was the key. I will not even be using the taxonomy index table for page displays so the aspects of status does not apply to my use case but I understand there would be further development after the status row is added.

I applied the patch in #30 pretty cleanly to a Drupal 7 install (minus the sandbox aspects of Drupal 8 upgrade scripts) and it is working exactly as intended. No adverse affects to other parts of the site and the beauty is that views continues to work flawlessly with this patch and is non the wiser.

Issue summary:View changes

coolestdude1 -- I'm having trouble applying the patch on my D7 site -- hunk #2 of the patch to taxonomy.install fails for me (everything else applies fine). It looks like it isn't working because the patch is trying to act on a different version of taxonomy.install than what I have, which is from Drupal 7.26. Did you manually apply this patch, is that why it worked for you, or? I realize there's some D8 "stuff" going on, but the other parts of the patch applied so cleanly, I feel like I'm missing something...

You can manually apply most patches if that works better for you. However module and core maintainers generally require all patches to be against the latest development copies, the process is typically referred to as 'rebase' around here. As mentioned in my previous post I did not require the status field so I left that whole portion of the patch out of my changes and picked which portions would apply here. My thought process for this change was to drop the node status as I do not really care if an entry is made on the taxonomy table for an unpublished node (distinguishing them was not something I found useful in my use cases).

I would say the most useful part of the patch is the changing to the way in which the node saves it's taxonomy information. Typically that information is stored alongside the nodes field tables and is later aggregated into the taxonomy_index table upon node publish. So the best part of the previous patch was getting past the if statement for the node status.

If you have any pages which directly expose taxonomies and nodes to non privileged users I would advise against this change as of now or at least hold off until the status flag is in use or a better solution can be found.

I have also begun to experiment with using this patch and the view_unpublished module to beef up security but have yet to have that particular solution working. Hope that helps.