Problem/Motivation

We need a view for our content authors which displays content revisions that are filterable by moderation_state.

The (current) problem with this is that, per node, there could be several revisions with the same state (i.e "Needs Review"). We need a way to filter a view of content revisions to only the latest revision.

Proposed resolution

TBC, Either:
* A views_data hook
* Or a Filter plugin

Remaining tasks

Decide on approach.
Implement.
Test.

User interface changes

None.

API changes

TBC.

Data model changes

TBC.

Original report by acbramley

As we are beginning to implement workbench_moderation in our project we are coming across small things that are missing.

It would be good to be able to filter a content revision view by the current (i.e latest) revision.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

acbramley created an issue. See original summary.

acbramley’s picture

Issue summary: View changes
dawehner’s picture

Afaik you can totally achieve this:

uuid: 97fef943-3c2b-4739-be07-7de0750d738f
langcode: en
status: true
dependencies:
  module:
    - node
    - user
id: revs
label: revs
module: views
description: ''
tag: ''
base_table: node_field_revision
base_field: vid
core: 8.x
display:
  default:
    display_plugin: default
    id: default
    display_title: Master
    position: 0
    display_options:
      access:
        type: perm
        options:
          perm: 'view all revisions'
      cache:
        type: tag
        options: {  }
      query:
        type: views_query
        options:
          disable_sql_rewrite: false
          distinct: false
          replica: false
          query_comment: ''
          query_tags: {  }
      exposed_form:
        type: basic
        options:
          submit_button: Apply
          reset_button: false
          reset_button_label: Reset
          exposed_sorts_label: 'Sort by'
          expose_sort_order: true
          sort_asc_label: Asc
          sort_desc_label: Desc
      pager:
        type: full
        options:
          items_per_page: 10
          offset: 0
          id: 0
          total_pages: null
          expose:
            items_per_page: false
            items_per_page_label: 'Items per page'
            items_per_page_options: '5, 10, 25, 50'
            items_per_page_options_all: false
            items_per_page_options_all_label: '- All -'
            offset: false
            offset_label: Offset
          tags:
            previous: '‹ Previous'
            next: 'Next ›'
            first: '« First'
            last: 'Last »'
          quantity: 9
      style:
        type: default
        options:
          grouping: {  }
          row_class: ''
          default_row_class: true
          uses_fields: false
      row:
        type: fields
        options:
          inline: {  }
          separator: ''
          hide_empty: false
          default_field_elements: true
      fields:
        changed:
          id: changed
          table: node_field_revision
          field: changed
          entity_type: node
          entity_field: changed
          alter:
            alter_text: false
            make_link: false
            absolute: false
            trim: false
            word_boundary: false
            ellipsis: false
            strip_tags: false
            html: false
          hide_empty: false
          empty_zero: false
          plugin_id: field
          type: timestamp
          settings:
            date_format: medium
            custom_date_format: ''
            timezone: ''
          relationship: none
          group_type: group
          admin_label: ''
          label: ''
          exclude: false
          element_type: ''
          element_class: ''
          element_label_type: ''
          element_label_class: ''
          element_label_colon: true
          element_wrapper_type: ''
          element_wrapper_class: ''
          element_default_classes: true
          empty: ''
          hide_alter_empty: true
          click_sort_column: value
          group_column: value
          group_columns: {  }
          group_rows: true
          delta_limit: 0
          delta_offset: 0
          delta_reversed: false
          delta_first_last: false
          multi_type: separator
          separator: ', '
          field_api_classes: false
        title:
          id: title
          table: node_field_revision
          field: title
          entity_type: node
          entity_field: title
          label: ''
          alter:
            alter_text: false
            make_link: false
            absolute: false
            trim: false
            word_boundary: false
            ellipsis: false
            strip_tags: false
            html: false
          hide_empty: false
          empty_zero: false
          settings:
            link_to_entity: false
          plugin_id: field
          relationship: none
          group_type: group
          admin_label: ''
          exclude: false
          element_type: ''
          element_class: ''
          element_label_type: ''
          element_label_class: ''
          element_label_colon: true
          element_wrapper_type: ''
          element_wrapper_class: ''
          element_default_classes: true
          empty: ''
          hide_alter_empty: true
          click_sort_column: value
          type: string
          group_column: value
          group_columns: {  }
          group_rows: true
          delta_limit: 0
          delta_offset: 0
          delta_reversed: false
          delta_first_last: false
          multi_type: separator
          separator: ', '
          field_api_classes: false
      filters:
        status:
          value: true
          table: node_field_revision
          field: status
          plugin_id: boolean
          entity_type: node
          entity_field: status
          id: status
          expose:
            operator: ''
          group: 1
      sorts: {  }
      header: {  }
      footer: {  }
      empty: {  }
      relationships:
        vid:
          id: vid
          table: node_field_revision
          field: vid
          relationship: none
          group_type: group
          admin_label: 'Get the actual content from a content revision.'
          required: true
          entity_type: node
          entity_field: vid
          plugin_id: standard
      arguments: {  }
      display_extenders: {  }
    cache_metadata:
      max-age: -1
      contexts:
        - 'languages:language_content'
        - 'languages:language_interface'
        - url.query_args
        - 'user.node_grants:view'
        - user.permissions
      tags: {  }
acbramley’s picture

The latest revision is not necessarily published when using workbench_moderation so that filter isn't going to work.

Also adding that relationship and those fields adds a join to node_field_data so you are getting the revision that's referenced in that table.

dawehner’s picture

Well, then move this issue to workbench_moderation, it is a concept they should solve.

acbramley’s picture

Project: Drupal core » Workbench Moderation
Version: 8.0.x-dev » 8.x-1.x-dev
Component: views.module » Code

Moving.

EDIT: after thinking about it a bit I'm not entirely sure this is a workbench_moderation thing since revisions are core, but we'll see what the WBM people think.

Crell’s picture

Here's a first stab at a latest-revision filter. It's not working yet, though.

Background: In Drupal 7, WBM maintained its own history table to track various and sundry things, which included a flag to indicate what revision was the latest revision. In D8, we don't have such a table so we have nowhere to store that flag. Sad panda.

In IRC, dawehner suggested using max(vid) and GROUP BY to produce the same result. Amazingly enough, that seems to work in my testing. The SQL query we want to generate is:

select nid, max(vid) as max_vid, max(revision_timestamp) as max_timestamp from node_revision group by nid

(Or equivalent.)

I do NOT know if this will scale performance-wise. I've only run it on a table with a dozen revisions, so anything will be fast. Further testing there is needed.

Trying to replicate that with Views, I got to the attached patch. (Hat tip to larwolan along the way.) However, although I am calling addGroupBy() on the query the GROUP BY clause is not showing up in the resulting query, and thus the query is invalid. I am unclear why this is, but my suspicion is that it relates to this excerpt from \Drupal\views\Plugin\views\query\SQL::query():

    // Assemble the groupby clause, if any.
    $this->hasAggregate = FALSE;
    $non_aggregates = $this->getNonAggregates();
    if (count($this->having)) {
      $this->hasAggregate = TRUE;
    }
    elseif (!$this->hasAggregate) {
      // Allow 'GROUP BY' even no aggregation function has been set.
      $this->hasAggregate = $this->view->display_handler->getOption('group_by');
    }
    $groupby = array();
    if ($this->hasAggregate && (!empty($this->groupby) || !empty($non_aggregates))) {
      $groupby = array_unique(array_merge($this->groupby, $non_aggregates));
    }

Is that perhaps a bug in Views? It does seem like hasAggregate should be true if there is a group by field. (The getOption() call is also returning false.) I will at this point ask for help from dawehner. Halp???

lokapujya’s picture

I didn't have time to test this ( I will later) , but it at least produces a valid query. I also turned aggregation on via the UI:

<?php
    $query->addField(NULL, "{$table}.vid",NULL,['aggregate'=>TRUE]);
    $query->addField(NULL, "{$table}.revision_timestamp",NULL,['aggregate'=>TRUE]);
?>
dawehner’s picture

Yeah, no idea, without having properly time to look at it. This might be related with https://stackoverflow.com/questions/1225144/why-does-mysql-allow-group-b...

Crell’s picture

lokapujya: That doesn't work for me either, whether I check "Aggregate" in the UI or not. The GROUP BY clause is still not added to the query string.

If I hack the Views code above to set hasAggregate if $this->groupby has a value, too, I get this error:

SQLSTATE[HY000]: General error: 1111 Invalid use of group function: SELECT COUNT(*) AS expression FROM (SELECT max(node_revision.vid) AS maxnode_revisionvid, max(node_revision.revision_timestamp) AS maxnode_revisionrevision_timestamp, 1 AS expression FROM {node_field_revision} node_field_revision INNER JOIN {node_revision} node_revision ON node_field_revision.vid = node_revision.vid GROUP BY node_revision.vid, max(node_revision.vid), max(node_revision.revision_timestamp)) subquery; Array ( )

Which appears to be on the count query version of the View, which I think is only running to render the pager in the preview. The resulting query is

SELECT max(node_revision.vid) AS maxnode_revisionvid, max(node_revision.revision_timestamp) AS maxnode_revisionrevision_timestamp, MIN(node_field_revision.vid) AS vid
FROM 
{node_field_revision} node_field_revision
INNER JOIN {node_revision} node_revision ON node_field_revision.vid = node_revision.vid
GROUP BY node_revision.vid, max(node_revision.vid), max(node_revision.revision_timestamp)

It seems to be grouping by everything, which seems odd. Also, what the heck is the node_field_revision table? :-)

dawehner: I'm pretty sure this is an issue on the Views end, as it's simply not adding a GROUP BY clause at all, which is guaranteed wrong anyway.

Crell’s picture

Priority: Normal » Major

Let's be honest, this is kind of a big deal...

lokapujya’s picture

Crell, did you make the changes to your addField() calls that I posted in #8? I don't think the max() function should be in the "group by". After doing that, I see the group by is added to the query and don't get the group by errors.
I am on mySQL 5.7 though. mySQL 5.7 became more strict so that you basically are not allowed to select a field that is not either in an aggregate function or in the group by.

After doing #8, from Views preview:

SELECT node_field_revision.changed AS node_field_revision_changed, node_field_revision.title AS node_field_revision_title, node_revision.vid AS node_revisionvid, node_revision.revision_timestamp AS node_revisionrevision_timestamp, MIN(node_field_revision.vid) AS vid
FROM 
{node_field_revision} node_field_revision
INNER JOIN {node_revision} node_revision ON node_field_revision.vid = node_revision.vid
WHERE (( (node_field_revision.status = '1') ))
GROUP BY node_revision.vid, node_revision.revision_timestamp, node_field_revision_changed, node_field_revision_title
LIMIT 10 OFFSET 0
acbramley’s picture

Thanks for picking this up @Crell!
node_field_revision seems to be where all the extra data about the revision is stored, I find it odd that revision_log is still in the node_revision table. I've opened #2674518: Add revision log field to views as well, it would be a whole lot easier if that field was in the node_field_revision table!

dawehner’s picture

@Crell
The symptoms you describe in #7 basically read like #2674480: Views does not properly ignore the query cache during preview.

EclipseGc’s picture

Status: Needs review » Needs work

So I worked on this a bit this morning to see if I could help move it along, interesting points of interest (which could be repetitively redundant).

1.) The query currently generated is essentially: https://gist.github.com/EclipseGc/f8d61f8eeaf043db9131

This query has max() calls in the group by which do not appear to work.

2.) Everything else in the query is mostly a unique identifier.

Title could be unique, vid definitely is, date almost certainly is, so all the results are going to appear, we need to group by nid

3.) The title and changed appear to be added based upon the fields in the view.

I'm not certain why this is. I suspected NodeRevision wizard to be the culprit but haven't found any evidence to support that suspicion. Whatever the case, the fields we add appear to be affecting the group by logic.

4.) I only know all this because I got the group to actually apply.

Deep in the bowels of \Drupal\views\Plugin\views\query\Sql::query() you'll find this line:


$this->hasAggregate = $this->view->display_handler->getOption('group_by');

Unfortunately the query plugin's addGroupBy method does not update the display's options to reflect that we want to do a group_by, so I added a line to the latest revision handler provided by this patch that after running $query->addGroupBy("{$table}.nid"); we do a:


$this->view->display_handler->options['group_by'] = TRUE;

This passes the buck to the pager though. I simply forced a no-pager scenario to debug further, but we'll have to update the pager's query as well.

Hope this is helpful in some way.

Eclipse

Crell’s picture

Thanks, Kris.

After walking through this this morning with EclipseGc, and brainstorming the performance implications further, I decided to abandon the GROUP BY approach. I don't want to fight Views that badly, plus we decided the likely workarounds would run into performance issues at scale.

I am therefore now working on the backup plan, which is to maintain our own tracking table of what the latest revision is and expose that to Views, allowing us to just join against it like a normal person. Completely new patch coming when I've written it...

arknoll’s picture

++ @crell

lahoosascoots’s picture

Hey! We really need this!

We are trying to add the Moderation State of the latest revision to the Content overview page. This functionality would give us the exact relationship we need to get that done.

Crell’s picture

Work in progress. Still doesn't actually work. At the moment I'm hoping to avoid having to manually map the new table to every base table that exists and use direct query modification in the filter itself, but I may not be able to avoid that. (Anyone with more Views-fu, your recommendations are welcome.)

Status: Needs review » Needs work

The last submitted patch, 19: 2674520-latest-revision-table.patch, failed testing.

jibran’s picture

+++ b/workbench_moderation.module
@@ -158,3 +172,303 @@ function workbench_moderation_theme($existing, $type, $theme, $path) {
+function hook_views_data() {

Thanks for the docs. :P

Crell’s picture

Now you get a glimpse into how Crell develops for complex hooks... :-P

jibran’s picture

Just for the record hook_views_data should go to *.views.inc.

Crell’s picture

I expect to push it to a service anyway, so it shouldn't really matter at that point, should it?

jibran’s picture

I expect to push it to a service anyway, so it shouldn't really matter at that point, should it?

Moving it to service is fine but having a *.views.inc file in the modules just shows that there is a views integration by looking at file system that's why I'd prefer having it.

Crell’s picture

OK, so as seen in the above snapshot of not-workingness, I'm unsure which of two annoying approaches to take. I'd love some input from someone who knows Views better than I do on the best way to do this.

Situation: I have a table that consists of entity type, entity id, revision id, language, and an int-boolean flag (latest). I want to add a filter to a "revision" view of any revisionable content type, such that when making a view of revisions only those with the "Latest" bit set are included. There's a single table for all entity types, since there seems to be no value-add to making a separate table. Now I need to build that filter. Options:

1) Do everything through hook_views_data(), and create "latest" as a virtual field in views. This will require setting up joins against every revision base table dynamically. Because the keys in every entity are different, this involves a ton of indirection and lookup up values from the entity definition in entity type manager, and I'm not entirely clear how to also build a join clause for "and this field is this literal value", because for, eg, nodes I want to join where entity_type="node" and all the other fields match up with the record in the base table. (And it's debatable if the "latest" column should be hard coded join, or if that should be exposed as a checkbox option for latest/is not latest. I cannot think of a case where you'd want is-not-latest, but that doesn't prove it doesn't exist.) Problem: Fugly massive array iteration and dynamic assignment, and I don't know how to do static value compound joins.

2) Skip the table definition and just modify the query directly in the plugin. This feels like it would be lighter weight and less code, and make the static parts easier. Or rather, would if I were dealing with vanilla DBTNG Select queries. As is, the best I can do is a long list of WHERE expressions that I hope come out right. Problem: Again, I'm not sure if my knowledge of the Views API is even remotely correct here, or if this kind of direct query manipulation is one of the things parents warn their children about when sending them off to school for the first time, along with not taking candy from strangers.

Someone from the Views world: which approach is less-bad, and any pointers on how to take that approach in the least-bad way? What API calls do I not know exist that I should be using? (The inline documentation is unfortunately thinner than I'd like.)

larowlan’s picture

Comment module has a similar conundrum, and does option 1

dawehner’s picture

  1. +++ b/src/Plugin/views/filter/LatestRevision.php
    @@ -0,0 +1,80 @@
    +
    +    $query->addWhere($query->options['group'], 'entity_type', $this->getEntityType());
    +    $query->addWhereExpression($query->options['group'], "{$table}.{$keys['id']}={$tracker_table}.entity_id", []);
    +    $query->addWhereExpression($query->options['group'], "{$table}.{$keys['langcode']}={$tracker_table}.langcode", []);
    +    $query->addWhereExpression($query->options['group'], "{$table}.{$keys['revision']}={$tracker_table}.revision_id", []);
    +  }
    

    Regarding the question of code vs. hook_views_data() I would use the following decision tree:

    a) If its sort of easy, use hook_views_data(), note I added #2699721: Add a domain object to easy defining hook_views_data which at least removes the argument of array handling.
    b) Use a domain specific join plugin
    c) Fallback onto filter plugin, if still needed.

  2. +++ b/src/RevisionTracker.php
    @@ -0,0 +1,129 @@
    +  protected function ensureTableExists() {
    +    try {
    +      if (!$this->connection->schema()->tableExists($this->tableName)) {
    +        $this->connection->schema()->createTable($this->tableName, $this->schemaDefinition());
    +        return TRUE;
    +      }
    +    }
    +    catch (SchemaObjectExistsException $e) {
    +      // If another process has already created the table, attempting to
    +      // recreate it will throw an exception. In this case just catch the
    +      // exception and do nothing.
    +      return TRUE;
    +    }
    +    return FALSE;
    +  }
    +
    

    This is a bad code design, rather you should do something what core does: lazy create the table when the actual query fails.

  3. +++ b/src/RevisionTrackerInterface.php
    @@ -0,0 +1,27 @@
    +  public function setLatestRevision($entity_type, $entity_id, $langcode, $revision_id);
    

    What about using updateLatestRevision which maybe makes it clearer that this does some DB operation as well.

  4. +++ b/workbench_moderation.module
    @@ -158,3 +172,303 @@ function workbench_moderation_theme($existing, $type, $theme, $path) {
    +function hook_views_data() {
    +  // This example describes how to write hook_views_data() for a table defined
    +  // like this:
    +  // CREATE TABLE workbench_revision_tracker (
    +  //   nid INT(11) NOT NULL         COMMENT 'Primary key: {node}.nid.',
    +  //   plain_text_field VARCHAR(32) COMMENT 'Just a plain text field.',
    +  //   numeric_field INT(11)        COMMENT 'Just a numeric field.',
    +  //   boolean_field INT(1)         COMMENT 'Just an on/off field.',
    +  //   timestamp_field INT(8)       COMMENT 'Just a timestamp field.',
    +  //   langcode VARCHAR(12)         COMMENT 'Language code field.',
    +  //   PRIMARY KEY(nid)
    +  // );
    +
    +  // Define the return array.
    +  $data = array();
    +
    +  // The outermost keys of $data are Views table names, which should usually
    +  // be the same as the hook_schema() table names.
    +  $data['workbench_revision_tracker'] = [];
    +
    +  // The value corresponding to key 'table' gives properties of the table
    +  // itself.
    +  $data['workbench_revision_tracker']['table'] = [];
    +
    +  // Within 'table', the value of 'group' (translated string) is used as a
    +  // prefix in Views UI for this table's fields, filters, etc. When adding
    +  // a field, filter, etc. you can also filter by the group.
    

    lol

Crell’s picture

Marked as a duplicate of this issue: #2626166: Create a view for content revisions

jibran’s picture

I think scope of #2626166: Create a view for content revisions and this issue are different. That one is about adding a view it is about adding a view handler.

acbramley’s picture

I agree with @jibran, #2626166: Create a view for content revisions by the looks of it is targetted at creating a view for the revisions page of a node, entirely separate issue to what I created this issue for. I am going to update the issue summary now since we have some traction.

acbramley’s picture

Issue summary: View changes
Crell’s picture

OK, with much help from dawehner and EclipseGc, here's what I think is a tested and working approach! :-)

It adds a filter for "is latest revision". There's no configuration or negation because I don't know why you'd want to do that, and it's harder to make that work. Add that filter to a node_revision or block_content_revision view and poof, no more older revisions.

Can I get an Amen, RTBC, or turkey sandwich?

jibran’s picture

  1. +++ b/src/Plugin/views/filter/LatestRevision.php
    @@ -0,0 +1,82 @@
    +  public function __construct(array $configuration, $plugin_id, $plugin_definition, EntityTypeManagerInterface $entity_type_manager) {
    ...
    +  public static function create(ContainerInterface $container, array $configuration, $plugin_id, $plugin_definition) {
    ...
    +  public function adminSummary() { }
    ...
    +  protected function operatorForm(&$form, FormStateInterface $form_state) { }
    ...
    +  public function canExpose() { return FALSE; }
    ...
    +  public function query() {
    

    Docs missing.

  2. +++ b/src/Plugin/views/filter/LatestRevision.php
    +  public function adminSummary() { }
    

    It should be good if we are able to add something helpful here.

  3. +++ b/src/Plugin/views/filter/LatestRevision.php
    @@ -0,0 +1,82 @@
    +  public function getCacheContexts() {
    +    $contexts = parent::getCacheContexts();
    +
    +    return $contexts;
    +  }
    

    Is this really needed?

  4. +++ b/src/RevisionTrackerInterface.php
    @@ -0,0 +1,27 @@
    +<?php
    +namespace Drupal\workbench_moderation;
    

    Space missing.

  5. +++ b/src/RevisionTrackerInterface.php
    @@ -0,0 +1,27 @@
    +
    +
    

    One enter two many.

  6. +++ b/tests/modules/workbench_moderation_test_views/config/install/views.view.latest.yml
    @@ -0,0 +1,399 @@
    +base_table: node_field_revision
    

    I think entity_revision_test would be a better candidate for this view.

jibran’s picture

  1. +++ b/src/RevisionTracker.php
    @@ -0,0 +1,154 @@
    +        'revision_id' => $revision_id,
    

    Shouldn't this be 'entity_id'? An entity can only have one latest revision.

  2. +++ b/src/RevisionTracker.php
    @@ -0,0 +1,154 @@
    +      'primary key' => ['entity_type', 'entity_id', 'langcode'],
    

    Why not revision id?

Crell’s picture

1) Added
2) I don't think there is anything to add there; the title itself is sufficient to explain what its presence means in the UI.
3) Nope.
4) Fixed.
5) Fixed.
6) Eh, maybe. But the rest of the module's tests are done against nodes, and I never quite figured out what entity test entities were appropriate in what situation. That's maybe something to tweak another time.

#35: The table is really a simple key-value lookup. entity type, entity id, and langcode form the key, and never change. (Eg, "node:5:en".) The value is the revision ID that is the current latest revision. However, that latest revision is contextually local to the entity type and langcode, hence why all 3 keys are needed. This also allows the primary key index to be optimized, as it's those 3 keys that we will be looking up against. Also, having every column in a table be part of the primary key is, while legal, weird feeling. :-) (I don't know how else to describe that.)

acbramley’s picture

Awesome work @Crell and co, I'll give this a test and report back soon.

jibran’s picture

My issue with the table is that it can return more then one revision id for "node:5:en".

acbramley’s picture

Very happy to report the filter works well for me. Thank you so much for all your hard work everyone!

Crell’s picture

jibran: How? node:5:en is the PKey, so there's no way for there to be more than one of that record.

jibran’s picture

Status: Needs review » Reviewed & tested by the community

Thanks @Crell for clearing my confusion.

+++ b/src/RevisionTracker.php
@@ -0,0 +1,154 @@
+    return $this->connection->merge($this->tableName)
+      ->keys([
+        'entity_type' => $entity_type,
+        'entity_id' => $entity_id,
+        'langcode' => $langcode,
+      ])
+      ->fields([
+        'revision_id' => $revision_id,
+      ])
+      ->execute();

Oh we are just updating the revision_id here. Sorry I read it wrong last time.

Crell’s picture

Yay!

I had to rebase (because I merged some other stuff this morning), so to be safe let's run this by testbot one last time. Will merge if it passes.

  • Crell committed 9c1b25c on 8.x-1.x
    Issue #2674520 by Crell, dawehner, EclipseGc: Add current revision...
Crell’s picture

Status: Reviewed & tested by the community » Fixed

And done. Go team!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

joseph.olstad’s picture

Nice work above, this is needed for Drupal 8 core 'content_moderation'

if anyone has done this, please private message me, or add a link below with a link to the patch or to an open issue for this.

otherwise I may be forced to create one myself.