Duplicate nodes in views - how to eliminate them

garywiz - September 17, 2009 - 23:13
Project:Domain Access Advanced
Version:6.x-1.0-rc1
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

I am using domain 2.0-rc9 as well as domain_adv 1.0-rc1. On many occasions I am getting duplicate nodes. It is so prevalent that I won't try to describe each situation. But, for example, if I create nodes which have access on multiple sites, or site wide access, the problem shows up.

The reason (I believe) is obvious. The LEFT JOIN's present in the access queries are causing multiple rows to turn up whenever multiple affiliates are assigned to nodes. This also happens on other modules (such as og) which use JOINs to detect access situations.

One solution is to replace the JOIN with a subselect. For example, if you replace domain_adv_db_rewrite_sql with the following, the access logic is the same, but additional rows will never be introduced into the result query:

function domain_adv_db_rewrite_sql($query, $primary_table, $primary_field, $args) {
  switch ($primary_field) {
    case 'nid':
      $global_or = '';
      if ($GLOBALS['_domain']['domain_id']) {
        $global_or = ' OR (dasub.realm = "domain_site" AND dasub.gid = 0)';
      }
      $return = array();
      $return['where'] = '((SELECT COUNT(*) FROM domain_access dasub WHERE ('.
        $primary_table . '.nid = dasub.nid AND ((dasub.realm = "domain_id" AND dasub.gid = '.
        $GLOBALS['_domain']['domain_id'] .')' . $global_or . '))) > 0)';
      return $return;
      break;
  }
}

I believe this is pretty foolproof. However, it depends upon using a database which supports subselects, as MySQL 5+ does. So, I'm not making any claim that this is a perfect solution for the module.

Many people have tried to solve these problems by introducing DISTINCT queries, or turning on "Distinct" features in the views they create. But, when there are a lot of nodes, this can really create egregiously long query times. For example, in some of my views, the presence of duplicates creates 8 times more records than there should be (the combination of joins for og + domain are multiplicative).

I'd be curious about your assessment about this.

#1

cardentey - October 16, 2009 - 21:39

me too.

#2

Skirr - October 19, 2009 - 21:19

Have same problem. Is it possible to make a patch?

#3

garywiz - October 21, 2009 - 00:57

I stopped using domain_adv shortly after I wrote that post. It turned out that the combination of organic groups, domains, etc. was causing many problems of this nature elsewhere and so many duplicates because of multiple entries in the node access table. I hear this is being fixed in D7 core, but am not sure. This is really a fundamental flaw in the way node access queries are formulated and used.

In the mean time, I made a simple change to node.module (GASP, I know), that completely fixed the problem in every single case on our site. It's not good to hack core, so I would be hesitant to produce such a patch. But, the change is really simple and converts the code in node_db_rewrite_sql() so that it uses subselects instead of joins. I was desperate and needed a solution and just had to be pragmatic.

From my modified node.module...

/**
* Implementation of hook_db_rewrite_sql
*/
function node_db_rewrite_sql($query, $primary_table, $primary_field) {
  if ($primary_field == 'nid' && !node_access_view_all_nodes()) {
    $wsql = _node_access_where_sql();
    if ($wsql) {
      $return['where'] = '((SELECT COUNT(*) FROM node_access na WHERE '.$primary_table.
      '.nid = na.nid AND ('. $wsql .')) > 0)';
    }
    return $return;
  }
}

#4

t.lan - December 4, 2009 - 19:39

Hi garywiz,

I'm running in the same problems now.

Do you have any update on this topic?

Is your modification from #3 still succesfully running?

Regards, Thomas

#5

garywiz - December 5, 2009 - 08:06

Yes, this has been working fine for us after many weeks of development and testing and we have a very complex site involving multiple domains, og, domain_taxonomy. Again, the patch is a core change, so be warned.

#6

t.lan - December 5, 2009 - 10:09

Seeing no other chance to fix our problems I will have to use this. And would like to thank you for making your fix public!

I added an Issue in Drupal core with the hope to have some feedback from the core maintainers. http://drupal.org/node/651526

Regards, Thomas

 
 

Drupal is a registered trademark of Dries Buytaert.