pager_query: the actual nodes counts 44, 5 pages (10 nodes per page) , BUT the pager_query get 88 , SO the pager display 9 pages!

SELECT COUNT(*) FROM bbk_node n INNER JOIN bbk_node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.realm = 'all' AND na.gid = 0) OR (na.realm = 'og_public' AND na.gid = 0) OR (na.realm = 'og_admin' AND na.gid = 206) OR (na.realm = 'og_subscriber' AND na.gid = 170)) AND na.nid IN (SELECT na.nid FROM bbk_node_access na WHERE ((na.realm = 'domain_site' AND na.gid = 0) OR (na.realm = 'domain_id' AND na.gid = 2))) ) AND ( n.promote = 1 AND n.status = 1 ) 

focus: "OR (na.realm = 'og_admin' AND na.gid = 206) OR (na.realm = 'og_subscriber' AND na.gid = 170)" .

alter to "SELECT COUNT(distinct n.nid) ....", get the correct result..

BUT pager_query in the include/pager.inc, SO ..HOW.. ?

CommentFileSizeAuthor
#23 pager_distinct.patch818 bytesagentrickard

Comments

olet’s picture

more information:
I use OG 5.x-7.2 and DA 1.4,
I set a subdomain site, in this site I created two OG, then creat 42 nodes in OG, all nodes are only in the same subdomain site (domain_id = 2) ;
ON the site home page (drupal default home page path: /node), the pager is wrong, like above issue description..

olet’s picture

more info, every node access like this:

(devel.module) node_access entries for nodes shown on this page

node realm gid view update delete explained

Don’t Look Back domain_id 2 1 0 0
Don’t Look Back og_admin 170 1 1 1
Don’t Look Back og_public 0 1 0 0 All users may view this node.
Don’t Look Back og_subscriber 170 1 0 0 Members of baby may view this node.

4 access records per node !

agentrickard’s picture

Title: multiple_node_access: the pager count(*) is error » multiple_node_access: pager count(*) error

4 access records per node is normal under your configuration. I suspect that adding the DISTINCT is necessary, based on how I read the query.

What is odd is that pager_query runs two queries. The first gets the page count and does not use DISTINCT. The second gets the nodes per page and _does_ use DISTINCT.

The above query. by the way, is constructed correctly, and unpacks to:

SELECT COUNT(*) 
FROM bbk_node n 
INNER JOIN bbk_node_access na 
ON na.nid = n.nid 
WHERE 
(na.grant_view >= 1 AND 
  (
    (na.realm = 'all' AND na.gid = 0) OR 
    (na.realm = 'og_public' AND na.gid = 0) OR 
    (na.realm = 'og_admin' AND na.gid = 206) OR 
    (na.realm = 'og_subscriber' AND na.gid = 170)
  ) 
  AND 
  na.nid IN 
  (
    SELECT na.nid FROM bbk_node_access na 
    WHERE 
    (
      (na.realm = 'domain_site' AND na.gid = 0) OR '
      (na.realm = 'domain_id' AND na.gid = 2)
    )
  )
) 
AND 
(
  n.promote = 1 AND n.status = 1 
)

But that first line probably needs to be:

SELECT COUNT(DISTINCT(nid))

This may be a core bug. I'd like to get some other tests and opinions here.

agentrickard’s picture

It seems -- though I haven't tested it -- that the COUNT(*) is not handling the subselect as expected.

agentrickard’s picture

Testing with the following query. The subselect _does_ double the COUNT(*).

SELECT COUNT(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'domain_site');
agentrickard’s picture

The issue here is that we cannot fix this in node.module, since other modules make pager queries. Fixing inside pager_query() may not work either, since it is building the count query on the fly without knowing which column would be distinct.

This is the relevant code from pager.inc, lines 63-66:

  // Construct a count query if none was given.
  if (!isset($count_query)) {
    $count_query = preg_replace(array('/SELECT.*?FROM /As', '/ORDER BY .*/'), array('SELECT COUNT(*) FROM ', ''), $query);
  }

This note should also be referenced over at http://drupal.org/node/196922

somebodysysop’s picture

In my case:

SELECT count(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT distinct nid FROM node_access WHERE realm = 'og_public')

returns 1638 records.

However,

SELECT count(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT distinct nid FROM node_access WHERE realm = 'og_public')

returns 249 records.

SELECT count(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0

3067 records.

SELECT count(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0

383 records.

What I'm trying to figure out is: Why is this an issue with our code as opposed to the default code? Remove our subselect code, and you're still going to have a different count between "count(*)" and "count(distinct n.nid)". So, what gives?

agentrickard’s picture

Well, if that is the case, then this _is_ a core bug, as mentioned in #3.

But does this happen without the patch applied?

somebodysysop’s picture

Perhaps we should find out where the offending query is located. And, what patch has been applied. This is what I have as of the last patch I submitted here: http://drupal.org/node/196922

+  if ($op != 'create' && $node->nid) {
+    $grants_sql = node_access_grants_sql($op, NULL, $user->uid, $node->status);
+    // If the return value is FALSE, then the node status is unpublished and
+    // none of the grants requested an access check be run.  In which case,
+    // we should fall through to the final 'if' statement.
+    if ($grants_sql !== FALSE) {
+      if (!empty($grants_sql)) {
+        $grants_sql .= ' AND';
+      }  
+      $sql = "SELECT COUNT(*) FROM {node_access} WHERE (nid = 0 OR nid = %d) $grants_sql grant_$op >= 1";
+      $result = db_query($sql, $node->nid);
+      return (db_result($result));
     }
-
-    $sql = "SELECT COUNT(*) FROM {node_access} WHERE (nid = 0 OR nid = %d) $grants_sql AND grant_$op >= 1";
-    $result = db_query($sql, $node->nid);
-    return (db_result($result));

So, my point is that if the pager_query sql is the db_rewrite_sql of the above, the only difference between the original core code and the patch would be the subselects that we add. The sub-selects cannot increase the query count! Only reduce it or keep it the same.

Look at your own example:

SELECT COUNT(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'domain_site');

As opposed to:

SELECT COUNT(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'domain_site');

Now, try both without the sub-select statements. Same difference. But, at best, the counts will be the same or less. If they are more, then we have a sql bug of monumental proportions.

So, what I'm getting at is: How is our patch causing a problem that doesn't already exist? It very well might be, but I don't understand how from what I know so far.

agentrickard’s picture

Look at your own example:
SELECT COUNT(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'domain_site');

As opposed to:
SELECT COUNT(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'domain_site');

Now, try both without the sub-select statements. Same difference. But, at best, the counts will be the same or less. If they are more, then we have a sql bug of monumental proportions.

Not true. The test query should return 50 records on my system, and does so with the subselect removed. With the subselect, it returns 100. Hence the note: the subselect does not work as expected.

somebodysysop’s picture

The test query should return 50 records on my system, and does so with the subselect removed. With the subselect, it returns 100. Hence the note: the subselect does not work as expected.

Well, then. That's a horse of a different color. What version of MySQL are you running? So far, I have yet to run a query where using the subselect causes more records to be returned than are in the main query. Logically, that doesn't make sense. I'm running 4.1. What version of Mysql is Olet running?

Also, can you duplicate the problem using an OG term (instead of a DA term)? If so, what is that sql so I can see if I can duplicate it here as well.

olet’s picture

Thanks! I use views.module to display frontpages, instead of default nodes frontpage.....
Views.module set DISTINCT, lucky!

olet’s picture

pager_query , distinct count(), many access records (If I subscribed 100 OGs, or more)........... maybe a performance killer~
SO, what can we do?

agentrickard’s picture

I'm on MySQL 5.0.45

I suggest testing these queries with simple JOINs and SUBSELECTS rather than actual module testing. It's faster and easier.

somebodysysop’s picture

Here's what I get:

SELECT COUNT(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'ogr_access')
COUNT(*)
81
SELECT COUNT(*) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0
COUNT(*)
1472
SELECT COUNT(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid IN (SELECT nid FROM node_access WHERE realm = 'ogr_access')
COUNT(distinct n.nid)
7
SELECT COUNT(distinct n.nid) FROM node n INNER JOIN node_access na ON n.nid = na.nid WHERE na.grant_view > 0 AND n.nid
COUNT(distinct n.nid)
151

So, my question is: If this is a core bug, what do we do about it?

agentrickard’s picture

What numbers should you get -- hard to tell from the above.

I think this issue is solved if the query runs COUNT(DISTINCT(nid)), but that is not how most of the queries are structured. They are generayy COUNT(*).

This is something for a bjaspan or a crell to look at, since htey are doing much of the database work -- though Dave Cohen is good, too.

somebodysysop’s picture

I have no reason to believe that the numbers I'm receiving aren't the numbers I should be getting. I am in total agreement that this issue appears to be resolved by changing the core query to DISTINCT.

How do we get that going?

agentrickard’s picture

Post to the development list or get on IRC #drupal

somebodysysop’s picture

I recommend someone else do it. When I suggested a patch to clear the cache, the moderator snipped that Dries had already told me that he doesn't support "changing roles on the fly". Of course, the patch had nothing to do with that, but because I am associated with OGR, I supposed my idea wasn't taken seriously. And, my request was promptly denied. However, Dave Cohen (I think it was him), made the same exact request, and it was immediately added to Version 6.x.

agentrickard’s picture

You will see better results if you figure out how to work productively with the community. I simply don't have time for this right now. See http://ken.therickards.com/ for the reason.

somebodysysop’s picture

You will see better results if you figure out how to work productively with the community.

My participation in this project *is* an effort to work productively with the community.

I just gave you an example of a recent effort. Perfectly legitimate and useful feature. If the "community" says bug-off or, more commonly, simply ignores me, then what?

I simply don't have time for this right now. See http://ken.therickards.com/ for the reason.

Understood. Good luck.

I'll do what I can do.

agentrickard’s picture

Then try a different approach -- like recruiting folks like Dave and myself to support you. As I said, ask bjaspan or crell.

agentrickard’s picture

Status: Active » Needs review
StatusFileSize
new818 bytes

Here is an experimental patch for 5.x, it tests the pager_query for a DISTINCT value.

agentrickard’s picture

Title: multiple_node_access: pager count(*) error » pager count(*) error
Version: 5.x-1.4 » 6.x-2.0-rc4
Status: Needs review » Needs work

This apparently affects any pager query where multiple grants are passed. See #324070: SELECT * FROM {node} violates db_rewrite_sql()

agentrickard’s picture

Status: Needs work » Closed (won't fix)

Bumping to core.