http://drupal.org/node/35909 reports a performance bottleneck. Part of the problem appears to be two bits of SQL in node.module. The problem would only be visible when node_access is large. I suggest a fix in 35909 but do not have a large node_access table for testing the change. Perhaps someone could test the change functions correctly and then work with the original creator of 35909 to test performance improvements.

The essense of the change is to remove field concatenations from where clauses to avoid table scans. The relevant fields can be indexed so that column values can be found fast. The creator of 35909 is not a programmer or SQL expert. I am happy to help but do not have a system with a high volume of data in the relevant table.

http://petermoulding.com/web_architect

Comments

peterx’s picture

I changed my code as shown here and tried a quick test. Nothing exploded. My node_access table contains only one row.

At about line 1898:

$grants = array();
foreach (node_access_grants($op, $uid) as $realm => $gids)
	{
	foreach ($gids as $gid)
		{
		$grants[] = "(gid = " . $gid . " and realm = '" . $realm ."')";
		}
	}
$sql = 'SELECT COUNT(*) FROM {node_access} WHERE (nid = 0 OR nid = %d) AND ('
	. implode(' or ', $grants) . ') AND grant_' . $op . ' = 1';

At about line 1948:

$grants = array();
foreach (node_access_grants($op, $uid) as $realm => $gids)
	{
	foreach ($gids as $gid)
		{
		$grants[] = "(gid = " . $gid . " and realm = '" . $realm ."')";
		}
	}
$sql = $node_access_alias . '.grant_' . $op . ' = 1 AND (' . implode(' or ', $grants) . ')';

At about line 1995:

$grants = array();
foreach (node_access_grants('view') as $realm => $gids)
	{
	foreach ($gids as $gid)
		{
		$grants[] = "(gid = " . $gid . " and realm = '" . $realm . "')";
		}
	}
$sql = 'SELECT COUNT(*) FROM {node_access} WHERE nid = 0 AND ('
	. implode(' or ', $grants) .') AND grant_view = 1';

You will need an index for grant_view:
alter table node_access add index grant_view_index (grant_view)

You might need separate indexes for gid and realm. Because you are selecting nid first, MySQL might be able to use the primary index to find gid. After MySQL finds gid, MySQL might be able to search the primary index for realm.

alter table node_access add index gid_index (gid)
alter table node_access add index realm_index (realm)

petermoulding.com/web_architect
petermoulding.com/technology/content_management_systems/

peterx’s picture

The SQL at 1948 should be changed to the following. The code is in function _node_access_where_sql and lets the user provide a table name prefix.

$grants = array();
foreach (node_access_grants($op, $uid) as $realm => $gids)
	{
	foreach ($gids as $gid)
		{
		$grants[] = "(" . $node_access_alias . ".gid = " . $gid
			. " and " . $node_access_alias . ".realm = '" . $realm ."')";
		}
	}
$sql = $node_access_alias . '.grant_' . $op . ' = 1 AND (' . implode(' or ', $grants) . ')';
boris mann’s picture

Status: Active » Needs review

Peter, it would be good if you could roll a patch. I'm still setting this to "code needs review".

There is a reference thread here that this fixes scalability issues for large sites.

peterx’s picture

If dumell reports success next week in http://drupal.org/node/35909 then this change is worth doing. I started a big project yesterday and may not have time to look at this change until late March 2006.

moshe weitzman’s picture

Title: Performance bottleneck could be avoided. » Remove CONCAT() and speed up node_access query
Version: 4.6.3 »
Assigned: Unassigned » moshe weitzman

I've been consulting with dumell and encouraged him to try the edits described here. He did so and saw huge benefits. We should definately incorporporate these changes. I will roll a patch if noone else does.

killes@www.drop.org’s picture

I think we should fix 4.6 too. Ill be glad when that fishy concat stuff is gone.

killes@www.drop.org’s picture

Assigned: moshe weitzman » killes@www.drop.org
StatusFileSize
new2.67 KB

ok, here's a patch sans the indices and associated db changes. Anybody care to test?

killes@www.drop.org’s picture

I also changed two instances of = 1 to >= 1 to match with that change in other places.

killes@www.drop.org’s picture

StatusFileSize
new2.69 KB

Improved patch, added () around or conditions, thanks Cvbge.

dries’s picture

Any chance Dumell could test this exact patch? Let me know when it has been tested by promoting the patch to the 'ready to be committed' queue.

killes@www.drop.org’s picture

http://drupal.org/node/35909#comment-68622

Will add the db upgrade later today.

dumell’s picture

This patch can not be used for node.module 1.485.2.13 2005/08/11 (4.6) can it?

We are using Drupal 4.6.3 so I can not test this patch if it requires a never node.module.

killes@www.drop.org’s picture

Dumell, can you simply attach the node.module and any other files that you changed? I will roll patches form them.

dumell’s picture

We just need a patch for a plane node.module 1.485.2.13 2005/08/11 or (or any newer node.module that is 4.6.3 compatible).

http://cvs.drupal.org/viewcvs/drupal/drupal/modules/node.module?rev=1.48...

Right now we are using a node.module 1.485.2.13 that I updated by hand acording to #1 and #2 in this node and I would replace this module with a fresh node.module updated with a patch from you, killes.

We have made the changes to the database according to the advice in

http://drupal.org/node/35909#comment-66106

No other changes have been made.

killes@www.drop.org’s picture

StatusFileSize
new2.75 KB

Ok, here is a patch for 4.6.

dumell’s picture

I treid using this patch for 4.6 but Drupal started writing error messages to the log for each page view:

Unknown column 'na.0' in 'where clause' query: SELECT DISTINCT(n.nid), n.title, n.created FROM node n INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((gid = na.0 AND na.realm = 'all') OR (gid = na.4961 AND na.realm = 'og_uid') OR (gid = na.0 AND na.realm = 'og_group'))) AND n.type = 'blog' AND n.status = 1 ORDER BY n.created DESC LIMIT 0, 10 in /drupal/drupal-4.6.0/includes/database.mysql.inc on line 66.

In the patch, a few lines are missing witch I think should be there, like line 1863:

$sql = 'SELECT COUNT(*) FROM {node_access} WHERE (nid = 0 OR nid = %d) AND CONCAT(realm, gid) IN (';

When I patched by hand acording to http://drupal.org/node/36429#comment-53102, I left that line in and started changing from the next line.

killes@www.drop.org’s picture

StatusFileSize
new2.75 KB

The error was due to a silly mistake by me. The removed line was just placed a few lines lower. Here is the updated 4.6 patch.

killes@www.drop.org’s picture

StatusFileSize
new2.71 KB

updated patch for cvs.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new2.64 KB

I just exercised this patch and all looks good. I rerolled to get rid of fuzz. I think it is ready to commit.

As a next step. there may be indices that are missing. Here are 2 EXPLAIN outputs and queries. DB experts - please help. I chose a single node view query and a node list query (i.e. the home page)

VIEWING A NODE - EXPLAIN
+-------+-------+-----------------+---------+---------+--------+------+-----------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+-------+-----------------+---------+---------+--------+------+-----------------+
| n | const | PRIMARY,uid,vid | PRIMARY | 4 | const | 1 | Using temporary |
| u | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| na | ref | PRIMARY | PRIMARY | 4 | const | 1 | where used |
| r | ALL | [NULL] | [NULL] | [NULL] | [NULL] | 166 | where used |
+-------+-------+-----------------+---------+---------+--------+------+-----------------+

VIEWING A NODE - SQL
EXPLAIN SELECT DISTINCT(n.nid), n.vid, n.type, n.status, n.created, n.changed, n.comment, n.promote, n.moderate, n.sticky, r.timestamp AS revision_timestamp, r.title, r.body, r.teaser, r.log, r.format, u.uid, u.name, u.picture, u.data FROM node n INNER JOIN node_access na ON na.nid = n.nid INNER JOIN users u ON u.uid = n.uid INNER JOIN node_revisions r ON r.vid = n.vid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 2 AND na.realm = 'og_uid') OR (na.gid = 0 AND na.realm = 'og_group') OR (na.gid = 56 AND na.realm = 'og_group') OR (na.gid = 57 AND na.realm = 'og_group'))) AND n.nid = 61

VIEWING HOME PAGE - EXPLAIN
+-------+------+-----------------------------------------------------+---------------------+---------+-------------+------+---------------------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+------+-----------------------------------------------------+---------------------+---------+-------------+------+---------------------------------------------+
| n | ref | PRIMARY,status,node_promote_status,node_status_type | node_promote_status | 8 | const,const | 34 | where used; Using temporary; Using filesort |
| na | ref | PRIMARY | PRIMARY | 4 | n.nid | 1 | where used; Distinct |
+-------+------+-----------------------------------------------------+---------------------+---------+-------------+------+---------------------------------------------+

VIEWING HOME PAGE - SQL
SELECT DISTINCT(n.nid), n.sticky, n.created FROM node n INNER JOIN node_access na ON na.nid = n.nid WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all') OR (na.gid = 2 AND na.realm = 'og_uid') OR (na.gid = 0 AND na.realm = 'og_group') OR (na.gid = 56 AND na.realm = 'og_group') OR (na.gid = 57 AND na.realm = 'og_group'))) AND n.promote = 1 AND n.status = 1 ORDER BY n.sticky DESC, n.created DESC LIMIT 0, 10

dries’s picture

Status: Reviewed & tested by the community » Active

Committed to HEAD. Marking this active so db people can have another look at it.

magico’s picture

Status: Active » Fixed

Is there any need to still take a look? Perhaps to apply the 4.6 patch? Don't think so.

Anonymous’s picture

Status: Fixed » Closed (fixed)