This patch should make notify run faster, espeically when many users are subscribed.
In the current version, the queries to fetch the nodes and comments are run for each user, although they don't depend on the user id. The access check is separate. This patch moves the queries and node loading outside the while ($user...) loop, so that they are only executed once.
The comment query is also simplified to eliminate the unnecessary inner join.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | notify-efficiency.r1.patch.txt | 2.67 KB | kthagen |
| notify-efficiency.patch.txt | 2.71 KB | kthagen |
Comments
Comment #1
RobRoy commentedMarking this by design, as db_rewrite_sql will invoke any hook_db_rewrite_sql module hooks which will restrict access to nodes/comments/taxonomies according to the current user. And since we are changing the user for each loop, this patch would subvert that access control.
Did that make sense? I'm not saying there aren't improvements to be made, so keep them coming kthagen! :)
Comment #2
RobRoy commentedAnother note, it looks like you were focusing on if (!node_access('view', $nodes[$nid], $user->uid)) { for the only access check when actually there are two: the db_rewrite_sql() and node_access() calls. These are probably redundant, but I was having reports of Notify sending unauthorized content so I put the node_access() call in there after taking over maintership of this module. That shouldn't be needed in hook_db_rewrite_sql() is being used properly, so this whole section could use some thought.
For Drupal 5.x, I'm going to try to refactor this module to clean some of these sections up.
Comment #3
kthagen commentedI take your point, and I wasn't thinking about the hooks when I wrote this patch. But thinking it over, the way that comments are stored means that node_access check is really the only defense against returning non-authorized content. In other words, when we do the query for the comments, any hooks that restrict access clearly aren't checking for whether or not the user has access permission to the underlying node. Given that we have to use node_access anyway, why not load all the nodes once, and simply check the access permissions ourselves? Is there ever a case where node_access will return true when the hooked query would have filtered out that node?
If not, is more efficient, and should give the same results, although you have to remove the db_rewrite_sql() functions.
BTW, the check for access_node when processing comments still isn't in the 4.7.x branch. That's probably the source of most (all?) of the unauthorized-content complaints, since I assume most people aren't using HEAD.
Comment #4
RobRoy commentedYeah, when I mentioned redundancy, going with a pure node_access() solution is what I was implying. I'd like to think this out a bit more, but you bring up a good point about optimizing the node/comment selection queries.
Comment #5
kthagen commentedHere's some more info to mull on while you're thinking it over:
I just took a look at the db_rewrite_sql function in tac_lite, which is what I use on my sites. Notice the following switch:
In other words, tac_lite only alters the query if the primary field is 'tid', and the default field, used by both invocations in notify, is 'nid'. 'tid' isn't even a field in either node or comment. The moral is, we can never trust the sql_rewrite hooks to do the proper filtering job for us, so I think node_access is our only sure option.
Here's a revised version of the patch without db_sql_rewrite.
?>
Comment #6
kthagen commentedChanging the status myself, because I just thought of a good reason to keep it as is.
Trying to add functionality like that requested here doesn't really strike me as appropriate to notify itself, since it would introduce a dependency on CCK, but leaving the query as is does allow the possibility of future companion modules to hook in and narrow down the query on a per-user basis.