When you for example use the domains module the access of a node is based on the current domain. An anonymous user will have access on Site 1 but will not have access to the same node for Site 2. At this moment a check is done whether or not an anonymous user has access to a certain node. When you got dynamic rules for access to anonymous users or something like the domains module this will not work. Proposed solution:

diff -r af33ae5c1628 sites/all/modules/contrib/apachesolr/apachesolr_access/apachesolr_access.module
--- a/sites/all/modules/contrib/apachesolr/apachesolr_access/apachesolr_access.module	Mon Aug 15 09:21:43 2011 +0200
+++ b/sites/all/modules/contrib/apachesolr/apachesolr_access/apachesolr_access.module	Mon Aug 15 10:29:20 2011 +0200
@@ -11,9 +11,9 @@
     $account = drupal_anonymous_user();
   }
 
-  if (!node_access('view', $node, $account)) {
-    // Get node access grants.
-    $result = db_query('SELECT * FROM {node_access} WHERE (nid = 0 OR nid = :nid) AND grant_view = 1', array(':nid' => $node->nid));
+  // Get node access grants.
+  $result = db_query('SELECT * FROM {node_access} WHERE (nid = 0 OR nid = :nid) AND grant_view = 1', array(':nid' => $node->nid));
+  if ($result->rowCount() > 0) {
     foreach ($result as $grant) {
       $key = 'access_node_' . apachesolr_site_hash() . '_' . $grant->realm;
       $document->setMultiValue($key, $grant->gid);
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

So, I see how that might work for domain access, but not clear it will work generally. Also, I think that may fail in the case of a site with no nodeaccess module, since I think there is always one node access record.

pwolanin’s picture

Title: Apachesolr access not respecting node access ruls » Apachesolr access not respecting node access rules
Status: Needs review » Needs work
agentrickard’s picture

I think the problem in the Domain Access world is that the 'realm' of the query must be specified.

Core uses a realm of 'all'. Domain access removes that and has a 'domain_all' realm that is used (legitimately) to control advanced behaviors (such as searching across multiple domains).

The proper query should probably be:

$result = db_query('SELECT * FROM {node_access} WHERE ((nid = 0 AND realm = 'all') OR nid = :nid) AND grant_view = 1', array(':nid' => $node->nid));

But I am looking at this problem out of context ATM.

Nick_vh’s picture

Version: 7.x-1.0-beta8 » 7.x-1.x-dev
agentrickard’s picture

I tested Acquia (SOLR) Search with Domain Access (7.x-3.1) recently with no issues.

Nick_vh’s picture

Status: Needs work » Closed (works as designed)
drupalusering’s picture

Will this ever become part of functionality? I had latest stable release from 2012 June 21st --> 7.x-1.0-rc2 and ran into the issue of subdomains not filtering the results posted from primary domain. Checked the .dev version and the code was same (without patch written above which works great!). I use access domain and when checking the content for a particular domain the filtering works properly - however Solr without patch displays primary domain results on subdomains...

Nick_vh’s picture

Status: Closed (works as designed) » Needs work

Could you please post more details and tell us how we can replicate the problem?

codemuncher’s picture

When using Apache Solr Attachments, Apachesolr access prevents anonymous users from seeing files in search results even when they are in public folders. After disabling Apachesolr access, all is well.

pwolanin’s picture

@agentrickard - is it not a generally correct use of the node access API to have nid=0 for a default rule?

If any row contains the node ID in question (or 0, which stands for "all nodes"), one of the grant IDs returned, and a value of TRUE for the operation in question, then access is granted.

http://api.drupal.org/api/drupal/modules%21node%21node.module/group/node...

agentrickard’s picture

@pwolanin

You can pass nid = 0, yes, but it must have a specific realm attached to it, normally "realm = 'all'". nid = 0 by itself can cause issues.

Tim Jones Toronto’s picture

I am testing this on multiple sites using the Domain Access module 7.x-3.4dev and Apache Solr Access 7.x-1.0-rc4

I have: domain-1, domain-2 and domain-3. There seems to be a logic problem in the Solr search hash access, where:

1. Node content that is set ONLY for domain-1 is also appearing on domain-2 and domain-3 searches.
2. Node content set for domain-2 only is working fine, and appears on domain-2 search only.
3. Node content set for domain-3 only is working fine, and appears on domain-3 search only.

Therefore, the *first* domain doesn’t appear to be obeying the access rules / filter logic.

this is part of the query for domain-1: access_node_b6zt08_all:0+OR+access_node_b6zt08_domain_site:0+OR+access_node_b6zt08_domain_id:2+OR+access__all:0

(domain-1 is actually referenced as domain_id:2 since a domain was deleted when setting up Domain Access and follows the next index.)

Thanks!

agentrickard’s picture

That query string is not readable. Please use SQL.

Tim Jones Toronto’s picture

>That query string is not readable.

Not readable? It is the console output of the Solr query as initiated by the underlying Apache Solr Access code. It has been posted to assit the maintainer with that non-SQL type query (if they need it) since they are dealing with the access logic at that language level.

agentrickard’s picture

Yes, and I'm the Node Access subsystem maintainer, and I can't understand that string, so I can't help you.

Tim Jones Toronto’s picture

> I can't understand that string

Never mind. I'm not sure how you want SQL supplying on a Solr generated query. :/

What do you need other than my example #12 showing conditions where it is breaking access rules?

agentrickard’s picture

I think my confusion is in the query string elements. The _b6zt08_ part of the string is gibberish to me. Here's what I read:

access_node_b6zt08_all = 0
OR
access_node_b6zt08_domain_site = 0
OR
access_node_b6zt08_domain_id = 2
OR
access__all = 0

It looks like the domain parts of the query are correct (e.g. is this node assigned to domain_id 2 or set to "all affiliates" via domain_site = 0). I don't know what the other two parts of that query do.

In the node access world, domain_site = 0 and domain_id = 2 are a realm and a grant id.

Is there anything in the data that would match all = 0? That can happen if using multiple access modules or if the default Drupal node access grant were still indexed.

Tim Jones Toronto’s picture

Thanks - I was including the Solr query in case it 'range any immediate bells' since this is passed by the module e.g.:

The ‘gibberish’ _b6zt08_ is (hash) key data as formed by apachesolr_site_hash() in apachesolr_access.module.

My confusion was providing SQL in this instance.

To re-check things, I will load another test version on a fresh system *without* any extraneous modules and compare the example results again to see what is going on.

Cheers.

Tim Jones Toronto’s picture

Hi, I have re-installed a completely fresh install of Drupal 7.15 and enabled:

i) Domain Access - 7.x-3.4+24-dev
ii) Apache Solr Access - 7.x-1.0-rc4

To keep things simple (so far) I have created a similar example as #12, only creating TWO domains and 2 nodes. Node 1 content is set only for 'domain1.com', and node 2 content is set only for 'domain2.com'. There is no 'send to all affiliates'. This is represented in the database as the following tables:

'domain'
========
domain_id : subdomain
-------------------
1         : www.domain1.com
3         : www.domain2.com

and

'domain_access'
---------------

nid:gid:  realm           my comment
------------------------------------------
1  : 1 :	domain_id  < this is domain1.com
2  : 3 :	domain_id  < this is domain2.com

PROBLEM:

When the site is on domain2.com, the search content for domain1.com is showing up (same as the example i gave above in #12). When on domain1.com, the search works fine, and does not show any data from domain2.com.

So, by examining the Solr 'fq' filter query constructed to limit the dataset, it returns the following on the domain2.com site:

fq=(access_node_nl9yt0_all:0+OR+access_node_nl9yt0_domain_site:0+OR+access_node_nl9yt0_domain_id:1+OR+access__all:0)&rows=10} hits=1 status=0 QTime=2

To break this down we have:

access_node_nl9yt0_all:0
OR
access_node_nl9yt0_domain_site:0
OR
access_node_nl9yt0_domain_id:3
OR
access__all:0

(ignore nl9yt0 as is unique hash as explained)

From what I can see so far using the 'OR' logic:

.._domain_id - this is fine
.._domain_site - this is fine (as it's not being set)

This leaves 'access__all:0' and 'access_node_nl9yt0_all:0' to inspect as being '0'? There are no other modules on the system except core and Devel.

pwolanin’s picture

This module as written sets access__all:0 if anonymous users can view the content. This is the logic for constructing the fq:

    // Get node access grants.
    $grants = node_access_grants('view', $account);
    foreach ($grants as $realm => $gids) {
      $realm = apachesolr_access_clean_realm_name($realm);
      foreach ($gids as $gid) {
        $node_access_query->addFilter('access_node_' . apachesolr_site_hash() . '_' . $realm, $gid);
      }
    }
    $node_access_query->addFilter('access__all', 0);

I think you need to look at what's in the Solr index, but probably you need a version of this module with different logic to support domain access. I don't consider domain access to be in the realm of "normal" node access modules that this module supports. Part of the logic of what's indexed is geared to multi-site search indexes, which doesn't align with domain access needs.

Tim Jones Toronto’s picture

Thanks pwolanin. Yes I see.

So without the logic change, and in its current version, this does not work with multiple sites* using Domain Access. Thanks for confirming.

(*Edit: 'multiple sites' that require unique content being assigned to each unique domain).

pwolanin’s picture

When you say "multiple sites" you mean one site with multiple domains?

Tim Jones Toronto’s picture

Yes. i.e. a single codebase system running multiple domains and content giving:

More than one website "multiple sites" each with a different domain, serving unique content to each website using the Domain Access module functionality.

Thanks again.

agentrickard’s picture

Another way to restate what pwolanin is saying is that it looks like the Apache Solr search index is trying to provide the type of per-site separation that DA already provides.

I would disagree, however, that DA is not "normal", since it follows the Node Access API. The only difference is that DA doesn't distinguish between users the way that a module like OG does. This is, IMO, a flawed assumption by Apach Solr, which assumes that anonymous users are different from logged in users with regard to access control.

So the issue here is the 'access__all' grant that is supplied for anon users.

If that logic were alterable, I'd be happy to put code into DA to disable it.

agentrickard’s picture

Status: Needs work » Needs review
FileSize
650 bytes

Simple approach sets a variable that can be overridden by other modules when the node is indexed.

pwolanin’s picture

Status: Needs review » Needs work

@agenrickard - it doesn't seem complete, since this now breaks multisite search? I'm also not sure the correct data gets into the index. I think this needs a module specifically for domain access, though you'd have to work to make it compatible with other non-DA sites using the access module.

I don't know how things like cron runs and indexing work enough to suggest a complete solution, and I think there are modules that provide per-domain variables which can complicate the situation more.

agentrickard’s picture

When you say "multisite" search above, you mean multiple Drupal dbs pointing to a single instance?

From my perspective, the correct data gets into the database but the 'access__all' grant is entirely extraneous.

pwolanin’s picture

multsite search is e.g. drupal.org and groups.drupal.org sharing the same search index (as they do).

agentrickard’s picture

Status: Needs work » Needs review
FileSize
646 bytes

If we're worried about domain-specific settings, that actually opens up the option to disable the access__all grant at runtime on a per-domain basis, which would be very cool.

Like so. This would, in the case of Domain Access setting the variable to FALSE, remove the access__all:0 from the query OR statement.

pwolanin’s picture

FileSize
4.8 KB

The OP had roughly the right approach - #29 is not correct, I think.

Here's what I think a roughly correct/complete patch should contain. Please give it a try. Note the use of a per-environment variable.

pwolanin’s picture

FileSize
4.86 KB

Variant - adds some doxygen and displays a warning rather than actually forcing the re-index.

pwolanin’s picture

Title: Apachesolr access not respecting node access rules » Apachesolr access makes assumptions that don't apply to modules like Domain Access
agentrickard’s picture

Yeah, I misread the OP thinking that was a filter query not a storage query.

jcfiala’s picture

I applied the patch and it works fine, but I have one quibble - the patch adds "$form['#environment'] = $environment;" into apachesolr.admin.inc - why? It doesn't seem to be doing anything there.

Otherwise, it seems to do a fine job.

pwolanin’s picture

It's used here:

+    '#default_value' => empty($form['#environment']['conf']['apachesolr_access_always_add_grants']) ? 0 : 1,
kirkkala’s picture

We have a site setup with multiple domain access defined domains and nodes published only to default domain appeared in searches performed to subdomains.

Applying the patch #31 and checking the new setting fixed perfectly our search issue (after rebuilding solr index).

I would definitely like to see this patch in the next release. Thanks!

agentrickard’s picture

Status: Needs review » Reviewed & tested by the community

Sounds like a good real-world test.

Nick_vh’s picture

Thanks for testing, let's commit this in the next committing round!

Nick_vh’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
4.61 KB

Patch did not apply, reroll

Nick_vh’s picture

Version: 7.x-1.x-dev » 6.x-3.x-dev
Status: Needs review » Patch (to be ported)

Committed. Thanks! Need backport

pwolanin’s picture

Status: Patch (to be ported) » Needs review
FileSize
4.44 KB

only partially tested backport

amccune’s picture

Nick_vh and agentrikard

Can you tell me what version of the module that the patch in #29 ? I have tried various versions with no success. Would it be possible even for someone who has domain access being properly respected to send me a zip of their apachesolr folder?

Im under press to get this working asap and am not getting anywhere fast :(

Any help would be greatly appreciated.

Thanks a lot.

Adam