I was implementing this module on a site where I wanted to have a search that searched nodes, pages, files and users all at the same time and ran accross an interesting issue.

When the search_by_page.module calls the sbp_query_modify to build the search query BOTH sbp_attach_sbp_query_modify() and sbp_nodes_sbp_query_modify() add a join to node_access in when required (when not using the user with uid = 1). This creates an sql error as {node_access} na appears twice in the FROM clause in the select.

In order to fix this I tried a couple of things - e.g. re-aliasing one of the node_access tables as na2 (like you had done for node in sbp_attach_sbp_query_modify() ) but this didn't work.

What worked in the end was moving the node_access check into a sub-query for both node and attachment searches.

I've attached a patch.

CommentFileSizeAuthor
#4 541958.patch4.58 KBjhodgdon
node_access_issue.patch1.73 KBalexpott

Comments

jhodgdon’s picture

Assigned: Unassigned » jhodgdon

Thanks for reporting and patching! I'll look into this. I'm not sure sub-queries are supported in all versions of MySQL, so I might see if I can get it to work without the sub-queries somehow.

jhodgdon’s picture

Which module(s) are you using to control content access, or what special permissions do you have set?

The reason I am asking is that I am not getting this problem on my test site, even when searching with the anonymous user. But I have "access content" granted to anonymous, and no other special node access permissions, so that's probably the reason our sites are behaving differently.

alexpott’s picture

Just using content_access... but any module that will cause _db_rewrite_sql to return a join to the node_access table should cause the issue if both sbp_nodes and sbp_attach are active.

I think just enabling the content_access module will be enough to create the issue. The important thing is to make it add a join to {node_access} twice in the $join variable in _search_by_page_do_search() function in search_by_page.module.

My MSQL version is 5.0.45

jhodgdon’s picture

Status: Active » Needs review
StatusFileSize
new4.58 KB

Thanks. With Content Access module, I was able to set up a test case and reproduce your issue.

I also found a solution involving search and replace, which I think is preferable to using sub-queries (supposed to be faster). I put it into an API function, so that it can be reused in future sub-modules.

Can you try the attached patch and see if it works for you? Thanks in advance...

jhodgdon’s picture

I have also committed this change to the development version (6.x-1.x-dev) of Search by Page, so if you wait until the next snapshot is generated, you can also download the (hopefully) fixed-up module later today from the Search by Page main project page.

alexpott’s picture

Version: 6.x-1.1 » 6.x-1.x-dev

Unfortunately the new patch in 6.x-1.x-dev doesn't work for me as it doesn't return any results when it should do...

If I disable either the attachments or the node module I get results but when both enabled and I search as a non-admin user I get no results.

Investigating further, the query the new patch generates when all search_by_page modules are enabled is:

SELECT SUM(i.score * t.count) AS score 
FROM search_index i 
INNER JOIN search_total t ON i.word = t.word 
INNER JOIN sbp_path sp ON i.sid = sp.pid 
LEFT JOIN sbpa_files saf ON sp.modid = saf.fid 
LEFT JOIN node sbpa_n ON sbpa_n.nid = saf.nid AND sbpa_n.vid = saf.vid 
INNER JOIN node_access sbpa_na ON sbpa_na.nid = sbpa_n.nid 
LEFT JOIN node sbpn_n ON sbpn_n.nid = sp.modid 
INNER JOIN node_access sbpn_na ON sbpn_na.nid = sbpn_n.nid 
LEFT JOIN sbptt1959937329 sbpp ON sbpp.pid = sp.modid 
WHERE (0 OR 
( sp.from_module = 'sbp_attach' AND 1 AND (0 OR ( saf.source = 'cck' AND (1))) AND sbpa_n.status = 1 AND (sbpa_na.grant_view >= 1 AND ((sbpa_na.gid = 0 AND sbpa_na.realm = 'all') OR (sbpa_na.gid = 33 AND sbpa_na.realm = 'content_access_author') OR (sbpa_na.gid = 2 AND sbpa_na.realm = 'content_access_rid') OR (sbpa_na.gid = 3 AND sbpa_na.realm = 'content_access_rid')))) 
OR ( sp.from_module = 'sbp_nodes' AND sbpn_n.status = 1 AND (sbpn_na.grant_view >= 1 AND ((sbpn_na.gid = 0 AND sbpn_na.realm = 'all') OR (sbpn_na.gid = 33 AND sbpn_na.realm = 'content_access_author') OR (sbpn_na.gid = 2 AND sbpn_na.realm = 'content_access_rid') OR (sbpn_na.gid = 3 AND sbpn_na.realm = 'content_access_rid')))) 
OR ( sp.from_module = 'sbp_paths' AND sbpp.perm = 1) 
OR ( sp.from_module = 'sbp_users' AND 0)) 
AND (i.word = 'test') AND i.type = 'search_by_page' 
GROUP BY i.type, i.sid HAVING COUNT(*) >= 1 ORDER BY score DESC LIMIT 0, 1

I think the issue is with the inner joins on node_access... if I change these to left joins the queries return the same result as for my sub-select patch.

If I remove the part of the query for the nodes or the files I get a result so the issue is not with the node access records - also, as stated above if I disable either nodes or attachments I get a result - so almost certainly not a node_access issue.

Unfortunately changing the node_access join to a left join is not right as this would then give access to nodes the user is not supposed to have. As far as I know, the only way to solve this issue is with the badly-performing sub-selects or with a dreaded union select. The latter solution would require a re-write of do_search and therefore probably is not practical.

Any ideas?

jhodgdon’s picture

In my testing, that new version works correctly.

My testing set-up: I enabled Content Access, enabled per-item permissions, and created one page that was public and one that was private. I was able to replicate the behavior you had originally reported, when searching as the anonymous user. (First step in fixing a bug: replicate it!)

Then with the patch above (or the current dev snapshot), I got correct results: In the case where the search is on a word that is on a restricted page/attachment only, I get no results when searching as the anonymous user. If I search for a word that is in a public page, I get that page as a result.

So I'm a bit confused as to why it's working for me and not for you... maybe the reason it is working differently for you is that you have something more complex set up. Can you provide me with a description of your permission setup so I can try to replicate your problem (and hopefully resolve it) on my system? Thanks!

alexpott’s picture

I'm confused too... :)

Maybe this is a mysql version issue - I'm on 5.0.45... stuck on RHEL5 unfortunately.

My permissions set up is a bit different... I have public and private node types... but the node_access records would be pretty similar.

However the following tests suggest this is not a node_access issue.

If I run this select I get 0 rows... (both nodes and attachments)

 SELECT SUM( i.score * t.count ) AS score
FROM search_index i
INNER JOIN search_total t ON i.word = t.word
INNER JOIN sbp_path sp ON i.sid = sp.pid
LEFT JOIN sbpa_files saf ON sp.modid = saf.fid
LEFT JOIN node sbpa_n ON sbpa_n.nid = saf.nid
AND sbpa_n.vid = saf.vid
INNER JOIN node_access sbpa_na ON sbpa_na.nid = sbpa_n.nid
LEFT JOIN node sbpn_n ON sbpn_n.nid = sp.modid
INNER JOIN node_access sbpn_na ON sbpn_na.nid = sbpn_n.nid
WHERE (
i.word = 'week'
)
AND i.type = 'search_by_page'
GROUP BY i.type, i.sid
HAVING COUNT( * ) >=1
ORDER BY score DESC
LIMIT 0 , 1 

if I run this select... I get 1 row (attachment only)

 SELECT SUM( i.score * t.count ) AS score
FROM search_index i
INNER JOIN search_total t ON i.word = t.word
INNER JOIN sbp_path sp ON i.sid = sp.pid
LEFT JOIN sbpa_files saf ON sp.modid = saf.fid
LEFT JOIN node sbpa_n ON sbpa_n.nid = saf.nid
AND sbpa_n.vid = saf.vid
INNER JOIN node_access sbpa_na ON sbpa_na.nid = sbpa_n.nid
WHERE (
i.word = 'week'
)
AND i.type = 'search_by_page'
GROUP BY i.type, i.sid
HAVING COUNT( * ) >=1
ORDER BY score DESC
LIMIT 0 , 1 

And this returns 1 row too (attachments only)

 SELECT SUM( i.score * t.count ) AS score
FROM search_index i
INNER JOIN search_total t ON i.word = t.word
INNER JOIN sbp_path sp ON i.sid = sp.pid
LEFT JOIN sbpa_files saf ON sp.modid = saf.fid
LEFT JOIN node sbpa_n ON sbpa_n.nid = saf.nid
AND sbpa_n.vid = saf.vid
INNER JOIN node_access sbpa_na ON sbpa_na.nid = sbpa_n.nid
WHERE (
i.word = 'week'
)
AND i.type = 'search_by_page'
GROUP BY i.type, i.sid
HAVING COUNT( * ) >=1
ORDER BY score DESC
LIMIT 0 , 1 
jhodgdon’s picture

I think the problem is that the Search by Page queries need some parentheses inserted.

Can you try this query and see if you get both nodes and attachments on your system (assuming the word 'week' appears in both)?

SELECT SUM( i.score * t.count ) AS score, sp.page_path, sp.from_module
FROM search_index i
INNER JOIN search_total t ON i.word = t.word
INNER JOIN sbp_path sp ON i.sid = sp.pid
LEFT JOIN (
sbpa_files saf
LEFT JOIN node sbpa_n ON sbpa_n.nid = saf.nid
AND sbpa_n.vid = saf.vid
INNER JOIN node_access sbpa_na ON sbpa_na.nid = sbpa_n.nid
) ON sp.modid = saf.fid
LEFT JOIN (
node sbpn_n
INNER JOIN node_access sbpn_na ON sbpn_na.nid = sbpn_n.nid
) ON sbpn_n.nid = sp.modid
WHERE (
i.word = 'week'
)
AND i.type = 'search_by_page'
GROUP BY i.type, i.sid
HAVING COUNT( * ) >=1
ORDER BY score DESC
LIMIT 0 , 30
alexpott’s picture

The parentheses work great... never knew that they worked in joins...not too obvious from the MYSQL pages either... however having a quick look at MYSQL's doc we seem to have introduced a dependency on mysql versions...

In versions of MySQL prior to 5.0.1, parentheses in table_references were just omitted and all join operations were grouped to the left. In general, parentheses can be ignored in join expressions containing only inner join operations.

jhodgdon’s picture

Excellent, and thanks for finding that MySQL doc, which explains why things are working differently for you than for me.

I have just checked in a fix to the 6.1-dev branch that introduces parentheses into the Attach and Nodes joins (the others are simple one-table joins and don't need parens). Hopefully the new code will work equally well on my test site and your site. Let me know what you find out.

alexpott’s picture

All's working well... however through testing I did come across another issue... :) (will raise new issue once confirmed)

I can confirm that the module respects node_access permissions, the upload permission and cck field permissions.

jhodgdon’s picture

Status: Needs review » Fixed

I'll go ahead and mark this fixed then, since the fix is checked in to the 6.x-1.x-dev branch.

jhodgdon’s picture

I decided there were enough urgent bug fixes to make an official release, so this is now fixed in version 6.x-1.2 of Search by Page.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.