Download & Extend

Spellcheck should respect permissions

Project:Apache Solr Search Integration
Version:7.x-1.x-dev
Component:schema.xml
Category:feature request
Priority:normal
Assigned:Unassigned
Status:postponed
Issue tags:spellchecker

Issue Summary

As far as my understanding of the spellcheck feature of Solr goes, there is no way to keep it from suggesting words that only exist in permissioned content. For example, if you use node access to make node/5 unavailable because it has the word "sekr3t", an unprivileged user might still get "sekr3t" as a spelling suggestion, even if clicking that suggestion will bring up no results.

Likewise, if a user has restricted field level privileges due to content_permissions, he or she will still be shown spelling suggestions from the content in those fields.

I don't see any good way to fix this given the tools that Solr makes available. Any ideas?

One way around would be to use file based spelling using dictionaries. You'd have to provide default language dictionaries, and add words to them if you need to (ie Drupal would have to be added manually on Drupal.org if we were using a file based dictionary here).

Comments

#1

Here's a comment by Peter when the apachesolr_autocomplete module was born: http://drupal.org/node/394076#comment-1679980

"A good start, but I'm not sure that is suitable for general use since AFAIK, you can never have access controls work with this."

In the comments thereunder are some suggestions that might allow for nodeaccess-aware suggestions, although with a performance hit.

#2

I just ran into this with spellcheck - had forgotten our earlier discussion.

Is there something in the Solr API that lets us control which spellcheck index(s) a certain document goes into at index time?

#3

We are now telling Solr to use the title and body fields to populate the "spell" field:

   <field name="spell" type="textSpell" indexed="true" stored="true" multiValued="true"/>
   [...]
   <copyField source="title" dest="spell"/>
   <copyField source="body" dest="spell"/>

So, we could "just" change this.

Hmm, first (crazy?) idea: create a title_anonymous and body_anonymous field that are only populated on the Drupal side at indexing time, IF the node is visible to anonymous users. Then we can specify copyField source to be title_anonymous and body_anonymous. Result: spellcheck corpus comes from only anon-visible information.

#4

If we can't have a spellcheck that respects access permissions of all roles then I'm all for a spellcheck that only uses content available to anonymous users.

#5

So, it looks like an option would be to build multiple spellcheck dictionaries - e.g. for anonymous versus authenticated

Then we could set this parameter in the request based on the current user.

spellcheck.dictionary

The name of the spellchecker to use. This defaults to "default". Can be used to invoke a specific spellchecker on a per request basis.

This would be more useful if there was a way to have a conditional copyfield.

#6

Title:Spellcheck: no way to make it respect permissions» Spellcheck should respect permissions
Version:6.x-2.x-dev» 7.x-1.x-dev

#7

subscribing

#8

Category:bug report» feature request
Status:active» postponed

Postponed because a lack of activity. Hoping that someone will pick this up. A possibility is to tell Solr to return multiple suggestions and to process these suggestions in the Drupal side. See #875716: Support collation for Did You Mean spelling suggestions, and support multiple suggestions

nobody click here