I'm having a problem with some confidential files coming up on Google and I wondered how you guys get round this.

We're using the nodeacess module to protect certain sections of the site, but have allowed files to be public (in our site settings) - as we want most attachments to be accessible to anonymous users. However, I'd assumed that files that are attachments to private nodes would also be private.

Yet, Google finds a way round and the "view as HTML" option in their search results displays the restricted content just fine. Has anyone else had this problem? And has anyone found a way round it?

Comments

vm’s picture

use a robots.txt file and DENY access to the google bot. Google pays attention to the robots.txt (not all search bots do). So if you tell google, not to search a specific path, it wont. more on robot.txt files can be found both on drupal.org and google.

divrom’s picture

Thanks, I hadn't thought about protecting the files folder in such a way that google would still find the post.

snav’s picture

Hi,

I'm fairly new to Drupal, so need some clarification on what your saying. I have a feeling that the files would still be accessible. My understanding is that the robots.txt file will prevent google from accessing a portion of the site.

However, in this instance, because the files storage method is public (and i'm guessing the directory must be accessible over the web (otherwise google wouldn't have nabbed it) then the confidential files themselves would still be accessible if someone were to directly type in the url?

I'd really appreciate your input. Thanks.

vm’s picture

However, in this instance, because the files storage method is public (and i'm guessing the directory must be accessible over the web (otherwise google wouldn't have nabbed it) then the confidential files themselves would still be accessible if someone were to directly type in the url?

all files located in any folder within the public root of a server can be accessed. The only time files are private and cannot be accessed are when the folder they are stored in, is above the public root. Thus, even when you set the file system to private in drupal, and your sill storing the files in a folder inside the public root, they are still accessible.

For files to be completely private the folder they are stored in must go above the public root, and the file system set to private, while pointing to the private folder using a relative path.

snav’s picture

That clears it all up, thanks again.