I'm seeing duplicate title reports by Google Webmaster Tools for my site using Content Locking. Under HTML Suggestions, Google is lising 54 pages that have duplicate titles because it's somehow finding the /node/xxxx/canceledit page which was added by Content Locking. I have no idea why this URL should be visible to Google as only logged-in users can even edit content, but this should be fixed.

Thanks,
Augie

Comments

eugenmayer’s picture

Category: bug » feature

Sorry cant see how this is a bug, its most probably and optimization.

As iam not very interested in SEO, probably someone else can file a patch here?

mdlueck’s picture

If these /node/xxxx/canceledit should not be picked up by search engines, perhaps the module should set the meta-tag in the header to tell the search engine not to index that page, but still follow links on the page.

<meta name="robots" content="noindex,follow,noarchive,nosnippet" />
eugenmayer’s picture

Iam just curious, why should google ever find the /canceledit URL?. That would actually mean, google followed the node/%/edit url from the local task.

This actually must be forbidden anway and is actually the only way to get to the canceedit "url". Locking down the most local task for SEO optimization is quiet a normal task.

If there is no good argument against it i will mark it as wont-fix / works as supposed.

@2: Drupal has no core-api for this and i wont use a contrib module to add those headers.

In general the cancel button should not be a link but rather a form-submit, which google would ignore. But thats a total different story and has different reasons.

mdlueck’s picture

@Eugen: Do you mean that for GoogleBot (anonymous) the site was actually editable!?!?

eugenmayer’s picture

@4 depends on your setup - no idea.

augiem’s picture

There's no way Googlebot could be picking up the link from the Edit page as only authenticated users have access to edit any kind of nodes. I've checked in PathAuto and there are no aliases to canceledit anywhere. XML sitemap also has no references to this so I can't imagine where else it would come from if not directly from this module. The only other connections to Google through the site are Google Analytics and Ping. GA is not active for authenticated users, only anonymous.

Simply doing a search for site:www.owningpink.com canceledit in Google turns up hundreds of results. Anyone else with this module care to test their site on Google?

UPDATE: Found the culprit. The CANCELEDIT button is being added to the Node Comment form. Why is that? Content locking on comments? And for anonymous users... hmm... my content_lock permissions for anonymous are

administer checked out documents : NO
check out documents : NO
keep documents checked out : NO

eugenmayer’s picture

ah i see, well yes its the normal cancel implementation,. I guess a simple nofollow will do the job

summit’s picture

Subscribing, couldn;t it be settled by changing the robots.txt?
greetings, Martijn

eugenmayer’s picture

Status: Active » Fixed

fixed, will be included in 2.3

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.