While it has been discussed before, I feel a few changes should be available for the Canonical URL handling:

  • There should be an option to pass the Canonical URL value via the url() command to obtain the URL alias; this would default to FALSE, so the system path would be used.
  • There should be an option to make the Canonical URL an absolute value; this would default to FALSE so that only the local path is used (along with Drupal's normal BASE path value this is sufficient).
  • Ensure all in-module instructions are accurate for both how the module works and Google's supported use.

Comments

jwilson3’s picture

As it currently works, if globalredirect/pathauto/redirect/etc is installed we have a fundamental conundrum, that could confuse search engines...

Example: Page with alias "/myalias" has a canonical tag that points to node/X which automatically redirects to /myalias, which points to node/X, which redirects to...

What if we run through url() by default, or calculate the default for this option, based on the presence of these other modules to keep people that are unsure of the configuration to have sensible defaults that won't penalize them out of the box.

jwilson3’s picture

For clarification on the third bullet point in the OP. The Description text for the canonical field on per-node edit forms says the following, (emphasis added by me):

Canonical URLs are used from the search engines, and allow them to not report duplicate titles for HTML pages that are accessible from different URLs. Use a relative URL without the initial slash; canonical URLs that point to a different domain are normally not accepted.

The last phrase is technically incorrect, as NodeWords does actually support absolute URLs that point to other domains. Additionally, this is a valid use of rel="canonical", so the description text should be updated to remove this statement.

damienmckenna’s picture

Status: Active » Fixed

The form field's description has been updated to the following:

Canonical URLs are used by search engines to identify the primary location that specific content is available from; this is useful when content is accessible from multiple URLs, either within the same site or across multiple sites that are sharing content. Use a relative URL without the initial slash.

I believe that is a more technically correct message.

damienmckenna’s picture

jwilson3’s picture

kewl, #3 sounds good.

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Added a third item to cover updating the instructions.