The major web crawlers have published an additional manner of communicating the right URL to be indexed : inserting a rel="canonical" tag into page HEAD tag.
See http://googlewebmastercentral.blogspot.com/2009/02/canonical-link-elemen... for example.

The http://drupal.org/project/globalredirect project is implementing it, but it is key that XMLSiteMap and GlobalRedirect get consistent, meaning that they push the exact same URL to web crawlers.

Then it is probably important that both teams communicate and cross validate their strategies to find out unique URLs for each node, view, user and all other object types eligible these features (sitemap and canonical tag)

Comments

avpaderno’s picture

Title: Publish consistent URLs in XMLSITEMAP and Canonical URLs » Publish consistent URLs with Canonical URLs
avpaderno’s picture

Status: Active » Closed (works as designed)

The problem of Google Webmaster Tools marking some pages like having duplicated titles, or meta tags is not a problem specific to XML Sitemap, but it's a problem present in Drupal-powered sites (with or without XML Sitemap).

The problem should be resolved by a third party module, not by XML Sitemap that has a completely different purpose.

open-keywords’s picture

Status: Closed (works as designed) » Active

@kiam

I have never talked about any of the issues you raise:
- nothing about WMT complaints
- nothing about titles
- nothing about meta tags

I'm only saying that URLs being set as "canonical" in globalredirect should be the ones pushed in the sitemap, and vice versa = be consistent.
If some situation don't have an obvious canonical URL, it is safer not to set it.

Regards

avpaderno’s picture

Status: Active » Closed (works as designed)

The purpose of setting a canonical URL is to avoid Google Webmaster Tools reports that two pages have duplicated titles, or duplicated meta tags (this is explained in the link you referred to).

The purpose of the project modules is to create a list of links that is then exposed to search engines like XML file; if the links exposed have or not a canonical URL like Google Webmaster Tools is expecting, that is not a concern of XML Sitemap which should simply make a list of links, and report the change frequency (and the priority) of those links.