Hello,
to start, thanks for maintaining such a great module for drupal, but I've found, what i think to be an issue...
lets say you made a node with the title "Hello World", when using the path auto you get: "mydrupalwebsite.com/hello-world" but the canonical url uses the default clean drupal path as:
<link rel="canonical" href="/node/2616" />
and i think that the most path auto users block the /node/** path in there robots.txt to avoid double content.
Also i think its best to use the default url, and not the relative part, so it becomes:
<link rel="canonical" href="http://mydrupalwebsite.com/hello-world" />
this because not everybody has there base url defined..
i hope it's clear enough :)
Comment | File | Size | Author |
---|---|---|---|
#39 | nodewords-n650362-39.patch | 728 bytes | DamienMcKenna |
#36 | nodewords-n650362-36.patch | 1.56 KB | DamienMcKenna |
#31 | nodewords-canonical.patch | 1.4 KB | jcisio |
#25 | nodewords-n650362-2.patch | 1.68 KB | DamienMcKenna |
#23 | nodewords-n650362.patch | 1.66 KB | DamienMcKenna |
Comments
Comment #1
apadernoIt does look for the path alias, but it does it in the wrong place.
The attached patch changes the development snapshot code to fix this problem.
Comment #2
apadernoThe code has been changed, and committed in CVS for the development snapshot.
Comment #3
ao2 CreditAttribution: ao2 commentedHi,
I think that the fix is not complete with regard to the Original Poster requests (the absolute/relative URL issue).
If a base_url is not defined we should still use the global one to get an absolute URL; currently, passing an empty string in
$options['base_url']
tourl()
makes it ignore the 'absolute' setting.See this patch (also attached), it shows the problem, and the difference between a NULL value and an empty string on a check based on
isset()
(seeurl()
implementation).An alternative could be to use a snippet like the following, functionally equivalent to the one-liner.
Regards,
Antonio
Comment #4
apadernoThe code has been changed, and committed in CVS.
Thanks for the help, and the patch.
Comment #6
Sputnik CreditAttribution: Sputnik commentedComment #7
Sputnik CreditAttribution: Sputnik commentedHello,
i reopend the issue because in the actual developer snapshot (2010-Jan-27) and also in the 6.x-1.12-beta2 the canonical url doesnt work for me.
thats what it produces:
<link rel="canonical" href="http://www.example.com/en/node/15864" />
thats what i expected :
<link rel="canonical" href="http://www.example.com/en/very-nice-content-here" />
i think this how it should work, or I am wrong ?
[Edited by kiamlaluno] to show the tags that were not show by the current input format
Comment #8
apadernoI explained in #658630: Generated canonical URL is not correct why the code is not using the path alias as canonical URL.
Comment #9
Sputnik CreditAttribution: Sputnik commented@kiamlaluno so works as designed, do you think it will be helpfull to have a feature request for this ? and thanks for the advise
Comment #10
apaderno@Sputnik: I would reply in the same way I replied here.
The canonical URL is a URL that doesn't change with the time, as it needs to uniquely identify a resource. This means that in Drupal a canonical URL is http//example.com/node/<nid> for nodes, http//example.com/user/<uid> for user profiles, etc.
Comment #11
Sputnik CreditAttribution: Sputnik commented@kiamlaluno From the technical point off view I agree with you.
On the other hand , from the SEO point off view, I prefer a URL like
http//example.com/<titel>
the alias of the Node URL as canonical URL, because the CTR will be significant higher than with the "Node" Version.That's one of the reasons because we use modules like Pathauto , Global Redirect ... . And we try to suppress the access to duplicate content, like the author of this issue demon326 explained.
I think to have an option like "use alias URL instead of node URL for canonical URL" will be a great advantage for this module.
Or is there an error in my reasoning ?
TIA
Comment #12
pavlos CreditAttribution: pavlos commented@kiamlaluno fully agree with Sputnik on this. The canonical tag should be controlled by the webmaster and not by drupal core's default url structure, as this is a basic SEO principle and this tag is supported by search engines for this exact purpose.
Comment #13
apadernoIn fact, it is possible to change the Canonical URL to the one you prefer. As there is nothing wrong using the Drupal canonical URL as canonical URL (considering also the purpose of that meta tag), I am changing back the status of this issue.
Comment #14
ao2 CreditAttribution: ao2 commented+1
I also think that the webmaster should be able to decide about the Canonical URL.
Comment #15
apadernoSee my previous comment.
Comment #16
apadernoI have marked #707876: Canonical URLs get not Pathauto as duplicate of this report.
Comment #17
pavlos CreditAttribution: pavlos commented@kiamlaluno my problem is that I have generated vocabularies using taxonomy_csv module and I need it to automatically enter the canonical url using the path for the term and not with paths like taxonomy/term/xx
Need the SEO-friendly one which is /category-name/sub-category-name/
Comment #18
eule CreditAttribution: eule commentedhello
i switch today from 6.x-1.12-beta2 to Nodewords 6.x-1.x-dev (2010-Feb-09)
the problem with the Canonical URL seems not fixed for me...all my canonicals are ./node/** or / and ./taxonomy/**
cache cleaned!
Comment #19
apaderno@eule: The status of this report is fixed because of comment #3; the code has been changed to solve what reported about the canonical URL not being an absolute URL when the base URL is an empty string.
As already explained, in Drupal the canonical URL has the form node/<nid> for nodes; if you want to use a path alias as canonical URL (which I would not suggest), then you set the canonical URL to the one you prefer. Branch 6.x-3 will allow you to use a token to globally set the canonical URL to the path alias, if you really want to do so.
Comment #20
locomo CreditAttribution: locomo commentedsubscribing
+1 comment #11
@kiamlaluno - yes it's true that we can manually control the canonical url. but now this means i have to go though and touch hundreds of nodes.. i'm sure for many this is thousands of nodes.
Comment #21
eule CreditAttribution: eule commentedsubscribing
+1 comment #11
Comment #22
DamienMcKennaHow about a simple option (e.g. variable_get('nodewords_canonical_alias', FALSE)) that decides if nodewords_basic_canonical_prepare() runs the path via drupal_get_path_alias() first?
Comment #23
DamienMcKennaHere's a patch that adds a new option "Canonical URL use alias" to the main settings form that will control whether internal paths or url aliases are used. By default the variable is FALSE so it will use the internal path until the variable is set.
Comment #24
apadernoThe proposed patch doesn't actually change the way the canonical URL is generated. Using the path alias will be possible once that the code is changed as per #508496: Support for tokens, and #706630: Implement the hooks necessary to add more tokens.
Comment #25
DamienMcKennaApologies, I hadn't fully merged the full change. Please check this new attachment.
While I laud the concern to provide a proper fix for this, Drupal via the core Path module (with PathAuto, Path_Redirect and GlobalRedirect) already has an existing structure to cleanly handle paths so it doesn't IMHO make sense to provide a redundant method of configuring the same thing.
Comment #26
ao2 CreditAttribution: ao2 commented@DamienMcKenna: thanks.
@kiamlalauno: tokens are indeed more flexible, but maybe you can accept this change for the time being. It is fairly trivial and most users will be happy NOW :)
In the meantime we can discuss (in another issue if you like) if using tokens for paths _directly_ in nodewords is sane, or if relying on tokens already used in Pathauto (which this patch should cover already) is better.
Just my 2 cents.
Regards,
Antonio
Comment #27
Duplika CreditAttribution: Duplika commentedAnother issue that came to my mind is that if I enable canonical URL's and I only specify one on a certain node, I see that all the other nodes that doesn't have a custom canonical URL's are nevertheless, all nodes have it's canonical URL pointed to themselves.
Is there any way to output the canonical URL's only on the nodes that indeed have a canonical URL and not on the other ones?
Comment #28
Dave ReidThis is simple and gets the job done. +1 for being off by default, but configurable. I tested this locally and its working great.
Comment #29
DamienMcKennaFor anyone who wants to try, I've made a version of v6.x-1.11 available with the patch applied, please let us know here how it works for you.
Comment #30
Anonymous (not verified) CreditAttribution: Anonymous commentedAs Nodewords now supports tokens, and it implements two tokens though to be used for the canonical URL, I am marking this report as fixed.
As usual, read what reported in the project page, before to update the installed modules.
Comment #31
jcisio CreditAttribution: jcisio commentedThanks, looking into the code, I see the token is [metatags-url-alias], right?
However it is not listed on q=admin/content/nodewords/default. The parameter is not saved as reported in another issue (1.13beta4). So I reroll the patch for a temporary fix (sorry, I don't know about the token thing, so I can't help).
Comment #32
Anonymous (not verified) CreditAttribution: Anonymous commentedThe issue hs been already fixed. The hook to list the tokens were not returning the result as expected from token.module, and that caused the custom tokens to not appear.
Comment #34
firebus CreditAttribution: firebus commentedafaict (and maybe i'm missing something)
1. the patch in #25 no longer applies to 6.x-1.x-dev, nor does it apply to 6.x-1.11
it seems to affect a non-existent nodewords_basic/includes/nodewords_basic.nodewords.tags.inc file.
maybe it's meant to apply to nodewords_basic.module, or i'm doing something really dumb?
2. the token feature referred to by kiam is only in the 6.x-2.x branch.
i'm not sure if i should update to 6.x-2.x or not - would it be worth rerolling #25 to apply to the current 6.x-1.x-dev?
Comment #35
3dloco CreditAttribution: 3dloco commented@firebus, I applied #25 to nodewords_basic.module for 6.x-1.11 and it is working for me.
Comment #36
DamienMcKennaI've re-rolled & slightly simplified the patch from #25 to work on 6.x-1.x-dev.
Comment #37
DamienMcKennaI've committed the patch from #36 to 6.x-1.x-dev, it'll be available in the next release.
Comment #38
DamienMcKennaThis should have been tagged.
Comment #39
DamienMcKennaPer feedback from jwilson3, I added an extra comment to help explain what appears as backwards logic in nodewords_basic_canonical_prepare() for deciding whether to use an alias or not.
Comment #41
Z2222 CreditAttribution: Z2222 commentedSorry to post in an old thread, but I just noticed that a couple of my Drupal sites have spontaneously switched from having the node alias as the canonical URL to having the
node/123
internal paths as the canonical URLs. I wanted to respond to the comment above:The
node/123
URLs are internal paths and are not the canonical URLs if there are URL aliases present. The canonical URL meta tag is a feature that was created by Google to help them index pages correctly. Its purpose is to tell Google what the correct URL is, especially in cases where a site is doing things like using Google Analytics campaign tracking as parameters in the URLs likehttp://example.com/my-webpage?utm_source=drupal-issue&utm_medium=issue-post&utm_campaign=drupalorg
.If you have a path alias and then set the canonical URL as the
node/123
URL, it is very bad for SEO, because Google will start displaying the wrong URLs in the search engine results pages.URLs like node/123 are just internal paths and should not be exposed to search engines or users when there is a URL alias present.
The ability to change this is in the module, but I just want to post a caution about changing the default settings of the module. The checkbox that sets the module to use URL aliases as the canonical URLs should be checked by default, otherwise a lot of websites will get messed up.
Comment #42
DamienMcKennaThe latest version does, in fact, use the alias by default.
Comment #43
WeRockYourWeb.com CreditAttribution: WeRockYourWeb.com commentedJ Cohen explains how this should work perfectly. Relieved that this is the default setting now, we had one of our sites completely screwed (pages dropped from Google index) by having the canonical URL set to /node/x for each page during a former update.
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedMy team was also burned by this issue. We updated to a new version at some point and it set the internal URL to be canonical and of course it broke our SEO practices.
I see no reason why we should have been reverting already live sites that have aliases to use node/[nid]. It goes against the very purpose of this module.
I realize that being the technical lead on my project puts the blame on me for rolling this out, however, we are in the days of Ægir and other drush-based automated deployment tools where catching a little detail like this is difficult to catch until it is too late. Giving the developer control is good, but our defaults should be a bit more in line with what the intended goal of a module is.
Comment #45
DamienMcKenna@Ryan: I am sorry you went through that. I had a disagreement with the module's last maintainer back in 2009 about what the default should be, he argued for using the internal URL while I (and some SEO analysts I worked with at the time) argued for the alias. Since v1.14-beta1 the default has been the alias.
Comment #46
Anonymous (not verified) CreditAttribution: Anonymous commented@DamienMcKenna I think what happened was that we upgraded at some point, and then later (before the revision) someone on our team had made an update to that settings form, thus preserving the bad setting, so it persisted after the subsequent updates. I really appreciate the work people have been doing on this module, it certainly beats doing this by hand for each site!