Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
The site I'm working on have 7 different languages and I'm facing an issue where I only want the node translations with an actual translation available to be indexed.
Example of what I'm getting now:
<xhtml:link rel="alternate" hreflang="en" href="http://site.dev/some-page"/>
<xhtml:link rel="alternate" hreflang="sv" href="http://site.dev/sv/node/366"/>
<xhtml:link rel="alternate" hreflang="fr" href="http://site.dev/fr/node/366"/>
<xhtml:link rel="alternate" hreflang="de" href="http://site.dev/de/node/366"/>
As the content is only available in english it should look like this:
<xhtml:link rel="alternate" hreflang="en" href="http://site.dev/some-page"/>
Is there a way to achieve this?
Thanks in advance.
Comments
Comment #2
gbyte CreditAttribution: gbyte commentedIn #2707557: Sitemap contains untranslated nodes I have argued that providing language specific links to content that is not translated would not harm SEO. I am not sure about this however; if you can find someone stating that sitemap hreflang links pointing to non-translated content are harmful, I will gladly take a closer look at this.
The implementation is not difficult at all, we just need a check if an entity has a translation:
Where I see a problem with your approach is that a page is not made up exclusively of the entity (eg node) in question. It may also have blocks/views around it (sidebar, footer) with content that may be translated and this content would not be indexed if we exclude entity pages which do not have a translation.
Another problem is posed by mixed pages like the homepage and panel pages which do not have that 'main entity' which could be checked for translations. Maybe a configuration option to choose one approach or the other is needed?
I'm changing this to 'feature request' and invite everyone to share their thoughts on this issue.
Comment #3
matsbla CreditAttribution: matsbla commentedI don't have any data to support that including hreflang links to pages without translation is bad for SEO, but I made a website with 300 languages where obviously most content did not have translations in all languages. In Google webmaster I got very many error messages about dead hreflang links.
I guess it can make sense for a website with only some few languages and where translations are frequently added and updated, to have it though.
Maybe we could make it a configuration option and then website builders can choose to turn it on or off depending on their own need?
Comment #4
gbyte CreditAttribution: gbyte commentedAs a temporary solution you guys can implement hook_simple_sitemap_links_alter in a custom module to exclude some languages completely, something along these lines:
Obviously you could do some more magic by loading the entity from the path, checking if it is translated and so on, but this will cost some performance.
Comment #5
matsbla CreditAttribution: matsbla commentedI don't think that is a good solution. Different nodes have different translations. I don't want to hard-code languages to exclude them for hreflang liks for the whole site! I simply want to remove the hreflang links when there doesn't exist a translation
Comment #6
goodwillhinton CreditAttribution: goodwillhinton commentedI agree that it is not a good solution to hard code the exclusion of individual languages. And it is a problem to submit a sitemap with inaccurate links to non-translated nodes.
You might want to take a look at this core patch that provides similar functionality to include hreflang tags only for those nodes that are published and translated: https://www.drupal.org/node/2521782
Comment #7
gbyte CreditAttribution: gbyte commentedThanks @goodwillhinton
I'm going to look into it soon, but would be happy to review any patches (the change should be implemented in
\Drupal\simple_sitemap\Batch::generateBundleUrls
). If you guys can help out this may make it into the next stable version.Comment #8
guilopes CreditAttribution: guilopes commentedHello, I did an patch for it, could you guys review that?
It add an checkbox on content type to user select if he want or not check the translation before generate the link
Comment #9
matsbla CreditAttribution: matsbla commentedThanks for helping to push for this, I tested it and here are my feedbacks:
- Why should this be configured for each content type? For me it makes more sense to have a global setting for this at the sitemap configuration page (path: admin/config/search/simplesitemap). I think to make different settings for this for each content type is confusing, and also we run the risk of many people getting inconsistent xml sitemaps as they have inconsistent configurations for each content type. However would be interesting to hear if there are any specific use case for this configuration by content-type?
- When I tested it doesn't seem to work. It now seems like the frontpage is always indexed with hreflang links (no matter if it has translation or not), but content doesn't have the hreflang links no matter if I check the box "Check Translation" is used or not.
- The text is a little bit confusing: "It will check if exists translation of this content type before index it". It should express more clear if it will include these translations or not, like "Only index published translation of this content"
Comment #10
gbyte CreditAttribution: gbyte commentedHi guilopes, After taking a brief look at your code, I agree with matsbla's review. A global setting on the setting page seeems to be the right place for this feature. If you guys can come up on a clean implementation, I will be happy to push this.
Comment #11
guilopes CreditAttribution: guilopes commentedhey, It's just for Content Types that's why not work on front page. I think all the languages have the front page so it should show all the hreflang.
I used the configuration on Content Type because it's more flexible, I can select the bundle that I want to select not for all bundles, it makes sense when I have Blog Posts for example and I don't wan't to show hreflang because it's not translated in the other hang I have a basic page, just with file attach that is the same for all the languages and I want to have the hreflang, make sense? I don't think that it's confuse because we already have some configurations in each bundle type. Do you guys agree with that?
I agree with the message and I can change it, just let me know if you guys agree with the rest of configuration and I do these changes
Comment #12
guilopes CreditAttribution: guilopes commentedTo hreflang work with different languages, we need to translate the content and Check "Check Translation" on content type, it should work
Comment #13
guilopes CreditAttribution: guilopes commentedI found one issue in the patch, when the language is not on default language it's not generating the properly link.
I fixed it and also the message on the following patch
Comment #14
guilopes CreditAttribution: guilopes commentedI forgot to change the text on check translation.
Again:
Comment #15
guilopes CreditAttribution: guilopes commentedComment #16
Fidelix CreditAttribution: Fidelix as a volunteer commentedIt seems @guilopes made per-content type settings to address the scenario described in #2.
That is: If a content type doesn't need to be translatable but still needs to be displayed in different languages, there should be an option for that.
ie. A user page with blocks that are translatable, the home page (if it were a non-translatable entity), file entities (or a similar node content type), etc.
Yes, it is slightly less straightforward for the user. The extra flexibility may be worth it, though. It covers a lot more use cases IMO.
We can decide on some sane defaults to reduce friction for the user. I think everything should be "language-checked" by default so as not to generate incorrect links.
Comment #17
guilopes CreditAttribution: guilopes commentedI forgot to sent one thing on batch.php to pass the default langcode. I'm sending it in attach
Comment #18
sgurlt CreditAttribution: sgurlt at Bright Solutions GmbH commentedAwesome, I was hoping for a patch like this and their it is. Works perfect, thanks :)
Comment #19
gbyte CreditAttribution: gbyte commentedHi guilopes, appreciate your effort and I'm also looking forward for it to be pushed.
I have tested #17 and found a few problems:
Notice: Undefined index: und in Drupal\simple_sitemap\SitemapGenerator::generateSitemapChunk() (line 169 of modules/contrib-dev/simple_sitemap/src/SitemapGenerator.php).
I have English set as default language and German.admin/config/search/simplesitemap/entities
The 'check translation' checkbox value for non-bundle (atomic) entity types (like user or file) does not get saved.
admin/config/search/simplesitemap/entities
The JS inside atomic entity type fieldsets e.g (like user or file) is not hiding the 'check translation' fields dynamically.
simple_sitemap.schema.yml
The check_translation key has the wrong label.
admin/structure/types/manage/page
The summary inside the tab doesn't reflect the bundle's 'check translation' setting (JS).
Beside that, I think I know how we can vastly improve user friendliness in this feature:
a) either implement a global setting 'check translation' on the module setting for all types and another setting 'Override setting on a per-bundle basis' which will then show the 'check translation' checkboxes on entity bundle edit pages as well as on
admin/config/search/simplesitemap/entities
for atomic (non-bundle) entity types.b) If the above is too complex, we could start by simply implementing one global setting 'Show translation settings on entity type pages' which will expose the checkbox you implemented on the entity bundle edit pages as well as on
admin/config/search/simplesitemap/entities
for atomic entity types. In other words, this checkbox would enable the feature you implemented and would be unchecked by default.Any thoughts? I'm looking forward to review the next iteration. Please do not forget to pull the new dev version first, as there has been lots of code refactoring to improve testability.
Comment #20
meramo CreditAttribution: meramo at Bright Solutions GmbH commentedDoesn't apply against the latest dev (f07ee24)
Comment #21
guilopes CreditAttribution: guilopes commentedit's based on b4ba757c138301ac86a471029e3b8520be2be458
Comment #22
sic CreditAttribution: sic commentedI'd love to apply the patch but how can I do that? It doesnt work with the latest dev or stable version.
Untranslated nodes/ entites really should be excluded. I tend to have a global setting for that, and not per content-type.
Comment #24
gbyte CreditAttribution: gbyte commentedAlright I just pushed code adding a global 'Skip non-existent translations' setting. This does not allow to set the setting on a per-bundle basis, but all the problems from #19 should be removed plus the setting also works on custom links which point to entities.
@guilopes Sorry I couldn't use your code, but feel free to take what I wrote and create a further patch allowing setting this on a per-entity type (on non-bundle entities) and per-bundle basis. I'll be happy to review and push.
Please test and reopen issue if you find problems.
Comment #26
gbyte CreditAttribution: gbyte commentedComment #27
guilopes CreditAttribution: guilopes commentedhey, I created a new patch. I hope it can help :)
Comment #28
shadcn CreditAttribution: shadcn at Chapter Three commentedHi Pawel, thanks for the fix.
We were having the same issue on one of our sites. I just tested the
8.x.2.6
version with the fix but I'm still seeing a few issues.Here's some more information about our setup. We have 2 languages on the site: English (default) and German.
Test A: Success
hreflang="en"
is insitemap.xml
Test B: Failure
hreflang="en"
andhreflang="ge"
insitemap.xml
.Can you help us? Thanks.
Comment #29
guilopes CreditAttribution: guilopes commentedhave you test the #27 patch ?
Comment #30
gbyte CreditAttribution: gbyte commented@arshadcn
It appears English is your default language.
Drupal\simple_sitemap\Batch\BatchUrlGenerator:
The condition
$language->isDefault()
adds the link to the sitemap for the default language regardless whether the node exists in this language. This was added because the condition$entity->hasTranslation($langcode)
returned TRUE only for translations of the default language.The question is, is it incorrect for the sitemap to always return a link to the default language; if not, the above code is what needs to be tweaked?
Comment #31
guilopes CreditAttribution: guilopes commentedwith the patch #27 the issue about default languages always have the href is solved. it's make sense because we need to check only if the current entity have the language
Comment #32
shadcn CreditAttribution: shadcn at Chapter Three commented@gbyte.co, yes this is what we expected: if we create a node in one language, regardless if the language is default or not, only 1 entry for the node in that language should be in the sitemap. Is this the correct behavior?
Comment #33
guilopes CreditAttribution: guilopes commented@arshadcn same bahavior we have here. the #27 should work but you should apply based on c8633a94226de25e72cd324338e19999b7d1688b something like this:
Comment #34
guilopes CreditAttribution: guilopes commented@gbyte.co any chance of my patch to be merged?
Comment #35
guilopes CreditAttribution: guilopes commentedComment #36
shadcn CreditAttribution: shadcn at Chapter Three commentedThanks @guilopes. I'll take a look. The patch didn't apply to the latest 8.x-2.x.
Comment #37
guilopes CreditAttribution: guilopes commented@arshadcn yeah, I know. I don't know if the mantainer of the module will apply my patch so I can't update it every new dev commit. You need to apply on top of this git hash c8633a94226de25e72cd324338e19999b7d1688b
Comment #38
shadcn CreditAttribution: shadcn at Chapter Three commentedThanks @guilopes. I'll test it.
Comment #39
gbyte CreditAttribution: gbyte commented@guilopes Thanks a lot for your effort! The problem with your patch is that non-translated entities (where language is set to 'not specified'), are not indexed at all. That would often be most pages, as most editors leave the content language on 'not specified' until a translation is made, so that the content can be reached regardless of the current website language setting.
Also, the 'regenerate sitemap' checkbox on setting pages does not appear dynamically when you toggle the 'skip non-existing translations' checkbox.
The third thing I found is that your code does not work with custom links (
admin/config/search/simplesitemap/custom
).@arshadcn I've modified my approach to prevent the default language to always appear in URLs. Could you please test for me the most recent dev which should be up in a few minutes? Please close the issue if all is working well.
Comment #41
shadcn CreditAttribution: shadcn at Chapter Three commented@gbyte.co this is perfect. Thanks.
Comment #43
guilopes CreditAttribution: guilopes at Dexa commentedComment #44
gbyte CreditAttribution: gbyte commentedRevisiting this issue.
Any of you guys think the here implemented skip_untranslated setting should be ticked by default for new installations of the module? Thanks for your feedback everyone.
Comment #45
guilopes CreditAttribution: guilopes commented@gbyte.co yes, I think it should be the default. Make more sense since in most of the cases you want to expose just the nodes that you have translated.
Comment #46
mbovan CreditAttribution: mbovan at MD Systems GmbH commentedI just found this issue when I was looking for a way to exclude non-existent translations.
+1 for making this default behavior (#44).
I quickly created a follow-up #2989456: Skip non-existent translations by default.
Comment #47
mbovan CreditAttribution: mbovan at MD Systems GmbH commentedOops #46 was already fixed in #2961233: Allow for multiple sitemap types on one Drupal instance and comitted to 3.x branch only. Closed #2989456: Skip non-existent translations by default as duplicate.