There are some scenarios in Drupal6's content translation system where you can be looking at the (for example) hungarian version of a page inside of the English version of the site. This leads to confusion for the user and also SEO issues (duplicate content).

It would be great if global redirect could handle these scenarios (note they are not entirely simple - see http://drupal.org/comment/reply/146084#comment-661963 ).

This is basically a question about whether that seems like it belongs in this module or not. If so I can provide more concrete steps to repeat and desired actions on the part of global redirect module.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

nicholasThompson’s picture

Greggles - thanks for the request. I dont know if you're aware, but there is a known incompatibility with GR and i18n. I assume the issues with D6 are going to be along the same lines.

To be honest, I dont really know where to start on this - or, indeed, how the systems handle the query. Does it take the URL (eg, 'de/node/1' for a German node 1) and simply pop arg(0) (ie, 'de') out of it and keep a note to render the german translation?

Global Redirect's initial spec was a lightweight module which can handle the redirection from non-aliased to aliased pages (ie, node/1 to node_with_alias.html).

This problem isn't really covered by that project goal as the translation pages are technically DIFFERENT pages... although they are the same content, they are translated and therefore targeting a different audience.

I'm MORE than open to ideas and (even better) solutions though! Languages (as in English, French & German, etc) are not really my strong point - I can just about handle English! In fact I might be better at PHP than English... Hmmm...

greggles’s picture

1. install drupal6
2. add in a second language
3. on admin/settings/language/configure use path prefix only
4. enable content translation module
5. in workflow on admin/content/types/page enable multilingual support with translation
6. create a page in english (node 1), click the translate tab, create it in your second language (node 2)
7. Visit example.com/en/node/2 and you see the content in the second language and all the menu stuff in the first language

This is where it would be nice if global redirect could forward the user to example.com/hu/node/2

Here is some very weak code that does this:

  global $language;
  $node = node_load(arg(1));
  if ($node->language != $language->language) {
    drupal_goto(drupal_get_path_alias($node->language .'/node/'. $node->nid));
  }

It assumes that you are using a path prefixing mode and it assumes that you are on a node/NID page.

Does something like this even have a chance of working if we can fix those two assumptions?

greggles’s picture

Also, if there were a version of the module compatible with 6.x I'd be interested in working on a patch to do this. I checked CVS and HEAD is missing a .info file and there is no DRUPAL-6--1 branch.

I'd rather see this get fixed/committed for HEAD/Drupal6 and then see about backporting it. In Drupal6 we have the benefit of i18n being in core where it is more stable/more of a known entity.

nicholasThompson’s picture

Head is a little out of date, I tend to work on the dev-versions for each branch (being a bit of a CVS n00b)...

I'll get a DRUPAL-6--1 branch sorted out early this afternoon which will basically be a DRUPAL-5--1 copy.

Your suggestion about sounds sensible - although I'd make one suggestion... Instead of redirecting straight away to {language}/node/{nid}, make that the new source path and THEN do a lookup to make sure you 301 to the ALIAS first instead of directing from SrcA to SrcB then to AliasB.

dvaernewijck’s picture

on motogp.Com when you click on another language you get redirected to the home page in that language, ex when you klik on "FR" you get redirect to http://www.motogp.com/fr instead of the page in an other language.

is there a special way to set this up?

brgds

nicholasThompson’s picture

Status: Active » Postponed

I believe this overlaps STRONGLY with the i18n issue (#153950: Endless loop with i18n), even though the i18n one is for D5 and not 6. I'm marking this as postponed until the D5 issue is solved. Once the D5 one is solved we can port the fix into D6.

Andy Inman’s picture

Re. #5, that I think, is a matter of which Languages Menu (block) you set up - there are two, one comes from i18n and the other comes from Locale (again, I think, not sure.)

Please see my comment here - http://drupal.org/node/153950#comment-937525 - about what should happen when the url specifies a language but points to a node written in a different language. To expand on that, I can see some situations in which you might want to automatically redirect, but other situations in which you definitely wouldn't want to. It would have to be configurable, probably at a user preference level.

Example:
User tries to goto www.site.com/en/page_in_french
... the big question is WHY? As a Site Admin, or Site Translator, they may want to check the French version of something, but not have all other content (menus, blocks, etc) switched into French - that's what would happen if you redirected them to www.site.com/fr/page_in_french. Or, you could redirect them to www.site.com/en/english_version_of_page - Again probably not what Site Admin or Translators want, but possibly best for normal users and Search Engines. But, what will you do if the translation doesn't exist? Display a "not found"? (in English I suppose) or to go to the French page (which url? the en/ one or the fr/ one?), or display another page offering the user a choice of the various options?

I think it's important to be clear about what the address www.site.com/en/page_in_french actually means. To my mind it means "Show me the page_in_french page regardless of what language it may be in, and keep showing me everything else in English." I think this definition makes most sense when dealing with a human visitor - they have used that address for some reasons - if it due to an incorrect link elsewhere that issue should be fixed at source. If it was from a Search Engine, then that should be fixed by making sure they don't index these "mixed language" urls.

So, for Search Engines, the address www.site.com/en/page_in_french is one you do not really want them to see nor index. I haven't got my head around whether such links would become visible or not via simple crawling - I don't think they are for a properly constructed site, but if they are then something needs to be put in place to stop them from being visible or followed.

Another point www.site.com/en/page_in_french should probably show a link at the top of the text to allow the user to get to the English translation if it exists. This can be added manually using my Language Sections module - http://drupal.org/project/language_sections . Maybe it would be better if fully automatic.

All of this seems to me to be way beyond the scope of Global Redirect, unless the author wants to take it on!?

nicholasThompson’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev

netgenius, thanks for that explanation. The issue is actually more complex than it sounds, isn't it. You are right to bring up the issue of what should actually be done... eg
www.example.com/en/french-page

Should this redirect to /en/english-page or /fr/french-page?

Following on from Greggles post in #2, here is a slightly expanded theory...

    // If Content Translation module is enabled then check the path is correct
    if (module_exists('translation') && (arg(0) == 'node') && is_numeric(arg(1)) && (arg(2) == '')) {
      switch(variable_get('language_negotiation', LANGUAGE_NEGOTIATION_NONE)) {
        case LANGUAGE_NEGOTIATION_PATH_DEFAULT:
        case LANGUAGE_NEGOTIATION_PATH:
          $node = node_load(arg(1));
          if ($node->language != $language->language) {
            drupal_goto(drupal_get_path_alias($node->language .'/node/'. $node->nid));
          }
          break;
      }
    }

I added this just before the clean url test towards the end of the module. It checks translation is enabled, that arg0 is node, arg1 is numeric and that arg2 is empy (ie, we're not on an edit page for example).
It then checks that the language negotiation is on path or path/language default (rather than NONE or DOMAIN).

Freso’s picture

Is this still "postponed"?

Anyway, regarding what should be done, here's my take:

Case 1.
1) You have en/node/1 with no translation.
2) You go to da/node/1.
3) GR should redirect to en/node/1.

Case 2.
1) You have en/node/1 with translation da/node/2.
2) You go to da/node/1.
3) GR should redirect to da/node/2.

I think those are pretty much the options as it stands. I'll go poke at the proposed code now. :)

Freso’s picture

Title: redirect to version in native language » Redirect to version in native language
Status: Postponed » Needs review
FileSize
1.99 KB

This patch is based on the code in comment #8, but has been expanded somewhat upon. It should work as described in my comment #9. Thanks to agentrickard for trying to help me find stuff! :D

Freso’s picture

Also, the patch is live (applied to DRUPAL-6--1-0) on freso.dk if you feel like seeing how it works. :)

Andy Inman’s picture

IMHO Case 2 is ok, but not Case 1 for reasons described in my longer posting above. Forcing a whole-page language change could be a real problem if the user doesn't understand the menu, can't see where to login or logout, etc. My golden rule would be: never redirect from a url which specifies a language to a url which specifies a different language - how content gets displayed is a separate topic.

The only thing that I think you can safely redirect is mysite.com/something to mysite.com/en/something (where en is the site default or user preference already set.) I'm currently using my own custom version of Global Redirect which does this. So this way, Google sees only one copy of a page rather than two. I also thought that maybe it would make more sense to do the reverse, i.e. redirect mysite.com/en/something to mysite.com/something (same content, different url). The only advantage I can think of is that the url looks shorter/neater. Problem is that if you ever changed your site default language (highly unlikely of course) then Google's view of things would get messed up.

I guess different sites and different users have different needs, so any generic solution is going to need a good level of configurability, and cater for some individual user preferences too.

Freso’s picture

@netgenius:
First, my patch in #10 will use the prefix specified by the user. By default, English's prefix is (still, see #146084: Default path prefix for English (and DBTNG it)) '' (ie., an empty string). This means that English content would be available at "foo", while localised content would be at "lang/foo".

Second, you say So this way, Google sees only one copy of a page rather than two. while opposing my case 1. If you don't do case 1, you'll end up with having the same content multiple places: en/node/1, da/node/1, fr/node1, ... - depending on how many languages you've set up. This is what I want to avoid ("at all costs").

I do agree it's not a perfect solution (one has to navigate to a different page to change the language), but it's one that lets Google and other search engines see content at one address, and one address only. This could possibly be toggled by a site variable/admin configuration though, so each site can set it as they want it. This shouldn't be too difficult to add to the patch, if Nicholas agrees it is needed.

greggles’s picture

Rather than discussing what it shouldn't do, let's discuss what it should do.

1) You have en/node/1 with no translation.
2) You go to da/node/1.
3) GR should...

Option A) Redirect to en/node/1
Option B) Redirect to example.com/da home page with a message "content not available in your language, do you want to see it in english? (where "english" is linked to en/node/1)
Option C) Display a 404
Option D) something else

Personally, I'd be happy with any of these. I agree with Freso's comment that the most important thing (both for SEO and usability) is that content is available at only one URL.

greggles’s picture

One more thing:

I guess different sites and different users have different needs, so any generic solution is going to need a good level of configurability, and cater for some individual user preferences too.

We have to be careful about this in GR since it runs on every page load. Adding too much complexity to this module will slow down sites. I guess that if people have translation enabled then they are prepared for some slowness already, but it's worth considering.

nicholasThompson’s picture

Greggles - very good points allround...

I would opt for.. err.. option B. This does a number of things...
1) It definately stops the likes of Google indexing that URL
2) It stops the user wondering why they are getting an English site when they requested a Danish page.
3) It gives control to the USER about what they do next

I also would prefer to keep this module lightweight... However bear in mind that:
a) the configuration could be put in an include which the new D6 menu system could include when needed
b) the locale stuff (ie all this issue is talking about) only applies when module_exists('locale')

Freso’s picture

And since the patch in #10 pulls in a list of all translations, they could(/should) all be listed.

Hm. Is there a way to do something akin to drupal_set_message() that will always be shown? (Ie., that won't be hidden if the user is anonymous or similar.) If so, that might be another alternative. Setting a message that the user has been redirected to another language. At least, I'd rather that than option B. But then Greg raised the point of the possibility of the content being indexed with this notice. Hohum. Is it possible to make a new page available, without first defining it outside the if {}?

In the mean time, here's an update of the #10 patch. I realised there was a small bug when there wasn't a $language->language translation available when looking up $node_translations[$language->language]. Also, it's not changing the entire $language variable instead of only the prefix. Just in case.

Freso’s picture

And here's a patch which includes a drupal_set_message() before it changes the language. Just in case you want to try and play around with it. =)

Andy Inman’s picture

Ok, agree, "B" is best, so I'll update my "golden rule" to include "without the user's prior agreement" :)

Fresco, I was not in any way saying you're *wrong* only that my needs are different!

Performance - keeping it lightweight.... perhaps the answer is caching... what I mean is, given url ... do lots of processing to figure out where to redirect it to, then cache the result so next time it's very lightweight - url A redirects to url B thats all. Ok, there are issues, need to clear the cache under some circumstances - probably on node update, maybe other cases.

In summary, I don't think GR can be a "one size fits all" solution without offering configurable options, at least at the system level and possibly at the user level.

Andy Inman’s picture

Keeping it lightweight yet flexible, how would this be? ...

1) You have en/node/1 with no translation.
2) You go to da/node/1.
3) GR should...

-> Redirect to da/special-page?url=da/node/1

... where special-page is either a hard-coded url or configurable. Either way, it's up to each site admin to put something there (probably some PHP) that handles the rest. So, there you could provide whatever message you want to display to the user (in the appropriate language), a link to the English version of the originally requested page, or some complex routing.

This to me would seem to meet the needs of "lightweight" and "fully-configurable", and moves the complexity away from this module. I'm probably missing something?

nicholasThompson’s picture

I cant see what's wrong with having a settings page? The menu system in D6 can include files based on callback path. We could use this to include "globalredirect.admin.inc" or something like that...

Freso’s picture

Status: Needs review » Needs work

@ Nicholas:
If you're referring to my comment about a page, I'm not referring to an /admin page, but the page where one can choose (to either go back or) to go to a different language to see the content. But... that might very well be possible to add in a menu callback as well.

sparkey85’s picture

This patch is really nice, if I use the direct links to the nodes, but if I call the node with an alias /en/Germansite, it could not going to /en/Englishsite, is that possible to extend the patch with alias handling?

Freso’s picture

@sparkey85:
I'm assuming you use Pathauto here, but it might apply even if you're just using core's Path module: When you create a node and select a language for it, Pathauto (and/or possibly just plain Path) will save the accompanying alias with the language code – this is to prevent aliases for e.g. English "foobar" and German "foobar" to conflict which each other, as they would have the same alias! (See #269877: path_set_alias() doesn't account for same alias in different languages for a conflict due to Path not using this language information.)

So, in short: to solve this, either the aliases needs to be saved without language information, so they apply "universally", or Global Redirect has to hit the database and get information about aliases in all languages.

The first version is obviously problematic, as I started out by explaining, and the second approach is problematic in a similar way. Say we have a node with a Danish alias "foobar" and another node with German alias "foobar" (for the purpose of this example, no English node with this alias exists, and the Danish and German nodes aren't translations of the same content), you then go to "en/foobar"... should it redirect to the Danish or German node?

sparkey85’s picture

@Freso

I have pathauto, and I use aliases without language prefix. Its possible to give the same alias for the translastions of a node, because the Localisation module from the System handles it with its prefixing (6.3) and it works fine, as i need already. Only translated aliases makes my life difficult.

In your exaple #24 (en/foobar) it should make no redirect, because there are no definite translations for this item.

The e.g. Settings:
Home › Administer › Site building > URL aliases:

Alias System Language
start node/1 Deutsch
start node/2 Englisch

inhalt node/3 Deutsch
content node/4 Englisch
some independ Aliases (no translation relation):
foobar node/5 Deutsch
foobar node/6 Englisch
foobar2 node/7 Deutsch

Lets see the uses cases with the actual modul:
Browser/selected language: English
Expected:
start -> en/start (node/2) (ok)
de/start -> de/start (node/1) (ok)
content -> en/content (node/4) (ok)
foobar -> en/foobar (ok)
de/foobar -> de/foobar (ok)
foobar2 -> Not exists (ok)
Translate en/start to de/start (ok)
Translate en/content to de/inhalt: (ok)
Translate de/foobar to en/foobar and back: ok (no translation)

Whats not working:
1. inhalt -> en/content (not ok)
2. de/content -> de/inhalt (not ok)

My imagined algorhythm:
1. check the nodes referenced by the alias
2. are they translations to each other?
2. if yes select the node in the requested language (prefix), if no selected language existent select the node in the default language.

I think it should work, but supposably i have error in reasoning :)

Freso’s picture

@sparkey85: supposably i have error in reasoning - Yeah. The language prefix isn't part of the alias. If you go to http://freso.dk/en/2008/08/13/9_years_on_the_web the alias you're accessing is "2008/08/13/9_years_on_the_web", not "en/2008/08/13/9_years_on_the_web".

Anyway, apart from this, I figured, since we're all pretty much agreed on my case 2, I figured I could do a patch just with this for now, while we agree on the proper course of action for case 1. No? A hand-edited (and non-tested, but I can't see why it shouldn't work) version of the patch from #17 is attached, which only does the case 2 scenario and does nothing (ie., the same as the current behaviour) in case 1 cases.

fletchgqc’s picture

This might be against the feelings of most commenting here, but regarding Greggles' comment #14 options, I am strongly in favour of Option C) Display a 404.

If you stop, step back, and think about it logically for a second, then this is the answer that makes most sense. There is no Danish content called "node 1". It really is as simple as that. There is no reason that anyone should ever visit that URL, and that URL should never return anything. If you want to make it possible for your users to switch the language of what they are reading, then leave the default language switcher node links on (or whatever) - don't rely on them attempting to hack the path!

The core aim of GR is to avoid duplicate content - to ensure that URL paths don't exist on your site that could somehow be found out and have a negative bearing on SEO. Let's stick to that aim. GR's aim is not to attempt to figure out what users were trying to do when they entered a weird URL path, nor to make all sorts of weird URL path combinations available to redirect to a sensible page. The simplest and most correct way to solve this i18n problem is to issue 404s for anything which is not a valid URL.

Freso’s picture

Status: Needs work » Needs review

@ fletchgqc: Try and take a peek at http://freso.dk/. I'm guessing you will land at the English version, so the first listed/most recent entry is node/25 - which is Danish content. (And using a patch I've uploaded earlier, going there will redirect to the Danish version.) Also note that once you're on the Danish version, it will no longer be called "node/25" but, eh... "2008/08/23/spam_anti_nigeria_scam". I do agree though, that getting a listing of translations might be preferable to being ruthlessly redirected to another language.

Also, it would be great if people would review my patch at #26 to have at least part of the solution (on which no one seems to disagree with the approach) committed.

fletchgqc’s picture

Hmmm... I wasn't aware that on the English front page, a link to Danish content will link with the path /en/node/25 rather than linking directly to /da/node/25. That seems to me like core handles things the wrong way, but I don't know enough about i18n to debate that. I don't understand all the complexities yet. In any case I can definitely understand why you would not want to send 404s.

The bottom line for me is that we don't have duplicate content. If you guys want to put in tricky solutions to redirect certain pages to other places with 301s, you obviously know why you are doing it so go for it and don't let me stand in the way. I.e. I'm therefore OK with all this redirecting discussed above.

Freso I will test your patch (the one that you are running on freso.dk) if you expand it to deal with domain language negotiation. According to the comments above domain language negotiation has been ignored (It then checks that the language negotiation is on path or path/language default (rather than NONE or DOMAIN).). This definitely needs addressing (and I can test it) - i.e. the problem that: de.ex.com/node/1 = en.ex.com/node/1. Is there any chance that your patch could be expanded to address this?

wrwrwr’s picture

@Freso: I don't understand why wouldn't you use "Language neutral" setting for this content that you'd like to be accessible for both English and Danish users. The way as it is now you can choose whether you want a "404" or a redirection to another language version, with such a patch you can't.

fletchgqc’s picture

Status: Needs review » Needs work

@Freso are you still on the case and able to do this? The idea is definitely valid.

Freso’s picture

Status: Needs work » Needs review

I'm not "assigned", but yeah, I'm sitll on the case. I just got back from Turkey a few days ago, bringing back home with me a messed up stomach... so I'm currently trying to deal with that, meaning that I haven't spent a lot of time on the computer(s).

Anyway, there's still the patch in #26 which I believe should be reviewed and possibly checked in, while we're discussing the other use case. Nicholas?

Also:
@fletchgqc: The reason for the language specific linking is most likely that it doesn't look for alias using language, and yes, this might be considered a bug, as it is doing this in other places. (And if you file or find a bug on this, please do send me a link. :))
I also think it should be doable with adding support for multiple domains, though I have no experience with this... so I'll definitely feedback from testing such things.
Also, I don't like the 404 idea. :) I really like the "Multiple Choices" page idea though - perhaps with a HTTP codes 303 and 300, depending on whether there is one or multiple options? If not, then at least 307 instead of 301, as the content may become translated in the mean time. I should implement this in the current patch...
@wrwrwr: Ideally, all my content would be available in both English and Danish. I just don't have the time/energy to both write the entries and translate them. But then, I might one day. Making them language neutral would mean more work once I'm translating the content. It would also not allow Drupal to make HTML that tells the browser (and search engines) the language of the content in question (e.g., Danish content appearing on the English section, would be misinterpreted as English).

fletchgqc’s picture

Status: Needs review » Needs work

@Freso:
I tested patch #26 and it has no noticeable effect when domain language negotiation is being used. Therefore I don't think that it should be applied. On the other hand, domain language negotiation is not a core feature AFAIK, only an i18n module feature so based on that argument perhaps you could get away with it. But personally I think the patch should address both cases... last I heard they want to put domain negotiation in core 7.x anyway. So I'm marking as "code needs work".

Freso, I don't really agree with your rationale for issuing patch #26. The way I see it, lots of people will have a say but very few will do the actual work of writing and testing code. That's fine - everyone's opinion is appreciated and there's no obligation to contribute code to Drupal; however I think the opinion of people that do actually write code should have a lot more weight. Otherwise momentum to get things done will get lost amongst heaps of feature suggestions that no-one has the motivation to write.

Therefore, since you are the only one that has managed to write any code to address this issue yet, I think we should push ahead with your patch #17. The behaviour which you say it creates sounds excellent and does solve the problem. I think we should get it committed, and any other ideas discussed here (some of which may indeed be very good) should be addressed as future feature requests.

I applied patch #17, but the result for domain language negotiation was exactly the same as the above patch - no effect. I think we should try and fix this and then get it committed (forget #26).

Anyway, that's just my opinion. Am of course open to hear what you or others think.

gagarine’s picture

track

ar-jan’s picture

Has this ever been continued elsewhere, or has someone found a (partial) solution (for D6)? I'm still struggling with this problem...

mrfelton’s picture

Patch in 17 updated to apply to 6.x-1.2. Not done extensive testing yet, but it seems to be working. This will go a long way to fixing our SEO duplicate content woes.

dboulet’s picture

Also interested in getting this feature in.

Yas375’s picture

Status: Needs work » Needs review
FileSize
1.22 KB

hi guys!

I had the same problem.

2 languages: russian and english with two domain: russian.com and english.com
node/1 - russian page with alias 'o_nas.html'
node/2 - english version of node/1 with alias 'about.html'

And I would to have 301 redirect from russian.com/node/2 to english.com/about.html

I tried to apply your patches but this does not help. Then I wrote my own patch.

I pathed 6.x.1.2 and it's working for me.

I hope that my patch will help you too.

mrfelton’s picture

This version of the patch fixes a bug which was causing an infinite redirect on nodes with no language preference. It also combines the code from the patch at #38 to include support for LANGUAGE_NEGOTIATION_DOMAIN (I have not tested this, as I don't use that kind of language negotiation - Yas375, please could you test).

Lets get some testers please... I'd love to see this get fixed as the SEO benefit would be huge.

Dave Reid’s picture

iamer’s picture

The patch in #39 is working well for most scenarios.
I have a site which has Arabic only and English only content (not translated). The main site language is Arabic.
Using both alias or node/nid url the expected behaviour is :

* Arabic content types accessed from /en should be directed to /ar ( WORKS )
* English content types accessed from /ar or / should be directed to /en ( NOT WORKING )

Where should I start investigating this ?

Thanks for all the incredible work.

TripleEmcoder’s picture

Will this also work for node/<nid>/edit? Or any other tab that might be defined for nodes? It would be very beneficial for the Edit tab, as editing node in language X under site language Y can cause severe problems with menus and pathauto.

j0nathan’s picture

subscribing

hilrap’s picture

@mrfelton great patch! Works 100% for me!!

THANX A LOT !!! :-)

okokokok’s picture

Issue tags: +i18n

I've been doing some work on #422742: Needs redirect for cross language - which is a duplicate of this one...

I uploaded a patch there that works for me. I'll go compare it to mrfelton's patch now.
http://drupal.org/files/issues/language-prefix2.patch (I guess it's better to add the issue number to my patches :)

okokokok’s picture

@mrfelton's patch looks good and it's working very well for me.
Thanks!

klavs’s picture

I'll be trying @mrfelton's patch.

Just wanted to note, that if you use ubercart - it can't handle translations as seperate nodes pr. language (because then stock levels are duplicated etc. and VAT screws up) - so I solved that, by using the language sections module (where the node is set to language neutral) - but this means that the same node-alias (f.ex. /product1) - should also be shown if you go to /da/product1 - since the content on the same node can be different.

Andy Inman’s picture

A request if anyone has time to do some testing: I've tested MultiLink Redirect in combination with the current release of Global Redirect, but not with any of the proposed language patches. MR will redirect to whatever language+path it thinks is the "right" translation, so if that's not the same as what GR thinks, I suspect the result will be a redirect loop. Exact behaviour may depend on which order the modules were installed (or weight in system table, if set) I tested the combination, also with Secure Pages active, in different orders and all seemed to work. Reading through the patches here I'm not sure whether things will break. I've raised the issue of how multiple redirection modules might collaborate and avoid redirect loops here: #775748: Define a hook for other modules

FiNeX’s picture

I've applied the patch #39 and it works perfectly.

I really hope it will be added on the next release of the global redirect module.

Many thanks!!!!

FiNeX’s picture

Update: there are some issues with the patch #39: after some time I'm browsing the website, some pages are no more opened and there is an error message which indicate that the redirect has not been succesfully executed even if the requested page was a valid path. It happens with a lot of pages which have a translation.

This is the firefox error message:


Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

Moreover the You have been redirected to a page in another language (%lang1) as the [...] warning is printed multiple times on other pages.

FiNeX’s picture

Update 2: After some tests, i've found an error on the node table: the nodes which gave me the error didn't have anything on the language field. So I've updated the DB and now it is all ok.

Maybe the patch should be updated with some check about nodes without a language.

For example I've a website with some translatable content type and some other which aren't.

Bye

YK85’s picture

subscribing

okokokok’s picture

Patch #39 works most of the time, though I also still encountered looping issues with it - might be related to #52. (Tried this a while ago, and I wrote something but I must have forgotten to save it.)

However, it seems to do the reverse of what I want. For some of the sites I work on Google has already indexed content under the wrong domain name, I don't want these already indexed page to suddenly contain content in another language. E.g. if node/123 is a French page on example.com I want it to redirect to example.fr/titre-du-truc and not to example.com/title-of-the-thing - example.com/node/123 might already be indexed (in French) and a 301 to example.com/title-of-the-thing would be less good for the overall ranking in google.

Ideally this would be an option in global redirect:
o redirect to proper URL of the node
o redirect to translation of the node

okokokok’s picture

With the redirect to the translation of the node there's also a problem is occurring at node/123/translate.
If node/357 is a translation of node/123, node/123/translate will simply link to node/357, which will be redirected to node/123. Not good.

So maybe "redirect to translation of the node" shouldn't even be an option, since there will probably be similar issues as well.

donquixote’s picture

netgenius said a lot of important things in #7, which seem to have been lost along the way.

We are all talking about mixed urls of the type http://example.com/fr/article-in-english.

How does this page look like?

  • Navigation will be in French, content will be in English.
  • When the visitor has finished reading, he can continue surfing in French.

Who should see this page, and when? Where would the page be linked?

  • English content as a fallback for the missing French translation. The visitor wants to continue surfing in English, when he has finished reading. Thus, the navigation needs to be in French. The link to fr/article-in-english can be in the French menu.
  • If there is a translation in French, then we don't have to link to fr/article-in-english. Curious people can visit (en/)article-in-english, to see the original version. But, if someone explicitly wants to visit fr/article-in-english by typing the url, then why stop him?

Ok, so what can we do to avoid google penalty and user confusion ?

  • We should avoid links to mixed language urls, unless it is for a language fallback, or we have another good reason.
  • We need to avoid having these pages indexed in Google. What about <meta name="robots" content="noindex" /> ?
  • There could be a note saying "You are seeing content from the English version of example.com, which has not yet been translated in French. Click here to see the article on the English site".

If everything is done right, these pages will be nearly irrelevant for SEO. Unless some external site plays a game of posting hundreds of links to our mixed language urls.

Furthermore, it is very unlikely that anyone would visit fr/article-in-english without having a good reason for it. The only two reasons that I can imagine at the moment is (a) language fallback, (b) curiosity, or unhappiness with the translated version. I guess the visitor in case (b) would not mind to switch the entire site to english, but there could be exceptions.

Bilmar’s picture

subscribing

Andy Inman’s picture


We need to avoid having these pages indexed in Google. What about meta name="robots" content="noindex" ?

That seems like a good idea to me (though probably outside the scope of GR). I think the ruling needs to be: If current url is mixed language AND a suitable translation exists. So, if that condition is true then add the meta tag to the page HEAD.

Then, for www.example.com/fr/english-content - check to see if a French version exists and if it does then add the meta tag. Whether or not to also redirect to the French version is another question.

donquixote’s picture

An all-inclusive patch (heavy refactoring + mixed language redirects + hook for other modules) can be found here:
http://drupal.org/node/803830#comment-2991412
("A bit of refactoring / code readability")

okokokok’s picture

We need to avoid having these pages indexed in Google. What about meta name="robots" content="noindex" ?

If links are pointing to noindex pages you loose link juice. If you redirect them to the canonical page you don't.
So I wouldn't spend any time or money to achieve this.

robby.smith’s picture

subscribing - should this be closed as duplicate of #803830: A bit of refactoring + mixed language redirects?

Pedro Lozano’s picture

Patch #39 works great. Google had indexed some of my spanish nodes without the alias :-(

donquixote’s picture

#60:
for my taste, yes, make it a duplicate.
What do the maintainers say?

Iztok’s picture

I used the patch #39 and it works great!

Tnx guys, u saved the day! Please get in touch if you would need some design help or sth...

nicholasThompson’s picture

Hi guys.

I have commited the patch in #39 to DRUPAL-6--1, along with an option in the admin section to disable it. I still need to write a SimpleTest for it, but my "manual" testing proved promising.

donquixote: I will apply your "refactor" code at some point - I've been thinking about it and you're right, functions will improve readability and debugging.

netgenius: This patch will take the user to the page for the current or default language... So /en/french-node will redirect to /en/english-node (and vice versa). It also handles some of the "untidy" URL's Drupal generates (I noticed in my manual testing that if you select a language (/fr) then all "foreign" URL's seem to lose their alias! Could have been me though...)

YK85’s picture

Should the status be set to 'fixed' as #39 was committed? Or 'needs work' for the SimpleTest?

donquixote - as the code at #39 was committed, would you be able to kindly re-roll your 'refactor code' patch for review?

Sorry - was trying to better understand what to look for here. Awesome module!!

Thanks!

donquixote’s picture

The point of my patch was to get a more manageable code to work with, before introducing new features. It would have been better to do my refactoring patch first, and then re-roll the native language thing, if necessary. My own patch contained its own native-language thing, btw, but it would have been easy to only take the refactoring, and continue the discussion from there.

I don't feel very motivated to do this again.

donquixote’s picture

I think a reasonable thing to do would be to start a 2.x branch with the refactored code.

robby.smith’s picture

i hope a new stable 6.x version will be released for this awesome module!

mkalkbrenner’s picture

Category: feature » bug
Priority: Normal » Critical
FileSize
1.72 KB

Due to the fact that the patch from #39 already has been committed to CVS I turn this issue into a bug report.

If you use LANGUAGE_NEGOTIATION_DOMAIN on your site you end up in infinite redirect loops for language-neutral aliases. I consider this as critical.

I have attached a patch to solve this issue by skipping the redirect to a different domain for language-neutral aliases which are valid for every language. The code then falls back to the "standard" global redirect rules.

FiNeX’s picture

About comment #51. The patch #39 should be modified adding a condition: the switch fragment should be executed only if the node has the "language" field is not empty otherwise the procedure will loop.

$nodesrc = node_load(arg(1));
if (empty($nodesrc->language)){
  switch(...){
  ...
  }
}
wizonesolutions’s picture

Subscribing to this. Seems like it's really close to being able to solve the issue of the default language not having a prefix and thus some paths returning page not found when language fallback tries to fall back to the user's default browser language instead of the site's default language.

If this is not what this hopes to solve...can someone let me know and maybe give me a hint? :) The issue I'm having is that when I go to www.mysite.com/books with my French-default browser, I get a page not found because I don't have an alias of "books" in French. Now, if I went to /fr/books, this would make sense, but "no prefix" should mean "English" so I expect to be taken to the English version. We have lots of incoming links without a path prefix, so getting this to work is reasonably important...that, or just finding a way to redirect the incoming links (which are intended to go to English pages) to the /en pages...in other words for /books either to go to the English /books or even just /en/books. Main thing is that we don't wind up with broken links or links that redirect people to the "language neutral" versions of the aliases, because lots exist...

Something along the lines of that. Am I in the right place?

heyyo’s picture

1) With the last dev I have still bug with taxonomy term in vocabulary in Per language term mode.
a) When accessing term aliased from this vocabulary in wrong language interface, no redirect 301 or 404 page error is processed.
http://fr.example.com/term-aliased-in-english will load english term in the french interface.

b) When accessing term in taxonomy/term/id form from this vocabulary in wrong language interface, no redirect 301 or 404 page error is processed.
http://fr.example.com/taxonomy/term/ID-English-term will redirect to http://fr.example.com/term-aliased-in-english
english term in french interface

2) With vocabulary in "Localize terms" mode
When accessing term aliased from this vocabulary in wrong language interface:
http://fr.example.com/term-aliased-in-english I get 404 error for same url

Andrew Answer’s picture

For me, this code piece work fine - it detect any "/node/*" URL, compare node language with current site language and redirect to 404 if differ. Put code to settings.php, and use DOMAIN language negotiation. (Drupal 6.19 used):

function custom_url_rewrite_inbound(&$result, $path, $path_language) {
  global $language;
  if (preg_match('/^node\/(.*)/', $path, $matches)) {
    $nid = $matches[1];
    $node_language = db_result(db_query("SELECT language FROM {node} WHERE nid=%d", $nid));
    if ($language->language != $node_language) {
      //this will cause a page not found
      $result = '';
    }
  }
}
radiobuzzer’s picture

Nope, Andrew's code in #73 worked fine with displaying nodes, but caused a 404 not found error when I was trying to add content with /node/add/story , obviously because /node/add doesn't have a node id in this language (nor in any language whatsoever).
I tried to change reg_match('/^node\/(.*)/' with reg_match('/^node\/([0-9]*)/' which should be the right solution, but I couldn't make it work properly, either.

radiobuzzer’s picture

Hi again. Does anybody have an update on this? I just downloaded the latest dev version and I don't get an option on the administrative page, or any function like that de-bugged in this page.

Andrew Answer’s picture

Updated version do not filter URLs like node/add:

function custom_url_rewrite_inbound(&$result, $path, $path_language) {
  global $language;
  if (preg_match('/^node\/(.*)/', $path, $matches)) {
    $nid = $matches[1];
    if (!is_numeric($nid)) return;
    $node_language = db_result(db_query("SELECT language FROM {node} WHERE nid=%d", $nid));
    if ($language->language != $node_language) {
      //this will cause a page not found
      $result = '';
    }
  }
}
colan’s picture

Re-rolled the patch from #69 with the additional check from #70, fixed comments, added whitespace and brought the whole section up to Drupal's coding standards.

I was only able to test this with language domains, so if someone could take this for a spin with prefixes, I'd appreciate it.

splash112’s picture

Patch from #77 works for me. No loops so far and sometimes I get decent redirects.

splash112’s picture

FileSize
4.47 KB

Extended the patch above (which has been working flawless for some time now) with a bit of my own.

The extra lines of code added will redirect per language taxonomy terms to their translated counterpart for the current language/domain if needed/possible with the help of i18ntaxonomy if enabled.

dutch.nl/category/english-term - will redirect to
dutch.nl/category/dutch-term

Will make patch later, for now just the module file.

splash112’s picture

Patch as described above.

For users of Taxonomy per language terms, because all terms will show up in all languages. This patch redirects the terms in the wrong language to the local term if available.

dutch.nl/category/english-term - will redirect to
dutch.nl/category/dutch-term

colan’s picture

Status: Needs review » Needs work

Looks like it's missing the stuff from #77? Let's keep everything together. Adding terms is a great idea! :)

splash112’s picture

Hi Colan,

Could do that, no problem. Code in #77 is RTBC as far as I can see.
Hopefully get some feedback from the project maintainers so it can be commited.

willeaton’s picture

Hi, I needed the following scenario to work...

1. Spanish page visited without prefix redirected to add prefix
2. Spanish page visited on drupal path without prefix redirected to alias with prefix added
3. English page visited with Spanish prefix redirected to remove prefix
4. English page visited on drupal path with Spanish prefix redirected to alias without prefix

I accomplished this with the following coded added to globalredirect.module around line 151...

// Compare the request to the alias. This also works as a 'deslashing' agent. If we have a language prefix then prefix the alias
    if ($_REQUEST['q'] != $prefix . $alias) {
      // If it's not just a slash or user has deslash on, redirect
      if (str_replace($prefix . $alias, '', $_REQUEST['q']) != '/' || $redirect_slash) {
        drupal_goto($alias, $query_string, NULL, 301);
      }
    }
	//START NEW CODE BY WILLIAM EATON
	if (module_exists('translation') && (arg(0) == 'node') && is_numeric(arg(1)) && (arg(2) == '')) 
	{
		switch(variable_get('language_negotiation', LANGUAGE_NEGOTIATION_NONE)) {
			case LANGUAGE_NEGOTIATION_PATH_DEFAULT:
			case LANGUAGE_NEGOTIATION_PATH:
				$node = node_load(arg(1));
				if ($node->language != $language->language) {
					
					$path = 'node/' . $node->nid;
					$alias = drupal_get_path_alias($path, $node->language);
					$languages = language_list('enabled');
					$languages = $languages[1];
					$prefix = $languages[$node->language]->prefix;
					$language = $node->language;
		
					if($prefix != '')
						$alias = $prefix."/".$alias;
					else
						$alias = $path;
		
					drupal_goto($alias);
				}
				break;
		}
	}
	//END NEW CODE BY WILLIAM EATON

    // If no alias was returned, the final check is to direct non-clean to clean - if clean is enabled
    if ((variable_get('globalredirect_nonclean2clean', GLOBALREDIRECT_NONCLEAN2CLEAN_ENABLED) == GLOBALREDIRECT_NONCLEAN2CLEAN_ENABLED) && ((bool)variable_get('clean_url', 0)) && strpos(request_uri(), '?q=')) {
      drupal_goto($request, $query_string, NULL, 301);
    }
Anybody’s picture

Anything news regarding this issue? This is still quite important I think, so would be nice to have it running soon.

The current problem is the outstanding feedback of the project maintainers?
Thank you :)

feuillet’s picture

Should have a solution for this too. Any help appreciated.

colan’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Category: bug » feature
Priority: Critical » Major
Status: Needs work » Active

In the interests of keeping things simple for the maintainers, @splash112, could you create a new issue for your patch (#80)? Let's mark this one as RTBC and get it committed.

colan’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
Status: Active » Reviewed & tested by the community

I don't see why we can't port this to D7 after the D6 version (#77) is committed.

rjmackay’s picture

Just testing #77 on a client site and it's looking good.
Will this be committed soon?

colan’s picture

I just sent a note to the maintainer.

UPDATE: See #1244396: Offering to co-maintain Global Redirect for details. I'm following the process over at Dealing with abandoned projects.

p.brouwers’s picture

looking forward to this as well!

nicholasThompson’s picture

The module is not abandoned - I appologise for the lack of reply to this. I have personally been in the middle of probably the busiest part of my life to date (changed jobs and moved home in the space of a week).

I've reviewed the patch in #77 by eye and it looks sensible enough to me.

Spash112's post from #80 looks like a sensible idea too. I will try my best to allocate some time to this asap.

elsteff1385’s picture

I'm having trouble with patch #77.

"fatal: git diff header lacks filename information when removing 1 leading pathname components (line 5)"

I'm a little confused how to fix this. Any hints?

colan’s picture

We shouldn't need to port this for the D7 version. i18n's "Translation redirect" module should handle this, but not if you're logged in. The reasoning behind this makes sense. See #1196784: Why does i18n_redirect (Translation Redirect) only work for anonymous users? for details.

However, as per #1188718: Translation redirect not working with language domains, I couldn't get it work for me.

Andy Inman’s picture

Please excuse the plug :) but may I suggest that anyone who needs language-based redirection in D7 might consider taking a look at my MultiLink Redirect, part of MultiLink. I will be testing (as with the D6 version) that it works with GlobalRedirect also active. Unlike Translation Redirect, it can handle both logged-in and anonymous users, and respects user's language preference (from account, browser, etc.) It also provides a redirect-override permission to aid with testing (intended redirect is reported as a message, not actually performed.)

gundara’s picture

subscribing

mkalkbrenner’s picture

Patch #77 (including my fix from #69) adjusted for globalredirect 6.x-1.4

the_g_bomb’s picture

Issue summary: View changes

I have another scenario to consider for this.

You install drupal.
You create several thousands of nodes
something changes internally and you suddenly need to add some nodes with translations
switch on content translation
default language is left as English, but the prefix is not set so all language neutral and English nodes get no prefix.
prefix for french is set to fr
prefix for german is set to de
add node/2001 with language set to English
add fr/node/2002 in French
add de/node/2002 in German

Go to /fr/node/2002
Go to /fr/node/1999 - English content shows up as language is not set. (Language is not set for this node)

If some of the content is left as language neutral, the language prefix is not removed, which can produce a duplicate content scenario.

To solve this we need to also check:

<?php
if (empty($node->language) && language_default('language') != $language->language)
?>
the_g_bomb’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
5.33 KB

Here is a patch with #77 + #80 and a potential fix for my scenario above.