Closed (duplicate)
Project:
Pathauto
Version:
5.x-1.x-dev
Component:
Code
Priority:
Critical
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
4 Jan 2007 at 15:06 UTC
Updated:
16 Jan 2007 at 16:50 UTC
Hello, the Czech letter "š" is not converted to s in 4.7. Instead it is converted to "-". This causes problems when editing an existing node, because the Pathauto changes the node URL again and omits the letter "š" (which I have to change manually in the URL Alias menu option). The effect is that after editing an existing node the URL changes inadvertently and leads to "Page not found".
Example:
Node title:
Objednávka starších čísel
Pathauto URL:
objednavka-star-ich-cisel
Corrected URL:
objednavka-starsich-cisel
After editing an existing node:
objednavka-star-ich-cisel
Thanks for any help. Roman
Comments
Comment #1
intu.cz commentedIs this part of the code what makes letters with accents convert to pure a-z?
Because if it is, there are some (a lot of) characters missing. Including "š" which has been a major pain on a website here. I'd like to help
and solve the problem, but it would be better if I could cooperate with somebody more knowledgeable.
Thanks
Roman
Comment #2
intu.cz commentedThe same problem applies as in http://drupal.org/node/106817
Roman
Comment #3
intu.cz commentedThe same problem applies to 4.6 as well. I haven't found any better way of expressing this (there is no option in the form to select more than one version of the module.)
My suggestion would be twofold:
In the short run:
1) The missing characters could be added manually. What I did was to change the lines so that I have:
which covers what our website in Czech needs. Some other languages might still miss characters...
2) Some kind of transcription mechanism might be found to work in general for all languages. Why is the following conversion mechanism in pathauto.module commented out?
In the long run:
1) Change the approach to i18n and Drupal Translations to encompass:
Translation of strings works, but could be less problematic in multilingual websites (though I admit to having seen the i18n module long time ago).
Typography: nobody seems to pay attention to it, besides Smartypants, but that is English-centric. Here is what happens if you ignore it:
http://www.ahmadinejad.ir/en/merry-christmas-to-everyone/
Transliteration tables could be another package in a system covering all language aspects. Looking at the code above: where are Cyrillic letters, Arabic, Armenian, Klingon, etc. ?
Roman
Comment #4
Láďa commentedI have to agree with this in 5.0-rc2. There should be another two lines
each one after theirs uppercase counterparts. Is this enough or should I make the patch?
All other czech characters are ok, tested with "příliš žluťoučký kůň úpěl ďábelské ódy" (uppercased and lowercased), what is the shortest known czech sentence including all accented characters.
Comment #5
intu.cz commentedNot only those two, but a few more should be included in the patch. I tried to suggest the additions in http://drupal.org/node/106817#comment-175985 under 1).
Comment #6
gregglesFirst, I'm not making changes to the 4.6 branch, so I changed the version.
Second, I'm not making changes to the transliteration array as proposed. See the reasoning for the elsewhere in the issue queue.
Third, there is a solution posted now in another issue which needs testers (see http://drupal.org/node/61815) and I believe that is the best long-term sustainable solution.
I would appreciate your help in testing the patch in 61815 so that we can get it committed and included in future releases. It is hard for me to test this because I only use ASCII-96...