if t() string has no translation or fallback language, text should have lang attribute [#1165476]

Comment	File	Size	Author
#64	patch_commit_3310b19aeaa3.patch	2.12 KB	hanno
#62	patch_commit_3adb7b5aa35b.patch	3.1 KB	hanno
#42	patchtitle.png	33.62 KB	hanno
#40	NoTranslation2.png	108.83 KB	mgifford
#40	no-translation-1165476-40.patch	943 bytes	mgifford
#38	patch_commit_b16fa6744012.patch	1014 bytes	hanno
#30	patch_commit_663ae58b2fba.patch	871 bytes	hanno
#22	patch_commit_6f8639cd6e82.patch	813 bytes	hanno
#22	punctuation_before.png	68.72 KB	hanno
#22	punctuation_after.png	70.1 KB	hanno
#16	punctuation_example.png	80.02 KB	hanno
#14	quotation_mark.png	90.26 KB	hanno
#8	t-markup.patch	623 bytes	gábor hojtsy
#8	tSpan.png	170.72 KB	gábor hojtsy

Comment #1

mgifford

he/him

English

commented 22 May 2011 at 22:45

Issue tags:

+WCAG

Should it be in English or in the default language chosen by the user?

Log in or register to post comments

It assumes that text strings of Drupalmodules are in the English language, as that is a Drupal localization rule (http://drupal.org/node/322729)
If the site is for example in Dutch, and there is no translation available for the original string 'Post new comment', it should be written with the language attribute. Something like

<span lang="en">Post new comment</span>

Note that best practice for accessibility is off course to translate all strings.

Log in or register to post comments

Comment #3

gábor hojtsy

he/him

Hungarian

Hungary

commented 16 June 2011 at 12:48

The fundamental problem with trying to implement this is that when t() or format_plural() is invoked, we don't know if HTML is generated. We cannot just output HTML wrapped text if what's generated is for an email or an XML format like RSS. We don't have that information, so we'd risk breaking all emails and RSS feeds generated by the site if we'd do. The localization client module has the same issue over at #218021: Make it possible to translate by clicking on elements.

Log in or register to post comments

Comment #4

hanno commented 28 June 2011 at 21:01

Hmm, that's a fundamental problem indeed. The only (somewhat radical) solution to implement this is to add a parameter to t()?

Log in or register to post comments

Comment #5

hanno commented 28 June 2011 at 21:03

Hmmm, a fundamental problem indeed. The only, somewhat radical, solution for this is adding a parameter to t()?

Log in or register to post comments

Comment #6

gábor hojtsy

he/him

Hungarian

Hungary

commented 30 June 2011 at 06:45

Yes, to implement this, we'd either need to return a structure from t() instead of a string, so people can use just the string or the whole markup or alternatively we could pass in some information to t() that adding markup is ok (or rather that it is not ok, since that is less common).

Log in or register to post comments

Comment #7

gábor hojtsy

he/him

Hungarian

Hungary

commented 4 November 2011 at 08:10

Consider this as well:

$form['item'] = array(
  '#title' => t('Pick a number'),
  '#type' => 'select',
  '#options' => array(t('One'), t('Two'), t('Three')),
);

Now which of the four t()'s would be safe to return something like:

<span class="untranslated" lang="en">....</span>

To me it seems like only the first. The other three are option values, that are escaped for inclusion in the select box, so the select box would literally include the span tag. It seems to me like although we can add an extra argument to t(), this is not something I'd consider easy developer experience as they need to track how the strings are used in obscure places like this...

$form['item'] = array(
  '#title' => t('Pick a number'),
  '#type' => 'select',
  '#options' => array(t('One', array(), array('wrapper' => FALSE)), t('Two', array(), array('wrapper' => FALSE)), t('Three', array(), array('wrapper' => FALSE)))),
);

Well, not nice. I don't really have better ideas, since as you can see the kind of ways we use the output of t() can be very different from line to line.

Log in or register to post comments

Comment #8

gábor hojtsy

he/him

Hungarian

Hungary

commented 4 November 2011 at 08:36

Status	File	Size
new	tSpan.png	170.72 KB
new	t-markup.patch	623 bytes

You can try this for yourself easily. Just apply this simple patch and see how your Drupal site looks:

Log in or register to post comments

Comment #9

mgifford

he/him

English

commented 4 November 2011 at 12:45

Ok, so it would be adding an $option to the array passed to t():
t($string, array $args = array(), array $options = array())

Currently this supports the following elements:
- 'langcode' (defaults to the current language): The language code to translate to a language other than what is used to display the page.
- 'context' (defaults to the empty context): The context the source string belongs to.

The best case described so far is to add a 'nomarkup' element to this which would check to see that no SPAN tags have been added or run the output through PHP's strip_tags() to yank out the HTML.

Log in or register to post comments

Comment #10

gábor hojtsy

he/him

Hungarian

Hungary

commented 4 November 2011 at 14:37

Well, while that sounds technically true I doubt this would generally be accepted as a trade-off. It puts lots of burden on the developer to figure out where the output of t() will be used eventually. Sometimes it is pretty impossible to tell if you are in a generic API function.

Log in or register to post comments

Comment #11

mgifford

he/him

English

commented 4 November 2011 at 16:57

Is there a way that this could be perhaps handled by a contrib module? I'm just trying to see if there is any way to address this.

Log in or register to post comments

Comment #12

gábor hojtsy

he/him

Hungarian

Hungary

commented 4 November 2011 at 17:13

@mgifford: I don't think there is a contrib way. I'd love to be able to solve this issue, and mark all separate language items proper on the page but we need to consider the developer experience closely. Let's consider you have an API function somewhere like this:

function mymodule_status_of_service() {
  if (some_logic_to_do()) {
    return t('Enabled');
  }
  return t('Disabled');
}

Since we don't know whether this data is going to be used in a form item (escaped), text email (looks ugly with markup) or straight on a webpage (would be ok with markup), we don't know if that t() is allowed to add markup or not. It is really an API responsibility question. We should not spice up all of our API functions with arguments to pass to t() for markup, so I can tell this function that the output should or should not contain markup, right?

I don't really see a good way to do this, even if we add a nomarkup option to t() which I think people would be freaked out about, that would still not give us the tools we need to solve this, since often we don't even know what to pass in that option.

I do agree it would be great to solve this, but I'm at a complete loss of ideas. If I'd have had an idea in the past four years (since the issue at #218021: Make it possible to translate by clicking on elements was submitted), I'd have done it for my l10n_client contrib module honestly. It would be a very powerful thing.

Log in or register to post comments

Comment #13

hanno commented 6 November 2011 at 00:28

@Gabor thanks for all this testing. There is indeed no easy solution. In the long term, we could probably work with html texts by default?
Mark this as postponed for now?

Log in or register to post comments

Comment #14

hanno commented 8 November 2011 at 23:37

Status	File	Size
new	quotation_mark.png	90.26 KB

I have an idea. It is a kind of a dirty workaround, maybe more for #218021: Make it possible to translate by clicking on elements:
If untranslated, return the text enclosed with single quotation marks (‘’). So, if a text is untranslated you will get ‘Post new comment’ instead of Post new comment.

We can then do a find a replace in html to substitute ‘ and ’ with special tags.

Log in or register to post comments

Comment #15

mgifford

he/him

English

commented 9 November 2011 at 15:30

Interesting option.. Could possibly play into modules like http://drupal.org/project/l10n_client

Log in or register to post comments

Comment #16

hanno commented 10 November 2011 at 15:05

Title:

if string is untranslated, text should have english language attribute

» if string is untranslated, text should have english language and LTR attribute

Status	File	Size
new	punctuation_example.png	80.02 KB

Changing title: while testing, found out that this issue is also relevant for RTL based websites. As there is no LTR attribute, Drupal prints the text with the punctuations (.!?()) in the wrong direction. Maybe minor issue, but good to take note.

Log in or register to post comments

Comment #17

gábor hojtsy

he/him

Hungarian

Hungary

commented 11 November 2011 at 10:22

Well, the punctuation would need to be there if the text is RTL, which is what the markup tells the browser. I agree it would be good to have these wrappers to mark up RTL/LTR differences too, that unfortunately does not get us any closer to a solution :|

Log in or register to post comments

Comment #18

mgifford

he/him

English

commented 16 June 2012 at 13:56

Title:	if string is untranslated, text should have english language and LTR attribute	» if t() string has no translation, text should have lang="en" & dir="ltr"
Priority:	Normal	» Major

As discussed at A11ySprint. So much of the world doesn't use Drupal in English.

Log in or register to post comments

Comment #19

gábor hojtsy

he/him

Hungarian

Hungary

commented 16 June 2012 at 14:04

Any good ideas?

Log in or register to post comments

Comment #20

mgifford

he/him

English

commented 18 June 2012 at 14:35

Guess we need to write up an issue to help assess whether or not it's returning HTML or not. Alternatively, do some tag sniffing as part of processing the string.

Log in or register to post comments

Comment #22

hanno commented 5 October 2012 at 23:33

Status	File	Size
new	punctuation after	70.1 KB
new	punctuation before	68.72 KB
new	patch for ltr-correction of untranslated texts	813 bytes

This issue kept me thinking. As we can't differentiate between html, plain and xml output, I went looking for a solution to add an invisible symbol in the text that we could replace while rendering.
The good news is that unicode has special invisible characters for text direction information: http://www.w3.org/International/questions/qa-bidi-unicode-controls

I did a test in Drupal, and using this invisible characters, fixes the right-to-left punctuation problem of untranslated strings. Attached the patch.
- I used json_decode to add the special characters, probably there is a better method, but I couldn't find it).
- We could alternatively use the code U+202D to start LTR. Didn't test that code.

Log in or register to post comments

Comment #23

hanno commented 5 October 2012 at 23:34

Status:

Active

» Needs review

Log in or register to post comments

Comment #24

6 October 2012 at 00:05

Status:

Needs review

» Needs work

The last submitted patch, patch_commit_6f8639cd6e82.patch, failed testing.

Log in or register to post comments

Comment #25

Everett Zufelt commented 6 October 2012 at 07:13

I think the problem is when should this markup be added (if required?

Perhaps an API function should return untranslated, but translatable, strings, and then at the point that these strings are being output, or added to a render array, they should be translated. This would require separation of translation into two calls. This is more easily understood, but still a extra burden on developers.

// Indicate a string is translatable.
translatable('Disabled');

// Translate a string.
translate($string);

Edit: Perhaps the second function could mean translatable and translate. The first function would declare a string translatable only, the second would declare a string translatable and perform the translation, so that literals can still be used and simple strings need only be passed into one function before output.

Obviously the function names used are for demonstration. In this case API functions would declare a string to be translatable, but the t() function could be used to translate non literal strings.

Another problem, which is exposed by Gábor above, is the need to set the lang and dir on specific render elements

$element['#lang'] = 'en';
$element['#dir' = 'ltr';

What if all fields on my form are in english, except one? Or, a more complicated case:

$element['#type'] = 'radios';
$element['#options'] = array(
array(0, 'English', array('#lang' => 'en', '#dir' => 'ltr')),
array(1, 'French', array('#lang' => 'fr', '#dir' => 'ltr')),
);

Log in or register to post comments

Comment #26

mgifford

he/him

English

commented 6 October 2012 at 13:50

Great to see progress. Would be so neat if this patch could be brought in to address this on a more basic level.

@Hanno, you've added directionality via unicode, but not the language definition. I was quite hopeful to see this example here where they equated:
<Q lang="he" dir="rtl">...a Hebrew quotation...</Q>
and
‫״...a Hebrew quotation...״‬

Unfortunately I couldn't find a way to do that in English.

Having the untranslated string display in left to right I think would be an improvement for sighted users.

Ultimately we need to insert LTR & English around strings that haven't been translated. The unicode solution also won't work for XML I don't think.

Think we might have to go back to API function changes as per @Everett's note above.

Log in or register to post comments

Comment #27

hanno commented 7 October 2012 at 22:21

@Everett Well, there are different directions for a solution
1. Handle every untranslated string as an error. Especially for anonymous users, every string should be translated. So post an error message in the watchdog table that a translation string could not be found (similar as file not found).
2. Make a smarter t()-function, that can accept html=true/false and/or that return an array (translated yes/no, plain/html, language)
3. add invisible information to the untranslated string and replace that while html rendering with language code

Ad 1: Not that sure if every administrator is happy with that
Ad 2: this is proposed by Mike in #9, Gabors answered in #10 that this is hard or impossible to tell for module developers. I prefer this solution as well, but this needs patches and testing in core, and I doubt this can make it for Drupal 8
Ad 3: using unicode could be a solution that can eventually work, that doesn't break anything. Important issue is whether it is possible to replace unicode by 'lang=' and 'dir=' for html-rendering.

@Mike unicode text direction is possible in XML, but not preferred. It depends on the receiver if this is used or ignored and if markup is preferred.

Log in or register to post comments

Comment #28

hanno commented 7 October 2012 at 22:19

@Mike, There are indeed language tags in unicode, but they are depreciated. So, I suppose no browser and text-to-speech-software will handle this unicode characters. However, we could probably investigate if it is possible to use this unicode character as an helper and replace it for real language tags during rendering?

The English language tag in Unicode characters would be: U+E0001, U+E0065, U+E006E
End the end tag would be: U+E007F

During html rendering we could replace them with lang="en". Can we do this DOM manipulation at the end of building a page?

For detailed reading about the language tag: The Unicode Standard - Chaper 16 (p565)

Log in or register to post comments

Comment #29

mgifford

he/him

English

commented 8 October 2012 at 20:57

Thanks @Hanno for your persistence on this issue. Language of parts is an important part of WCAG 2.0AA and one that isn't well supported anywhere.
http://www.w3.org/WAI/WCAG20/quickref/#qr-meaning-other-lang-id

It's too bad that the language tags in unicode have been depreciated. But yes, it's unlikely that AT will support it if they don't already..

Interesting the option of inserting the depreciated unicode and then swapping it with JS. It shouldn't cause a problem I would assume.

I like the idea of handling every untranslated string by an anonymous users as an error. I think that many sites would see it this way anyways.

I do wonder what the performance implications might be for any of these solutions.

Log in or register to post comments

Comment #30

hanno commented 11 October 2012 at 22:41

Status:

Needs work

» Needs review

Status	File	Size
new	patch_commit_663ae58b2fba.patch	871 bytes

Well, included a patch that, besides the official direction characters, also adds the depreciated language unicode characters. These characters can optionally get rendered in the right places to html attributes lang and dir. For sure by javascript, but probable also somewhere around Drupal_render.

The str_replace below works in a theme:

print str_replace( drupal_json_decode('"\udb40\udc01\udb40\udc65\udb40\udc6e\u202a"'),'<span lang="en" dir="ltr">' , str_replace( drupal_json_decode('"\u202c\udb40\udc7f"'), '</span>' , $text));

Three questions about this solution:
- Is this a valid solution?
- Is json encode the preferred way to include these unicode characters?
- Is there a way to manipulate the html dom, or do a smart replace somewhere before output, to add the lang attribute?

Log in or register to post comments

Comment #31

11 October 2012 at 22:50

Status:

Needs review

» Needs work

The last submitted patch, patch_commit_663ae58b2fba.patch, failed testing.

Log in or register to post comments

Comment #32

mgifford

he/him

English

commented 11 October 2012 at 23:03

Issue tags:

+i18n, +language of parts

adding tags.

Log in or register to post comments

Comment #33

gábor hojtsy

he/him

Hungarian

Hungary

commented 12 October 2012 at 07:29

Why are you using json wrappers when the same chars can be represented in PHP strings just as well (and in much less CPU time :)? Also, are you envisioning all strings in themes be wrapped in this decoding code? How do you think this could be feasible to implement?

Log in or register to post comments

Comment #34

hanno commented 13 October 2012 at 22:41

I tried to add these unicode characters directly with an escape code, but couldn't get it to work in php and followed this advice: http://stackoverflow.com/questions/6058394/unicode-character-in-php-string Would also prefer to have the characters there directly. Also tried to copy paste them directly, but that didn't work either. Using the htmlencoded variant worked, but had the problem that on several places this code was printed.
The str_replace in a theme is indeed not a workable solution, i added it as a proof of concept to see if the conversion was possible. I am hoping for a way we can do a general find and replace in the DOM-structure or when rendering HTML, somewhere added in a general theme output function. Is there a place where we can loop through the Drupal html output?

Log in or register to post comments

Comment #35

hanno commented 13 October 2012 at 22:41

I tried to add these unicode characters directly with an escape code, but couldn't get it to work in php and followed this advice: http://stackoverflow.com/questions/6058394/unicode-character-in-php-string Would also prefer to have the characters there directly. Also tried to copy paste them directly, but that didn't work either. Using the htmlencoded variant worked, but had the problem that on several places this code was printed.
The str_replace in a theme is indeed not a workable solution, i added it as a proof of concept to see if the conversion was possible. I am hoping for a way we can do a general find and replace in the DOM-structure or when rendering HTML, somewhere added in a general theme output function. Is there a place where we can loop through the Drupal html output?

Log in or register to post comments

Comment #36

pancho

UTC+2 🇪🇺 EU

commented 14 October 2012 at 23:22

Category:

feature

» task

Awesome that you're making some progress!
This is at least a task, so the feature freeze shouldn't stop you IMHO.

Log in or register to post comments

Comment #37

hanno commented 23 October 2012 at 09:23

@Pancho @Gabor, is it an idea to add a hook here for untranslated strings?
Or should we include the unicode characters and work on a conversion later?

If we have a hook for untranslated strings, we can create contributed modules for different solutions:
- to help the translation client module with inline translations
- to add a language negotiation fallback mechanism for t strings.
- to write errors to the log when a string appears untranslated
- to add unicode and rewrite the dom by javascript

Log in or register to post comments

Comment #38

hanno commented 23 October 2012 at 21:14

Status:

Needs work

» Needs review

Status	File	Size
new	patch_commit_b16fa6744012.patch	1014 bytes

Patch with the unicode characters directly encoded in valid utf-8 format.
Tested and resolves the direction issue for RTL languages.

if ($locale_t[$langcode][$context][$string] === TRUE) {
  	return "\xF3\xA0\x80\x81\xF3\xA0\x81\xA5\xF3\xA0\x81\xAE\xE2\x80\xAA" . $string . "\xE2\x80\xAC\xF3\xA0\x81\xBF";
  }

Log in or register to post comments

Comment #39

23 October 2012 at 21:30

Status:

Needs review

» Needs work

The last submitted patch, patch_commit_b16fa6744012.patch, failed testing.

Log in or register to post comments

Comment #40

mgifford

he/him

English

commented 24 October 2012 at 14:03

Status:

Needs work

» Needs review

Status	File	Size
new	no-translation-1165476-40.patch	943 bytes
new	NoTranslation2.png	108.83 KB

I had some problems applying the patch so just refreshed it.

In looking at the code though I want to confirm that from the source perspective like:

Actually, it seems I can't paste in the code here as the first character after the title="

It seems that this is being stripped out along with everything following it by the filters we're using.

Log in or register to post comments

Comment #41

24 October 2012 at 14:42

Status:

Needs review

» Needs work

The last submitted patch, no-translation-1165476-40.patch, failed testing.

Log in or register to post comments

Comment #42

hanno commented 25 October 2012 at 08:02

Status	File	Size
new	patchtitle.png	33.62 KB

Thanks for this refreshed patch. Don't understand why it fails on the tests. Any clue?
Here a screenshot of the title attribute. These characters aren't printed by webbrowsers, but they are send to the browser and not filtered by Drupal functions as check_plain afaik. Attached a screenshot of the title attribute of the menu correct with and incorrect without this patch (rtl language chosen, without translated strings).

Log in or register to post comments

Comment #43

hanno commented 25 October 2012 at 08:51

Probably the tests fails due to SimpleXML parsing and these unicode characters?.

Log in or register to post comments

Comment #44

mgifford

he/him

English

commented 25 October 2012 at 13:03

I think the problem comes because the tests are looking for absolute strings, but the strings are different. Your patch appends characters which we can't see but that are there. I think the tests just need to be modified to allow for "\xF3\xA0\x80\x81\xF3\xA0\x81\xA5\xF3\xA0\x81\xAE\xE2\x80\xAA" . $string . "\xE2\x80\xAC\xF3\xA0\x81\xBF" as right now it's looking for $string in most cases.

It's a brilliant approach I think. It's only going to be wrapped around untranslated strings when in another language so for most sites it isn't something anyone is going to know about.

It will make some things harder to troubleshoot, but really like the approach you've developed here.

Log in or register to post comments

Comment #45

mgifford

he/him

English

commented 25 October 2012 at 13:07

EDIT: There was a duplicate.

But thought I'd add here that I do believe that non-English sites seeking to meet WCAG 2.0 AA will need this. Will be useful to not have to hack core to be able to implement this. Not sure if there's any way to do this.

Log in or register to post comments

Comment #46

yesct commented 8 January 2013 at 20:43

Is the next step here still:
more discussion to settle on an implementation?

Is there anyone we can ping that has not had a look at this yet that might be interested or have some expertise?

Log in or register to post comments

Comment #47

mgifford

he/him

English

commented 9 January 2013 at 14:43

Status:

Needs work

» Needs review

#40: no-translation-1165476-40.patch queued for re-testing.

Log in or register to post comments

Comment #48

mgifford

he/him

English

commented 9 January 2013 at 14:46

Partly I think it's a matter of getting the patch to pass the bot. I've just re-queued it so perhaps it will work.

There are also open concerns by Gábor Hojtsy that haven't been addressed. Updating the issue with a summary of the concerns and responses would be useful.

Log in or register to post comments

Comment #49

9 January 2013 at 15:43

Status:

Needs review

» Needs work

The last submitted patch, no-translation-1165476-40.patch, failed testing.

Log in or register to post comments

Comment #50

mgifford

he/him

English

commented 29 January 2013 at 19:54

Issue tags:

+Needs tests

Log in or register to post comments

Comment #51

hanno commented 29 January 2013 at 23:08

Not sure why this isn't working, is it the unicode characters that doesn't parse through the testbot? If so, does that probably mean this is too experimental to implement?

If we can't find the reason before the code freeze we can instead create a hook on the t-function. We could use that hook for a contrib module.

Log in or register to post comments

Comment #52

mgifford

he/him

English

commented 30 January 2013 at 15:26

I like that approach. Would probably be easier to get into core anyways. Any thoughts on how you're do that?

Log in or register to post comments

Comment #53

hanno commented 26 February 2013 at 08:38

Well, same as the hook with the issue for the date_formats to support other calendars:
#1178342: Allow contributed modules to alter the format_date() function result.
so, patch needed. Contributors could also use that hook to look up a word in another preferred fall back language when it's not available in the requested language.

Log in or register to post comments

Comment #54

hanno commented 11 May 2013 at 20:57

I researched why the patch failed.
It happens because the testbot fails to write to the simpletest table in the 'message' colom. (General error: 1366 Incorrect string value: '\xF3\xA0\x80\x81\xF3\xA0...' )

It fails because the supplied utf8 characters for the language change are four bytes. Mysql doesn't support the 4 byte section of utf8. As this is also an issue for other characters, this is all explained in this issue #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)

The solution for the language change in unicode can't be implemented before that issue is solved.

Log in or register to post comments

Comment #55

pancho

UTC+2 🇪🇺 EU

commented 12 May 2013 at 10:32

Status:

Needs work

» Postponed

So postponing on #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)

Log in or register to post comments

Comment #56

pancho

UTC+2 🇪🇺 EU

commented 12 May 2013 at 11:06

Category:	task	» bug
Priority:	Major	» Normal

Also recategorizing as a bug. The screenshots in #16 should be pretty convincing.
Even in a fully RTL translated scenario, it's practically impossible not to have a few untranslated strings left, if it is desired at all.

Log in or register to post comments

Comment #57

pancho

UTC+2 🇪🇺 EU

commented 13 May 2013 at 12:19

Status:

Postponed

» Needs work

Not sure if this really needed to be postponed.
The question ist: How far can we get without using the 4-byte characters?
It might be worth getting that part fixed and do a followup, whenever #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols) lands, which is quite unclear atm.

Log in or register to post comments

Comment #58

heine commented 13 May 2013 at 12:47

Unicode control characters are forbidden in phrasing content per the HTML 5 candidate:

Text nodes and attribute values must consist of Unicode characters, must not contain U+0000 characters, must not contain permanently undefined Unicode characters (noncharacters), and must not contain control characters other than space characters. […]

Log in or register to post comments

Comment #59

hanno commented 13 May 2013 at 13:24

That's bad news. In an W3C article about bidi and html5 it seemed possible (http://www.w3.org/International/articles/inline-bidi-markup/#nomarkup):

"There are some situations where you may not be able to use the markup described in the previous section. In HTML these include the title element and any attribute value. In these situations you have to use the invisible Unicode characters that produce the same results."

Log in or register to post comments

Comment #60

heine commented 13 May 2013 at 18:20

The "control characters" are used somewhat ambiguously in normal speach, but the spec perhaps refers just to the "Control Codes". The Unicode glossary defines those as in the ranges U+0000..U+001F and U+007F..U+009F, which would make this approach at least non-forbidden.

Log in or register to post comments

Comment #61

hanno commented 13 May 2013 at 20:00

@Heine it's indeed really ambiguous mentioned in the W3C document, so a good catch, and in contrast with for example http://www.w3.org/International/docs/bp-html-bidi/ Probably these draft documents need some improvement as these chracters seems ok.

Log in or register to post comments

Comment #62

hanno commented 15 May 2013 at 21:17

Status:

Needs work

» Needs review

Status	File	Size
new	patch_commit_3adb7b5aa35b.patch	3.1 KB

Ok, here the patch that should solve the bug shown on #16.
- It adds the unicode characters for LTR and dir pop.
- unicode chracters as a const
- Added complexity to give untranslated plural strings properly LTR and dir pop
- Changed the test for plural strings as this test expects an untranslated string

Log in or register to post comments

Comment #63

15 May 2013 at 21:19

Status:

Needs review

» Needs work

The last submitted patch, patch_commit_3adb7b5aa35b.patch, failed testing.

Log in or register to post comments

Comment #64

hanno commented 15 May 2013 at 21:54

Status:

Needs work

» Needs review

Status	File	Size
new	patch_commit_3310b19aeaa3.patch	2.12 KB

Log in or register to post comments

Comment #65

hanno commented 15 May 2013 at 21:54

Log in or register to post comments

Comment #66

15 May 2013 at 22:55

The last submitted patch, patch_commit_3310b19aeaa3.patch, failed testing.

Log in or register to post comments

Comment #66.0

(not verified) commented 15 May 2013 at 22:55

Issue summary:

View changes

Adding quick summary of the main roadblock from Gabor.

Log in or register to post comments

Comment #67

mgifford

he/him

English

commented 16 June 2014 at 12:15

Issue summary:	View changes
Related issues:		+#1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)

Nice to see progress in #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)

Log in or register to post comments

Comment #68

matsbla commented 8 September 2014 at 00:24

Strings can fallback to other languages than English, e.g. Arabic (Egypt) can fallback to Arabic, thus it should have lang="ar" & dir="rtl" (so the writing direction need to respect the writing direction of the string language used)

Log in or register to post comments

Comment #69

hanno commented 9 September 2014 at 22:06

@matsbla, thanks to bring this to our attention. Now we have language fallback in t(), the writing direction and language attribute becomes even more important to meet wcag.
This issue hangs still on the 4-byte issue for the language code, but for writing direction it is already doable with the special unicode characters.

Log in or register to post comments

Comment #70

hanno commented 10 September 2014 at 22:15

I think it's better to split rtl in a separate issue as that one is easier to fix as the language code.
#2336491: if t() string has fallback language in another text direction, bidi should be added

Log in or register to post comments

Comment #71

hanno commented 10 September 2014 at 21:58

Title:	if t() string has no translation, text should have lang="en" & dir="ltr"	» if t() string has no translation or fallback language, text should have lang attribute
Issue summary:	View changes

Log in or register to post comments

Comment #72

hanno commented 10 September 2014 at 22:18

Log in or register to post comments

Comment #73

mgifford

he/him

English

commented 28 September 2014 at 13:00

Issue tags:

+dcamsa11y

Log in or register to post comments

Comment #74

gábor hojtsy

he/him

Hungarian

Hungary

commented 28 September 2014 at 13:11

Issue tags:

+Amsterdam2014

Amsterdam2014 is the master sprint tag for AMS :)

Log in or register to post comments

Comment #75

hanno commented 29 September 2014 at 07:26

Shouldn't we first focus in Amsterdam on #2336491: if t() string has fallback language in another text direction, bidi should be added, as that is much easier to implement? If that works, we can work with the same solution on language tags.

Log in or register to post comments

Comment #76

yesct commented 12 November 2014 at 20:55

Issue tags:

+D8MI

Log in or register to post comments

Comment #77

hanno commented 22 June 2015 at 07:04

#1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols) has been fixed, so with latest version it should be possible to use 4 byte language characters in Drupal.

Log in or register to post comments

Comment #78

mgifford

he/him

English

commented 22 June 2015 at 17:20

@Hanno, can you re-roll your patch in #64? It would be great if this could be brought into D8.

Log in or register to post comments

Comment #79

hanno commented 22 June 2015 at 20:38

Yes, back in time :) patch #64 was for bidi only, #34 involved the language changes.

Log in or register to post comments

Comment #80

mgifford

he/him

English

commented 17 July 2015 at 02:44

Version:	8.0.x-dev	» 8.1.x-dev
Status:	Needs work	» Postponed

I'd like to get this in but might need to wait to 8.1.

Log in or register to post comments

Comment #81

mgifford

he/him

English

commented 5 January 2016 at 22:36

Status:

Postponed

» Needs review

Log in or register to post comments

Comment #82

5 January 2016 at 22:36

Version:

8.1.x-dev

» 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #83

5 January 2016 at 22:36

Version:

8.2.x-dev

» 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #84

mgifford

he/him

English

commented 3 January 2017 at 22:54

Status:

Needs review

» Needs work

That should be Needs Work. Hanno's patch is from 4 years ago...

Log in or register to post comments

Comment #85

3 January 2017 at 22:54

Version:

8.3.x-dev

» 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #86

3 January 2017 at 22:54

Version:

8.4.x-dev

» 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #87

3 January 2017 at 22:54

Version:

8.5.x-dev

» 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #88

3 January 2017 at 22:54

Version:

8.6.x-dev

» 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #89

3 January 2017 at 22:54

Version:

8.7.x-dev

» 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Log in or register to post comments

Comment #90

3 January 2017 at 22:54

Version:

8.8.x-dev

» 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Log in or register to post comments

Comment #91

3 January 2017 at 22:54

Version:

8.9.x-dev

» 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Log in or register to post comments

Comment #92

3 January 2017 at 22:54

Version:

9.1.x-dev

» 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Log in or register to post comments

Comment #93

3 January 2017 at 22:54

Version:

9.2.x-dev

» 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #94

3 January 2017 at 22:54

Version:

9.3.x-dev

» 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #95

3 January 2017 at 22:54

Version:

9.4.x-dev

» 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #96

3 January 2017 at 22:54

Version:

9.5.x-dev

» 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #97

dww

we/he/they

commented 6 April 2023 at 01:29

Issue tags:

+Bug Smash Initiative, +Needs reroll

This came up as a random triage target for the #bugsmash initiative. Seems like this is still a bug, and it's still a tricky problem. At the bare minimum, we need to re-roll this for modern core (probably target 10.1.x for now, worry about backports later). I haven't exhaustively read the whole history in here to know if there's more needed (although I also see it's tagged for "Needs tests" and the existing test coverage in here is both very spares (only changes 2 lines) and is changing existing coverage, not adding new coverage.

Log in or register to post comments

Comment #98

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 6 April 2023 at 02:04

Read all comments. I was concerned too about what Gábor described in the first 12 comments, as t() misses the context for when something is rendered in html (and where) or other formats. But maybe now with delayed translation with TranslatableMarkup is worth a revisit. Still think this would be a DX issue and a big change only doable in a major release.

For the current attached patch, looks like that covers only what is described in #2336491: if t() string has fallback language in another text direction, bidi should be added, so probably that should be moved there.

Log in or register to post comments

Comment #99

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 6 April 2023 at 02:05

Also a concern: #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols) was closed, but not sure about the rest of engines we support (core and contrib).

Log in or register to post comments

Comment #100

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 7 April 2023 at 02:25

Status:

Needs work

» Closed (won't fix)

After discussion in Slack with @dww and @Gábor Hojtsy, looks like there's agreement that this is a legitimate and important request, but cannot be realistically fixed unless a major re-architecture haul happens.

Log in or register to post comments

if t() string has no translation or fallback language, text should have lang attribute

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Comments