Translatable entries not HTML encoded [#218516]

Comment	File	Size	Author
#19	l10n_client-node-218516-2.patch	3.32 KB	zroger
#17	l10n_client_ui_before.jpg	79.58 KB	gábor hojtsy
#17	l10n_client_ui_after_1.jpg	141.16 KB	gábor hojtsy
#17	l10n_client_ui_after_2.jpg	36 KB	gábor hojtsy
#6	l10n_client_encode_D5_2008080101.patch	1.85 KB	hass
#6	l10n_client_encode_D6_2008080101.patch	1.35 KB	hass
#2	l10n_client_encode_D5.patch	1.85 KB	hass
#2	l10n_client_encode_D6.patch	1.35 KB	hass

Comment #1

gábor hojtsy

he/him

Hungarian

Hungary

commented 11 February 2008 at 12:22

Status:

Active

» Postponed (maintainer needs more info)

If t() is not used in some module, then it is the problem of that module, not the problem of l10n_client. Also, there are strings showing up from your theme (like region names), which are not shown on the page. The Drupal 5 version also shows menu item names. I don't see a bug here, as you did not specify and specific buggy behavior of the module. Maybe you have specifics to say?

Log in or register to post comments

Comment #2

hass commented 26 April 2008 at 20:29

Version:	5.x-1.0	» 6.x-1.x-dev
Status:	Postponed (maintainer needs more info)	» Needs review

Status	File	Size
new	l10n_client_encode_D6.patch	1.35 KB
new	l10n_client_encode_D5.patch	1.85 KB

Could be an encoding issue what i found, too. This patches should fix this. The D5 patch is little bit different, while is sync's code with D6 and cleans up a small part.

Log in or register to post comments

Comment #3

hass commented 26 April 2008 at 20:30

Title:

Strange list of translatable entries on a page

» Translatable entries not HTML encoded

Aside this patches are fixing XHTML validity bugs...

Log in or register to post comments

Comment #4

EllECTRONC commented 1 August 2008 at 17:32

I have an error when trying to apply this patch l10n_client_encode_D5.patch

patching file `l10n_client.module'
Hunk #1 FAILED at 223.
Hunk #2 FAILED at 299.
2 out of 2 hunks FAILED -- saving rejects to l10n_client.module.rej

Log in or register to post comments

Comment #5

hass commented 1 August 2008 at 20:54

Status:

Needs review

» Needs work

Log in or register to post comments

Comment #6

hass commented 1 August 2008 at 20:53

Status	File	Size
new	l10n_client_encode_D6_2008080101.patch	1.35 KB
new	l10n_client_encode_D5_2008080101.patch	1.85 KB

I'm not able to repro this hunks. The lines have changed a little bit, but this isn't any issue. Nevertheless, updated patch attached.

Log in or register to post comments

Comment #7

EllECTRONC commented 3 August 2008 at 01:55

Status:

Needs work

» Needs review

patch -p0 < l10n_client_encode_D5_2008080101.patch
patching file `l10n_client.module'
Hunk #1 FAILED at 204.
Hunk #2 FAILED at 307.
2 out of 2 hunks FAILED -- saving rejects to l10n_client.module.rej

Log in or register to post comments

Comment #8

hass commented 3 August 2008 at 02:04

Do you apply to CVS version?

Log in or register to post comments

Comment #9

EllECTRONC commented 3 August 2008 at 19:39

No! I've applied it to version from a package l10n_client--5.x-1.0.tar.gz.

Log in or register to post comments

Comment #10

hass commented 7 August 2008 at 19:45

Patches are every time build against latest CVS version and not against an old release.

Log in or register to post comments

Comment #11

EllECTRONC commented 7 August 2008 at 20:56

OK, now I saw new version.

Log in or register to post comments

Comment #12

pasqualle

🇭🇺 Budapest

commented 13 October 2008 at 17:30

The patch solved the issue: #281721: shift in the string list

but there is still problem with the patch:
after the patch, instead of

@count hét

I see

@count hÃ©t

can it be fixed somehow?

Log in or register to post comments

Comment #13

hass commented 13 October 2008 at 18:18

This is an UTF8 problem... that patch uses only htmlentities() and your "e" is not changed. Maybe your server config have issues... drupal is UTF8...

Log in or register to post comments

Comment #14

hass commented 13 October 2008 at 18:22

See http://www.php.net/manual/en/function.htmlspecialchars.php for the chars that this patch fixes.

Log in or register to post comments

Comment #15

gábor hojtsy

he/him

Hungarian

Hungary

commented 13 October 2008 at 18:25

Any good example page to test this with so I can reproduce the bug and see that this is indeed fixing it?

Log in or register to post comments

Comment #16

pasqualle

🇭🇺 Budapest

commented 13 October 2008 at 19:15

I am not sure, but the first string on admin/build/modules:

(<span class="admin-disabled">disabled</span>)

is one which is not encoded, and displayed differently after the patch.

2. using htmlentities($something, ENT_COMPAT, 'UTF-8') fixed my UTF8 problem

Log in or register to post comments

Comment #17

gábor hojtsy

he/him

Hungarian

Hungary

commented 13 October 2008 at 20:23

Status:

Needs review

» Needs work

Status	File	Size
new	l10n_client_ui_after_2.jpg	36 KB
new	l10n_client_ui_after_1.jpg	141.16 KB
new	l10n_client_ui_before.jpg	79.58 KB

I've also tested the patch. A couple of issues:

1. UTF-8 problem as reported above. I did reproduce it. See after_2.jpg.
2. There is a strip_tags() in the "selection list" builder so that you can see and search actual strings, instead of useless HTML markup taking up space there. Now if you add an escaping code before it, that will obviously not remove the HTML bloat. See after_1.jpg.
3. The escaped tags appear in the source display as escaped and also get copied to the editor textarea escaped. This is clearly not desired.

Looks like this patch needs some work, please keep up the improvements!

Log in or register to post comments

Comment #18

hass commented 13 October 2008 at 20:58

Ups... this is really not desired.

Log in or register to post comments

Comment #19

zroger commented 30 January 2009 at 21:48

Status	File	Size
new	l10n_client-node-218516-2.patch	3.32 KB

After much testing, this patch fixes several issues I had. They all result in the same symptoms as above so i've rolled them all into a single patch. Maybe they should be broken into separate issues/patches, but since they seemed to all be causing the same problem, this is how i did it.

1. A php string from a views header got into the locale table, so I ended up with really ugly half-php code in the table. I fixed this in l10n_client_footer, by simply not including anything with <?php in it. This part is definitely up for debate. I am currently discussing the bigger issue involving views with merlinofchaos and nedjo, but this seems like a reasonable fail-safe for this module.

2. HTML strings were being injected as-is into the DOM lookup table. So if there was a mis-matched tag or something along those lines, the entire lookup goes bad. This is what was causing the mis-matched source and target panes. In my case, the offending string was < and <=, which I believe are translated in views. This is remedied by doing an htmlspecialchars() on the output, then using $().text() rather than $().html() in the javascript.

3. Along the same lines as #2, strip_tags() aggressively strips < and >. So in the above instance, the < string in the string selector pane was not displaying. There was also a string like < None > which also displayed as an empty selection. This along with the mis-matched panes caused quite an odd user-experience. The way I fixed this was to check for an empty string after doing the strip_tags() and in the case of an empty string, resort to doing an htmlspecialchars() on the original string. This way we get nice looking strings for everything except these edge cases.

4. This last fix also made brought to my attention that the ... for truncated strings was being added manually, instead of letting truncate_utf8() handle it for us.

Log in or register to post comments

Comment #20

zroger commented 6 February 2009 at 20:57

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #21

gábor hojtsy

he/him

Hungarian

Hungary

commented 2 April 2009 at 11:32

Status:

Needs review

» Needs work

1. I am not sure the PHP check belongs here. PHP code should not be added for t().

2. Using .text() would not get over the HTML tags from the source strings, so users will not translate things like links and other inline elements. Using htmlspecialchars() and combining it with decoding the .html() output sounds like a better approach, so that the HTML will contain things like < and the copied data will be <.

3. If you htmlspecialchars() before truncate_utf8(), you can end up with HTML entities cut in half, since as far as I see http://api.drupal.org/api/function/truncate_utf8/6 does not have any protection for HTML entities at all. So you can end up with a string like "foo bar baz &l", with the "t;" cut off.

4. Right, no objections :)

Thanks for your work. Let's fix these remaining problems and get the fixes in!

Log in or register to post comments

Comment #22

gábor hojtsy

he/him

Hungarian

Hungary

commented 15 April 2009 at 20:10

Status:

Needs work

» Fixed

I should have noticed earlier that this patch has security implications. Anyway, it was reported independently as a security issue, so was unpublished for a while. The escaping part was fixed via http://drupal.org/node/434682 (SA-CONTRIB-2009-019).

Committed the special casing code for strings which are HTML-only. Expanded that to happen also in the JS and also the string limitation and tag stripping to work there as well. Committed that before the SA in http://drupal.org/cvs?commit=194594

The PHP code checking is left uncommitted, but I am totally not convinced that should happen in this module. I'd be more comfortable with moving to using drupal_alter() and letting other modules alter our strings with whatever criteria. I don't consider Views sane to export PHP code for translation, and that is not something we should work around in this module.

All this said, I consider this issue fixed. Anything else should have a new open issue.

Log in or register to post comments

Comment #23

29 April 2009 at 20:20

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Translatable entries not HTML encoded

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

News items

Our community

Documentation

Drupal code base

Governance of community