Download & Extend

D7: Create RFC compliant HTML safe JSON

Project:Drupal core
Version:7.x-dev
Component:base system
Category:bug report
Priority:major
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

Problem/Motivation

(From #1086098: D6 JSON is total garbage - Create RFC 4627 compliant HTML safe JSON by @Heine)

Background:

We are dealing with a number of circumstances:

  • At least two different parsers may process the JSON:
    • HTML parser
    • JSON parser
  • Clients we have no control over need to interpret the JSON: jquery versions, iphone apps, .NET classes, world+dog

This means we need to produce RFC4627 compliant JSON that, at the same time, will not have certain 'special characters' such as ', ", <, > and & be interpreted by an HTML parser.

As RFC4627-2.5 clearly states that "Any character may be escaped", we can avoid special treatment of characters ', ", <, > and & by an HTML parser through simple substitution with a Unicode escape sequence (\uXXXX).

A number of characters MUST be escaped for the JSON parser. These are:

  • "
  • \
  • U+0000 - U+001F

For a number of characters (eg "), it is possible to choose the escape form and either precede them with a backslash (JSON Escape Sequence, eg \"), or use the \uXXXX (Unicode escape sequence, eg \u0027) form. HOWEVER, because we also need to deal with the HTML parser, _it_ may interpret the quote before the the JSON parser even has the chance to run. Because of this, it is advisable to use the Unicode escape sequence (\uXXXX) here as well.

Proposed resolution

In PHP 5.3, compliant JSON that properly encodes HTML special characters can be generated with:

<?php
json_encode
($data, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);
?>

However, in PHP 5.2, json_encode() does not support these options. So in 5.3 we use the provided PHP function, but in 5.2 we create our own helper to return properly encoded JSON according to the spec.

Remaining tasks

User interface changes

None.

API changes

None.

Original report by @Steve McKenzie

I discovered this when trying to debug when an objective-c parser was not parsing my data.

return '"'. str_replace(array("\r", "\n", "<", ">", "&"),
array('\r', '\n', '\x3c', '\x3e', '\x26'),
addslashes($var)) .'"';

in drupal_to_js is the issue.

im not someone that knows much about encoding but after showing this to people in freenode #iphonedev, they blamed that line.

so i switched and used json_encode and what do you know, it works fine now.

im too tired to give more details right now. this seems like enough anyways.

on another note, its probably about time drupal changes that name to drupal_to_json OR just drop the damn function entirely in favor of requiring json_encode / json_decode in php 5.2.x but obviously we're not discussing that here.

Comments

#1

os x character palette:

Character: <
unicode: 003C
UTF8: 3C

so why the \x and lower case? i dont know much about encoding like i said.

#2

We need to escape <>& to prevent HTML parsing issues.

Relevant RFC: http://www.ietf.org/rfc/rfc4627.txt?number=4627

json should use \uXXXX, not \xYY to escape characters.

#3

An simple note, to respect the procedure: this doesn't apply to D7, which uses json_encode() already. 6.x is hence the correct version.

#4

@DamZ

<?php
function drupal_to_js($var) {
 
// json_encode() does not escape <, > and &, so we do it with str_replace()
 
return str_replace(array("<", ">", "&"), array('\x3c', '\x3e', '\x26'), json_encode($var));
}
?>

Still uses \x instead of \u

#5

Version:6.x-dev» 7.x-dev

Indeed.

#6

Status:active» needs review

I'm having a hard time finding an authorative text on JS string encoding. Is it possible to mix \uXXX and UTF-8 in a string literal?

AttachmentSizeStatusTest resultOperations
do-479368-u-escape-sequences.patch688 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-479368-u-escape-sequences.patch.View details

#7

The \uXXX is of course transferred in UTF-8 (or whatever encoding used) and the parser deals with the rest.

#8

Here's a patch against D6 for the OP to test.

AttachmentSizeStatusTest resultOperations
do-479368-u-escape-sequences_6.patch790 bytesIdleFailed: Failed to apply patch.View details

#9

Status:needs review» needs work

The last submitted patch failed testing.

#10

It's time for some level tests about that.

#11

This is what I had to change for my iPhone app to load the JSON properly:
common.inc, line 2046 (Drupal 6):

return '"'. str_replace(array("\r", "\n", "<", ">", "&", "\'"),
    // array('\r', '\n', '\x3c', '\x3e', '\x26'), JSON Unicode fix http://drupal.org/node/479368
    array('\r', '\n', '\u003c', '\u003e', '\u0026', '\''),
    addslashes($var)) .'"';

*Note I also added a new replacement for single quotes \'

http://dev.enekoalonso.com

#12

Status:needs work» needs review

Looks like this is needed for #653580: Upgrade to jQuery 1.4

Rerolling

AttachmentSizeStatusTest resultOperations
479368.patch696 bytesIdlePassed on all environments.View details

#13

Priority:normal» critical

Marking this as critical since it causing issues with jQuery 1.4 testing. We have a short time table to get jQuery 1.4 in before the first alpha. The plan is to release jQuery 1.4 the day before the D7 alpha so we are doing testing early in preparation.

#14

Status:needs review» reviewed & tested by the community

This makes total sense.

#15

Status:reviewed & tested by the community» needs work

Does make sense. But can we get some tests for this?

#16

Status:needs work» needs review

json_decode() appears to unescape the HTML-safe characters itself (on PHP 5.2.12), so I've removed the older unescaping from drupal_json_decode(). This probably needs testing on other PHP versions.

The only test I could think of to add to the suite was to test for the escaped characters in the JSON encoded string. All the other unit tests for the JSON functions appear to pick up everything else.

Any more suggestions for the tests?

AttachmentSizeStatusTest resultOperations
479368_1.patch2.56 KBIdlePassed on all environments.View details

#17

Status:needs review» reviewed & tested by the community

Agreed this is the only additional test required.

#18

Version:7.x-dev» 6.x-dev
Status:reviewed & tested by the community» patch (to be ported)

Great. Thanks, folks! :D

Committed to HEAD.

Moving down to 6.x.

#19

Removing needs tests tag

#20

Rolled a patch for D6 based on #11. It adds some characters to correctly escape in JSON in drupal_to_js(). I'd found out that JSON Server was returning un-parseable JSON because it was escaping apostrophe's, which is incorrect.

There's a handy JSON tester online to check correct formatting, which might be helpful in this thread: http://www.jsonlint.com/

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js.patch597 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js.patch.View details

#21

Status:patch (to be ported)» needs work

       return '"'. str_replace(array("\r", "\n", "<", ">", "&"),
-                              array('\r', '\n', '\x3c', '\x3e', '\x26'),
+                              array('\r', '\n', '\x3c', '\x3e', '\x26', '\u003c', '\u003e', '\u0026', '\''),
                               addslashes($var)) .'"';

This stil escapes < etc with \x3c etc. What is \' supposed to replace?

#22

Status:needs work» needs review

I'm assuming we're sticking with drupal_to_js() in D6, and not simply changing it to:

function drupal_to_js($var) {
  // json_encode() does not escape <, > and &, so we do it with str_replace().
  return str_replace(array('<', '>', '&'), array('\u003c', '\u003e', '\u0026'), json_encode($var));
}

??

#23

@Heine: \' is supposed the replace the backslash and leave the apostrophe in place.

#24

Status:needs review» needs work

#25

We cannot switch to json_encode as D6 should support pre 5.2 PHP versions.

Your single quote in the replace array has no corresponding entry in the search array of str_replace.

#26

Thanks Heine. Re-rolled.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js.patch663 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_1.patch.View details

#27

Excellent!

I think we should drop the use of addslashes, and use addcslashes (or expand the str_replace) with the defined escape list of RFC 4627 :

  %x22 /          ; "    quotation mark  U+0022
  %x5C /          ; \    reverse solidus U+005C
  %x2F /          ; /    solidus         U+002F
  %x62 /          ; b    backspace       U+0008
  %x66 /          ; f    form feed       U+000C
  %x6E /          ; n    line feed       U+000A
  %x72 /          ; r    carriage return U+000D
  %x74 /          ; t    tab             U+0009

#28

I vote for str_replace() as it's pretty easy to see what's going with it. Patch attached.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch765 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped.patch.View details

#29

Wait, this one is slightly better.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch765 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_0.patch.View details

#30

Status:needs work» needs review

#31

Status:needs review» needs work

+      return '"'. str_replace(array('<', '>', '&', "\'", '"', '\\', '/', '\b', '\f', '\n', '\r', '\t'),
+                              array('\u003C', '\u003E', '\u0026', "'", '\u0022', '\u005C', '\u002F', '\u0008', '\u000C', '\n000A', '\u000D', '\u0009'),
                               addslashes($var)) .'"';
You need to write \b, \f, etc. in double quotes – otherwise you are searching for the literal string (e.g. \t) instead of the character it represents (a tab character).

When called with the string \, the output is "\u005C\u005C". It should be "\u005C" or simply "\\".

When called with \0 (i.e. a string containing one character with the ASCII value 0), the output is "\u005C0". It should be simply \0 enclosed in quotes (not the literal string \0, but the character with the ASCII value 0), or escaped as "\u0000".

I think addcslashes() is better suited for this task.

#32

\x00 is not valid in JSON; The quotation mark, reverse solidus, and the control characters (U+0000 through U+001F: \x00 through \x1f) must be escaped.

As \uXXXX escaping is always allowed, I'd vote to ditch add(c)slashes and use only one escape method.

#33

Status:needs work» needs review

Ok, I've removed addslashes and now handle everything with str_replace. The function now handles all characters from \x00 to \x1f.

I've also written a way to test that it's all working, by testing the output of drupal_to_js() against json_encode() and printing an error if the outputs don't match. Like so:

<?php

$var
= array();
//previous characters which were replaced in drupal_to_js() but don't need to be any more
$var[] = "<";
$var[] = ">";
$var[] = "&";

//characters which are usually handled by addslashes() http://php.net/manual/en/function.addslashes.php
$var[] = "'";
$var[] = '"';
$var[] = '\\';

//other special characters
$var[] = "\'";
$var[] = "\\";
$var[] = "/";
$var[] = "\b";
$var[] = "\f";
$var[] = "\n";
$var[] = "\r";
$var[] = "\t";

//unicode characters
$var[] = "\x00";
$var[] = "\x01";
$var[] = "\x02";
$var[] = "\x03";
$var[] = "\x04";
$var[] = "\x05";
$var[] = "\x06";
$var[] = "\x07";
$var[] = "\x08";
$var[] = "\x0a";
$var[] = "\x0b";
$var[] = "\x0c";
$var[] = "\x0d";
$var[] = "\x0e";
$var[] = "\x0f";
$var[] = "\x10";
$var[] = "\x11";
$var[] = "\x12";
$var[] = "\x13";
$var[] = "\x14";
$var[] = "\x15";
$var[] = "\x16";
$var[] = "\x17";
$var[] = "\x18";
$var[] = "\x1a";
$var[] = "\x1b";
$var[] = "\x1c";
$var[] = "\x1d";
$var[] = "\x1e";
$var[] = "\x1f";

//everything
$var[] = "<>&'\" \'\\ / \b \f \n \r \t \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x1a \x1b \x1c \x1d \x1e \x1f";

//test that drupal_to_js() matches the output of json_encode(). If it doesn't, output an error.
foreach ($var as $x) {
  if (
drupal_to_js($x) != json_encode($x)) {
    print
"<b>Error!</b> $x<br>";
    print
'drupal_to_js: '. drupal_to_js($x) .'<br>';
    print
'json_encode: '. json_encode($x) .'<p>';
  }
}

?>

Running these all through my version of drupal_to_js() doesn't generate any errors. Patch attached.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.35 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_0_0.patch.View details

#34

Damn - patched against my already patched function. Here's the correct patch.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.33 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_2.patch.View details

#35

From #2, need to escape <>& to prevent HTML parsing issues.

This now gives output from the testing function at #33 like so:

Error! <
drupal_to_js: "\u003c"
json_encode: "<"

Error! >
drupal_to_js: "\u003e"
json_encode: ">"

Error! &
drupal_to_js: "\u0026"
json_encode: "&"

Error! <>&'" \'\ / \b
AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.38 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_3.patch.View details

#36

drupal_to_js failed on my site due to line Separator (U+2028). Is this related? Should we remove similar UTF-8 separators as well?

#37

What do you mean, "failed"? As in the function failed with an error, or that drupal_to_js produced incorrect JSON? I added $var[] = "\u2028"; to my testing function from #33 and drupal_to_js (the patched version) produces the same output as json_encode.

#38

I've modified my drupal_to_js testing code from #33 to use the same escaping that D7 is using with json_encode. It generates no errors now, including the \u2028 unicode character.

<?php

$var
= array("<", ">", "&", "'", '"', '\\', "\'", "\\", "/", "\b", "\f", "\n", "\r", "\t", "\x00", "\x01", "\x02", "\x03", "\x04", "\x05", "\x06", "\x07", "\x08", "\x0a", "\x0b", "\x0c", "\x0d", "\x0e", "\x0f", "\x10", "\x11", "\x12", "\x13", "\x14", "\x15", "\x16", "\x17", "\x18", "\x1a", "\x1b", "\x1c", "\x1d", "\x1e", "\x1f", "\u2028", "
"
, "<>&'\" \'\\ / \b \f \n \r \t \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x1a \x1b \x1c \x1d \x1e \x1f");

//test that drupal_to_js() matches the output of json_encode(). If it doesn't, output an error.
foreach ($var as $x) {
 
$drupal = drupal_to_js($x);
 
$json = str_replace(array('<', '>', '&'), array('\u003c', '\u003e', '\u0026'), json_encode($x));
  if (
$drupal != $json) {
    print
"<b>Error!</b> $x<br>";
    print
'drupal_to_js: '. $drupal .'<br>';
    print
'json_encode: '. $json .'<p>';
  }
}
?>

#39

\b - \t are already included in the \x00 to \x1f control characters.

There's no need to replace "\'", as we just want to turn the single quote into the proper \u escape sequence, not the reverse solidus followed by a quote.

#40

Roger #37 Here is what I mean by fail: I use a JQuery ajax call and return data via drupal_to_js. The data is an array that includes several HTML regions (right-sidebar and content). Everything worked fine, until part of the data (a teaser of a post) included the UTF-8 line separator. It caused the ajax call to fail with parsererror error.

#41

In my case the following one line addition to drupal_to_js works:

$var = preg_replace( '/\p{Z}/u', ' ', $var );  // Remove separator characters
     
return '"'. str_replace(array("\r", "\n", "<", ">", "&"),
                              array('\r', '\n', '\x3c', '\x3e', '\x26'),
                              addslashes($var)) .'"';

Taken from http://nadeausoftware.com/articles/2007/9/php_tip_how_strip_punctuation_...

#42

Thanks for the feedback, Heine - the single quote is now replaced with \u0027 and I've taken out replacements for \f - \t. Patch attached.

bluetegu, you're saying that in your case drupal_to_js returns a \u2028 character which ajax then couldn't parse? I think that problem lies in the ajax, rather than in drupal_to_js. See the JSON rfc at http://www.ietf.org/rfc/rfc4627.txt - particularly the bit which says, "Insignificant whitespace is allowed before or after any of the six structural characters" and then it does on to list allowed whitespace characters, including Line feed/New line/Carriage return.

Try the attached patch and see if this solves your problem?

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.31 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_4.patch.View details

#43

Roger, the patch added multiple unreadable characters to the HTML rendering through ajax I'm doing. So something is wrong here. I guess it replaced some white spaces I had to UTF-8. It didn't help with the problem I had either - but this should be expected as the patch doesn't remove the U2028 characters. I'm too tired to think straight so I'll look again on this tomorrow.

#44

Try running it through this function instead and see if it helps (this is how D7 does it):

<?php
function drupal_json_encode($var) {
 
// json_encode() does not escape <, > and &, so we do it with str_replace().
 
return str_replace(array('<', '>', '&'), array('\u003c', '\u003e', '\u0026'), json_encode($var));
}
?>

#45

Hi Roger. #44 works for me. I'll use this instead of my current patch #41. Many thanks.

#46

That's great :) Does running it through the patched drupal_to_js (from #42) work too?

#47

No, unfortunately it doesn't. See #43.
Looking again at the patch at #42 I think I spotted a typo: You replace "\x0a" with '\u000e' instead of '\u000a'. Once I fixed it, I no longer have the behavior I describe at #43.

#48

Ah, thanks - I've changed that character. Try attached patch again, please.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.31 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_5.patch.View details

#49

Status:needs review» needs work

You missed \x09 and \x19.

In addition to the characters encoded by json_encode(), the function now also encodes < > & " / ', but the Doxygen comment only mentions the three former characters. I don't think we should encode more characters in D6 than in drupal_json_encode() D7. If the three latter characters are considered “unsafe”, they should be encoded in D7 as well.

The last replacement pair is "\'" => 'u0027'. Shouldn't this be "'" => '\u0027', i.e. skip the unnecessary backslash in the first array, and add a missing backslash in the second?

#50

Roger, it works ok for me, unless I have the separator u2028 (and possibly similar separators). For this to work, you can either use my #41 or the drupal_json_encode code per #44.

#51

Status:needs work» needs review

"You missed \x09 and \x19."
Actually, they don't exist - they use octal notation (I think?) which means that after 8 the next 'digit' is "a".

"I don't think we should encode more characters in D6 than in drupal_json_encode() D7."
I'm encoding them here at the suggestion of Heine in #27 and #32, who knows far more about this stuff than I do.

"The last replacement pair is "\'" => 'u0027'. Shouldn't this be "'" => '\u0027', i.e. skip the unnecessary backslash in the first array, and add a missing backslash in the second?"
It's weird, I know, but if the replacement is '\u0027' then running "\'" through the script returns "\\u0027". Keeping the backslash out of the replacement returns \u0027

I've added in a comment to Doxygen to detail the other replaced characters, and I've put the "\'" into a more logical place in the array.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.58 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_6.patch.View details

#52

re #50: when I run the following, I get \\u2028 as an output:
print json_encode("\u2028");

I get the same output when I do:
print drupal_to_js("\u2028");

Isn't this expected?

#53

\x[0-9A-Fa-f]{1,2} are hexadecimal :)

I don't really care about what the PHP json_encode is doing, but we need to follow the RFC, and then escape & <> to have HTML compatibility.

#54

Ah yes, hexadecimal. My brain wasn't supplying me with that word - too tired.

I've re-read the RFC to see if I've missed anything. Page 4 is the relevant page for our purposes; especially the line which tells us which characters don't need to be escaped:

    unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

This means we need to escape \x00 - \x1F, \x22 (a quotation mark "), \x5C (reverse solidus), \x2F (solidus), \x62 (backspace), \x66 (form feed), \x6E (line feed), and \x72 (carriage return). I'd missed some of those - I've added them in now (and added a line in the function comments to mention these). Hopefully I've got everything?

@bluetegu, this might catch your error?

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.75 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_7.patch.View details

#55

And...just when I thought the function was performing properly, I find that putting in this:

print drupal_to_js("The quick brown fox jumped over the lazy dog");

gives this:

"The quick \u0008\u000Dow\u000A \u000Cox jumped ove\u000D \u0009he lazy dog"

It's late again and I'm too tired to figure out why this is messed up - help?

#56

From #54, I'd thought the following characters needed to be escaped: backspace, form feed, line feed and carriage return. That's correct - they do - and as Heine pointed out in #39, all of these are already handled in \x00 - \x1f so we can just take them out here (and, I'd done the character representation incorrectly - a backspace is \b or \x5C\x62 - I'd represented it as just \x62 - which is why "The quick brown fox jumped over the lazy dog" got garbled).

So, the new lines should look like this:

<?php
     
return '"'. str_replace(array("\\", "<", ">", "&", '"', '/', "\'", "\x00", "\x01", "\x02", "\x03", "\x04", "\x05", "\x06", "\x07", "\x08", "\x0A", "\x0B", "\x0C", "\x0D", "\x0E", "\x0F", "\x10", "\x11", "\x12", "\x13", "\x14", "\x15", "\x16", "\x17", "\x18", "\x1A", "\x1B", "\x1C", "\x1D", "\x1E", "\x1F"),
                              array(
'\\\\', '\u003c', '\u003e', '\u0026', '\u0022', '\u002F', 'u0027', '\u0000', '\u0001', '\u0002', '\u0003', '\u0004', '\u0005', '\u0006', '\u0007', '\u0008', '\u000A', '\u000B', '\u000C', '\u000D', '\u000E', '\u000F', '\u0010', '\u0011', '\u0012', '\u0013', '\u0014', '\u0015', '\u0016', '\u0017', '\u0018', '\u001A', '\u001B', '\u001C', '\u001D', '\u001E', '\u001F'),
                             
$var) .'"';
?>
AttachmentSizeStatusTest resultOperations
479368_drupal_to_js.patch680 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_2.patch.View details

#57

"\'" should be "'"

The replacement '\\\\' will work, but '\u005C' would be more consistent.

You forgot \x09, \x10 and \x19 and the corresponding replacements.

#58

I've added in \x09 and \x19 (\x10 was already there ;)

Have changed the replacement '\\\\' to '\u005C' for consistency.

As for "\'"...that's what led me to this thread in the first place - I had a node title which wasn't having correct JSON return by drupal_to_json: "This isn't a node title" was returning "This isn\'t a node title" because of the addslashes function. Since we've taken out addslashes() there's no need to replace "\'" with anything.

So ' stays ' and \' becomes \u005C' - which is what we want.

<?php
return '"'. str_replace(array("\\", "<", ">", "&", '"', '/', "\x00", "\x01", "\x02", "\x03", "\x04", "\x05", "\x06", "\x07", "\x08", "\x09", "\x0A", "\x0B", "\x0C", "\x0D", "\x0E", "\x0F", "\x10", "\x11", "\x12", "\x13", "\x14", "\x15", "\x16", "\x17", "\x18", "\x19", "\x1A", "\x1B", "\x1C", "\x1D", "\x1E", "\x1F"),
            array(
'\u005C', '\u003c', '\u003e', '\u0026', '\u0022', '\u002F', '\u0000', '\u0001', '\u0002', '\u0003', '\u0004', '\u0005', '\u0006', '\u0007', '\u0008', '\u0009', '\u000A', '\u000B', '\u000C', '\u000D', '\u000E', '\u000F', '\u0010', '\u0011', '\u0012', '\u0013', '\u0014', '\u0015', '\u0016', '\u0017', '\u0018', '\u0019', '\u001A', '\u001B', '\u001C', '\u001D', '\u001E', '\u001F'),
           
$var) .'"';
?>

Yes?

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.63 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_8.patch.View details

#59

The test status for the last patch is "postponed" - how can I get the testing server to test it?

#60

We definitely need to escape:

Next Line (\u0085)
Line Separator (\u2028)
Paragraph Separator (\u2029)

Note that this wasn't caught by your tests above because you can't test for these by adding "\u2028" -- you have to use "\xE2\x80\xA8".

I found this issue because I copied and pasted a Paragraph Separator into a CCK label unknowingly, which broke AJAX in views (see #682778: Error: http://mysite.org/admin/build/views/ajax/add-item/comm_sites/default/field.).

I've rerolled the patch with these additions.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.63 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_9.patch.View details

#61

Re-submitted without the extra line I added by mistake.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.63 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_10.patch.View details

#62

Quite ugly + unmaintainable. Can we use strtr() with array notion instead of str_replace() ?

#63

Assigned to:Anonymous» mcarbone
Status:needs review» needs work

#64

Status:needs work» needs review

Here's a hopefully cleaner version.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.4 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_11.patch.View details

#65

Used patch from #64 on Drupal 6.17, seems to have cleared up the issues I was having (I believe with ampersand character in the content being converted to json.)

#66

Status:needs review» reviewed & tested by the community

Hi,

The patch from #64 also works for me on a server running D6.17 and PHP 5.1.6. Marking this as RTBC as we now have two reports of the patch working.

The bug is a bit of a show-stopper for anybody not able to roll their own function using the PHP json_encode function in 5.2, so it would be great to see this patch committed soon (unless anybody spots a problem with it, of course) - running sites on patched versions of core makes me nervous!

#67

Status:reviewed & tested by the community» needs review

Needs RTBC by Heine or c960657.

#68

Status:needs review» needs work

We definitely need to escape:

Next Line (\u0085)

Why? AFAICT this character is not mentioned in ECMA-262, contrary to the two other that you mention.

Isn't html_entity_decode('&#'. $i .';', ENT_NOQUOTES, 'UTF-8') equivalent to just chr($i) when $i < 127 ?

+ $replace_pairs

+      // Remaining unicode replacements.
+      for ($i = 0; $i < 32; $i++) {
How about saving storing $replace_pairs in a static variable to prevent this for loop from running in each call?

#69

Status:needs work» needs review

Weird -- it's been six weeks since I wrote the patch, but I seem to remember trying chr(). Well, in any case, it's there now and it seems to be working fine.

Removed the Next Line character and added static variable usage.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.42 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_12.patch.View details

#70

Status:needs review» needs work

+      static $replace_pairs = array();
+
+      if (empty($replace_pairs)) {
The default is never used. The convention is to not specify a default and then check using if (!isset($foo)) {.

Single quote is not escaped.

+          "\\" => '\u005c',
+          "<" => '\u003c',
+          ">" => '\u003e',
+          "&" => '\u0026',
+          '"' => '\u0022',
+          '/' => '\u002f',
+          "\xe2\x80\xa8" => '\u2028', // Line Separator
+          "\xe2\x80\xa9" => '\u2029', // Paragraph Separator
I suggest you put all except the last two keys (and the missing key for the single quote) in single quotes.

BTW I compared the performance of str_replace() and strtr(), and in this case strtr() seems to be much faster (i.e. it should not be changed to str_replace() in #561422: Replace strtr() with str_replace() for db prefixing):

#71

Status:needs work» needs review

"The default is never used."

Are you sure about that? I was following the example of drupal_get_schema in the same file (http://api.drupal.org/api/function/drupal_get_schema/6). But a default is not needed here, so I made the change anyway.

Single quote added.

AttachmentSizeStatusTest resultOperations
479368_drupal_to_js_all_escaped.patch1.44 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 479368_drupal_to_js_all_escaped_13.patch.View details

#72

Status:needs review» reviewed & tested by the community

This works as I would expect.

#73

Looking for more testing feedback!

#74

Gábor, are you looking for more feedback from me.. or feedback from more testers?

#75

More testing from others, since it looks like a pretty substantial change.

#76

Additionally, when providing testing results, something more substantial than "This works as I would expect." is generally helpful for building confidence in core committers regarding the level of testing. What, specifically, were you seeing your site do before that it's not doing now? How did you explicitly go out of your way to break it that failed? What sort of kooky/unusual set-up did you try the patch on where it did not break? etc.

#77

Sorry webchick, here is an expanded report.

I applied the patch then installed jQuery Update with http://drupal.org/node/775924#comment-3274706 but made it always use jQuery 1.4.2

Tested draggable sort on manage fields section of edit content types, tested views and panels edit screens by adding sorting and removing fields etc.

I also tested views and panels with Uniform enabled http://drupal.org/project/uniform

I also did the above tests with vanilla D6 jQuery version and eveything worked as I would expect

#78

On a server running D6.19 and PHP 5.1.6. (so without the PHP json_encode function in PHP 5.2) supplying a JSON feed to an iphone app, the results go something like this:

Before: Plain vanilla D6.19
Example JSON output:
{ "umm": { "item": { "id": "10092", "language": "fr", "url": "http://example.com/fr/node/10092", "title": "\x3cp\x3eUn miraculé de Lourdes\x3c/p\x3e\n", etc
Result: iphone app developer sends emails along the lines of 'what's all this garbage in the feed, I can't work with this!' and storms off in a huff!

After: Patched D6.19 with the patch in #71
Example JSON output:
{ "umm": { "item": { "id": "10092", "language": "fr", "url": "http:\u002f\u002fexample.com\u002ffr\u002fnode\u002f10092", "title": "\u003cp\u003eUn miraculé de Lourdes\u003c\u002fp\u003e\u000a", etc
Result: Less disgruntled iphone app developer!

As well as encoding tags in \uXXX format as opposed to the problematic \x format, the slashes in the url item are encoded, which didn't happen before applying the patch (but that's just an observation, not a problem, as far as I can see).

#82

Looking to test this and want to be clear:

1) Apply patch at #71
2) Replace stock jquery with 1.4 (or use jquery update module with 1.4)
3) Try to break
4) record results

That about right?

#83

#6: do-479368-u-escape-sequences.patch queued for re-testing.

#84

I have installed the #71 patch and updated to jQuery 1.4.2 using the patched linked to at http://drupal.org/node/775924#comment-3274706. After several days of testing and use, I have yet to encounter any issues caused by this patch.

I have also written of this at http://echodittolabs.org/blog/2010/08/drupal-6x-jquery-142-new-possibili... and look forward to its inclusion in the core.

#85

@rlnorthcutt

Thats a good start

#86

Patch in #71 corrected the incompatibility for me in 6.19. Is this ready to be included in the next release, or can I do anything to help make it ready?

#88

Priority:critical» major
Status:reviewed & tested by the community» needs review

We are dealing with a number of circumstances:

  • At least two different parsers may process the JSON (drupal_add_js setting):
    • HTML parser
    • JSON parser
  • Clients we have no control over need to interpret the JSON: jquery versions, iphone apps, .NET classes, world+dog

This means we need to produce RFC4627 compliant JSON that, at the same time, will not have certain 'special characters' such as ', ", <, > and & be interpreted by an HTML parser.

As RFC4627-2.5 clearly states that "Any character may be escaped", we can avoid special treatment of characters ', ", <, > and & by an HTML parser through simple substitution with a Unicode escape sequence (\uXXXX).

A number of characters MUST be escaped for the JSON parser. These are:

  • "
  • \
  • U+0000 - U+001F

For a number of characters (eg "), it is possible to choose the escape form and either precede them with a backslash (JSON Escape Sequence, eg \"), or use the \uXXXX (Unicode escape sequence, eg \u0027) form. HOWEVER, because we also need to deal with the HTML parser, _it_ may interpret the quote before the the JSON parser even has the chance to run. Because of this, it is advisable to use the Unicode escape sequence (\uXXXX) here as well.

Right now, drupal_to_js is not RFC4627 compliant. This causes problems with a number of non-core JSON consumers.

Violations:

  • Control characters such as U+001F are not escaped
  • It uses the undefined escape sequence \xXX, which only works on certain JSON implementations (likely, only JS eval)
  • It improperly uses the non-existing JSON escape sequence \0 for the 0 byte instead of the Unicode escape sequence \u0000

Additional problem:

  • It risks interpretation of certain characters as special by the HTML parser due to its use of addslashes (\")

Now, enough about the problems of the current implementation, on to the patch.

It correctly escapes U+0000 - U+001F (Ascii 0 - 31) and the potential special HTML characters.

AFAIK we do not have to escape /, U+2028 and U+2029 according to the specs, but naive JSON parsers (eval) may have trouble with it.

As there's no harm in escaping these characters, I've kept them in the revised patch. All this reroll did was to expand the U+0000 - U+001f range and add comments.

AttachmentSizeStatusTest resultOperations
do479468-drupal_to_js-expanded-comments.patch2.68 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do479468-drupal_to_js-expanded-comments.patch.View details

#89

Now without the static.

AttachmentSizeStatusTest resultOperations
do479468-drupal_to_js-expanded-comments-nostatic.patch2.51 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do479468-drupal_to_js-expanded-comments-nostatic.patch.View details

#90

Version:6.x-dev» 7.x-dev
Status:needs review» patch (to be ported)

Back to D7 for \, ", ' and / that we don't escape there.

#91

Status:patch (to be ported)» needs review
AttachmentSizeStatusTest resultOperations
do479468-drupal_json_encode-D7.patch3.57 KBIdlePASSED: [[SimpleTest]]: [MySQL] 23,402 pass(es).View details

#92

json_encode uses JSON escape sequences (\") instead of our preferred Unicode escape sequences (\uXXXX). As json_encode in PHP 5.2.x doesn't have support for the $options array, we need to ditch the function and implement json_encoding ourselves again.

#93

Hm, any reason not to replace '"' and '\'' manually as we were previously doing?

#94

Because we do not want to escape all quotes.

"key" : "value"

Quotes in key and value need to be escaped, but we have to leave the significant quots wrapping the key and value alone.

We can't use check_plain either, because the result of drupal_json_encode has to be consumed by pure JSON parsers.

#95

*sigh*

Ok, so we can probably have two versions:

  • On PHP >= 5.3.0, use json_encode() with JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP, postprocessing '\\\\' into '\u005c' and '\\/' into '\u002f'
  • Otherwise use our implementation

Are the postprocessing of \ and / really necessary, or can we get away without it?

#96

If the postprocessing of \ and / is really necessary, we should open a feature request against PHP to add support for that in later versions.

#98

Assigned to:mcarbone» Anonymous

#99

This seems to have stalled.. What still needs to be done?

#100

Well, can patch #71 be applied to drupal 6 core? I've had to patch core with this, and I'd love to not have to repatch at the next update, even though this doesn't affect user-facing interfaces in my case and wouldn't take my site down.

It sounds like there's a pretty serious interpreter issue that needs to be addressed, but I don't understand the impact of the various JS interpreters. It also seems like the conversation has moved from v6 to v7. Should this thread be split off so the issue can be addressed separately for both versions?

#101

I think we can get away with not escaping \ and / as they are not special in HTML. (/ at least not when not in in the </ or /> combo, and < and > are escaped).

(patch contains a BOM, so it's also a test for the test infrastructure)

AttachmentSizeStatusTest resultOperations
do479368-drupal_json_encode.patch4.22 KBIdleFAILED: [[SimpleTest]]: [MySQL] 24,802 pass(es), 2 fail(s), and 0 exception(es).View details

#102

Status:needs review» needs work

The last submitted patch, do479368-drupal_json_encode.patch, failed testing.

#103

Status:needs work» needs review

Lovely, json_encode options produce \uXXXX vs. \uxxxx. Adjusting both test and drupal_encode_json_helper to test and produce uppercase hexadecimals.

AttachmentSizeStatusTest resultOperations
do479368-drupal_json_encode.patch4.89 KBIdlePASSED: [[SimpleTest]]: [MySQL] 24,811 pass(es).View details

#104

That's for d7. Right?

#105

Tested #71 with jQuery 1.4.2.

Now I got:

jQuery.extend(Drupal.settings, { "basePath": "\u002f",

instead of:

jQuery.extend(Drupal.settings, { "basePath": "/",

Subscribing. Thanks

#106

The latest D6 patch for review is on #89.
The latest D7 patch for review is on #103.

#107

Thanx Heine for letting us know.

btw, patches in #88 and #89 got accidentally named 479468 instead of 479368

#108

Status:needs review» needs work

#89 works with d6's default jquery library, but draggable handles don't appear when managing fields of content types using jQuery 1.4.2.

#109

Just for reference, there is a working patch for tabledrag.js at http://drupal.org/node/893538

#110

Status:needs work» needs review

jQuery updates are NOT SUPPORTED. Random breakage due to swapping out core components is in fact to be expected.

Please focus on the JSON produced.

#111

Status:needs review» needs work

On instigation by chx; needs a test for drupal_json_helper as well or we lose coverage on PHP 5.3 testservers.

#119

Status:needs work» needs review

Patch adds the additional test for the helper to keep coverage on PHP 5.3+

See the following comments for background.

#479368-88: D7: Create RFC compliant HTML safe JSON
#479368-92: D7: Create RFC compliant HTML safe JSON

Any followup not furthering this issue will be unpublished.

AttachmentSizeStatusTest resultOperations
do479368-drupal_json_encode.patch7.64 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do479368-drupal_json_encode_1.patch.View details

#120

#71: 479368_drupal_to_js_all_escaped.patch queued for re-testing.

#121

Title:drupal_to_js provides bad unicode conversions» drupal_json_encode produces invalid JSON

#122

Title:drupal_json_encode produces invalid JSON» Create RFC compliant HTML safe JSON

Better title.

This issue is most urgent for D6 and interaction with external JSON consumers:

drupal_json_encode produces JSON that risks being interpreted by HTML parsers (\" escape sequences).
drupal_to_js (D6) produces total garbage for RFC compliant parsers (iPhone, random APIs, later jQuery versions).

#123

Title:Create RFC compliant HTML safe JSON » drupal_to_js provides bad unicode conversions
Status:needs review» reviewed & tested by the community

Works and looks great.

#124

Title:drupal_to_js provides bad unicode conversions» Create RFC compliant HTML safe JSON

Crosspost

#125

Status:reviewed & tested by the community» needs review

Needs another review IMO.

#127

Drupal 6 backport with a testbot safe name.

AttachmentSizeStatusTest resultOperations
drupal_to_js_proper_json-D6.patch2.6 KBIgnored: Check issue status.NoneNone

#128

+1 for the D6 patch, much needed as commonly used features are exposed to bugs that are hard to track (such as Ajax-enabled Views). I've seen U+2028 characters in text fields if copy-paste is used with some editors.

#129

that's just the problem i was trying to solve, a U+2028 character breaking Views Ajax filtering in a D6 site. i applied the last patch from #127 and it just shifted the issue - instead of getting a failure on the Ajax call, there's a syntax error later on when it tries to use the response result. i had to change these lines:

        // While these are allowed unescaped according to ECMA-262, section
        // 15.12.2, they cause problems in some JSON parsers.
        "\xe2\x80\xa8" => '\u2028', // U+2028, Line Separator.
        "\xe2\x80\xa9" => '\u2029', // U+2029, Paragraph Separator.

to set those to blanks instead of the escaped values.

i think the gist of this may be outside the scope of this issue, but i can try to make a simpler test case if anyone would want to see it.

#130

...well if this translates to potential issues in the future, I'd be one of them people that'd care to see a test case (and a solution eventually).

#131

#71: 479368_drupal_to_js_all_escaped.patch queued for re-testing.

#132

I've created #1086098: D6 JSON is total garbage - Create RFC 4627 compliant HTML safe JSON as the issue is most problematic in D6 (per #122). Maybe it can get some traction there.

#133

brad, do you know the exact location of the syntax error? This might be a problem that should be solved by not offering e2 80 a8 and friend in _this_ response in the first place.

#134

it's the eval() at line 155 of views/js/ajax_view.js if that's of use.

#135

Yes, then the problem IMO is in the entity calling drupal_json. callback should be a valid JS callback.

#136

#71: 479368_drupal_to_js_all_escaped.patch queued for re-testing.

#137

Do the d.o infrastructure have PHP 5.3 slaves? Anyway this good to go

#138

#119: do479368-drupal_json_encode.patch queued for re-testing.

#139

Status:needs review» needs work

The last submitted patch, do479368-drupal_json_encode.patch, failed testing.

#140

Version:7.x-dev» 8.x-dev

#141

Assigned to:Anonymous» keichee
Status:needs work» needs review

my patch :)

AttachmentSizeStatusTest resultOperations
drupal_json_encode_0[1].patch590 bytesIdleFAILED: [[SimpleTest]]: [MySQL] 32,789 pass(es), 2 fail(s), and 0 exception(es).View details

#142

Status:needs review» needs work

The last submitted patch, drupal_json_encode_0[1].patch, failed testing.

#143

okay, as per requested by the testing module, patch will just escape <, > and & only.

AttachmentSizeStatusTest resultOperations
drupal_json_encode.patch595 bytesIdleFAILED: [[SimpleTest]]: [MySQL] 32,773 pass(es), 2 fail(s), and 0 exception(es).View details

#144

Status:needs work» needs review

#145

Status:needs review» needs work

The last submitted patch, drupal_json_encode.patch, failed testing.

#146

Status:needs work» needs review

added <, >, ', " and & to simpletest test module to comply to drupal_json_encode escape function

please test attached.

AttachmentSizeStatusTest resultOperations
drupal_json_encode.patch1.41 KBIdleFAILED: [[SimpleTest]]: [MySQL] 32,777 pass(es), 1 fail(s), and 0 exception(es).View details

#147

Status:needs review» needs work

The last submitted patch, drupal_json_encode.patch, failed testing.

#148

fine, remove that backslash for double quote

AttachmentSizeStatusTest resultOperations
drupal_json_encode.patch1.41 KBIdleFAILED: [[SimpleTest]]: [MySQL] 32,779 pass(es), 1 fail(s), and 0 exception(es).View details

#149

Status:needs work» needs review

#150

Status:needs review» needs work

The last submitted patch, drupal_json_encode.patch, failed testing.

#151

Status:needs work» needs review

#146: drupal_json_encode.patch queued for re-testing.

#152

Status:needs review» needs work

hmm tweaked simple test a little bit. hopefully to see green

AttachmentSizeStatusTest resultOperations
drupal_json_encode.patch1.39 KBIdleFAILED: [[SimpleTest]]: [MySQL] 33,575 pass(es), 2 fail(s), and 0 exception(es).View details

#153

Status:needs work» needs review

#154

So #152 looks great but that requires 5.3 - however D8 now has a PHP 5.3 requirement so it should be fine for that.

#119 needs a re-roll for D7, no longer applies. Also looks like it needs the test changes from #152 added to it (adding '\').

#155

Here's that reroll of #119 for D7. (Edit: missed the changes from #152.)

AttachmentSizeStatusTest resultOperations
drupal-json-encode-479368-155-d7.patch7.65 KBIgnored: Check issue status.NoneNone

#156

Ah, the character that needs to be added is a single quote, not a slash.

#157

Setting to 7.x temporarily to test the D7 patch with the single quote added.

AttachmentSizeStatusTest resultOperations
drupal-json-encode-479368-157.patch7.73 KBIdleFAILED: [[SimpleTest]]: [MySQL] 35,782 pass(es), 2 fail(s), and 0 exception(es).View details

#158

Version:8.x-dev» 7.x-dev

#159

Version:7.x-dev» 8.x-dev

#157 passes, so moving back to 8.x.

#160

Status:needs review» reviewed & tested by the community

Oh I see it's an escaped single quote, that'll teach me to review patches from my phone.

A double quoted single quote by itself is no more readable, at least I spotted something was missing I guess...

#157 I haven't properly reviewed but it looks like Heine and Damien have. One thing we should do is at least sync the json_encode arguments with D8 so they're more easily scannable:

+    return json_encode($var, JSON_HEX_QUOT | JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS);

vs.

+  return json_encode($var, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);

So #152 looks RTBC for D8 to me now that we required 5.3 there, so let's get that in then it looks like at most minor changes to the D7 patch to get that ready.

#161

In #152, I don't understand why the double quote is excluded from the patch. I mean, I see that earlier tests broke with it, but is there a logical reason it's excluded?

Looking over it again, I think #157 needs the single quote added to the helper test's $html_unsafe in addition to the change catch suggests. Also, the same question about the double quote applies there. The 5.3 call to json_encode() includes the double quote as an argument, but the 5.2 switch statement does not include it in the same section as < > ' &.

#162

Status:reviewed & tested by the community» needs review

Tentatively back to NR until I can wrangle an answer to #161 out of someone.

#163

i just copied the code from php.net documentation

i dont know why it fails when i include it in simpletests, thats why i removed it.. i guess its a simpletest bug or something?

#164

Status:needs review» needs work

Looking at the test results for the last one that failed, the test that failed was:

A JSON encoded string does not contain ". Other common.test 2343 DrupalJSONTest->testJSON()

This means that the double quote was not properly escaped. Removing it from the test fixes that test error, but there is an underlying problem escaping it. So, I think this is NW, then.

#165

Assigned to:keichee» xjm

#166

I see now why it is failing. The JSON-encoded string is delimited with double quotes, e.g.:

<?php
print json_encode('This is my string with " & \' < > characters', JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);
?>

prints:
"This is my string with \u0022 \u0026 \u0027 \u003C \u003E characters"

with the double quotes at the beginning and end.

So rather than asserting that the string does not contain a double quote, it should assert that the double quote only appears at the beginning and end of the string?

#167

Status:needs work» needs review

Let's try this.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-167.patch2.99 KBIdleFAILED: [[SimpleTest]]: [MySQL] 33,588 pass(es), 5 fail(s), and 0 exception(es).View details

#168

Status:needs review» needs work

The last submitted patch, json-encode-d8-479368-167.patch, failed testing.

#169

Oh, $json is getting re-defined as an array and not just the string. We need to test the string delimiters only before this. Reusing the same var seems not-best-practice, but I guess that'd be a separate issue.

#170

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-170.patch2.27 KBIdleFAILED: [[SimpleTest]]: [MySQL] 33,586 pass(es), 2 fail(s), and 0 exception(es).View details

#171

Status:needs work» needs review

#172

Status:needs review» needs work

The last submitted patch, json-encode-d8-479368-170.patch, failed testing.

#173

Status:needs work» needs review

#152: drupal_json_encode.patch queued for re-testing.

#174

So, the solution in #152 is now causing taxonomy autocomplete fields to fail, when it didn't as of July 3.

I'll see if #157 has the same problem now.

#175

Version:8.x-dev» 7.x-dev

#176

#157: drupal-json-encode-479368-157.patch queued for re-testing.

#177

A taxonomy patch just went in, switching to drupal_implode_tags().

#178

#179

Status:needs review» needs work

So I guess it is maybe failing because the str_replace() in drupal_implode_tags() matches double quotes and not the encoded variant? Do we need to patch that function as well? I don't think not encoding double quotes should be an option.

#180

When I actually test this in a browser, the patch does not seem to alter the behavior. E.g., if I have a term foo, bar, it is returned as an autocomplete suggestion when I type foo regardless of whether or not the patch is applied. So I think the conflict is with the new test for #1000736: Term reference autocomplete widget having problems handling terms with commas:

<?php
    $this
->assertRaw('{"\"' . $term2->name . '\"":"' . $term2->name . '"}', t('Autocomplete returns term %term_name after typing the first 3 letters.', array('%term_name' => $term2->name)));
?>

Without the patch, this is the text when I run the test:
{"\"xc0wcH6G, dfzKJQXl\"":"xc0wcH6G, dfzKJQXl"}

With the patch applied, this is the result:
{"\u0022vuzR52oT, jcy3G4Sp\u0022":"vuzR52oT, jcy3G4Sp"}

So we just need to change the assertion to match \u0022 instead of \".

#181

Attached is the same as #170, but adjusts the taxonomy test to match encoded rather than escaped double quotes wrapping the term.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-181.patch3.16 KBIdlePASSED: [[SimpleTest]]: [MySQL] 33,585 pass(es).View details

#182

Version:7.x-dev» 8.x-dev
Status:needs work» needs review

#183

Just a comment fix to match #119.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-183.patch3.38 KBIdlePASSED: [[SimpleTest]]: [MySQL] 33,606 pass(es).View details

#184

Version:8.x-dev» 7.x-dev

And, the D7 version with support for 5.2, based on Heine's patch in #119.

I ran JSON and taxonomy form tests for this locally on both 5.2 and 5.3; the tests passed. I also tested the functionality of autocomplete fields and they autocompleted properly with and without commas in the term names.

However, assuming this passes testbot, it would be good to have some people testing it in a situation where compliant JSON is required, e.g. iPhone. A review from Heine would also be good.

For the record, the D6 version of this is in #1086098: D6 JSON is total garbage - Create RFC 4627 compliant HTML safe JSON.

AttachmentSizeStatusTest resultOperations
json-encode-d7-479368-184.patch6.91 KBIdlePASSED: [[SimpleTest]]: [MySQL] 35,765 pass(es).View details

#185

Thanx Jess for your hard work on this ;)

#186

Tagging issues not yet using summary template.

#187

Summary added.

#188

Status:needs review» needs work

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+  // We do not want to use drupal_static() since the PHP version will never
+  // change during a request.

Shorten to: "PHP version cannot change within a request."

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+  // json_encode on PHP prior to PHP 5.3.0 doesn't support options.
...
+    // json_encode() escapes <, >, ', &, and " using its options parameter.
...
+  // In PHP versions less than 5.3.0, use our helper.
+  return drupal_json_encode_helper($var);

Let's replace the last comment with the first, and also move the second to the last, which clarifies what is actually different below 5.3, and why on earth we need that helper.

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+      return $var ? 'true' : 'false'; // Lowercase necessary!
+    case 'integer':

(and elsewhere) There should be a blank line between switch case statements that do not fall-through to the next. See http://drupal.org/coding-standards

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+        // 15.12.2, they cause problems in some JSON parser.

parser*s*

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+      return '"'. strtr($var, $replace_pairs) .'"';
...
+        return '[ '. implode(', ', $output) .' ]';
...
+      return '{'. implode(', ', $output) .'}';

Bad string concatenation style.

+++ b/includes/common.inc
@@ -4797,14 +4797,120 @@ function drupal_clear_js_cache() {
+      if (empty ($var) || array_keys($var) === range(0, sizeof($var) - 1)) {

No space after "empty".

-3 days to next Drupal core point release.

#189

Version:7.x-dev» 8.x-dev
Status:needs work» needs review

Comment fix to D8 to bring it in line with @sun's comment change recommendation, plus an additional comment clarifying the use of \u0022 in the taxonomy autocomplete test.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-189.patch3.74 KBIdleFAILED: [[SimpleTest]]: [MySQL] Repository checkout: failed to checkout from [git://git.drupal.org/project/drupal.git].View details
interdiff-183-189.txt1.46 KBIgnored: Check issue status.NoneNone

#190

And, finally, the d7 reroll, incorporating both the comment changes from #189 and sun's comment and code style fixes from #188. Phew!

AttachmentSizeStatusTest resultOperations
json-encode-479368-190-d7.patch7.26 KBIgnored: Check issue status.NoneNone
interdiff-184-190.txt4.04 KBIgnored: Check issue status.NoneNone

#191

#189: json-encode-d8-479368-189.patch queued for re-testing.

#192

Status:needs review» needs work

The last submitted patch, json-encode-d8-479368-189.patch, failed testing.

#193

Status:needs work» needs review

Rerolled for core/.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-193.patch4.55 KBIdlePASSED: [[SimpleTest]]: [MySQL] 33,989 pass(es).View details

#194

Assigned to:xjm» Anonymous

#195

+++ b/sites/default/default.settings.php
@@ -316,7 +316,7 @@ ini_set('session.cookie_lifetime', 2000000);
- * theme. It is located inside 'core/modules/system/maintenance-page.tpl.php'.
+ * theme. It is located inside 'modules/system/maintenance-page.tpl.php'.

Is this correct for D8?

#196

Status:needs review» needs work

Nope. And... that's weird, and has nothing to do with the patch.

Edit: I bet default.settings.php was write-protected and so didn't get changed properly when I rebased. Oops! Thanks for the catch.

#197

Status:needs work» needs review

Fixed.

AttachmentSizeStatusTest resultOperations
json-encode-d8-479368-197.patch3.8 KBIdlePASSED: [[SimpleTest]]: [MySQL] 33,991 pass(es).View details

#198

Status:needs review» needs work

In #94 Heine explains why we dont want a dumb string replace in any version so the D7 version is good to go functionality but I would love to banish that abomination into a helper include.

The D8 version is OK because json_encode does the right thing for the Ux0000-Ux0019 characters and the few other special chars JSON-wise and with the new options it even becomes HTML safe.

#199

Version:8.x-dev» 7.x-dev
Status:needs work» needs review

Attached creates a new file, json-encode.inc. This is for D7 only.

The JSON and taxonomy autocomplete tests pass locally with PHP 5.2. Setting to D7 temporarily to make sure this patch passes the full test suite for D7.

The D8 patch is still in #197.

AttachmentSizeStatusTest resultOperations
json-encode-d7-479368-199.patch7.61 KBIdlePASSED: [[SimpleTest]]: [MySQL] 37,314 pass(es).View details

#200

Version:7.x-dev» 8.x-dev

#201

This is good to go.

#202

Status:needs review» reviewed & tested by the community

I mean...

#203

Status:reviewed & tested by the community» needs work

I reviewed this patch and it looks good. However, it needs a quick re-roll for the 'core'-directory changes.

#204

Status:needs work» reviewed & tested by the community

See #197. The D8 patch is available there.

#205

Version:8.x-dev» 7.x-dev

Oops, you're right xjm. Committed to 8.x and moving to 7.x for @webchick.

#206

Status:reviewed & tested by the community» fixed

Hm. Seems a bit ick to make a whole separate include file for just this one function. But I guess if it was good enough for Dries, it's good enough for me. ;) Thanks for the expanded test coverage.

Committed and pushed to 7.x.

#207

Version:7.x-dev» 6.x-dev
Status:fixed» patch (to be ported)
Issue tags:-needs backport to D7+needs backport to D6

Oopsie. Marking down to D6.

#208

Version:6.x-dev» 7.x-dev
Status:patch (to be ported)» fixed

D6 is in #1086098: D6 JSON is total garbage - Create RFC 4627 compliant HTML safe JSON due to having to support PHP 4.

#209

Title:Create RFC compliant HTML safe JSON» D7: Create RFC compliant HTML safe JSON

...minor title edit to make this clear(er) + removing the backport tag.

#210

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

#211

But after upgrading to D7.12, I have some problems with my ajax interface, which after ajax-reload returns html with russian textes. I use ajax_command_html for ajax output with php 5.2.17.
I've tried to use old version of the drupal_json_encode() function and it looks good...
and drupal_json_encode_helper() returns abracadabra instead of the normal readable text.
I think the problem is in drupal_json_encode_helper() function, lines 23-77...
But I'm not sure...

#212

@alweb: I'd suggest opening a new issue, with detailed steps to reproduce the problem you're having. Thanks!