In various places, titles are double-escaped, such as with check_plain(). For example, if the title is "Liam's Test", it gets escaped to Liam's Test (which is what should happen before display) then to Liam's Test.

Steps to recreate:
- Create a new web page with a title contained a single quote.
- View the source of the node: the span element inside the node markup with property="schema:name" will have double-escaped @content value.

Use Google's RDFa Parser to test.

There are reports of this problem in various places and I suspect they are all caused by the same core bug:
#1189550: RSS Style plugin escapes feed title twice
#1302760: ampersand from Title gets escaped (or double escaped?)
#1363358: Shortcut set titles are double-escaped with check_plain()
#1468014: Double Escapes the Title? Escape characters showing up in the node title.
#1955072: Node/feed item titles are double escaped

Files: 
CommentFileSizeAuthor
#27 core_2011918_title_double_escape_27.patch1.91 KBjesse.d
PASSED: [[SimpleTest]]: [MySQL] 40,251 pass(es).
[ View ]
#27 interdiff.txt1.48 KBjesse.d
#26 core_2011918_title_double_escape.patch446 bytesLiam Morland
PASSED: [[SimpleTest]]: [MySQL] 40,390 pass(es).
[ View ]
#23 2011918_23_title_double_escaped_D8.patch1.55 KBjesse.d
PASSED: [[SimpleTest]]: [MySQL] 57,886 pass(es).
[ View ]
#21 2011918_20_title_double_escaped_D8.patch1.33 KBjesse.d
PASSED: [[SimpleTest]]: [MySQL] 57,303 pass(es).
[ View ]
#11 2011918_11_title_double_escape_D8.patch1.33 KBscor
PASSED: [[SimpleTest]]: [MySQL] 56,726 pass(es).
[ View ]
#10 core_2011918_title_double_escape_D8.patch466 bytesLiam Morland
FAILED: [[SimpleTest]]: [MySQL] 56,549 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#6 core_2011918_title_double_escape.patch446 bytesLiam Morland
PASSED: [[SimpleTest]]: [MySQL] 40,474 pass(es).
[ View ]

Comments

Title:Titles are often doubled-escaptedTitles are often double-escaped

I'm not sure there's a central place that would be doing the double escaping ... seems more likely that maybe there are different causes for each? A lot of times people use @-style placeholders in t() somewhat blindly (which is usually exactly what they should be doing) but if the thing they're inserting has already been escaped previously, then doing that blindly is where the double-escaping would come from.

It looks to me like Drupal is not consistent about when a string is plaintext and when it is HTML. At any given moment in the page-build process it needs to be clear which it is.

It certainly could be different causes for each. These issues are related in that it is the same essential problem in each case.

I'm pretty sure this should not be solved by un-escaping the strings to un-do excessive escaping.

In any case, there is a core bug here since the issue is observed in the dc:title meta element.

Not un-escaping, just using t('!string') rather than t('@string') to avoid double-escaping if you're in a situation where you know the string is already pre-sanitized.

In any case, there is a core bug here since the issue is observed in the dc:title meta element.

That part should probably be moved to the RDF core issue queue component, but in any case my guess is it's because rdf_preprocess_node() is doing this:

<?php
    $element
= array(
     
'#tag' => 'meta',
     
'#attributes' => array(
       
'content' => $variables['title'],
       
'about' => $variables['node_url'],
      ),
    );
?>

But $variable['title'] is presanitized by template_preprocess_node(), and Drupal sanitizes HTML attributes too when they're printed, so I think that's where the double-escaping comes from in this case. Probably the RDF module should use $variables['node']->title here instead?

Yes, the difference between ! and @ is subtle and it doesn't help that people may think it is OK to sanitize too much.

Your solution sounds reasonable to me, but I am really not familiar enough with that part of the code to know.

Component:node system» rdf.module
Status:Active» Needs review
StatusFileSize
new446 bytes
PASSED: [[SimpleTest]]: [MySQL] 40,474 pass(es).
[ View ]

@#4: That fixes it for the meta element. Patch attached.

Status:Needs review» Needs work

The last submitted patch, core_2011918_title_double_escape.patch, failed testing.

Status:Needs work» Needs review

Title:Titles are often double-escapedTitles are often double-escaped (including in the content attribute of the dc:title meta element for nodes)

That looks right to me.

Does this exist in Drupal 8 too? I think it probably would...

Version:7.x-dev» 8.x-dev
StatusFileSize
new466 bytes
FAILED: [[SimpleTest]]: [MySQL] 56,549 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

D8 version.

StatusFileSize
new1.33 KB
PASSED: [[SimpleTest]]: [MySQL] 56,726 pass(es).
[ View ]

Good catch. Verified that this patch works on Google RDFa parser, before: Liam&#039;s Test, and after: Liam's Test

I also checked the user and taxonomy term code in rdf.module, and we use the raw value so we're fine ($account->name and $term->label()).

I'm uploading a patch that uses the label() method on the node, and which updates the node RDF test to include a single quote in the node title for the regular test. I'm curious to see if others would prefer to have it as a separate chunk of code at the end of the regular node test (create a new node for the purpose of testing the single quote).

I tested the patch, and it works on the property="dc:title" example.

I feel like including the single quotation mark in the same test is efficient, but am curious if combining two tests is in general not as good of a practice for regression testing? (I'm just now learning more about testing, so this is a genuine question.)

I may be missing something, but does the test modification need to add an assertion somewhere that tests if the double-escaping is happening?

@chrisshattuck not as it is... because the test would fail without the fix, it did fail on my machine. you are welcome to upload a patch with just the test hunk (without the rdf.module change) to illustrate that the test is effectively working (and failing).

Hi all!

There is the same problem in titles of RSS feed when using fields in Views. The &#039; is modified to &amp;#039;.

Any help?

Sorry for the trouble. Thank you.

This looks good to me.

NodeAttributesTest passes locally.

In a manual test, dc:title content before: Jesse&amp;#039;s Test and after:Jesse&#039;s Test.
And in Google RDFa parser: before: Jesse&#039;s Test and after: Jesse's Test

Status:Needs review» Needs work

Patch failed to apply locally. Rerolling.

Status:Needs work» Needs review
StatusFileSize
new1.33 KB
PASSED: [[SimpleTest]]: [MySQL] 57,303 pass(es).
[ View ]

The updated patch is attached.

Status:Needs review» Needs work

needs reroll after https://drupal.org/node/1323830#comment-7713521 was committed.

Status:Needs work» Needs review
StatusFileSize
new1.55 KB
PASSED: [[SimpleTest]]: [MySQL] 57,886 pass(es).
[ View ]

Rerolled after https://drupal.org/node/1323830#comment-7713521 was committed.

It looked good to me locally testing in google's rdfa parser.

Issue summary:View changes

Updated issue summary.

Status:Needs review» Reviewed & tested by the community

The patch fixes the problem. Checks out both in the browser and in Google's RDFa Parser.

Status:Reviewed & tested by the community» Fixed

I've run the changed test without the patch and it caught the error. With the fix the tests pass. Nice work!

Committed 5778af9 and pushed to 8.x. Thanks!

Version:8.x-dev» 7.x-dev
Status:Fixed» Needs review
StatusFileSize
new446 bytes
PASSED: [[SimpleTest]]: [MySQL] 40,390 pass(es).
[ View ]

D7 version (same as in #6). It does not have a test because the tests in the RDF module are completely different in D7 and I don't know how to port the test.

StatusFileSize
new1.48 KB
new1.91 KB
PASSED: [[SimpleTest]]: [MySQL] 40,251 pass(es).
[ View ]

D7 version including an updated test.

Looks good to me. We should probably check that the new test fails when the fix isn't applied.

Status:Needs review» Reviewed & tested by the community

confirmed on fresh Drupal 7 site - all good to go, so marking RTBC.

I've been using the patch from #26 on production for some weeks now.
+1 to the RTBC.

Issue summary:View changes

update with regard to previous commit and schema.org mappings

Issue summary:View changes
Status:Reviewed & tested by the community» Fixed

I verified locally that the test fails without the rest of the patch, so good to go.

I fixed a code style issue on commit (missing spaces before and after the string-concatenation operator) and also added the code comment from the Drupal 8 patch (seemed like an equally useful comment for Drupal 7)...

Committed to 7.x - thanks! http://drupalcode.org/project/drupal.git/commit/6331211

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.