In various places, titles are double-escaped, such as with check_plain(). For example, if the title is "Liam's Test", it gets escaped to Liam's Test
(which is what should happen before display) then to Liam's Test
.
Steps to recreate:
- Create a new web page with a title contained a single quote.
- View the source of the node: the span element inside the node markup with property="schema:name"
will have double-escaped @content value.
Use Google's RDFa Parser to test.
There are reports of this problem in various places and I suspect they are all caused by the same core bug:
#1189550: RSS Style plugin escapes feed title twice
#1302760: ampersand from Title gets escaped (or double escaped?)
#1363358: Shortcut set titles are double-escaped with check_plain()
#1468014: Double Escapes the Title? Escape characters showing up in the node title.
#1955072: Node/feed item titles are double escaped
Comment | File | Size | Author |
---|---|---|---|
#27 | core_2011918_title_double_escape_27.patch | 1.91 KB | jesse.d |
#27 | interdiff.txt | 1.48 KB | jesse.d |
#26 | core_2011918_title_double_escape.patch | 446 bytes | Liam Morland |
#23 | 2011918_23_title_double_escaped_D8.patch | 1.55 KB | jesse.d |
#21 | 2011918_20_title_double_escaped_D8.patch | 1.33 KB | jesse.d |
Comments
Comment #1
Liam MorlandComment #2
David_Rothstein CreditAttribution: David_Rothstein commentedI'm not sure there's a central place that would be doing the double escaping ... seems more likely that maybe there are different causes for each? A lot of times people use @-style placeholders in t() somewhat blindly (which is usually exactly what they should be doing) but if the thing they're inserting has already been escaped previously, then doing that blindly is where the double-escaping would come from.
Comment #3
Liam MorlandIt looks to me like Drupal is not consistent about when a string is plaintext and when it is HTML. At any given moment in the page-build process it needs to be clear which it is.
It certainly could be different causes for each. These issues are related in that it is the same essential problem in each case.
I'm pretty sure this should not be solved by un-escaping the strings to un-do excessive escaping.
In any case, there is a core bug here since the issue is observed in the dc:title meta element.
Comment #4
David_Rothstein CreditAttribution: David_Rothstein commentedNot un-escaping, just using t('!string') rather than t('@string') to avoid double-escaping if you're in a situation where you know the string is already pre-sanitized.
That part should probably be moved to the RDF core issue queue component, but in any case my guess is it's because rdf_preprocess_node() is doing this:
But $variable['title'] is presanitized by template_preprocess_node(), and Drupal sanitizes HTML attributes too when they're printed, so I think that's where the double-escaping comes from in this case. Probably the RDF module should use $variables['node']->title here instead?
Comment #5
Liam MorlandYes, the difference between ! and @ is subtle and it doesn't help that people may think it is OK to sanitize too much.
Your solution sounds reasonable to me, but I am really not familiar enough with that part of the code to know.
Comment #6
Liam Morland@#4: That fixes it for the meta element. Patch attached.
Comment #8
David_Rothstein CreditAttribution: David_Rothstein commented#6: core_2011918_title_double_escape.patch queued for re-testing.
Comment #9
David_Rothstein CreditAttribution: David_Rothstein commentedThat looks right to me.
Does this exist in Drupal 8 too? I think it probably would...
Comment #10
Liam MorlandD8 version.
Comment #11
scor CreditAttribution: scor commentedGood catch. Verified that this patch works on Google RDFa parser, before:
Liam's Test
, and after:Liam's Test
I also checked the user and taxonomy term code in rdf.module, and we use the raw value so we're fine (
$account->name
and$term->label()
).I'm uploading a patch that uses the label() method on the node, and which updates the node RDF test to include a single quote in the node title for the regular test. I'm curious to see if others would prefer to have it as a separate chunk of code at the end of the regular node test (create a new node for the purpose of testing the single quote).
Comment #12
chrisshattuck CreditAttribution: chrisshattuck commentedI tested the patch, and it works on the
property="dc:title"
example.I feel like including the single quotation mark in the same test is efficient, but am curious if combining two tests is in general not as good of a practice for regression testing? (I'm just now learning more about testing, so this is a genuine question.)
Comment #13
chrisshattuck CreditAttribution: chrisshattuck commentedI may be missing something, but does the test modification need to add an assertion somewhere that tests if the double-escaping is happening?
Comment #14
scor CreditAttribution: scor commented@chrisshattuck not as it is... because the test would fail without the fix, it did fail on my machine. you are welcome to upload a patch with just the test hunk (without the rdf.module change) to illustrate that the test is effectively working (and failing).
Comment #15
Pycouz CreditAttribution: Pycouz commentedHi all!
There is the same problem in titles of RSS feed when using fields in Views. The
'
is modified to'
.Any help?
Comment #16
Liam MorlandHere is the Views issue: #1189550: RSS Style plugin escapes feed title twice
Comment #17
Pycouz CreditAttribution: Pycouz commentedSorry for the trouble. Thank you.
Comment #18
jesse.d CreditAttribution: jesse.d commentedThis looks good to me.
NodeAttributesTest passes locally.
In a manual test, dc:title content before:
Jesse's Test
and after:Jesse's Test
.And in Google RDFa parser: before:
Jesse's Test
and after:Jesse's Test
Comment #19
jesse.d CreditAttribution: jesse.d commented#11: 2011918_11_title_double_escape_D8.patch queued for re-testing.
Comment #20
jesse.d CreditAttribution: jesse.d commentedPatch failed to apply locally. Rerolling.
Comment #21
jesse.d CreditAttribution: jesse.d commentedThe updated patch is attached.
Comment #22
scor CreditAttribution: scor commentedneeds reroll after https://drupal.org/node/1323830#comment-7713521 was committed.
Comment #23
jesse.d CreditAttribution: jesse.d commentedRerolled after https://drupal.org/node/1323830#comment-7713521 was committed.
It looked good to me locally testing in google's rdfa parser.
Comment #23.0
jesse.d CreditAttribution: jesse.d commentedUpdated issue summary.
Comment #24
kay_v CreditAttribution: kay_v commentedThe patch fixes the problem. Checks out both in the browser and in Google's RDFa Parser.
Comment #25
alexpottI've run the changed test without the patch and it caught the error. With the fix the tests pass. Nice work!
Committed 5778af9 and pushed to 8.x. Thanks!
Comment #26
Liam MorlandD7 version (same as in #6). It does not have a test because the tests in the RDF module are completely different in D7 and I don't know how to port the test.
Comment #27
jesse.d CreditAttribution: jesse.d commentedD7 version including an updated test.
Comment #28
scor CreditAttribution: scor commentedLooks good to me. We should probably check that the new test fails when the fix isn't applied.
Comment #29
kay_v CreditAttribution: kay_v commentedconfirmed on fresh Drupal 7 site - all good to go, so marking RTBC.
Comment #30
Tor Arne Thune CreditAttribution: Tor Arne Thune commentedI've been using the patch from #26 on production for some weeks now.
+1 to the RTBC.
Comment #30.0
Tor Arne Thune CreditAttribution: Tor Arne Thune commentedupdate with regard to previous commit and schema.org mappings
Comment #31
David_Rothstein CreditAttribution: David_Rothstein commentedI verified locally that the test fails without the rest of the patch, so good to go.
I fixed a code style issue on commit (missing spaces before and after the string-concatenation operator) and also added the code comment from the Drupal 8 patch (seemed like an equally useful comment for Drupal 7)...
Committed to 7.x - thanks! http://drupalcode.org/project/drupal.git/commit/6331211