Part of meta issue

follow-up from #1696786: Integrate Twig into core: Implementation issue

Twig as it stands introduces a fair bit of overhead into the theme system. Fabianx indicated that a lot of this is from marking $variables as secure so they're not double escaped later.

Ideally, if Twig autoescape is going to be enabled, then we should just pass raw variables to it and let it do the work. This way, if a template doesn't print the date, or a link, or whatever might currently be check_plain()ed in preprocess, we're not spending all this time creating it for it to be never used. In general, we should be able to remove a large chunk of preprocess work, and just let Twig sort out variables on demand.

Doing this means that a PHPTemplate engine in contrib is going to have to add back a way to securely format variables, but I don't see a way around this if we don't want a serious performance regression.

Related

#1818266: [meta] A secure theme system (with twig)
#1382350: [meta] Theme/render system problems

Files: 
CommentFileSizeAuthor
#43 interdiff.txt2.16 KBjoelpittet
#43 1825952-43-twig-autoescape.patch7.96 KBjoelpittet
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 50,650 pass(es), 6,247 fail(s), and 1,419 exception(s).
[ View ]
#41 1825952-twig-autoescape-41.patch6.68 KBmgifford
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 40,356 pass(es), 12,690 fail(s), and 1,164 exception(s).
[ View ]
#39 1825952-twig-autoescape-39.patch6.69 KBjoelpittet
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] Drupal installation failed.
[ View ]
#30 1825952-twig-autoescape-30.patch39.89 KBjoelpittet
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to login to test site.
[ View ]
#23 autoescape-on.patch1.72 KBjoelpittet
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#13 interdiff.txt7.48 KBCottser
#12 1825952-12.patch9.32 KBCottser
FAILED: [[SimpleTest]]: [MySQL] Setup environment: Test cancelled by admin prior to completion.
[ View ]
#12 interdiff.txt5.37 KBCottser
#11 1825952-11-do-not-test.patch11.63 KBCottser
#11 interdiff.txt606 bytesCottser
#9 1825952-9-do-not-test.patch11.43 KBCottser

Comments

Issue tags:+security, +Perfomance, +Twig, +Twig engine

tagging

Issue summary:View changes

Updated issue summary.

Issue summary:View changes

Updated issue summary.

Issue tags:-Twig engine

Twig tag seems enough.

Status:Active» Postponed

Postponing this until #1757550: [META-63] Convert core theme functions to Twig templates is resolved.

Once the above is done, we will be able to create a single patch that removes all instances of check_plain, check_markup, etc, from content passed to Twig. We can't do that until Twig is actually rendering all of our output or it would be a major security regression.

Oh, but I did add this as one of the major steps in our roadmap so it won't get overlooked or forgotten :)

Priority:Normal» Major

Also raising to at least major, since this was a big win we were counting on with Twig, so makes sense to make use of it. :)

Isn't this dependent on #1818266: [meta] A secure theme system (with twig) too? Turning on auto-escape is not going to be simple, as far as I know, but there was some progress in that issue.

Priority:Major» Normal

Oops. I think I meant to mark the other issue as major. Thanks, David!

Some of the drupal_render() stuff elsewhere is handling the "don't prepare wasted variables in preprocess". I need to try to talk to Fabian about what other savings we could make with preprocess for things like format_date() which is a perennial killer for content listings.

Issue summary:View changes

Phrase capitalization.

Issue summary:View changes

Fabian little x. Silly case sensitive usernames.

Assigned:Unassigned» Cottser
Status:Postponed» Active

I'm going to try and pull a patch out based on the work done in @Fabianx's sandbox: https://drupal.org/sandbox/Fabianx/1819382

The patch will be based on a diff of the d8-core-1712444 and d8-core-1712444-v1 branches.

Status:Active» Needs work
StatusFileSize
new11.43 KB

Here is a rough initial version, brought over almost everything except for one unrelated inline comment and one change that seemed unrelated/unnecessary:

diff --git a/core/includes/form.inc b/core/includes/form.inc
index 5d6b32e..e84ff49 100644
--- a/core/includes/form.inc
+++ b/core/includes/form.inc
@@ -3426,7 +3426,7 @@ function theme_tableselect($variables) {
           // A header can span over multiple cells and in this case the cells
           // are passed in an array. The order of this array determines the
           // order in which they are added.
-          if (!isset($element['#options'][$key][$fieldname]['data']) && is_array($element['#options'][$key][$fieldname])) {
+          if (is_array($element['#options'][$key][$fieldname]) && !isset($element['#options'][$key][$fieldname]['data'])) {
             foreach ($element['#options'][$key][$fieldname] as $cell) {
               $row['data'][] = $cell;
             }

The only things that I changed from the sandbox were some tiny little coding standards things that I spotted along the way.

This also changes twig_escape_filter in vendor which we can't do obviously.

Putting -do-not-test.patch because this is quite broken - for example #type => html_tag gets escaped. Going to see if I can push this a bit further though.

+++ b/core/includes/bootstrap.incundefined
@@ -1406,6 +1406,17 @@ function check_plain($text) {
+function drupal_mark_safe($string) {
+  global $safe_strings;
+  $safe_strings[$string] = TRUE;
+  return $string;
+}
+
+function drupal_is_safe($string) {
+  global $safe_strings;
+  return isset($safe_strings[$string]);
+}

I had inlined those in the optimized version and we probably need to do that for the final and to be profiled version.

I wish for a PHP-preprocessor here :-D

+++ b/core/includes/common.incundefined
@@ -1365,7 +1365,9 @@ function l($text, $path, array $options = array()) {
+  $string = ('<a href="' . $url . '"' . $attributes . '>' . $text . '</a>');
+  $GLOBALS['safe_strings'][$string] = TRUE;

Yes, here I already inlined it.

+++ b/core/includes/common.incundefined
@@ -3873,7 +3875,7 @@ function drupal_render(&$elements) {
-  $output = $prefix . $elements['#children'] . $suffix;
+  $output = ($prefix) . $elements['#children'] . ($suffix);

This change is left-over cruft.

+++ b/core/includes/theme.incundefined
@@ -1122,7 +1122,8 @@ function theme($hook, $variables = array()) {
-  return $output;
+  $GLOBALS['safe_strings'][$output] = TRUE;
+  return ($output);

Again - inlined for speed.

+++ b/core/lib/Drupal/Component/Utility/String.phpundefined
@@ -29,7 +29,9 @@ class String {
+    $GLOBALS['safe_strings'][$string] = TRUE;
+    return $string;

Probably should inline them all, it is quite easy to see.

Could be wrapped in:

if $GLOBALS['use_safe_strings']

and use the same pattern everywhere if we decide autoescape is optional to keep same performance as before.

+++ b/core/lib/Drupal/Component/Utility/String.phpundefined
@@ -105,7 +107,7 @@ public static function format($string, array $args = array()) {
-    return strtr($string, $args);
+    return drupal_mark_safe(strtr($string, $args));

Should probably inline here.

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -71,7 +71,7 @@ public static function filter($string, $allowed_tags = array('a', 'em', 'strong'
-    return preg_replace_callback('%
+    return drupal_mark_safe(preg_replace_callback('%

Inline, use just one pattern, Fabianx!

+++ b/core/lib/Drupal/Core/Template/Attribute.phpundefined
@@ -30,7 +30,7 @@
-class Attribute implements \ArrayAccess, \IteratorAggregate {
+class Attribute extends \Twig_Markup implements \ArrayAccess, \IteratorAggregate {

I am not totally sure this is still needed - I think it was left-over.

It should be removed.

The check should be like for RenderWrapper in the twig_render_template.

+++ b/core/lib/Drupal/Core/Template/Attribute.phpundefined
@@ -73,6 +73,9 @@ public function offsetSet($name, $value) {
+    elseif ($value instanceof \Twig_Markup) {
+      $value = new AttributeString($name, (string)$value);

This should be removed.

Drupal should not rely on twig as engine (even if its the default one).

+++ b/core/lib/Drupal/Core/Template/AttributeValueBase.phpundefined
@@ -12,7 +12,7 @@
-abstract class AttributeValueBase {
+abstract class AttributeValueBase extends \Twig_Markup {

Please remove.

+++ b/core/lib/Drupal/Core/Template/AttributeValueBase.phpundefined
@@ -69,6 +69,8 @@ public function printed() {
-  abstract function __toString();
+  public function __toString() {
+    return parent::__toString();
+  }

No longer needed.

+++ b/core/lib/Drupal/Core/Template/TwigExtension.phpundefined
@@ -40,6 +40,8 @@ public function getFilters() {
       'passthrough' => new \Twig_Filter_Function('twig_raw_filter'),
       'placeholder' => new \Twig_Filter_Function('twig_raw_filter'),
+      // Helper filter used to replace twig's original raw() filter.
+      'twig_raw' => 'twig_raw',

This should be:

'twig_raw' => new \Twig_Filter_Function('twig_raw'),

Probably _very_ confusing now.

+++ b/core/modules/filter/filter.moduleundefined
@@ -711,7 +711,8 @@ function check_markup($text, $format_id = NULL, $langcode = '', $cache = FALSE,
-      return $cached->data;
+      // @todo: The caller is responsible that this is really safe.
+      return drupal_mark_safe($cached->data);

Inline it!

+++ b/core/modules/filter/filter.moduleundefined
@@ -752,7 +753,8 @@ function check_markup($text, $format_id = NULL, $langcode = '', $cache = FALSE,
-  return $text;
+  // @todo: The caller is responsible that this is really safe.
+  return drupal_mark_safe($text);

And another one to inline ...

That is all :-D.

Thanks for working on this!

StatusFileSize
new606 bytes
new11.63 KB

Thanks @Fabianx, that's great :)

First the fix for output coming from RenderWrapper, cleanup from #10 coming in the next patch. Blocks are still busted (encoded when they shouldn't be) currently.

StatusFileSize
new5.37 KB
new9.32 KB
FAILED: [[SimpleTest]]: [MySQL] Setup environment: Test cancelled by admin prior to completion.
[ View ]

Blocks are still broken but cleaned up as per #10 (other than the if $GLOBALS suggestion) and things are still working as they were before from what I can see. Smaller patch file too :)

@Fabianx or anyone else - I'm not clear why some strings are wrapped in parentheses, would love to know why that is. Should we be doing this for all of them or is there a rule/reason to this? Examples:

+++ b/core/includes/theme.inc
@@ -1122,7 +1122,8 @@ function theme($hook, $variables = array()) {
-  return $output;
+  $GLOBALS['safe_strings'][$output] = TRUE;
+  return ($output);
+++ b/core/lib/Drupal/Component/Utility/String.php
@@ -29,7 +29,9 @@ class String {
-    return htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
+    $string = (htmlspecialchars($text, ENT_QUOTES, 'UTF-8'));
+    $GLOBALS['safe_strings'][$string] = TRUE;
+    return $string;

Status:Needs work» Needs review
StatusFileSize
new7.48 KB

Correct interdiff. Since I forgot to name the last patch -do-not-test.patch, temporarily setting to needs review so I can cancel that test :)

Status:Needs review» Needs work

The last submitted patch, 1825952-12.patch, failed testing.

@Fabianx Would there be possible name collisions with the $GLOBALS suggestion?

#12: No, the parentheses are a left-over from drupal_mark_safe($output) -> inlining conversion and yes I worked on it in a hurry ...

#15: Yes, indeed, should probably use $_GLOBALS['drupal_safe_strings'] instead.

Adding a new global to track rendering state is absolutely not OK. We've been working hard to remove globals from Drupal. Adding in new ones is not cool.

Should this patch be moved to #1818266: [meta] A secure theme system (with twig)? Otherwise we are starting to repeat discussion that already happened there.

(That's why this issue was marked postponed on that one. The fact that #1818266: [meta] A secure theme system (with twig) has "meta" in the issue title is arguably a bit misleading...)

As I said already, it would be good to have the same "pattern" (including variable name) to be able to do MASS Search+Replace.

Globals are really only used for performance, we could also use the advanced drupal_static_fast pattern, which should be similar to the Globals.
( Just [n] function calls for [n] functions using this technique.)

Assigned:Cottser» Unassigned

I won't have time to look at this for a little while, unassigning for now.

Assigned:Unassigned» Fabianx

Its DrupalCon ...

Issue summary:View changes

Add related meta

A bit unclear on why we have to mark strings as 'safe'? Can't we escape everything and {{ safe_string|raw }}? I'm probably being naive but thought I'd ask.

Issue summary:View changes
Status:Needs work» Needs review
StatusFileSize
new1.72 KB
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]

Ok in a crazy attempt to move this forward a bit I started writing some regular expressions... instead of a big patch.

This is what I came up with so far:

Tools: curl, ack, perl and git co = git checkout

ack check_plain -l --print0 | xargs -0 perl -pi -e 's#check_plain\(((?:\((?-1)\)|[^\(\)]++)+)\)#\1#g'
ack String::checkPlain -l --print0 | xargs -0 perl -pi -e 's#String::checkPlain\(((?:\((?-1)\)|[^\(\)]++)+)\)#\1#g'
ack "array_map\('check_plain'" -l --print0 | xargs -0 perl -pi -e "s|array_map\('check_plain', (.+?)\)|\1|g"
// Exclude everything in includes (except preprocess...)
git co -- core/includes/
// Exclude attributes
git co -- core/lib/Drupal/Core/Template/
// Do remove from preprocess in theme.inc!
@todo
// Turn on Auto-Escape
curl https://drupal.org/files/issues/autoescape-on.patch | git apply

I'm trying to ignore the idea of "safe" strings... though it will probably come up again.

Wondering if anybody can push this a bit further... the obvious catch is regions are getting escaped
    {{ page_top }}

FYI, the nasty looking perl regular expression just does some nested brace matching.

Status:Needs review» Needs work

The last submitted patch, 23: autoescape-on.patch, failed testing.

The last submitted patch, 23: autoescape-on.patch, failed testing.

Hmmm, the more I think about this issue the more like it seems we are trying to rob peter to pay paul.

We turn this auto-escape on and get:

  • A performance regression because all variables being printed will be escaped by default.
  • Removal of all checkPlain's with addition of all the mark_safe's.

All renderable arrays normally producing markup when printed would need to be "safe"/raw. This leaves all it's variables raw unless there is a twig template that they are being rendered into.

Sorry if this sounds :-(, trying to build a case in my head to make this work... is there any case where a Renderable Array would need to be escaped or can we handle that it it's template/theme function/post and render hooks etc? Maybe we can mark renderable arrays as unsafe in their hook_element_info/hook_theme definitions and assume the rest as safe?

Any direct call to theme() will generate a string that is unsafe. We've been trying to remove those through the conversions but there are still a number of them. So we'd need a way to mark those.

So far I've found the following need to be marked as safe:

  • Any variable that has been rendered in preprocess with theme(), render() or drupal_render().
  • Any object with a __toString(), DONE already in the patch. For example Attribute, RenderWraper, etc.
  • Any generated URL.
  • Any helper function that generates markup, for example String::placeholder.
  • Any variable that is already checkPlain'd explicitly.
  • And then, in general, any variable that contains markup explicitly.

Those are all the targets to be 'safe'/raw that I can think of.

Even if performance is a wash, IMO relying on Twig to handle escaping is a DX win as then Twig is behaving more like it does in every other system that uses it, which is a growing number. That makes Drupal theme adoption easier for developers used to Twig from some other CMS or framework.

@Crell, probably not a discussion for this thread on which frameworks do and don't have auto escaping on as a cursory search showed two that have it off due to reasons similar to Drupal and one has it off. Though if you see me on IRC ping me with a few examples of CMS's that do that because I'd like to read through how they go about it.

What I'd love to know from you and others, does that $GLOBALS['safe_strings'] look like a viable solution? And if not, maybe some other bright ideas? I imagine if strings were objects we could tag them as safe, but PHP doesn't do that...

One big blocker on this issue is the direct calls to theme(), render() and drupal_render() that we've being trying in twig conversion issues to convert to render arrays as much as possible but there are still cases where those happen in core. That would be an solveable problem though the other edge cases may need something like this $GLOBALS['safe_strings'] idea to get this issue moving, so if that's kosher I'll keep moving with that?

Irrelevant cursory search:

ProcessWire recommends auto-escape off because they too do the processing before:
http://modules.processwire.com/modules/template-twig-replace/

Kohana has auto-escape on because they leave everything to the developer:
https://github.com/jonathangeiger/kohana-twig/blob/master/config/twig.php

Contao auto-escape off:
https://contao.org/en/extension-list/view/twig.10020019.de.html

Status:Needs work» Needs review
StatusFileSize
new39.89 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to login to test site.
[ View ]

So here's a drastic approach. If I'm lucky it will pass install.

The plan here:
Completely remove TwigReference and it's cohorts so that |raw works.
Kill show and change hide to a filter help with that using #2114563: Remove TwigReference with the help of 'without' filter replacing 'show'/'hide' functions.
Rewrite TwigNodeVisitor to just for Arrays that look like renderable arrays to pass through twig_render_var.

Ideally, I'd just wrap all arrays in a RenderableArray object with a __toString() and drop the twig_render_var function too...

May have gone crazy... but at least I found what was hurting |raw, the node visitor converting too many things to Twig_Node_Print as referenced.

Status:Needs review» Needs work

The last submitted patch, 30: 1825952-twig-autoescape-30.patch, failed testing.

Status:Needs work» Needs review

30: 1825952-twig-autoescape-30.patch queued for re-testing.

Status:Needs review» Needs work

The last submitted patch, 30: 1825952-twig-autoescape-30.patch, failed testing.

If i'm reading that correctly, that means I win:) It installed just couldn't login which is progress:)

joel: Huh. I'm a bit surprised at that. Given what a big deal Twig and Fabien make about auto-escaping in Twig I wouldn't have expected many projects to bypass it. Although admittedly I don't know how many projects have the Russian-doll templating model we do.

After a couple more tests, seems that TwigReference is preventing {{ dump(_context|keys) }} from producing any output. So the patch in #30 with that print statement will return results in any twig file that gets rendered but without that patch it will print an empty array()

That along with {{ var|raw }} not working with TwigReference make that wrapper a blocker for this issue and I'll continue to keep it out. I may even postpone this issue on #2114563: Remove TwigReference with the help of 'without' filter replacing 'show'/'hide' functions. as it helps remove TwigReference.

Status:Postponed» Needs review

Seems the node visitor coming before auto-escape is causing raw filter not to work still. Not exactly sure why, but may be how the Auto-escape node visitor visits the nodes and looks for the raw filter. Could be a bug in the way it goes about finding the raw filter. Though if you set the priority from -1 to 1, raw seems to work as expected but all you get in the compile is this difference:

        // line 39
        echo "<div";
        echo twig_render_var(twig_escape_filter($this->env, (isset($context["attributes"]) ? $context["attributes"] : null), "html", null, true));
        echo ">
  ";

vs

        // line 39
        echo "<div";
        echo twig_escape_filter($this->env, twig_render_var((isset($context["attributes"]) ? $context["attributes"] : null)), "html", null, true);
        echo ">
  ";

Which indicates to me the raw filter is just a flag dictates when to escape or not during the 'node visitor' phase.

Would be nice to combine those two functions, though they do specific jobs so that would likely shoot ourselves in the foot.

StatusFileSize
new6.69 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] Drupal installation failed.
[ View ]

I'm always keeping in the back of my mind, that maybe auto-escape on will not be a huge win.
Any markup we send to a variable in the template will need to be either explicitly marked as Safe or automaticcly done so. The automatic way may open up potential for security holes, the manual/explicit will be a PITA.

That being said, here's some automagic that does some of the variables. Also note, #38 is not resolved or tackled.
I made a subclass of Twig_Markup so I didn't have to type in the charset each time.

Status:Needs review» Needs work

The last submitted patch, 39: 1825952-twig-autoescape-39.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new6.68 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 40,356 pass(es), 12,690 fail(s), and 1,164 exception(s).
[ View ]

reroll.

Status:Needs review» Needs work

The last submitted patch, 41: 1825952-twig-autoescape-41.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new7.96 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 50,650 pass(es), 6,247 fail(s), and 1,419 exception(s).
[ View ]
new2.16 KB

@mgifford thanks for the re-roll. Here's some help that will remove a few fails, I like that it passes install that's a good sign:)

Status:Needs review» Needs work

The last submitted patch, 43: 1825952-43-twig-autoescape.patch, failed testing.