Follow-up to #1825952: Turn on twig autoescape by default
Problem/Motivation
In #1825952: Turn on twig autoescape by default, we add a known list of "safe" strings that is populated from everything processed by String::checkPlain()
(and therfore every @
or %
placeholder passed to t()
and String::format()
), Xss::filter()
, various Views and other rendering functions, etc. This means in many cases we may have nested lists of safe strings, i.e. one safe string that is composed of several other safe strings that in turn is composed of several others... all in the static list. We have no idea how big this list might get at present.
Additionally, for form and batch processing, we are storing the known list of safe strings in the form/batch state to transfer it between requests. This has the potential to exacerbate existing form cache issues, especially if the list is very large.
Proposed resolution
Ensure that the list of strings does not become dangerously bloated for cache size or the memory footprint.
Remaining tasks
Do some profiling, particularly once other critical issues related to the form cache are resolved. Suggestions for profiling:
- The permissions page with more than 10 roles and standard profile.
- A large view, especially one with exposed filters.
- admin/people with more than 10 roles and 100+ users
- admin/content with 100+ nodes
- The block placement page.
- The menu page for a large menu
User interface changes
N/A
API changes
N/A unless something is broken.
Postponed until
Comment | File | Size | Author |
---|---|---|---|
#17 | 2295823.17.patch | 62.54 KB | alexpott |
#13 | 2295823.13.patch | 22.85 KB | alexpott |
Comments
Comment #1
yched CreditAttribution: yched commentedDo we really need to persist the list across form rebuilds ? Was performance really bad without it or is that some just-in-case / possibly premature optimization ?
Also not fully convinced of the interest of persisting across batch steps. Batches are mostly about processing rather than outputing strings than need to be sanitized, so the gain seems minor. OTOH, on a batch with thousands of steps that stores messages, serializing, unserializing the growing cache of safe strings could easily slow down processing as the batch progress.
Comment #2
xjmComment #3
xjmJust making sure this one doesn't slip through the cracks forever.
Comment #4
xjmComment #5
xjm#2488538: Add SafeMarkup::remove() to free memory from marked strings when they're printed is (possibly) showing problems from this.
Comment #6
xjmAdding some other/more specific scenarios.
Comment #9
xjmComment #10
xjmFiled: #2295823: Ensure that we don't store excessive lists of safe strings
Comment #11
xjmComment #12
star-szrLooks like that comment was meaning to reference #2506035: Do not add placeholders from SafeMarkup::format() to the safe list :)
Comment #13
alexpottThe patch attached combines:
On my test scenario of 11 roles and rendering
admin/people/permissions
it has the following affect:SafeMarkup::set()
is called 11,903 times less![edit removed wall time because it is misleading]
Comment #15
xjm@alexpott, awesome work.
Comment #16
Wim Leers\o/ alexpott \o/
Comment #17
alexpottHere's the combination of patches #13 not that they are all green.
Diff%
Diff
(microsec)
Diff%
Diff
(microsec)
Diff%
MemUse
Diff
(bytes)
Diff%
MemUse
Diff
(bytes)
Diff%
PeakMemUse
Diff
(bytes)
Diff%
PeakMemUse
Diff
(bytes)
Diff%
Comment #19
BerdirHow exactly are we *ensuring* this here?
I noticed that our form_state entries on a big page are *huge* (3-6MB), tracked it down to this list for a form that's rendered in post_render_cache/callback, so safe markup is full of viewed entities, views and so on. I guess #2506581: Remove SafeMarkup::set() from Renderer::doRender is going to help specifically with that.
Comment #20
xjmThis is currently covered in the scope of #2549943: [plan] Remove as much of the SafeMarkup class's methods as possible, so untagging.
Comment #21
joelpittetWe're no longer storing lists of strings. They are all passed around as Value Objects. This was done as a barrage of criticals just before RC.