This is a forked off issue of Steven's comment on the installer localization patch. Here he suggests we should extend t() to support escaping and escaping + placeholder display of text without calling the actual functions. Much like db_query() does the escaping magic when needed.
My approach to the issue is to extend on what we have now, instead of redefining the prefixes as Steven indicated. %value would be kept for simple substitution, but %@value is introduced for placeholder substitution (so that check_plain() is called and em tags are added around the value by default). Pros of this approach is that a lot of code becomes more readable, and that the installer support would become easier (by using this syntax there, not by using the t() function itself). The con is that t() always performs a foreach on arguments, which takes a small amount of time.
The attached patch is only a small fraction of t() calls changed (only include files). Since this needs to be done manually, I would not go into modules and core themes before a method is agreed upon, so my work is not thrown out. I have also been unable to find only check_plain()-d arguments in include files, so not implemented support for them (yet?).
| Comment | File | Size | Author |
|---|---|---|---|
| #35 | fix.patch | 482 bytes | eaton |
| #28 | t.st_1.patch | 294.46 KB | Steven |
| #27 | t.st_0.patch | 294.46 KB | Steven |
| #25 | t.st.patch | 36.01 KB | Steven |
| #23 | Drupal.t.refactor.20060814.patch.txt | 270.43 KB | gábor hojtsy |
Comments
Comment #1
Steven commentedActually I already have a finished conversion of core sitting here. I just got busy with eaton's install patch :/.
For those who just tuned in:
This patch tries to address the fact that nearly every call to t() uses check_plain() or theme('placeholder') on its arguments (or at least that's what people should be doing). Putting theme('placeholder') calls inside a t()-argument array means a lot of nesting, a lot of long lines and code that is hard to read.
So, why not let the conversion be done by t() itself, and make st() identical? We already implemented a similar technique in install.inc's st() function (rather on a whim), which is hardcoded to pass its arguments through theme_placeholder(). This has the benefit of reducing code bloat, but forces everyone to use the emphasized placeholder style. You might have noticed that the installer tends to have lots of italics.
So, here's an even better solution: we let the first character of the substitution key define the type of value:
Note that contrary to Goba's solution, I chose to change all t() calls around by changing the meaning of "%value". Why? Because too many contributed modules (and some parts of core) still don't use placeholders properly. Using this convention, we can fix a bunch of possible XSS exploit vectors at once. Of course, those modules that already work will need small changes, but those authors should already be familiar with the needed steps. They can remove any theme('placeholder') or check_plain() calls of your own, or use the !-prefix.
I also picked ! as the prefix for as-is insertion because it denotes "warning" and "danger": you're not doing any escaping here.
Examples:
The @ prefix is very useful for st() / install.php in fact where we don't want to emphasize the name of the install profile (e.g. Drupal or Civicspace).
The only downside of this scheme is that it affects tons of translatable strings.
Comment #2
Steven commentedThat last example should of course be:
Comment #3
gábor hojtsyI intended to keep backwards compatibility but if it is for the sake of XSS fixing then so be it :) Translations will need to look out for a huge amount of small string changes. Maybe we can cook a simple script to help them update their translations? Hm. Anyway, I am fine with the changes. I have reviewed the patch, and the array_walk() approach with the new _t() and _st() seems to be fine. IMHO RTBC.
Comment #4
moshe weitzman commentedseems like a good idea to me. this will cause loads of modules to review/change their escaping, but i guess that could be considered a good thing.
Comment #5
Jaza commentedAre there any precedents from other applications, programming languages, or environments, that we should be considering when making our choice of placeholder prefixes? Although I think that Steven's logic behind the choices is sound, I'd still like to see some evidence that research has done into existing use of prefixes in the wider world, before we go ahead and commit a patch that will potentially break with already-established conventions.
One precedent that comes to my mind: the '@' symbol is the 'shut-up operator' (a.k.a. the 'error control operator') in PHP. I think that our proposed use of the '@' symbol for placeholders fits reasonably well with the way that PHP uses it (i.e. overriding the default 'safe behaviour'). Can't think of a precedent for the '!' character - except that some developers might confuse it with the PHP/C/Java/etc '!' (negation, or 'bang') operator. But the argument that '!' indicates 'warning' sounds like good enough justification to me.
Comment #6
drumm+1, except:
- No longer applies.
- There is line-wrapping (search for 'gator/opml').
Comment #7
chx commentedRerolled.
Comment #8
Steven commentedUpdated with HEAD (and fixed wrapping...).
Proposed update instructions:
Comment #9
dries commented1. Before this can be committed, I'd like to understand the performance implications of this change. After all,
t()is called a lot.2. Can someone summarize the advantages of this patch? (Or the disadvantages of the current approach.) At the end of the day,
st()looks like a small function and the code duplication might not that bad as the additional performance cost?Comment #10
dries commented(All in all I'm OK with this patch. I just want us to benchmark it properly, and weigh the advantages and disadvantages. Rushing his sounds like a bad idea.)
Comment #11
gábor hojtsy@Dries, the advantage of this patch is that escaping and placeholder marking will be a lot easier. We know that XSS is so common because it needs unnecessarily extra work to prevent it. Drupal has a nice SQL injection prevention layer with escaping SQL parameters as needed. This patch introduces the same concept to help prevent XSS.
Quoting from the link above:
Unfortunaltely I am going to be missing in action until tuesday, so I cannot help in performance testing of this patch, but the development benefits are big IMHO.
Comment #12
dries commentedI agree that the benefits are big, and I can see this patch go in if there are no significant disadvantages. Unfortunately, I won't be able to do performance testing myself this weekend. Maybe early next week.
Comment #13
Steven commentedAdvantages:
This is needed in code that runs both at installation time as well as run-time (think install-time/run-time requirements checking patch). The alternative would be for every string to make an
if elseor? :construct. That's not handy.Re performance: I don't have a good environment to benchmark at the moment.
Comment #14
dries commentedI don't have a good environment to benchmark at the moment.
We don't need a super-duper setup for this. The following might give us some gut feeling:
Comment #15
heine commentedPrepatch: 5923.72 ms
Postpatch: 22853.34 ms
Comment #16
heine commentedIgnore the above. Will redo with proper prepatch checks.
Comment #17
heine commentedNow done with proper prepatch checks:
Prepatch: 11363.03 ms
Postpatch: 22853.34 ms
Comment #18
Steven commentedHeine: Summing the values you pasted in your last followup, gives me the following totals:
before: 20182ms
after: 22863ms
This is quite a different picture. Also, looking at the benchmark from #15, where the xss checks were absent (and the total time was only 5900ms), shows that most of the work is spent in actually performing the xss checks, not in the surrounding code. The performance impact of the changes to t() are minimal enough IMO.
Comment #19
heine commentedSteven: How embarrasing; that's indeed an incorrect before value.
I've some averages & spreads from 10x 10.000 x iteration
time in ms
pcn = placeholder, check_plain, no check
Mostly small differences as seen by
+1 for this patch
Comment #20
gábor hojtsyReady to get committed then.
Comment #21
dries commentedWanted to benchmark this patch but it no longer applies. :/
Comment #22
gábor hojtsyDries, at your service! I checked out a CVS copy of Drupal for the date when the last patch was generated, applied the patch and then took an hour to fix all CVS conflicts. Updated patch attached.
Would be nice to get this reviewed and comitted soon. It is not easy to keep this up to date...
Comment #23
gábor hojtsyAlthough the previous patch would probably apply with some offsets (due to several changes in Drupal core lately), here is an up-to-date version. I would like to reiterate that this is an important security improvement, and also note that other patches are on hold awaiting the commit of this patch: Installer translations and proper watchdog translations are mine awaiting for a final decision on this.
Comment #24
drummI'll wait for Dries to do the promised benchmarks or until I hear otherwise.
Comment #25
Steven commentedUpdating with latest HEAD.
Comment #26
dries commentedDid you upload the proper patch? I was going to benchmark it, but the last version doesn't apply at all. Got 20+ failures or so ...
Comment #27
Steven commentedNot sure what happened there. I think my CVS tree might have been a bit messed up.
Trying again... I verified that it applies cleanly to a fresh CVS checkout.
Comment #28
Steven commentedProperly wrapping doxygen comments...
Comment #29
Steven commentedDries' benchmarks showed a slowdown of less than 1% on the front page. Committed to HEAD.
Comment #30
eaton commentedThis patch seems to have hosed things. with a fresh HEAD, I see weird errors on hitting install.php:
Warning: array_walk() [function.array-walk]: Unable to call <em>Database name() - function does not exist in /home/bonsai/public_html/test/includes/common.inc on line 601form element titles show the check_plain'd output of their HTML rather than the properly rendered text, and so on. I'm not even able to load the registration form to create the first user.
When I reverse this patch against HEAD, everything starts working again.
Comment #31
Anonymous (not verified) commentedi can't confirm the install problems - it worked fine for me, but i'm also seeing the form title issues eaton mentioned.
Comment #32
eaton commentedalso, the bug italicized this issue!
Comment #33
webchickBetter..?
Comment #34
eaton commentedYes, better, thanks webchick :)
Having dug around, part of the problem appears to be that the case for handling "!" prefixes was completely removed from the patch when the last version of it was rolled. Is this, er, intentional?
Comment #35
eaton commentedNo, I'm betting it wasn't.
Comment #36
webchick+1.. confirmed that HEAD is totally broken without this patch (install.php is just a blank screen for me), and applying it fixes stuff again.
Comment #37
Steven commentedCommitted to HEAD, thanks.
Comment #38
(not verified) commented