This issue was brought up by Damien Tournoud at http://drupal.org/node/276111#comment-900679 . Basically, the current implementation of importing translations in Drupal trusts the administrator to trust the translation author that no malicious HTML tags are included, but what is worse is that trust of the plural formula is also required which is converted to PHP and eval()-ed.
As I've said on that issue:
We can certainly improve on validation of the plural forumla, since we know about the few characters which are allowed. Such as no other letter but the letter 'n' is allowed, or that comparison, the ternary operator and parenthesis are the basic building blocks. So we can quite clearly define the possible parts of a plural formula and lock is down hopefully enough. That should be another issue.
This is that other issue. I suggest we should develop a regular expression which defines the allowed values in plural formulas and validate against that. This should help not require that trust in translators.
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/l10n_server... should provide a decent overview of the existing plural formulas used in Drupal translation projects. That leads us to some allowed stuff:
- the letter n
- ( and ) parenthesis
- numbers
- ? and : (the ternary operator)
- % (modulo division)
- != <= >= < > == comparisons
- && and || logical operators
- whitespace
Anything else I might have missed? Does this set of stuff allow for any kind of PHP injection?
Comments
Comment #1
meba commentedsubscribe
Comment #2
gábor hojtsyHa, this was an entirely false alarm, sorry. If you look at http://api.drupal.org/api/function/_locale_import_parse_arithmetic/6 you'll clearly see that only the defined tokens, numeric values and 'n' is allowed in the formula, or the converter bails out and fails accepting the formula altogether. Which is what it is supposed to do. Damien please get back to concrete examples of how this can be tricked if you have them.