UI for filter_xss allowed tags default
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | filter.module |
| Category: | feature request |
| Priority: | normal |
| Assigned: | NancyDru |
| Status: | won't fix |
In the wake of numerous SAs of late, I have been more liberally using filter_xss. One big problem with this is that the standard default allowed tags is quite limited and I keep having to add a long string of tags that I want to all my modules. This is just plain code bloat.
I feel that a much better solution is to allow the admins to specify which tags they want to allow (similar to "Filtered HTML"). Then module maintainers won't have to keep including those long strings.
This patch adds a new tab to the "Input formats" page to define the default allowed tags.
.
A marginally acceptable alternative would be to turn "_filter_html" into a public function. But that doesn't give the admin as much flexibility as this path does.
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| xss.patch | 3.86 KB | Ignored | None | None |

#1
There is a patch so changing status.
I'm against mentioning a function name in user-facing text:
+ '#description' => t('These HTML tags will not be removed by the filter_xss function. Be sure you trust all possible combinations and your users before including new tags.'),#2
This is an admin function. One would need "administer filters" permission to see it. This is also specific to that function, so I think that is important.
#3
Since when does http://drupal.org/node/28984 tell you to use filter_xss() ?
#4
Check_plain renders non-Latin characters as HTML entities, leaving non-English speakers with junk to look at when they enter their alphabet. You can even see this in other places like titles that end up with ' instead of single quotes.
#5
And, as a followup, the sentence "Be sure you trust all possible combinations and your users before including new tags." is also confusing to me. Does that mean that you should trust all combinations and interactions between added tags? Or all combinations of new tags with your users, which wouldn't make sense.
Seems like something like "Be sure that you trust your users to make proper use of all allowed tags." would be simpler and possibly clearer, unless I completely misunderstand the meaning of the original.
#6
Updated
#7
If you need to mix user-supplied plaintext with module supplied HTML, use check_plain.
This simply indicates that a string containing a single quote has been escaped twice. Which is a coder error, not a problem of check_plain. If you suspect a bug in check_plain the natural response is to file a bugreport, not work around it in some weird, error prone way.
In what context do you use filter_xss? If you need to display user-supplied rich text, why not associate a format with the input and use check_markup instead?
#8
I am a module developer, not the site admin. I cannot create an input format for them. It's not just single quotes, that was just an example. And I see this problem in core implementations as well, such as titles, block names, and links. Check_plain simply does not allow rich text. In addition to showing plain HTML, it also turns many non-English characters into entities. My module users simply won't accept this (look at some of the past issues in Glossary, for example). Further, when displaying things like taxonomy term descriptions, there is no input format associated with the text.
The only way I have found so far that makes my international users happy is
decode_entities(filter_xss($term->description, $allowed))(in PHP 5, one may use htmlspecialchars_decode).#9
"Further, when displaying things like taxonomy term descriptions, there is no input format associated with the text."
Then we should add an input format to that field.
#10
I would accept that, but considering the imminent freeze, who's going to write that code? And I shudder to think of someone being allowed to insert PHP code into a term or vocabulary description.
#11
The freeze isn't imminent, it's unspecified. Could be any time within the next 1-5 months.
PHP in a term or vocabulary description is down to the PHP module being installed and appropriate permission - it's no worse than putting php into a block IMO.
#12
I think sufficient alternatives have been proposed.
#13
Which would they be?
Input formats for taxonomy terms and vocabularies? That doesn't exist and I have doubts that Dries would buy it.
A bug report on check_plain? I don't think there is a bug - it works as designed and does what is needed in some cases. It simply is inadequate for non-English and rich-text as my users tell me over and over again.
Create another input format? That is pretty much what I am doing with this patch, but without the chances of PHP. [And I agree that it shouldn't be available.]
Turn "_filter_html" into a public function? As I suggested this is a marginally acceptable solution since it is largely check_markup($text, DEFAULT_FILTER_FORMAT) with slightly less overhead (but, IMHO, a performance bug of its own).