Using <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> and whitelisting attributes: @[class|title|property|about|xmlns*]

The attributes in the following XHTML+RDFa get stripped out:

<div xmlns:dc="http://purl.org/dc/elements/1.1/">			
   <div about="/alice/posts/trouble_with_bob">
      <h2 property="dc:title">The trouble with Bob</h2>
      <h3 property="dc:creator">Alice</h3>
   </div>
</div>

(RDFa Example taken from http://www.w3.org/TR/xhtml-rdfa-primer/)

How can I configure WYSIWYG Filter so that xmlns*, about and property can be allowed?

Thanks!

Comments

markus_petrux’s picture

Priority: Critical » Normal
Status: Active » Fixed

Sorry, but RDFa is not supported. This filter supports only the options that you can see in the filter settings form. Extending this list is not as easy as it seems because each HTML element, attribute or style property needs to be validated agains certains rules that are hardcoded in the module.

I do not know of any other filter that supports RDFa, so I'm afraid you'll have to use Full HTML input format, or use custom macros or something similar to expand them after the WYSIWYG Filter has already processed the text.

Anonymous’s picture

Thanks for the quick response. I'm a little confused though, if you offer the option to whitelist attributes, how can you then hard code validation to strip out entries in the whitelist? Surely the whitelist should override any hard coded validation rules, i.e. if in whitelist = safe, don't run further validation rules on attributes? I'm just a little frustrated that all of the filters out there are stripping out RDFa. Drupal's built in HTML filter appears to allow at least the property attribute if the doctype is correct, however it then rather annoyingly strips out the colon from the value. Using the Full HTML input format is highly undesirable. I will look into the options you suggested, thanks. Do you have any plans to support RDFa in the near future?

markus_petrux’s picture

An input format filter should not only validate the attribute names, but also the contents of those attributes, for security reasons. So the code behind that should be able to understand all the possibilities the particular syntax offers, and try to implement something that is able to balance user capabilities with security concerns.

So it is not just a matter to have to possibility to whitelist stuff, but the code should also validate the rules where this stuff is being used. In WYSIWYG Filter, the goal was to cover enough HTML to be able to process stuff generated from WYSIWYG editors, using common stuff. You cannot even whitelist here embed or object, because these are too complex to validate, and people can use additional macros to publish videos, flash, etc.

RDFa is complex enough that it probably deserves a specialized input format filter, if you want to let users enter RDFa stuff in textareas.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.