I've created a node type with a single image and, using tokens, I've defined the alt and title text to be [title] (the node title). However, if a user includes an ampersand (&) in the node title, it becomes & in the html alt and title text. Therefore, I assume I should use the unfiltered node title rather than the regular node title, but of course, I noticed the warning:

Unfiltered node title. WARNING - raw user input.

Can someone explain what this warning means (to a not-extremely-technical person) and what risk I would be running if I set the alt and title text to [title-raw]? Thank you in advance.

Comments

bryancasler’s picture

I too would like a better explanation.

narayanis’s picture

I can't find a definitive answer in the docs, but here is my experience.

title-raw will dump out exactly what you type in
title will take your input, run it through a filter to remove anything potentially dangerous and convert things like & to amp;

If you fully trust all users who will author content where title-raw is going to be used, you should be fine to use it.

StormDruper’s picture

It makes sense to me that & would be converted to & but, unfortunately, there appears to be a bug. When I use the regular (filtered) node title token, & is being converted to & in the html alt and title text for the image. It appears that it's run through the conversion filter twice?

In any event, what types of things can a user enter into a title that could potentially cause problems or inflict damage if left unfiltered? (I presume this is a pretty stupid question ... and if so, I apologize, but I don't yet understand the possible ramifications.) Thank you.

narayanis’s picture

A clever enough person could enter some JavaScript in the title that grabs cookies and sends their data to his/her server. Or throw up an IFRAME that looks like a form on your site but is actually capturing and sending data to his/her server.

WillHall’s picture

It looks like the default behavior of the title field is to sanitize the user input by converting it to plain text.

jhl.verona’s picture

...and the answer is - look in the tokens module: sites/all/modules/token/token_node.inc:

      $values['title']            = check_plain($node->title);
      $values['title-raw']        = $node->title;

Which means that [title] is just [title-raw] that's been passed through the HTML sanitiser.
Now I don't quite know what you mean by image - I presume it's an imagefield. The alt and title attribute values get passed to drupal_attributes which gives each one a good wash (check_plain again). So use [title-raw] for attributes, otherwise they get sanitised twice (by token, then drupal_attributes) which gives you the & result.

The [title-raw] warning is just that - if you decide to use it in your HTML output (node.tpl.php perhaps), it's up to you to put it through the sanitiser first. Otherwise, yes, you can be open to sophisticated embedded code attacks.

StormDruper’s picture

Yes, I was referring to an imagefield. Thank you all for your answers.