Hi, I've been working with the Views RSS module, and I've come across something a bit frustrating. If quotes or aposrophes are used in a node title (example: Stanley Kubrick's "2001") the RSS output comes out as this:

Stanley Kubrick#039;s "2001"

Is there a fix for this?

Comments

Status:Active» Closed (works as designed)

Well, not really - this is an issue with Drupal's format_xml_elements() function, which Views RSS is using, rather than with Views RSS module - as this function calls check_plain() on all XML element values. You can see exactly the same thing happening to core Drupal feeds. (Nota bene, format_rss_item() is doing exactly the same thing.)

Also, even with HTML-encoded quotes in the <title> elements your feed is still valid (W3C Feed Validator does not return an error, although it suggests not using HTML quotes there).

I am contemplating switching Views RSS to use SimpleXML at one point in the future, until then though you could consider submitting an issue against Drupal core.

I'm guessing this may be associated with why I am seeing<h3> tags in my podcast episode titles in iTunes, even though I am instructing the feed to strip html tags?

Edit: Never mind... stupid too-inclusive field theming I'd apparently forgotten about!

A rough workaround I found is to use a PHP field.

Requires views_php module

Could you expand on what you do?

Here's a fix that appears to work, so if people test it, that'd be great.
It's based on a piece of code a guy wrote to solve the problem in D6 and I've used it on a lot of sites back then.
It's never caused me trouble.
For D7 things changed but here we go:

Go to views_rss/theme and open the theme.inc

Copy out the entire 'function template_preprocess_views_view_views_rss function, and put it in your theme's template.php.
Change the function name to: function yourthemename_precrocess_views_view_views_rss

Then at line 200 in the original theme, or where it reads '// Add XML element(s) to the item array' insert the following just above.
I included the above if statement to help find it.

if (empty($rss_elements)) continue;
        // Insert here -- clean up special characters
       $rss_elements[0]['value'] =  htmlspecialchars_decode(trim(strip_tags(decode_entities( $rss_elements[0]['value'])),"\n\t\r\v\0\x0B\xC2\xA0 "));
        $rss_elements[0]['value'] = htmlspecialchars($rss_elements[0]['value'], ENT_COMPAT);
// end of cleaning
// Add XML element(s) to the item array.
        $rss_item['value'] = array_merge($rss_item['value'], $rss_elements);
      }

Check your RSS.... you might have to flush the cache a few times.
You can always check it works by hacking the theme.inc file, as I had a bit of trouble getting the theme_hook to work from the template file.

I testing to see how it's working as Twitter and Facebook use the feed to post socially.

I use the views_php module (which I do anyway to construct a language-independent pubDate value on my multilingual site) for the RSS title field and just use str_replace() to solve the problem.

The code for the PHP field value code field is (make sure that you load the node title field, even if you don't use it directly):

return str_replace("&amp;#", "&#", $data->node_title);

Is this issue related with: #779760: check_plain runs twice on title ?

Also I cannot see the same behavior on Drupal's default feeds. When I look at the source codes;

Drupal's default feed has &amp;

Views RSS's feed has &amp;amp;

Issue summary:View changes

So there's this thing in PHP thats designed encode whacked characters.

http://us1.php.net/htmlspecialchars

I am not a developer and I don't play one on television, but it seems to me that the output of the Views RSS fields could use this to force encoding on quotes, apostrophes, and ampersands.

Perhaps the issue is related to the formatting of the fields before they are inserted into the Views RSS display type.

My title fields contain quotes and apostrophes. When I assigned a title field as the title value of the RSS item then the feed displayed poorly encoded quotes and apostrophes. I double checked the field configuration of the title field and saw that it was set to output as "default". When I changed the value of the title field to "plain text" then the characters output in a way that RSS readers could properly display the characters.

I also noticed the automatic preview generated by Views would show bad characters, but the RSS reader I was using parsed the characters correctly.

If this does not resolve the problem then reopen.