I just builkd a feed that view that used the views "strip tags" option to remove html from a node body (I didn't want images in the feed) however it didn't work because there were html &nbsp entities in the resultant string (should be   in a valid rss). I made a kludge fix by changing:

    $item['description'] = $row[$key];

to

    $item['description'] = htmlentities($row[$key]);

In views_plugin_style_rss_fields.inc - not sure if this is a valid long term fix because it might have side effects on fields other than node bodies that are mapped to the xml description field.

CommentFileSizeAuthor
#5 views_rss_2.patch10.47 KBDavid Goode
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

nightowl77’s picture

Status: Active » Reviewed & tested by the community

I'm no expert at RSS feeds but shouldn't all data have htmlenties anyway? I looked at some official feeds and everything in the "description" part always have all htmlentities converted to the "escaped" format.

For example: (this is a cut and paste of one of the entries on the planet ubuntu rss feed)

<description>
&lt;img class=&quot;face&quot; src=&quot;http://planet.ubuntu.com/heads/jriddell.png&quot; alt=&quot;&quot;&gt;
&lt;p&gt;All hands on deck for the release candidate CD (and DVD and USB and upgrades) testing day.  See the &lt;a href=&quot;http://iso.qa.ubuntu.com/qatracker/build/kubuntu/all&quot;&gt;ISO tracker&lt;/a&gt; for what needs tested (duplicates tests always welcome) and join us in #kubuntu-devel to coordinate.&lt;/p&gt;	</description>

I ask this because my feeds doesn't work in LiFeRea (linux feed reader) and opera. I mapped the feed description to the node:body field and since it contains a "paragraph" marker - it won't show the feed's body at all in either RSS reader.

If I change the code to $item['description'] = htmlentities($row[$key]); like @jpp did the feeds show correctly on both LiFeRea and Opera's internal RSS reader.

Therefore I don't think that it should have htmlentities just when views strips tags - it seems to me it should always apply htmlentities to the description field. Please correct me if I'm wrong.

Hope this helps

tdos20’s picture

I've managed to get properly formed rss out of managing news by adding htmlentities to all $item's in the above file.

omega8cc’s picture

loze’s picture

I was able to handle html characters in feeds using CDATA in a theme override.

function phptemplate_preprocess_views_rss_fields_item(&$vars) { 
  $item = $vars['item'];
        
  // GeoRSS
  if ($item['lat'] && $item['lon']) {
    $item['georss:point'] = check_plain($item['lat'] . ' ' .  $item['lon']);
    unset($item['lat']);
    unset($item['lon']);
  }
  
  // Loop through key=>value pairs
  foreach ($item as $key => $value) {
		
    if ($value) {
//@@ this is what i added
      if($key=='description' || $key=='title'){
				$row .= '<' . $key . '><![CDATA[' . filter_xss_admin($value) . ']]></' . $key . '>
				';
			} else {
				$row .= '<' . $key . '>' . filter_xss_admin($value) . '</' . $key . '>
				';
			}
    }
  }

  $vars['row'] = $row;
}
David Goode’s picture

FileSize
10.47 KB

Here is a patch to views_rss that cleans up code and adds this filter. Tested in new MN release beta.

alex_b’s picture

Status: Reviewed & tested by the community » Fixed

Committed #5, thank you.

tdos20’s picture

I'm still having problems with views_rss, when I navigate to the rss page in my browser it offers me an xml file for download (rather than rendering it in the browser) - the file is improperly formed and has <span id="thmr_78" class="thmr_call">
instead of
<xml>
as the first element - I had a look into what might be making this happen but was out of my depth - the .tpl file in views_rss/views/ seems ok and the views ui says that was the styling template so I'm at a loss - can anyone help?

BenK’s picture

Subscribing...

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

hexblot’s picture

@tdos20: My personal experience is that the culprit behind extra spans is the Themer module ( Theme Development ), since it adds a whole lot of such elements, with ids like "thmr_XXX", where XXX is an incremental number. If you have it enabled, try disabling the developer modules for your tests. If you have the admin_menu module, you can easily do so by using the left most menu item (your site favicon)-> Disable Developer Modules

NOTE: Do note that using the option from the Admin Menu will also disable ViewsUI and Imagecache UI, since these are also considered developer modules (took me ages to figure out why those two were getting disabled "on their own" )

Hope this helps.

tdos20’s picture

You were quite correct - The problem was to do with the themer module - a way round disabling it is to open the rss feed as an anonymous user rather than an admin - this should solve to mystery before the proper rss.

idealdesigns’s picture

I bumped into this problem where i wanted to remove html tags and html special characters so i wrote a small article
Views Rss html special characters problem