escaping special characters in generated xml document

drupaluser99 - October 18, 2006 - 17:26
Project:Playlist (toolkit, modules)
Version:4.7.x-1.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

When certain fields such as the title of the audio track has special characters (ie "&", etc), it is not properly escaped within the xml document resulting in validation errors.

#1

p_alexander - January 2, 2007 - 07:41

I changed the following in modules/playlist/audio_playlist/feeds.inc and it seems to be working for me now, even when the metadata field has an "&" or other character. Before this would create invalid XML which would prevent iTunes or other feed readers from picking up the feed at all. Copy and paste the following over the same section in feeds.inc for the (hopefully temporary) fix.

/**
* Return XML for podcast feed
*/
function audio_playlist_podcast_feed($items = array(), $metadata = array())
{
// Metadata about this feed
$output  = '<?xml version="1.0" encoding="UTF-8"?>' . " \n";
$output .= '<rss
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">' . " \n";
$output .= "<channel> \n";
$output .= "<ttl>60</ttl> \n";
$output .= "<title>". htmlspecialchars($metadata['title']) ."</title> \n";
$output .= "<link>". htmlspecialchars($metadata['link']) ."</link> \n";
$output .= "<generator>". htmlspecialchars($metadata['generator']) ."</generator>";
$output .= "<managingEditor>". htmlspecialchars($metadata['owner_email']) ." (".htmlspecialchars($metadata['owner']) .")</managingEditor>";
$output .= "<pubDate>" . htmlspecialchars($metadata['created']) ."</pubDate> \n";
$output .= "<language>". htmlspecialchars($metadata['language']) ."</language>";
$output .= "<copyright>". htmlspecialchars($metadata['copyright']) ."</copyright>";
$output .= "<itunes:explicit>". htmlspecialchars($metadata['explicit']) ."</itunes:explicit>";
$output .= "<itunes:subtitle>". htmlspecialchars($metadata['subtitle']) ."</itunes:subtitle>";
$output .= "<itunes:author>". htmlspecialchars($metadata['author']) ."</itunes:author>";
$output .= "<itunes:summary>" . htmlspecialchars($metadata['summary']) ."</itunes:summary>";
$output .= "<description>" . htmlspecialchars($metadata['description']) ."</description>";
$output .= "<itunes:owner>";
$output .= "<itunes:name>". htmlspecialchars($metadata['owner']) ."</itunes:name>";
$output .= "<itunes:email>". htmlspecialchars($metadata['owner_email']) ."</itunes:email>";
$output .= "</itunes:owner>";
$output .= "<itunes:image href=\"".htmlspecialchars($metadata['image']['url']) ."\" />";
$output .= "<image><url>". htmlspecialchars($metadata['image']['url']) ."</url><width>". htmlspecialchars($metadata['image']['width']) ."</width><height>". htmlspecialchars($metadata['image']['height']) ."</height></image>";

if (is_array($metadata['categories'])) {
   foreach ($metadata['categories'] as $category) {
     $output .= "<itunes:category text=\" ". $category ."\"/>"; // </itunes:category> used for subcategories
     $output .= "<category>". htmlspecialchars($category) ."</category>";
// RSS categories
   }
}

// Cycle through all the items in this feed
foreach ($items as $item) {
     $output .= "<item> \n";
     $output .= "  <title>". htmlspecialchars($item['title']) ."</title> \n";
     $output .= "  <guid>". htmlspecialchars($item['guid']) ."</guid> \n";
     $output .= "  <link>". htmlspecialchars($item['link']) ."</link> \n";
     $output .= "  <itunes:author>". htmlspecialchars($item['author']) ."</itunes:author> \n";
     $output .= "  <itunes:subtitle>". htmlspecialchars($item['subtitle']) ."</itunes:subtitle> \n";
     $output .= "  <dc:creator>". htmlspecialchars($item['creator']) ."</dc:creator> \n";
     $output .= "  <content:encoded>". htmlspecialchars($item['content']) ."</content:encoded> \n";
     $output .= "  <description>". htmlspecialchars($item['description']) ."</description> \n";
     $output .= "  <comments>". htmlspecialchars($item['comments']) ."</comments>";
     $output .= "  <itunes:summary>". htmlspecialchars($item['summary']) ."</itunes:summary> \n";
     $output .= "  <enclosure url=\"". htmlspecialchars($item['enclosure']['url']) ."\" length=\"".  htmlspecialchars($item['enclosure']['filesize']) ."\" type=\"".
htmlspecialchars($item['enclosure']['filemime']) ."\" /> \n";
     $keywords = implode(", ", $item['keywords']);
     $output .= "<category>". htmlspecialchars($keywords) ."</category>";
     $output .= "  <itunes:keywords>". htmlspecialchars($keywords) ."</itunes:keywords> \n";
     $output .= "  <itunes:duration>". htmlspecialchars($item['duration']) ."</itunes:duration> \n";
     $output .= "  <pubDate>" . htmlspecialchars($item['created']) ."</pubDate> \n";
     $output .= "</item> \n";
}
$output .= "</channel> \n";
$output .= "</rss> \n";
drupal_set_header('Content-Type: text/xml; charset=UTF-8');
print $output;
}

#2

zirafa - January 2, 2007 - 10:24

Hmm, I see. It might make sense to perform the htmlspecialchars() on the metadata array once at the beginning instead of multiple times. That'll make it much cleaner I think.

#3

zirafa - January 2, 2007 - 10:28

Also Drupal's check_plain function should be used instead of htmlspecialchars(), I think. I don't think it takes an array so maybe we'll have to settle for line-per-line.

#4

Mehrad - August 6, 2008 - 21:00

Hi here,
Have you ever tried with drupal 5 and apply your specific metadata, xml and schema on a content type?

 
 

Drupal is a registered trademark of Dries Buytaert.