XML export produces invalid XML

Aren Cambre - October 13, 2009 - 16:26
Project:Views Bonus Pack
Version:6.x-1.x-dev
Component:Views Export
Category:bug report
Priority:normal
Assigned:Unassigned
Status:needs review
Description

In views_bonus_export.theme.inc are these lines, starting at line 63:

<?php
   
foreach ($row as $field => $content) {
     
$vars['themed_rows'][$num][$field] = str_replace(
        array(
'&', '<', '>'),
        array(
'&amp;', '&lt;', '&gt;'),
       
$content);
    }
?>

A few problems.

First, there is no need to check for >. The W3C's XML spec sec. 2.4 only says to worry if it's within a CDATA section (because it could conflict with ]]>), but those were removed per #494848: CDATA is pointless.

Second, & and < should not be altered unless they are not part of "markup delimiters, or within a comment." For example, if you come across <b>, the opening < should not be altered. Also, if you come across across &amp;, that should not be altered. Unfortunately the current code will convert the tag to &lt;b&gt; and the entity to &amp;amp;. And if either is within a comment, they should be skipped entirely.

#1

Steven Jones - November 20, 2009 - 10:16
Version:6.x-1.0-beta4» 6.x-1.x-dev
Status:active» needs review

Although trying to use Drupal's theme rendering system to build the XML document is a laudable goal, it is fraught with difficulties. Rather than re-inventing the wheel I propose the following strategy:

  1. Use SimpleXML if it's around to do the escaping for us.
  2. 2. Fall back to the current theme implementation if SimpleXML isn't around.

I doubt many people want to theme the XML export (but probably have too at the moment).

I've attached a patch that cleans up the XML export of VBP, it does:

  1. The XML root node is the name of the views base table. Rather than just blindly assuming that this will be 'node'
  2. Each row's XML name is the root node name minus the last 's' if it's there. i.e. you get output like:
    <comments><comment>...</comment></commments>
  3. XML tag names are now validated, so users don't have to care about their field labels (why should they!)
  4. We use SimpleXML if it's around to do the escaping and rendering. Lovely.
AttachmentSize
views_bonus-603420.patch 4.81 KB

#2

Steven Jones - November 20, 2009 - 10:16
Title:XML export mangles <, >, and &» XML export produces invalid XML

#3

gavranha - November 27, 2009 - 00:51

@Steven - thanks a lot for the help. The exports are "lovely" :)

I'm testing the module for integration with other app and I'll be post more about this, for information.

Great idea, views_bonus!

 
 

Drupal is a registered trademark of Dries Buytaert.