I admit to not having much experience with XML parsing - or a good overview of all the issues with this module. I just wanted something to parse an XML feed that a client gave me.
... and feeds_xpathparser seemed better than rolling my own parser.
... and since my document is simple (see below), I used the plain 'XML' method, without installing QueryPath.

My document looks roughly like:

<xml>
  <stuff>
    <item>
      <field1>value</field1>
      <field2>value</field2>
      <field3>value</field3>
      ...fields...
      <field30>value</field30>
    </item>
    <item>
       ...fields...
    </item>
    ...items...
  </stuff>
</xml>

...so, one encompassing 'stuff' element. (Since I didn't even know all the fields present, and there were many, I've been fiddling with "defining mapping on import" - but I guess that's besides the point here.)

The point: I just wanted to define one query for the 'Context', as Feeds Xpath Parser names it: "//stuff"
And I don't want to define anything else. No per-field queries. Just have it return all (mapped) fields.

Thing is: for the XML parser, if you leave the per-field queries empty, nothing is returned -- because $this->sources[] isn't populated. (The XPath parser may have different defaults...)

So does it make sense to 'by default' (when queries are empty), return all field content?
(This is equal to running $xml->xpath('//fieldNN'), in your code - so that's how I hacked stuff.)

Or can we not assume such a default for all situations?
Or can we only do that for class FeedsXPathParserXML, not for class FeedsXPathParserHTML?

--- FeedsXPathParserHTML.inc	2010-08-16 22:17:20 +0000
+++ FeedsXPathParserHTML.inc	2010-09-18 20:37:14 +0000
@@ -32,10 +32,13 @@
 
     foreach ($mappings as $mapping) {
       $source = $mapping['source'];
-      if ($query = trim($this->config['sources'][$source])) {
-       $this->sources[] = $source;
-       $this->queries[] = $query;
+      $this->sources[] = $source;
+      if (!($query = trim($this->config['sources'][$source]))) {
+        // No query defined: assume that there is a first-level attribute named
+        // $source in every item
+        $query = '//' . $source;
       }
+      $this->queries[] = $query;
     }
   }

Comments

twistor’s picture

I think this is out of the scope of this project, but I'm curious, how do you plan on setting the target fields?

roderik’s picture

Target fields are provided by whatever Feeds Processor is selected. (In my case, the standard Node Processor + feeds_location).
They are 'set' (sources are mapped to targets) during the first import, i.e. the XML document is first parsed to obtain the source fieldnames, in order to do the mapping.

(Which in my case was very convenient, because I had an XML document with > 30 source fields, and no documentation. So I could just define a Feeds Importer with Feeds XPath Parser + Node Processor, start it, and look at a list of the source fields on my import screen. ...at least, after patching enough code :))

This needs the 'Mapping on import' patch for Feeds plus my modifications for Feeds XPath Parser - of which the code in this issue is a part.

It works like a charm. I'm confident that the patch for Feeds will be incorporated, since people are pretty much dying for it.
For the Feeds Xpath part... I'm figuring out what to do with it, here :)

Does that answer the question? :)

twistor’s picture

Priority: Normal » Minor
Status: Needs review » Postponed

Putting this off.

twistor’s picture

Status: Postponed » Closed (won't fix)

I don't think this is going to happen.