Extensible XML parser (mapping more sources) [#631104]

Comment	File	Size	Author
#135	xml_parser.png	120.11 KB	mokko
#108	example.xml_.txt	23.46 KB	sagar ramgade
#75	feeds_xmlparser.zip	1.63 KB	Monkey Master
#68	feeds_xmlparser(with_xpath_mapping).zip	1.56 KB	Monkey Master
#60	feeds_xmlparser.zip	1.52 KB	Monkey Master
#58	feeds_xmlparser.zip	1.46 KB	Monkey Master
#40	feeds_xmlparser.zip	7.41 KB	funkmasterjones
#40	import_preview.patch	4.4 KB	funkmasterjones
#40	parser_settings.png	20.36 KB	funkmasterjones
#40	mapping.png	38.83 KB	funkmasterjones
#28	Untitled.png	238.42 KB	gregag
#27	Untitled.jpg	231.21 KB	gregag
#21	cleandata.zip	5.88 KB	velosol
#15	customparser.zip	6.91 KB	JayCally

Comment #1

chrism2671 commented 12 November 2009 at 18:17

Could use this!
http://www.php.net/manual/en/simplexml.examples-basic.php

Log in or register to post comments

Comment #2

velosol commented 13 November 2009 at 17:58

chrism2671 : I agree this would be a great feature, however... After writing my own parser for the custom XML I needed to consume, I do not relish the idea of trying to make it extensible in such a way that non-developer types could use. I've posted a generalized version of my code below in case anyone else wants to use it as a starting point. It is far from clean and includes lots of old debugging code.

Some starting ideas:
Make the xpath queries variables that can be set in a configuration form,
Make the various parts (item, title, description) equal to variables of the form parent_object->child,
Make the whole set of code a bit more OOP so attribute elements and other elements can be added without bound.

This code is mostly lifted from other places around the d.o issue queue (or Feeds) and the php SimpleXML examples - original credit goes to them which are more than I can remember at this point.

function generic_parser_parse($raw) {
  //libxml_use_internal_errors(true);
  if (!defined('LIBXML_VERSION') || (version_compare(phpversion(), '5.1.0', '<'))) {
    @ $xml = simplexml_load_string($raw, NULL);
  }
  else {
    @ $xml = simplexml_load_string($raw, NULL, LIBXML_NOCDATA | LIBXML_NOERROR | LIBXML_NOWARNING);
  }

  // $errors = libxml_get_errors();
  // foreach ($errors as $error) {
    // $message = (string)$error->message;
    // trigger_error("Error is: $message");
  // }
  
  // Got a malformed XML.
  if ($xml === FALSE || is_null($xml)) {
    trigger_error("generic_parser_parse received bad XML, beginning:" . substr($string, 0, 20));
    return FALSE;
  }
  $parsed_source = array();
  
  // trigger_error("Parsing with generic_parser_parse helper");
  
  //@todo: Add formatted time string to title.
  $parsed_source['title'] =  'Hardcoded value (bad) or pulled from XML (good)';
  $parsed_source['description'] = 'Hardcoded value (bad) or pulled from XML (good)';
  $parsed_source['items'] = array();
  
  // @todo: (alex_b) Make xpath case insensitive.
  $genericItems = $xml->xpath('//items');
  foreach ($genericItems as $genericItem) {
    $item = array();
    $item['guid'] = (string)$genericItem->id;

    $timeStampIs = strtotime($genericItem['ItemTime']);
    if ($timeStampIs !== FALSE && $timeStampIs != -1) {
      //If able to parse timestamp, use it - checks false for both strtotime vers.
      $item['timestamp'] = (int)$timeStampIs;
    }
    else {
      $item['timestamp'] = time();
    }
    //Short version:
    //$item['timestamp'] = ($timeStampIs !== FALSE && $timeStampIs != -1) ? (int)$timeStampIs : time();
    
    $item['title'] = $genericItem['title'];
    // Set location
    $item['location']['lat'] = (double)$genericItem->geocode->lat;
    $item['location']['lon'] = (double)$genericItem->geocode->lon;
    
    //Pull attributes from an element to create body
    //<ElementWithAttributes attribute1="XX">XX</E...>
    //Write as "{attribute1} is: {elementValue}"
    $elementsWithAttributes = $genericItem->xpath('anElement/ElementsWithAttributes');
    $attribText;
    foreach ($elementsWithAttributes as $attrib) {
      $attribText .= "<br/>\n" . (string)$attrib['attribute1'] . " is: " . (string)$attrib;
    }
    //The 'description' is the 'body' for a node.
    $item['description'] = $attribText;
    //Clean output array for the next trip through in case diff. set of attribs.
    unset($attribText);
    unset($timeStampIs);
    $parsed_source['items'][] = $item;
  }
  return $parsed_source;
}

Log in or register to post comments

Comment #3

chrism2671 commented 15 November 2009 at 22:35

It's unfortunate that this is quite fiddly! Will have a go and report back!

Log in or register to post comments

Comment #4

alex_b commented 17 December 2009 at 14:44

Title:	Extensible XML parser	» Extensible XML parser (mapping more sources)
Version:	6.x-1.0-alpha7	» 6.x-1.x-dev

With mapping on import implemented, we could rely (as in FeedAPI Mapper) on analyzing the feed before offering mapping sources #651478: Mapping on import.

Log in or register to post comments

Comment #5

alex_b commented 18 December 2009 at 23:33

For a demo project I have written a rule based parser called RParser.

In its documentation you can see how RParser allows for a definition of parsing rules with XPaths:

function hook_rparser_xpath($feed) {
 return array(
      'title' => '//default:Document/default:name[1]',
      'items' => array(
        '#xpath' => '//default:Placemark',
        '#children' => array(
          'title' => '//default:name',
          'description' => '//default:description',
          'picture' => '//default:pic',
          'url' => '//default:url',
          'lat' => '//default:LookAt/default:latitude',
          'lon' => '//default:LookAt/default:longitude',
        ),
      ),
    );
}

What's missing are namespace declarations and some simple fallbacks (what if the title does not yield a result?). This approach could get us very far. People who are looking for parsing XML documents with uncommon namespaces could easily add these with a simple hook implementation.

Further, if we were to write a 'magic parser' that allows a site builder to enter xpaths through a site UI, this is the infrastructure we'd want to have running underneath it.

Log in or register to post comments

Comment #6

mottolini commented 18 December 2009 at 09:32

Links to RParser and its documentation are not working.
I'm really interested in RParser and willing to add a useful UI to it.

Log in or register to post comments

Comment #7

alex_b commented 18 December 2009 at 23:34

#6: fixed.

Log in or register to post comments

Comment #8

whatdoesitwant commented 23 December 2009 at 15:53

I forget to say that rparser looks clean and smart, as does the new feeds: amazing work. As a sitebuilder/themer I welcome the idea of opening this functionality up to the drupal ui.

For my 2 cts I think it better to work towards a general add-on that does open up the (feed)source specific (xml)data - as an extra array entry within the parsed php-array, along with the default fields - to the drupal ui.
This would mean that a setup as in your primary feeds tutorial vid would have to serve as a base on which more specific mapping can take place when the (feed)source's (xml)data is available.

If extensible parser does just that, i am sorry, but as a non-developer i never got it (which is my ish with that entire module). I'd much rather use something like exhaustive parser.

This approach is not as bad as it sounds because a three step configuration for specific (feed)sources is already necessary on the processing side (You have to create target fields based on the (feed)source's (xml)data, after creating a preliminary feed ct instance.)

Log in or register to post comments

Comment #9

SeanBannister commented 29 December 2009 at 03:19

Might be worth looking at phpQuery and QueryPath which also has a module. I'm currently evaluating both for easy screen scrapping. They allow you to use jQuery style syntax to select XHTML elements.

Log in or register to post comments

Comment #10

netentropy commented 4 January 2010 at 14:50

how does one actually use Rparser?

Log in or register to post comments

Comment #11

JayCally commented 5 January 2010 at 14:21

I need to import a custom XML file and have been trying to come up with a way to do it using Feeds. I downloaded Rparser but it is dependent on the feedsapi not feeds. Can this be used with feeds?

Log in or register to post comments

Comment #12

Thoor commented 5 January 2010 at 16:35

Sorry - this is maybe not the right Place to ask, but can anybody tell me ... is it possible to handle a XML File like shown under:
http://wiki.zanox.com/en/Product_Data_Download#Example_XML_file and create nodes from it? And if yes - do I have o use the OPML IMPORT, or FEED for it?

Beg your pardon - I read the documentation, watched the three videos and asked already in the regular forum without any answers ... THX

Log in or register to post comments

Comment #13

JayCally commented 5 January 2010 at 16:53

You would need a custom XML parser.

Log in or register to post comments

Comment #14

Thoor commented 5 January 2010 at 17:02

@ JayCally

THX a lot for your answer ... so I dont have to "try and error" anymore :) ... so I will use the CSV Import ... this is working so far.

Log in or register to post comments

Comment #15

JayCally commented 5 January 2010 at 17:03

Status	File	Size
new	customparser.zip	6.91 KB

I'm creating a custom parser to pull in my custom XML file and used common_syndication_parser as a starting point. I have it set up and can select it in the parser section of feed importers. I can map the source to the target but am getting an error when importing. The error is: Invalid argument supplied for foreach() in FeedsNodeProcessor.inc line 22. I'm not sure what's wrong. I've attached a zip file with the custom parser and example XML file. Could someone take a look and see what I may have forgotten or screwed up?

Log in or register to post comments

Comment #16

netentropy commented 5 January 2010 at 20:31

how do you use the custom parser once you write it?

Log in or register to post comments

Comment #17

JayCally commented 5 January 2010 at 21:11

Add the below to feeds.plugin.inc. so you can select it as the parser to use in feed importers. You can also create a plugin for it to set up the mapping options.

$info['YourCustomParser'] = array(
'name' => 'Custom parser',
'description' => 'Parse custom XML files.',
'handler' => array(
'parent' => 'FeedsParser',
'class' => 'FeedsCustomParser',
'file' => 'FeedsCustomParser.inc',
'path' => $path,
),
);

I have it all working except the parser. I used common_syndication_parser as a base but I think I left some things out. So I'm redoing it and hopefully I can get it working soon.

Log in or register to post comments

Comment #18

JayCally commented 6 January 2010 at 19:19

OK. Have trouble getting the custom parser to work. I used common_syndication_parser as a starting point, removed anything that I didn't need. I tested it on a normal feed and it works. But when I go into to add the custom fields I get an error message "feeds/plugins/FeedsNodeProcessor.inc on line 22" and I can't figure it out. :( Just as an example my custom feed doesn't have title for an item but "ADID". When I change title to this common_syndication_parser I get the error message.

Can anyone help me out with this?

Log in or register to post comments

Comment #19

JayCally commented 8 January 2010 at 15:03

Was able to get the custom parser to work. I can now import my custom feeds. Wish this was a standard feature of feeds. News companies and other sites that need to import data from their in-house databases get their content exported as a custom XML file. I work for a small newspaper publisher and all of our vendors have changed their product to export content as XML. This would be a very helpful feature to have standard with Feeds.

Log in or register to post comments

Comment #20

netentropy commented 8 January 2010 at 18:35

ehaustive parser for FeedAPI works nicely for FeedAPI, i will talk to the maintainer about porting it

Log in or register to post comments

Comment #21

velosol commented 8 January 2010 at 18:52

Status	File	Size
new	cleandata.zip	5.88 KB

JayCally, I'm sorry I didn't get back to you sooner. I'm glad you were able to get the custom parser working. Since I posted the first code blocks, I've had a chance to refine what I used for my custom parser; unfortunately my work has moved away from Feeds for the time being and I haven't been trawling the issue queue recently.

I'm posting the files I used (which is a custom module that uses Feeds hooks so that I don't have to modify Feeds' core files) in such a way that if you, or another developer needs a custom parser, you'll have a fairly good starting point/skeleton.

I would like to note that this is not in line with #4/#5 and should really only be a stopgap until rparser and mapping on import come to fruition. Also, it doesn't support namespaces as written, but in customizing for your files it should be easy to add. It could also use some additional OOP updates if you're planning on consuming really complex files.

The module says that it requires Feeds - for full functionality as written it also needs location_cck - this support is easily taken out by commenting the block relating to it (as mentioned in the files); especially useful if you don't have location data.

I hope this helps you, or someone else until a better solution is found.

Log in or register to post comments

Comment #22

JayCally commented 11 January 2010 at 21:18

Thanks velosol. I was able to get my custom parser working and added a patch to Feeds to allow mapping to image fields. It took me a while but was a good learning experience. I am now looking for away to map location cck fields, email field and website field. I can probably just use text fields for email/web and make them linkable in content template but location will be the big one to get done.

Log in or register to post comments

Comment #23

velosol commented 12 January 2010 at 22:27

JayCally - I have lat/lon location CCK support in the module I posted in #21. It's definitely not robust, but could serve as a starting point for you.

Log in or register to post comments

Comment #24

JayCally commented 13 January 2010 at 15:05

Thanks velosol! I pulled the section of code I need from your module and used that as a base. I had to make some changes and additions but I have it working now.

Log in or register to post comments

Comment #25

rjbrown99 commented 16 January 2010 at 05:30

velosol - for what it's worth, your module was fantastically helpful in allowing me to create a custom XML parser. Great stuff and well commented. Thanks!

Log in or register to post comments

Comment #26

gregag commented 17 January 2010 at 13:57

I get:
* user notice: customxml_parser_parse received bad XML, beginning: in C:\SERVER\htdocs\sites\testnadrupal.com\modules\cleandata\customxml_parser.inc on line 40.
* warning: Invalid argument supplied for foreach() in C:\SERVER\htdocs\sites\testnadrupal.com\modules\feeds\plugins\FeedsNodeProcessor.inc on line 22.

from code I inserted:
// Got a malformed XML.
if ($xml === FALSE || is_null($xml)) {
trigger_error("customxml_parser_parse received bad XML, beginning:" . substr($string, 0, 20));
return FALSE;
}
$parsed_source = array();

//@todo: Add formatted time string to title.
$parsed_source['title'] = '//html/body/form/table[2]/tbody/tr/td[1]/table'; //(string)current($xml->xpath('//Some/appropriate/path')); ////
$parsed_source['description'] = 'Webpart'; ////
$parsed_source['items'] = array();

// @todo: (alex_b) Make xpath case insensitive.
$customItems = $xml->xpath('//http://www.kjekupim.si/si/vroca_ponudba.wlgt'); ////
foreach ($customItems as $cItem) { ////
$item = array();
//Set the guid of the
$item['guid'] = (string)$cItem->Level1->Level2; ////

/**
* Skip those items without a guid - if you want this functionality
if( $item['guid'] == NULL) {
//We don't want it if it doesn't have an id.
continue;
}
*/

What did I do wrong?
help me please...

Log in or register to post comments

Comment #27

gregag commented 17 January 2010 at 14:01

Status	File	Size
new	Untitled.jpg	231.21 KB

I would like to feed from table. Attached picture for easier understanding what I want...

Log in or register to post comments

Comment #28

gregag commented 17 January 2010 at 14:05

Status	File	Size
new	Untitled.png	238.42 KB

Sorry for previous post, here is example

Log in or register to post comments

Comment #29

gregag commented 19 January 2010 at 15:46

some help how to write appropriate path in line $todo would be seriously big THANKS from me

Log in or register to post comments

Comment #30

geerlingguy commented 4 February 2010 at 16:43

Subscribe... would love an in-the-gui option, as I have a few news sites where we change the mappings from time to time, and maintaining a plugin/module is always a burden.

Should this be marked as a duplicate? Or vice-versa?
#651478: Mapping on import

Here's the feed I'm trying to import (a few custom xml siblings have been removed, for brevity.

<entry>
  <title type="text"><![CDATA[Title goes here.]]></title>
  <id>9713</id>
  <updated>2010-02-04T10:20:39</updated>
  <created>2010-02-04T06:00:00</created>
  <summary type="html"><![CDATA[By Author Name]]></summary>

  <content type="html"><![CDATA[
    <p>Content goes here.</p>
]]></content>

  <author><![CDATA[]]></author>
  <published>2010-02-04T10:20:39</published>
  <category term="sunday scripture" start="2010-02-04T11:13:13" end="2010-05-15"><![CDATA[sunday scripture]]></category>        
  <keywords><![CDATA[]]></keywords>
</entry>

I can get everything to move to the site except for the "summary" - I want to move that into a CCK 'Byline' field...

Log in or register to post comments

Comment #31

paganwinter commented 5 February 2010 at 11:16

Subscribing...

Log in or register to post comments

Comment #32

jay-dee-ess commented 5 February 2010 at 16:37

subscribing

Log in or register to post comments

Comment #33

jay-dee-ess commented 5 February 2010 at 21:08

I'm trying to use velosol's cleandata module and am getting the following error when I try to use the Custom XML Parser:

Fatal error: Declaration of FeedsCustomXMLParser::parse() must be compatible with that of FeedsParser::parse()

I haven't made any changes to the code and am a noob...any help would be much appreciated.

Log in or register to post comments

Comment #34

sutch commented 6 February 2010 at 14:03

subscribing

Log in or register to post comments

Comment #35

drewish commented 8 February 2010 at 17:21

subscribing

Log in or register to post comments

Comment #36

TimG1 commented 10 February 2010 at 23:57

subscribing.

And if there is any developers for hire out there who can write a custom parser for my feed please get in touch via my contact form.

Thanks!
-Tim

Log in or register to post comments

Comment #37

reglogge commented 12 February 2010 at 20:42

subscribing

Log in or register to post comments

Comment #38

pvhee commented 13 February 2010 at 13:17

It would be great to have out-of-the box support for XML import in Feeds. What are the blockers to get this done?

Log in or register to post comments

Comment #39

pvhee commented 13 February 2010 at 19:39

Building on the efforts on XML Parser, I created a Feeds plugin that can deal with (simple) XML, much like is being dealt with CSV at the moment.

You can find the module at http://github.com/pvhee/feeds_xmlparser. Note that this is only tested in my sandbox.

I'd be glad to receive any feedback on this, and hope we can tackle the problem of XML import using Feeds.

Log in or register to post comments

Comment #40

funkmasterjones commented 15 February 2010 at 04:08

Status	File	Size
new	mapping.png	38.83 KB
new	parser_settings.png	20.36 KB
new	import_preview.patch	4.4 KB
new	feeds_xmlparser.zip	7.41 KB

Install the feeds_xmlparser module
The patch is mostly to give a parsing example in the mapping section, I can't recall if feeds_xmlparser needs it to run, but I would recommend it anyways

Features:
- parse any type of xml file finding items specified by XPath (see parser_settings.png)
- preview the parsed data
- gets mapping source targets dynamically (see mapping.png)
- groups alike single values and array structures together for easier data access (Advanced)

Cons:
- must specify an xml template file (normally just the file you want to import) in the importer settings
(this is the only way you can generate source targets)

Note: in the screenshots I use a relative url, this is a different patch of mine, Feeds only supports absolute urls

Log in or register to post comments

Comment #41

pvhee commented 15 February 2010 at 08:28

Would there be a possibility to merge both projects (your Feeds XML Parser and the one I posted in the previous comment)? Note that I created a temporary Drupal project for this at http://drupal.org/project/feeds_xmlparser. However, I'd be more than happy to have it included into the Feeds module once the code is more or less finalized.

Log in or register to post comments

Comment #42

CheezItMan commented 1 March 2010 at 14:52

Interesting, I'm trying to write a parser for an Apple Podcast Producer ATOM Feed. Seems like this could do the job perfectly.

However with Alpha12, when I try to Feed Imports-->Edit the feed of my choice and then select the parser, I get a whitescreen of death. Pretty much the same error I get on a parser selection.

Log in or register to post comments

Comment #43

serbanghita commented 1 March 2010 at 18:24

Priority:

Normal

» Critical

OMG i'm subscribing to this and testing #40

Log in or register to post comments

Comment #44

minus commented 2 March 2010 at 18:05

tested #40 with version 6.x-1.0-alpha12 and got a whitescreen of death as well - i'm trying to write a custom parser but have a hard time doing it :-/

Log in or register to post comments

Comment #45

pvhee commented 2 March 2010 at 18:08

@serbanghita @minus: did you try #41?

the module can be downloaded directly from github as eg zip file: http://github.com/pvhee/feeds_xmlparser/zipball/master. The XML parser is fully functional and does not require any feeds patches.

Log in or register to post comments

Comment #46

minus commented 2 March 2010 at 18:21

thank you pvhee! didn't know it was updated, i had a version from feb.13 :-) will try this version right now :-)

Log in or register to post comments

Comment #47

minus commented 2 March 2010 at 18:48

hmm, i'm trying to import the data from a xml test file which contains two fields, a description and an url. When i add these two fields using the XML Parser it gives me a error message saying

    * warning: copy(u) [function.copy]: failed to open stream: No such file or directory in /Sites/acquia-drupal-site/acquia-drupal/sites/all/modules/feeds/plugins/FeedsParser.inc on line 148.
    * Cannot write content to /tmp/FeedsEnclosure-u

I'm really new to this and i guess this has something to do with either the person behind the computer or the xml file he uses :-/ If you have the time i could send you the url to the feed i'm using.

Log in or register to post comments

Comment #48

webwriter commented 2 March 2010 at 18:49

Subscribing

Log in or register to post comments

Comment #49

serbanghita commented 3 March 2010 at 21:44

@pvhee actually #41 inspired me to create a custom module that extends Feeds and parse and maps Twitter search feed (eg. http://search.twitter.com/search.atom?q=starcraft2). Twitter search feed is not like a standard feed so it needs custom mapping.

I haven't tested the class inside #41. Will do in the next days.

Meanwhile i finally understand the approach of #40. Please correct me i'm wrong:

We need a parser that "knows" the <item> fields we are about to parse, so we can select them from the dropdown and map them to our already created CCK fields. The approach of #40, with a sample file, seems to be the only logical solution. Will test #40 and post comments.

Log in or register to post comments

Comment #50

alex_b commented 3 March 2010 at 21:59

FYI - for some of you here maybe interesting #706984: Allow extension of FeedsSimplePieParser parsing

Log in or register to post comments

Comment #51

gaele commented 8 March 2010 at 10:42

subscribing

Log in or register to post comments

Comment #52

reglogge commented 11 March 2010 at 18:22

I am using the module and patch from #40 and get a very curious behavior: Importing works just fine but there is only ever the last item from an xml-file imported. So on each import I get just one new node, even if there are many more in the feed

Any ideas?

Log in or register to post comments

Comment #53

Monkey Master commented 26 March 2010 at 14:21

My compact version of FeedsXMLParser (#39) with no need of external module (XML Parser) and with xpath setting (#40):

class FeedsXMLParser extends FeedsParser {

  public function parse(FeedsImportBatch $batch, FeedsSource $source) {
    $file = realpath($batch->getFilePath());
    if (!is_file($file)) {
      throw new Exception(t('File %name not found.', array('%name' => $batch->getFilePath())));
    }
    $xml = simplexml_load_file($file);
    if (!$xml) {
      throw new Exception(t('Could not parse XML file: %file', array('%file' => $batch->getFilePath())));
    }
    if (!empty($this->config['xpath'])) {
        $xml = $xml->xpath($this->config['xpath']);
    }
    $batch->setItems(unserialize_xml($xml));
  }

  public function configDefaults() {
    return array(
      'xpath' => '',
    );
  }

  public function configForm(&$form_state) {
    $form = array();
    $form['xpath'] = array(
      '#type' => 'textfield',
      '#title' => t('XPath'),
      '#default_value' => $this->config['xpath'],
      '#description' => t('XPath to locate items.'),
    );
    return $form;
  }
}

function unserialize_xml($data) {
  if ($data instanceof SimpleXMLElement)
    $data = (array)$data;
  if (is_array($data))
    foreach ($data as &$item)
      $item = unserialize_xml($item);
  return $data;
}

Log in or register to post comments

Comment #54

minus commented 26 March 2010 at 19:30

how can this be used :$
Best regards
minus aka Noob

Log in or register to post comments

Comment #55

Monkey Master commented 26 March 2010 at 19:47

This is alternative FeedsXMLParser.inc for feeds_xmlparser module (post #39) which uses SimleXML php extension instead of external xml_parser module

Log in or register to post comments

Comment #56

alex_b commented 26 March 2010 at 20:03

#53: You should be able to use xpath on the mapping UI easily.

Implement a getSourceElement($item, $element_key) method:

On execution, $item will be one of the simple XML items yielded by $this->config['xpath']. $element_key will be a xpath entered by the site builder on the mapping UI.

Log in or register to post comments

Comment #57

AntiNSA commented 27 March 2010 at 14:53

subscribe

Log in or register to post comments

Comment #58

Monkey Master commented 27 March 2010 at 20:43

Status	File	Size
new	feeds_xmlparser.zip	1.46 KB

I implemented getSourceElement():

  public function getSourceElement($item, $element_key) {
    $xml = $item->xpath($element_key);
    if (is_array($xml)) {
      foreach ((array)$xml[0] as $value) {
        if (is_string($value)) return $value;
      }
    }
    return '';
  }

It returns the first string found at specified xpath since arrays are not accepted (lots of warnings, I tried)

Working module is attached.

Log in or register to post comments

Comment #59

Monkey Master commented 27 March 2010 at 21:41

#58 fails if number of items greater than 50.
SimpleXMLObject has big problems with unserialize() and FeedsSource->load() cannot restore batch object after first iteration.

Log in or register to post comments

Comment #60

Monkey Master commented 28 March 2010 at 13:24

Status	File	Size
new	feeds_xmlparser.zip	1.52 KB

I don't see a simple way to fix SimpleXMLObject serialization...
So here's version without getSourceElement() with good old arrays.

Successfully tested on importing 9500 ubercart products with cck and filefield.

Log in or register to post comments

Comment #61

AntiNSA commented 28 March 2010 at 00:59

Is that the best final solution?

Log in or register to post comments

Comment #62

AntiNSA commented 28 March 2010 at 03:28

When I install and select this as the processor, only the bottom right has a drop down feild to select where to map things to. The left side field is a text entry fields,,, not a drop down list with options to select mappiingg sources from......

Is this how it is intended to work?

Log in or register to post comments

Comment #63

Monkey Master commented 28 March 2010 at 07:58

Yes, like csv-parser - just open xml file and copy field names to mapper settings.
When we configure mappings no xml file is loaded yet - nowhere to get field names from.

Log in or register to post comments

Comment #64

alex_b commented 29 March 2010 at 03:20

Priority:	Critical	» Normal
Status:	Active	» Needs review

Very interesting - the limitation is that it can only detect elements one hierarchy level deep in what's returned by config['xpath'] - correct?

Setting this to non-critical as we're keeping track of changes to upcoming release alpha 13 with 'critical'.

Log in or register to post comments

Comment #65

Monkey Master commented 29 March 2010 at 06:26

yes, one level deep. Fits both of my projects I have at this moment - import uc_products from external program.

Version with getSourceElement() doesn't have this limitation.
Does anyone have a solution for SimpleXMLObject serialization problem?

Log in or register to post comments

Comment #66

srobert72 commented 29 March 2010 at 08:46

Subscribing

Log in or register to post comments

Comment #67

blasthaus commented 29 March 2010 at 10:19

subscribing

Log in or register to post comments

Comment #68

Monkey Master commented 29 March 2010 at 15:40

Status	File	Size
new	feeds_xmlparser(with_xpath_mapping).zip	1.56 KB

New version of #58 with a workaround for SimpleXMLElement serialization problem: Items added to batch as XML strings and then parsed again individually in getSourceElement().

It supports xpath on the mapping UI.

Log in or register to post comments

Comment #69

xcusso commented 9 April 2010 at 16:12

Fresh install of #68 fail with this error Drupal: "error HTTP 500 /batch?id=53&op=do" in apache server log: "PHP Fatal error: Call to a member function asXML() on a non-object in /sites/my_site/modules/feeds_xmlparser/FeedsXMLParser.inc on line 24". Any ideas?
Thanks.

Log in or register to post comments

Comment #70

Monkey Master commented 10 April 2010 at 09:05

@xcusso: did you set correct xpath in XML parser settings?
An example of XML that causes this error would be helpful

Log in or register to post comments

Comment #71

hampshire commented 10 April 2010 at 21:10

What else needs to be installed along with the files in #68 as I have tried every combination on this page it seems and I still have been unable to get this parser to show up on the select a parser page.

Thank you and sorry for the dumb question.

Log in or register to post comments

Comment #72

milesw commented 11 April 2010 at 17:02

subscribing

Log in or register to post comments

Comment #73

reglogge commented 11 April 2010 at 21:05

subscribing

Log in or register to post comments

Comment #74

Monkey Master commented 12 April 2010 at 11:04

@hampshire: you just need to enable module - Feeds XML Parser

Log in or register to post comments

Comment #75

Monkey Master commented 12 April 2010 at 12:33

Status	File	Size
new	feeds_xmlparser.zip	1.63 KB

New version
added support of multiple value fields and more checks (e.g. for #69)
xpath in XML parser settings is obligatory, defaults to "item"

Log in or register to post comments

Comment #76

gdud commented 12 April 2010 at 13:47

subscribing

Log in or register to post comments

Comment #77

AntiNSA commented 13 April 2010 at 09:06

I am trying to use this with this feed :
http://feeds.pheedo.com/OcwWeb/rss/new/mit-newarchivedcourses

I have mapped:

title Title Remove
dc:date Published date Remove
dc:subject Taxonomy: User Keyword Remove
URL URL Remove
GUID GUID Remove
Body Body Remove

HTTP Fetcher
Download content from a URL.

XML parser
Parse data in XML format.

Node processor
Create nodes from parsed content.

XPath:
item
XPath to locate items.---------------------- whats this?

I keep getting the message that no update available...... everytime I try to import..... If you could tell me what I am doing wrong I would really appreciate it!

Thanks for all your hard work-
Robert

Log in or register to post comments

Comment #78

AntiNSA commented 13 April 2010 at 09:49

I should say that everything works with the simple pie parser... I really would like to use your custom xml parser though! thanks again

Log in or register to post comments

Comment #79

Monkey Master commented 13 April 2010 at 13:12

I am trying to use this with this feed :
http://feeds.pheedo.com/OcwWeb/rss/new/mit-newarchivedcourses

This XML uses namespaces - not supported at the moment.

XPath to locate items.---------------------- whats this?

The XPath query that is executed on loaded XML file to get an array of XML nodes.
Then XPath queries from mapping UI are executed on that XML nodes to get source values.
(i.e. "Name of source field" in mapping UI is actually XPath for this parser)

So it's better to learn XPath before working with this module...

Log in or register to post comments

Comment #80

dvbii commented 13 April 2010 at 20:58

I have been trying the XPath expressions to map the fields, but can't get them to work.
Using http://eventful.com/atom/richmond/events as my feed, and I set the XML parser Xpath to: entry
Then trying to map start time from the feed as "//gd:when/@startTime". I get no entries..

Any advice?

dvbii

Log in or register to post comments

Comment #81

Monkey Master commented 14 April 2010 at 04:38

@dvbii: This XML also uses namespaces (as #77), not supported. Any advises are welcome.
BTW Feeds module already has RSS/Atom parser.

Here's what I found:

SimpleXMLElement->xpath() function doesn’t support default namespaces and will not be able to perform an XPath search on those types of documents.

Log in or register to post comments

Comment #82

dvbii commented 16 April 2010 at 15:33

thanks for the reply. If I use the atom parser, how do i map other fields?

Log in or register to post comments

Comment #83

clevername commented 18 April 2010 at 04:23

What else needs to be installed along with the files in #68 as I have tried every combination on this page it seems and I still have been unable to get this parser to show up on the select a parser page.

Thank you and sorry for the dumb question.

Not showing up for me either despite module being enabled. Subscribing in hopes for an answer.

Log in or register to post comments

Comment #84

gordns commented 21 April 2010 at 10:55

I had the same problem with the parser not showing. Worked after using the "flush all caches" menu in the admin_menu module.

Log in or register to post comments

Comment #85

hampshire commented 21 April 2010 at 14:34

I deleted the database and started from scratch but mine was a test site and not important, I never tried flush all caches, did that work for you?

Log in or register to post comments

Comment #86

Dmxr100 commented 21 April 2010 at 16:10

Hi, this is exactly what I'm looking for, and is working really well apart from one bit that I can't work out.

I have an XML feed from a News service which is importing fine, apart from categories -> Taxonomy.

Here is how a news article appears in the XML:

<Article Created="17:20:54" ID="19393452">
    
    <Heading>Advertising "essential tool"</Heading>
    <Date>05/10/2009</Date>
    <Contents><![CDATA[One charity has discussed how advertising is an &quot;essential tool&quot; that enables it to get its message out...]]</Contents>
    
    <Categories>
          <Category ID="438033218">Customer loyalty</Category>
          <Category ID="438033219">Customer retention</Category>
    </Categories>
	
</Article>

I'm not sure how to map the "Category" values to taxonomy, because they are wrapped within "Categories". Is this possible? If so I would greatly appreciate some instruction.

If I remove the "Categories" wrap in a test XML upload, the mapping works fine, however as it is a feed provided by a third party, they will not change the structure.

Thanks in advance

Log in or register to post comments

Comment #87

tinem commented 21 April 2010 at 16:26

subscribing

Log in or register to post comments

Comment #88

clemens.tolboom

Dutch

Groningen, 🇳🇱/🇪🇺

commented 22 April 2010 at 09:08

Status:

Needs review

» Needs work

Could you please add a patch file here :)

Log in or register to post comments

Comment #89

Monkey Master commented 23 April 2010 at 07:24

I'm not sure how to map the "Category" values to taxonomy, because they are wrapped within "Categories". Is this possible?

Yes, just use xpath in mapping UI - try "category"

Could you please add a patch file here :)

It's a standalone module, nothing is changed in feeds.

Log in or register to post comments

Comment #90

Dmxr100 commented 23 April 2010 at 12:55

Status:

Needs work

» Needs review

Hi Monkey Master

Thanks for your help with this. Turns out I wasn't using the correct XPath syntax. So having "Article" defined in the XPath UI imports the node correctly, and using "Categories//Category" in the mapping to taxonomy pulls in the nested "Category" element.

Success :-)

Log in or register to post comments

Comment #91

robbertnl commented 27 April 2010 at 12:03

Subscribing.
I am also using #75 now, now finding out how to map nested xml's

Log in or register to post comments

Comment #92

Exploratus commented 13 May 2010 at 02:03

This module works great. I was able to import using the XML Parser. My only question now is how do we import the images referenced in the XML feed so we do not have to download manually...

Cheers

Edit: Never mind, got it working...

Log in or register to post comments

Comment #93

alex_b commented 28 April 2010 at 18:59

Monkey Master: Very nice work. You are clearly very actively developing feeds_xmlparser. I don't want to stand in the way as gatekeeper of Feeds module - why don't you break out feeds_xmlparser into its own project on d. o. or on github? I see that pvhee's already got a version on github:

http://github.com/pvhee/feeds_xmlparser

Log in or register to post comments

Comment #94

chrisirhc commented 12 May 2010 at 21:02

Hi there, just to add on, I've incorporated the code from Monkey Master into my fork of the github (at http://github.com/chrisirhc/feeds_xmlparser) as well as made an amendment myself.

I have a question though on Monkey Master's code and my amendment though, how come this was used (1):

    if (is_array($xml)) {
      foreach ((array)$xml[0] as $value) {
        if (is_scalar($value)) return $value;
      }
    }

Instead of just (2):

    if (is_array($xml)) return (string) $xml[0];

If there are issues with what I am doing, please let me know, because I used (2) in my code. I made the change because the current code (1) does not support queries for attributes (results will be blank) e.g. /post/@title
Also, I noticed in my testing of XPath to my knowledge, (1) seems to behave unexpectedly in certain cases. If needed, I'll be happy to post some cases here if anyone needs an example. Just ask! :)

Log in or register to post comments

Comment #95

Monkey Master commented 13 May 2010 at 08:38

chrisirhc: you've incorporated old version, latest is in post #75

Log in or register to post comments

Comment #96

mokko commented 13 May 2010 at 10:17

I am just checking back to see what's the status of this very interesting importer. I am interested in namespace support,too. To me http://www.php.net/manual/en/intro.simplexml.php looks like it supports namespace. Where am I wrong?

Log in or register to post comments

Comment #97

chrisirhc commented 13 May 2010 at 11:08

Sorry about that. Incorporated your latest changes. :)

@mokko Regarding namespaces, what kind of support are we looking at? Providing another field so that you can specify a namespace URI? or? I'm interested in getting it to work too.

I'm currently using Feeds with YQL (Yahoo Query Language) to do some imports. Also looking into doing up a fetcher that respects paging and 'staggering' requests (not sure if that's what it's called). I'm not sure if someone else is working on that now.

Log in or register to post comments

Comment #98

mokko commented 13 May 2010 at 17:30

In the meantime I found out that you are using drupal.org/project/xml_parser and I am reading through that code. I haven't found out yet how it all works. I am new to feeds etc., so I wasn't able yet to test if it works. I just have xml files that I would like to digest (import to nodes) and they have namespaces. I don't need this urgently, so I guess I will play with it a little more.

Log in or register to post comments

Comment #99

chrisirhc commented 13 May 2010 at 20:06

The latest code I used from Monkey Master does not use xml_parser . It uses the native SimpleXML support in PHP.
I believe in order to retrieve the namespaced elements, you need to specify the namespace before retrieving it. This can be done by listing the namespaces in the Feeds XML Parser settings page. I might consider adding this. It should be just a few lines of code I think.

I dug up an example of SimpleXML with namespaced elements. You need PHP 5.2 and above to get it to work.
http://www.ibm.com/developerworks/library/x-simplexml.html

http://sg2.php.net/manual/en/simplexmlelement.registerXPathNamespace.php

It looks rather expensive but the following could be a simple fix to allow namespaced xpaths:

//Fetch all namespaces
$namespaces = $sxe->getNamespaces(true);

//Register them with their prefixes
foreach ($namespaces as $prefix => $ns) {
    $sxe->registerXPathNamespace($prefix, $ns);
}

I'm committing and pushing to the git.

I can already think of a possible problem that might occur. While you might be able to get the namespaced elements out, after that, during mapping, there might still be other challenges.. If the namespace does not persist till the getSourceElement method...
(Night, I'm off to sleep.)

Log in or register to post comments

Comment #100

Exploratus commented 13 May 2010 at 23:20

Hi everyone.

Quick question. Is there a way to get the value inside the category, not just within the brackets.

this is what i mean:

i can get the value

whatever

but how do I get the value from within the brackets, like this:

How do I pull the venue ID "106543".

Thank you so much.

Log in or register to post comments

Comment #101

chrisirhc commented 14 May 2010 at 05:16

Please put everything that is code within the <code></code> tags when giving code examples here.
Anyways, I think you're talking about attributes (<el someattribute="blah"/>) right?

You have to use the @ symbol to refer to attributes.

Take a look at the following links for some information on XPath.
http://www.w3schools.com/xpath/xpath_syntax.asp
http://www.quackit.com/xml/tutorial/xpath_attributes.cfm

Queries on XPath should be somewhere else.. This is a thread on the issues regarding the parser.

Log in or register to post comments

Comment #102

Exploratus commented 14 May 2010 at 18:41

Sorry about that, that is exactly what I wanted. I didn't realize that the Drupal Forum took the codes out...

Log in or register to post comments

Comment #103

mokko commented 16 May 2010 at 13:18

I have problems installing the feeds_xmlparser. A few more details here: http://drupal.org/node/800256. Any help appreciated.

Log in or register to post comments

Comment #104

ChaosD commented 17 May 2010 at 22:36

subscribed

this functionality would add a lot of value to the feeds module. how usable is it so far? do you need help with testing?

Log in or register to post comments

Comment #105

mkalisz commented 18 May 2010 at 12:11

subscribing

Log in or register to post comments

Comment #106

chrisirhc commented 19 May 2010 at 12:39

Help needed to test whether it works with namespaced XML documents (or any other documents).
You can download it from:
http://github.com/chrisirhc/feeds_xmlparser

Log in or register to post comments

Comment #107

Exploratus commented 19 May 2010 at 14:07

I am using namespaces and it works fine...

Log in or register to post comments

Comment #108

sagar ramgade commented 27 May 2010 at 12:53

Issue tags:

+xml parser

Status	File	Size
new	example.xml_.txt	23.46 KB

Hi,
I am using feed_xmlparser with xml_parser, I am not able to import data inside the subtags of xml file, however if i save the file and Remove those parent tags like media, media item, it is able to fetch data.
I want to fetch data inside the caption, mediaUrl, datePrefix etc.
The xml file which i am trying to import is attached. rename it .xml

Log in or register to post comments

Comment #109

ChaosD commented 27 May 2010 at 12:57

read #90 - that helped for me

Log in or register to post comments

Comment #110

sagar ramgade commented 27 May 2010 at 14:07

Here's my first item :

<articleListing>
−
<article id="6174899">
<articleType id="1">News Story</articleType>
<incomingBasket id="2">soccer</incomingBasket>
<title>Benfica won't hook Huntelaar - Milan</title>
−
<abstract>
Sporting director Ariedo Braida insists Klaas Jan Huntelaar will not be leaving AC Milan this summer despite reported interest from Benfica.
</abstract>
−
<body>
The 26-year-old arrived in Milan from Real Madrid last summer and is under contract for a further three years with the Rossoneri.
Braida said: "Huntelaar to Benfica? I don't think that will be possible.
"He is a player that is not for sale and that represents the future of Milan.
"Huntelaar has a contract for a further three seasons with us."
The former Ajax forward scored seven goals in 21 appearances for Milan during the 2009/10 campaign.
</body>
−
<media>
−
<mediaItem>
<caption>Huntelaar: Unlikely to join Benfica</caption>
<datePrefix>10/01</datePrefix>
<mediaURL>Klaas-Jan-Huntelaar-AC-Milan-celebrate_2404785.jpg</mediaURL>
−
<url>
http://clientimages.teamtalk.com/10/01/800x600/Klaas-Jan-Huntelaar-AC-Milan-celebrate_2404785.jpg
</url>
</mediaItem>
</media>
<last-updated>Thu, 27 May 2010 13:26:00</last-updated>
</article>

I tried //media//mediaItem//caption and //caption
Both of them didn't work.
Could you help please ?

Log in or register to post comments

Comment #111

ChaosD commented 27 May 2010 at 16:10

don´t forget to set your xpath in /settings/FeedsXMLParser to your "node identifier" (thats how i called it). in you case it would be "article". for the mapping you should use "media//mediaItem//caption"

Log in or register to post comments

Comment #112

sagar ramgade commented 27 May 2010 at 16:33

Thank you for you reply but i couldn't see that xpath setting anywhere, i am using httpfetcher and using a custom content type for importing.
Where do i find this setting, I might be acting dumb sorry for that however i couldn't find it anywhere.

Log in or register to post comments

Comment #113

sagar ramgade commented 27 May 2010 at 18:13

Hi,

Actually i was being dumb i didn't read the thread carefully, i was using this http://github.com/pvhee/feeds_xmlparser
instead of http://github.com/chrisirhc/feeds_xmlparser.
I could see that xpath setting now, however need to test it with the settings.
Will update here if successful.

Log in or register to post comments

Comment #114

sagar ramgade commented 28 May 2010 at 10:38

Hi,

It worked like a charm, I am using Feeds Image grabber module to fetch images too...
Thank you all who worked on this.
Cheers

Log in or register to post comments

Comment #115

tfranz commented 28 May 2010 at 14:17

Im getting the example.xml of chrisirhc/feeds_xmlparser to work, but doesn't succeed with my own XML-file:

<?xml version="1.0" encoding="ISO-8859-1"?>
<master>
    <customer>
        <product>
            <object>
                <objecttyp>
                    <value valuetyp="special"/>
                </objecttyp>
            </object>
            <tech>
                <name>franz</name>
                <id>111</id>
            </tech>
        </product>
		
        <product></product>			
    </customer>
</master>

I tried master//customer//product and only product – as XPATH, but none of them worked ("There is no new content") ...

I need the following mapping:

tech//id => GUID
tech//name => Title
object//objecttyp//value@valuetyp => CCK-Textfield

Thank you for any help or hint,

Tobias

Log in or register to post comments

Comment #116

sagar ramgade commented 29 May 2010 at 05:24

I think you need to set Xpath as //customer and then
product//object//objecttyp//value@valuetyp => CCK field
product//tech//id => GUID
product//tech//name => Title

Hope this helps.

Log in or register to post comments

Comment #117

Thoor commented 30 May 2010 at 09:41

First of all - THX to all who worked on this Issue! Great Job!

The example.xml is working fine for me and my example from january in http://drupal.org/node/631104#comment-2437960 is almost working now, when I use the http://github.com/chrisirhc/feeds_xmlparser in addition to the FEEDS Module.

But I still have a problem I can´t solve. As you can see, there is a socalled XML SCHEMA in the Feed I want to use.

<products xmlns="http://zanox.com/productdata/exportservice/v1" xmlns:xsi=http://www.w3.org /2001/XMLSchema-instance 
xsi:schemaLocation="http://zanox.com/productdata/exportservice/v1 http://productdata.zanox.com/exportservice/schema/export-1.0.xsd">

With this line in the feed i receive the message ("There is no new content") ... while importing.

When I change the line manually to <products> ... everything works fine and I can parse and map the XM Feed!

Does anyone have a solution, how I can successfully handle the additional XML Schema in my Feed with FEEDS and the FEEDS_XMLPARSER?

Log in or register to post comments

Comment #118

srobert72 commented 30 May 2010 at 10:21

@thoor
Not a solution to your problem, but just a workaround.
With Zanox you could also use CSV export instead of XML.

Log in or register to post comments

Comment #119

chrisirhc commented 30 May 2010 at 16:39

@Thoor

Is that line proper XML? I notice there's a space in the URL and there are no quotes.

Should it be:

<products xmlns="http://zanox.com/productdata/exportservice/v1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://zanox.com/productdata/exportservice/v1 http://productdata.zanox.com/exportservice/schema/export-1.0.xsd">

Will have to look into this issue if it persists even if it's proper validated XML.

Log in or register to post comments

Comment #120

Thoor commented 31 May 2010 at 23:32

@chrisirhc
The line is how it is in the feed. An example feed can be seen under: http://wiki.zanox.com/en/Product_Data_Download#Example_XML_file

I don´t know, if this is proper XML?

@srobert72
THX for your reply! I knew already, that CSV import works well with FEEDS. Also the "Classic XML Feed" from Zanox is working good! Because there it is possible to turn the XML Schema "off"! Without it, FEEDS with FEEDS_XMLPARSER is working without Problems!

Log in or register to post comments

Comment #121

TimG1 commented 1 June 2010 at 01:28

Hi Chris,

Huge thanks for your work on this!

I've been messing around testing this for a bit today. I have a feed full of namespace prefixes. Importing works fine for elements with no prefix, but for the elements with a prefix I get a page full of errors like this...

warning: SimpleXMLElement::__construct() [simplexmlelement.--construct]: namespace error : Namespace prefix dataField on caseId is not defined in /sites/all/modules/feeds_xmlparser/FeedsXMLParser.inc on line 51.

warning: SimpleXMLElement::__construct() [simplexmlelement.--construct]: ^ in /sites/all/modules/feeds_xmlparser/FeedsXMLParser.inc on line 51.

I have a bunch of elements such as

  <dataField:caseId></dataField:caseId> 
  <dataField:lastUpdateDate></dataField:lastUpdateDate> 
  <dataField:categoryName></dataField:categoryName> 
  <dataField:isFeatured></dataField:isFeatured>

I can send you the feed I'm working with via your contact form if you would like to mess around with it.

Thanks again!
-Tim

Log in or register to post comments

Comment #122

tfranz commented 1 June 2010 at 11:23

Hi Sagar Ramgade, thanks for your reply!

I tried:
XPath: //customer
product//tech//id => GUID
product//tech//name => Title

At least it created one empty node, and i got the following error (3x):

* warning: mysql_real_escape_string() expects parameter 1 to be string, array given in /mnt/web6/21/84/51321584/htdocs/drupal/includes/database.mysql.inc on line 321.

Line 321 is return mysql_real_escape_string($text, $active_db);

/**
 * Prepare user input for use in a database query, preventing SQL injection attacks.
 */
function db_escape_string($text) {
  global $active_db;
  return mysql_real_escape_string($text, $active_db);
}

Any idea?!

Log in or register to post comments

Comment #123

mokko commented 1 June 2010 at 14:15

Using chrisirhc's feeds_xmlparser I ran into the "mysql_real_escape_string()"-warning too. I looked into it a little bit.

In my case it comes up when my source xml has two title-elements while drupal has only one title. I can create a cck field title which allows multiple values, use xmlparser to map to the CCK field (instead of drupal's title) and then this warning disappears. Not sure this work around is what you want. At least then you should be able to import all your nodes.

Log in or register to post comments

Comment #124

mokko commented 1 June 2010 at 14:44

@Thoor: I also looked into using namespace with chrisirhc's feeds_xmlparser. My understanding is that currently default namespaces are a bit problematic with the module. I ended up naming my namespace. Something like:

<products 
   xmlns:default="http://zanox.com/productdata/exportservice/v1" 
   xmlns="http://zanox.com/productdata/exportservice/v1" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://zanox.com/productdata/exportservice/v1 http://productdata.zanox.com/exportservice/schema/export-1.0.xsd">

In the xpath setting I use
//default:products

I compare this functionality to my xml editor (oxygen). If I remember correctly it has an option where it replaces no namespace with default namespace. If feeds_xmlparser would do this, you could use your original xml with your original xpath setting
//products

Hope it helps
mokko

Log in or register to post comments

Comment #125

tfranz commented 2 June 2010 at 09:30

I tried the following:

Xpath: //customer//product
//tech//name => Title
//tech//id => GUID

... and it works! Thanks for your help!

Log in or register to post comments

Comment #126

alex_b commented 6 June 2010 at 20:11

Repeating what I said in #93: Does anyone want to break out this module into its own Drupal project? Seems like there is a large enough user base to justify it - just the issue queue that comes with it would be a big help for maintaining it :-)

While I welcome feeds_xmlparser to the Feeds ecosystem, I am not planning on committing it (or a similar extension) to Feeds at the moment - it is an extension that requires high involvement with its user base and I am personally not dealing with its use cases.

Log in or register to post comments

Comment #127

steven jones commented 8 June 2010 at 13:20

Project:	Feeds	» Feeds XML Parser
Version:	6.x-1.x-dev	»

I think it's been done: http://drupal.org/project/feeds_xmlparser

Log in or register to post comments

Comment #128

mokko commented 8 June 2010 at 16:02

I am probably repeating what everyone knows here already :
- it seems that pvhee created http://drupal.org/project/feeds_xmlparser and
- we were last talking about chrisirhc's fork of that code.

Both have a hosted their module on github (more than d.o), see e.g. http://github.com/chrisirhc/feeds_xmlparser.

Both have been quiet for at least a few days.

I am not very familiar with github. It seems for hosting the code it's fine. I miss drupal's issue queue and would prefer to use the http://drupal.org/project/feeds_xmlparser issue queue for chrisirhc's fork, but currently I think this would risk confusion. This would be somehow along the lines suggested by alex_b.

To me it seems the easiest would be if chrisirhc and or pvhee would say something. Any other suggestions?

Log in or register to post comments

Comment #129

podox commented 9 June 2010 at 13:20

Very excited by this, especially the namespaces development. I'm having trouble importing the following feed:

http://rss.oucs.ox.ac.uk/oxitems/generatersstwo2.php?channel_name=classi...

The format is as follows

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
  <channel>
    <title>The Beazley Archive - Classical Art Research Centre</title>

    <item>
       <title>Introduction to Art of the Ancient World</title>
       <itunes:author xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">John Boardman, Donna Kurtz</itunes:author>
       <link>http://media.podcasts.ox.ac.uk/oucs/classics/boardman-kurtz-oxonian.mp3?CAMEFROM=podcastsRSS</link>
    </item>

    <item>
      etc etc.
    </item>
  </channel>
</rss>

I've tried a combination of channel//item, item, //item //channel and many more as the Xpath setting, with //title as the sole mapping setting, but I get the error "Could not retrieve title from feed" on import.

Log in or register to post comments

Comment #130

mokko commented 9 June 2010 at 14:58

Warning:I don't know much about RSS , so my wording might be strange.

Of course, you have two different titles: the channel title and the item title. I assume you want to map the item information. Correct xpath should be one of two

/rss/channel/item

or

//item

Then you should be able to map the title by entering "title" in the mapping. This should work definitely. With itunes namespace I am less sure. Try separately.

In the namespace tests I made successfully only my root element has a namespace.

You should be able to enter itunes:author in the mapping, but I am not at all sure this does work.

Of course you have to be using http://github.com/chrisirhc/feeds_xmlparser

What do the others think?

Log in or register to post comments

Comment #131

chrisirhc commented 9 June 2010 at 15:04

Hi there everyone, apologies for the disappearance, I have been watching this thread but did not realise that it was awaiting me or pvhee's input.

pvhee has contacted me about co-maintaining the module with him. I've responded that I'm interested but he has not since responded. I believe he is busy at the moment. He has plans to bring the module from git back into the CVS system to be downloadable from the module here.

Log in or register to post comments

Comment #132

podox commented 9 July 2010 at 10:02

Thanks mokko. I do want to map the item information to the feed item node, and /rss/channel/item works (for the item title, description, media file URl etc.) Not sure how to map the itunes info (itunes:author doesn't work but I'll keep trying).

The error message comes about from not entering a title in the *feed node* - you have to enter it manually with the XML parser, whereas with the common syndication parser, the feed node title is correctly generated automatically from the source feed.

Thanks for your help - looking forward to seeing this develop.

EDIT: See http://drupal.org/node/838172#comment-3134636 for working mapping settings, including a way to map itunes:author

Log in or register to post comments

Comment #133

arski commented 16 June 2010 at 18:16

sub

Log in or register to post comments

Comment #134

arski commented 16 June 2010 at 18:26

Hey,

I just installed feeds_xmlparser on my site but when I create a new feed there seems to be no new "XML Parser" option or anything like that in the Parser section. Any ideas what's going on?

Thanks,
Martin

PS. I agree with the post above - please please move the code to d.o. CVS asap - that will make it so much easier for everyone to test/submit issues/comments.. you don't have to make a release until you're ready, but at least the code/issues will be all in one place :)

Log in or register to post comments

Comment #135

mokko commented 16 June 2010 at 20:22

Status	File	Size
new	xml_parser.png	120.11 KB

I think I had that problem at first, too, but I don't remember exactly what was the problem. Did you check the rights of feeds_xmlparser directory? I attach a screenshot of how it looks for me. Hope it helps.

Log in or register to post comments

Comment #136

arski commented 16 June 2010 at 20:30

Umm yea that's what I expected to see too.. the rights seem to be the same as for any other directory, don't think that should be a problem.. you really don't remember what you did to fix this? :)

Log in or register to post comments

Comment #137

arski commented 16 June 2010 at 20:45

Gah.. got it.. I clicked on the "Download Source" button at the top of that github page and apparently I downloaded an old version of the module with basically empty code files.. great stuff :/

After I got the latest individual files manually - you still need to flush your cache for the XML Parser option to appear.

Just writing this down so that anyone else having the same trouble can read up :)

Other than that, let's try this out! :)

Log in or register to post comments

Comment #138

TamboWeb commented 16 June 2010 at 21:11

@arski

Have you tried clearing your cache? Go to administer>site configuration > performance at the bottom, clear all cache.

I am currently doing testing with namespaces I will be reporting my findings sometime tonight.

Sandro.

Log in or register to post comments

Comment #139

arski commented 16 June 2010 at 21:22

Hey, well yea, as my 2nd post (#137) says - there were 2 issues, including a cache one.

Having tested the thing out - hats off, works like a charm! :)

Log in or register to post comments

Comment #140

TamboWeb commented 16 June 2010 at 21:47

I just imported 65 nodes from an xml file. I had to do a small change to the code to be able to import xml node elements within a namespace without prefixes. Other than that, it works great.

Log in or register to post comments

Comment #141

TamboWeb commented 17 June 2010 at 04:45

Issue parsing xml file with namespaces.

Setup: Using chrisirhc code from http://github.com/chrisirhc/feeds_xmlparser,

This is just an observation as I am not much of a coder. I am not implying that this is a bug nor it is a fix. Somewhere in the posts above there is mention that namespaces are not supported. On the other hand some posts indicate successful results. For my particular application this is a workaround until I find a robust solution compatible with the module. I am not intending to hack the module.

It seems that the module is unable to import content if the XML file used as a source of content contains XML nodes elements with namespaces but without namespace prefixes. The module responds with no content found or with an error depending on the parser xpath setting.

Sample xml (truncated for simplicity sake):

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<RetrieveListingDataResponse xmlns="http://www. example.com/ExampleNetServices">
<RetrieveListingDataResult>
<Listings xmlns="http://www.example.com/Schemas/Standard/StandardXML1_1.xsd">

<Residential><LN>34193</LN><PTYP>RESI</PTYP><LAG>15649</LAG></Residential>

</Listings>
</RetrieveListingDataResult>
</RetrieveListingDataResponse>
</soap:Body>
</soap:Envelope>

When trying to parse the above code with xpath, the module responds with no new content found.

If I remove the namespace from the Listing element, then the module parses the code correctly. This however is not ideal since the xml file will be updated daily with cron. I am not sure I want to manipulate the xml file with a search and replace, nor I want to do it manually prior to parsing.

After reviewing the code in FeedsXMLParser.inc Line 29 through 35,

      //Fetch all namespaces
      $namespaces = $xml->getNamespaces(true);

      //Register them with their prefixes
      foreach ($namespaces as $prefix => $ns) {
        $xml->registerXPathNamespace($prefix, $ns);
      }

I realized that there are no prefixes in the namespace, so there is no reference to the Listing element. Hence no content is found.

I proceeded to experiment a little, by hardcoding the namespace and the prefix as follows:

      //Fetch all namespaces
      $namespaces = $xml->getNamespaces(true);

      //Register them with their prefixes
      //foreach ($namespaces as $prefix => $ns) {
        //$xml->registerXPathNamespace($prefix, $ns);
      $xml->registerXPathNamespace('default', "http://www. example.com/Schemas/Standard/StandardXML1_1.xsd"); //hardcoded prefix + namespace, and commented out original code.
      //}

and changed the “Settings for XML parser” xpath settings to //default:Residential.

Now all nodes are imported without errors. It seems that the Fetch all namespaces needs some revision to accommodate for namespaces without prefixes since a blank prefix cannot be registered.

Sandro.

Log in or register to post comments

Comment #142

mokko commented 17 June 2010 at 08:24

yes, Sandro! This is similar to what I meant in #124. I didn't see that this has consequences for the import (updates). I don't get why is it not necessary to register the usual namespaces (via $xml->getNamespaces(true)). Shouldn't we just add default for default namespace?

Anyways, I guess we now need a generic way to access namespace without prefix. Apparently ->getNamespaces does not work. Then ->getDocNamespaces will also not work, will it?

At http://de3.php.net/manual/en/simplexmlelement.getDocNamespaces.php it says:

If there is no prefix (a default namespace), the empty string will be used as a key in the array referencing that namespace value. 
The earliest ancestor will be used for overwriting any identical  prefixes (or lack thereof).

This doesn't sound good. It probably means we can access only one default namespace. That would be an improvement though. When I have some time, I can look into it tonight or so.

Log in or register to post comments

Comment #143

robbertnl commented 17 June 2010 at 14:01

What about performance? i am using this, when importing a 8 mb XML file. It works but it eats a lot of memory ( about 1,5 gb) and several hours to complete.
Well i am doing some processing (creating users, content profiles, groups and importing small images), but i don't think it should take that long. I think it's an issue of XML Parser, but maybe also for this Extensible XML parser ?

Log in or register to post comments

Comment #144

TamboWeb commented 17 June 2010 at 16:59

Hi mokko,

Thanks for the input and the link. I will take a look at that information as well. Currently I have to pull away for some time to take care of another project but I will get back to this as soon as I can.

Log in or register to post comments

Comment #145

TamboWeb commented 17 June 2010 at 17:09

robbertnl

Regarding #143,

I am also concerned about performance. I have read somewhere that Xpath was slower than using the native php xml funcitons (i think). Eventually, my application will be doing some processing as well. So far this is a good start and proof of concept.

One feature that i need would be to be able to delete nodes referenced by another xml file. For example, import new content from xml file #1. then later, delete content specifed on XML file #2. So far the Feeds functionality allow you to delete content from the same feed only.

Another feature would be to be able to specify a feed import via cron. For instance run import of xml file #1 daily. and Run delete nodes as per xml file #2 every other day.

If anyone know who to do this I would be interested in hearing some pointers.

Thanks.

Log in or register to post comments

Comment #146

sagar ramgade commented 18 June 2010 at 13:49

Hi,

The xml feed which i am trying to fetch contains date as shown below

I tried to map with the my cck date field (date or datetime tried both ), it gives me an error message :
warning: DateTime::__construct() [datetime.--construct]: Failed to parse time string (18/06/2010) at position 0 (1): Unexpected character in /../sites/all/modules/feeds/plugins/FeedsParser.inc on line 388.

In my cck date field i have set the custom input format to d/m/Y which should ideally match with the date format in the xml file, however it doesn't import anything stops with the error message mentioned above.

Log in or register to post comments

Comment #147

rjbrown99 commented 18 June 2010 at 17:19

It seems like this issue has started to become a catch-all for any problems related to XML parsing. Since the issue was first created, we now have a Feeds XML Parser module and a dedicated issue queue for it.

Would it be too much to ask that we take any new issues, problems, or questions and break them out into separate items?

Log in or register to post comments

Comment #148

michellezeedru commented 23 June 2010 at 21:58

Plowing through this, knowing next to nothing about feeds and RSS, I was able to follow leads from this thread and set up a successful importer for an XML feed. Thought I'd subscribe to keep aprised of progress and also share my scenario, in case it's helpful to anyone else.

Setup: Using chrisirhc code from http://github.com/chrisirhc/feeds_xmlparser

The feed I'm trying to import looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0" xmlns:ra="http://www.radioactivity.fm/inc/namespace">
   <channel>
      <title>Last twenty four hours of plays on KALX </title>
      <link>http://kalx.radioactivity.fm/feeds/last24hours.xml</link>
      <description>The last twenty hour of plays on KALX</description>
      <language>en-us</language>
      <pubDate>Wed, 23 Jun 2010 15:48:44 EST</pubDate>	
	
      <item>
         <title>Play # 2 </title>
         <link>http://kalx.radioactivity.fm/</link>
         <ra:time>06/23/2010 11:54 am</ra:time>
         <ra:track>Alien Heart</ra:track>
         <ra:artist>Elodie Lauten</ra:artist>
         <ra:album>Piano Works Revisited</ra:album>
         <ra:label>Unseen Worlds</ra:label>
         <ra:genre></ra:genre>
      </item>
</channel>
</rss>

In my feed importer>XML Processor settings, I set X-Path to //item
In mappings, I set up the following sources to map to CCK fields:

Title
//track
//artist
//album
//time

And it works -- for the most part! The feed apparently contains no type of unique identifier, so updating is not working. I guess my next step is to work with the producer of this feed to include GUID in the feed?

Thanks for the work on this everyone.

Log in or register to post comments

Comment #149

ChaosD commented 24 June 2010 at 09:47

you could use the <link> as guid because its most likely unique and wont give duplicates (assuming that each item has a own link on the original site) ... title might also work but personally i would not rely on that.

Log in or register to post comments

Comment #150

meatbag commented 25 June 2010 at 02:48

When i test chrisirhc's xml parser on the following feed, i always get the "no new content" message.

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:content="http://purl.org/rss/1.0/modules/content/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
 xmlns:image="http://purl.org/rss/1.0/modules/image/"
 xmlns:admin="http://webns.net/mvcb/"
 xmlns:atom="http://www.w3.org/2005/Atom"
>
feed contents......
</rdf:RDF>

So i try to remove sth from the feed while keeping other settings untouched.

<rdf:RDF
>
feed contents......
</rdf:RDF>

And it works this time! But i just can't figure out why it failed to parse the first one.
Any helps would be appreciated.

Log in or register to post comments

Comment #151

meatbag commented 25 June 2010 at 05:48

Ok i figure out that maybe namespace without prefix is the reason...

Log in or register to post comments

Comment #152

sagar ramgade commented 25 June 2010 at 09:36

Hi All,

I want to fetch the Home team, away team score, MATCHDETAIL attendance.
I am able to fetch the match id , home team name, away team name However not able to fetch home team score, away team score and attendance.
Can anyone help.
I had set the xpath as : //XML//DETAILS

//HOMETEAMNAME[@SNAME] => Title

//HOMETEAMNAME[@SNAME] => Home team term

//AWAYTEAMNAME[@SNAME] => Away team term

//MATCHSUMMARY[@ID] => GUID

//MATCHSUMMARY[@ID] => matchID

//MATCHSUMMARY//HOMETEAMSCORE//SCORE => home team score

//MATCHSUMMARY//AWAYTEAMSCORE//SCORE => away team score

//MATCHDETAIL[@ATTENDANCE] => attendance

<XML>
−
<DETAILS>
−
<MATCHSUMMARY SPORTID="1" ID="301096" PAID="3222305" CPID="1627" SBID="12345058" TVCHANNEL="" FXDATE="21/06/2010" CURRENTDATE="" ENDDATE="" TIME="12:30" LONGCOMPETITIONNAME="FIFA World Cup" SHORTCOMPETITIONNAME="World Cup" LONGROUNDNAME="Group G" SHORTROUNDNAME="Group G" DETSLASTUPDATED="1277145082" COMMSLASTUPDATED="1277130900" STATSLASTUPDATED="1" TABLELASTUPDATED="1277140093" COMMENTARY="/football/commentary/301096_1277130900.xml" GALLERYLASTUPDATED="1277132280" GALLERY="/football/editorial/gallery_match_301096_1277132280.xml" PLAYERSTATS="Y">
<HOMETEAMNAME ID="195" SNAME="Portugal">Portugal</HOMETEAMNAME>
−
<HOMETEAMSCORE>
<SCORE AGG="">7</SCORE>
</HOMETEAMSCORE>
<AWAYTEAMNAME ID="9742" SNAME="Korea DPR">Korea DPR</AWAYTEAMNAME>
−
<AWAYTEAMSCORE>
<SCORE AGG="">0</SCORE>
</AWAYTEAMSCORE>
<MATCHSTATUS STATUS="5"/>
</MATCHSUMMARY>
<MATCHDETAIL VENUENAME="Green Point Stadium" VENUEIMAGE="http://img.skysports.com/10/06/68x93/Green-Point-Stadium-South-Africa_2464128.jpg" ATTENDANCE="63644"/>
−
<LAST6>
−
<HOMETEAM ID="195">
<MATCH RESULT="win" HOMETEAMNAME="Portugal" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="7" AWAYTEAMSCORE="0"/>
<MATCH RESULT="draw" HOMETEAMNAME="Ivory Coast" AWAYTEAMNAME="Portugal" HOMETEAMSCORE="0" AWAYTEAMSCORE="0"/>
<MATCH RESULT="win" HOMETEAMNAME="Portugal" AWAYTEAMNAME="Mozambique" HOMETEAMSCORE="3" AWAYTEAMSCORE="0"/>
<MATCH RESULT="win" HOMETEAMNAME="Portugal" AWAYTEAMNAME="Cameroon" HOMETEAMSCORE="3" AWAYTEAMSCORE="1"/>
<MATCH RESULT="draw" HOMETEAMNAME="Portugal" AWAYTEAMNAME="Cape Verde Islands" HOMETEAMSCORE="0" AWAYTEAMSCORE="0"/>
<MATCH RESULT="win" HOMETEAMNAME="Portugal" AWAYTEAMNAME="China PR" HOMETEAMSCORE="2" AWAYTEAMSCORE="0"/>
</HOMETEAM>
−
<AWAYTEAM ID="9742">
<MATCH RESULT="lose" HOMETEAMNAME="Portugal" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="7" AWAYTEAMSCORE="0"/>
<MATCH RESULT="lose" HOMETEAMNAME="Brazil" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="2" AWAYTEAMSCORE="1"/>
<MATCH RESULT="lose" HOMETEAMNAME="Nigeria" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="3" AWAYTEAMSCORE="1"/>
<MATCH RESULT="draw" HOMETEAMNAME="Greece" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="2" AWAYTEAMSCORE="2"/>
<MATCH RESULT="lose" HOMETEAMNAME="Paraguay" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="1" AWAYTEAMSCORE="0"/>
<MATCH RESULT="draw" HOMETEAMNAME="South Africa" AWAYTEAMNAME="Korea DPR" HOMETEAMSCORE="0" AWAYTEAMSCORE="0"/>
</AWAYTEAM>
</LAST6>
−
<SQUADS>
−
<SQUAD TEAM="195" FORMATION="433" HOMEFLAG="1">
<PLAYER PLFORN="" PLSURN="Eduardo" PLID="131544" SQUADNO="1" RANK="1" CAPTAIN="No" SUB="No">Eduardo </PLAYER>
<PLAYER PLFORN="" PLSURN="Miguel" PLID="94251" SQUADNO="13" RANK="2" CAPTAIN="No" SUB="No">Miguel </PLAYER>
<PLAYER PLFORN="Ricardo" PLSURN="Carvalho" PLID="99309" SQUADNO="6" RANK="3" CAPTAIN="No" SUB="No">Ricardo Carvalho </PLAYER>
<PLAYER PLFORN="Bruno" PLSURN="Alves" PLID="168852" SQUADNO="2" RANK="4" CAPTAIN="No" SUB="No">Bruno Alves </PLAYER>
<PLAYER PLFORN="Alexandre" PLSURN="Fabio Coentrao" PLID="190752" SQUADNO="23" RANK="5" CAPTAIN="No" SUB="No">Alexandre Fabio Coentrao </PLAYER>
<PLAYER PLFORN="" PLSURN="Tiago" PLID="96219" SQUADNO="19" RANK="6" CAPTAIN="No" SUB="No">Tiago </PLAYER>
<PLAYER PLFORN="Pedro" PLSURN="Mendes" PLID="99312" SQUADNO="8" RANK="7" CAPTAIN="No" SUB="No">Pedro Mendes </PLAYER>
<PLAYER PLFORN="Raul" PLSURN="Meireles" PLID="126502" SQUADNO="16" RANK="8" CAPTAIN="No" SUB="No">Raul Meireles </PLAYER>
<PLAYER PLFORN="Cristiano" PLSURN="Ronaldo" PLID="94592" SQUADNO="7" RANK="9" CAPTAIN="No" SUB="No">Cristiano Ronaldo </PLAYER>
<PLAYER PLFORN="Hugo" PLSURN="Almeida" PLID="163879" SQUADNO="18" RANK="10" CAPTAIN="No" SUB="No">Hugo Almeida </PLAYER>
<PLAYER PLFORN="" PLSURN="Simao" PLID="94259" SQUADNO="11" RANK="11" CAPTAIN="No" SUB="No">Simao </PLAYER>
<PLAYER PLFORN="" PLSURN="Beto" PLID="154641" SQUADNO="12" RANK="12" CAPTAIN="No" SUB="Yes">Beto </PLAYER>
<PLAYER PLFORN="Paulo" PLSURN="Ferreira" PLID="99310" SQUADNO="3" RANK="13" CAPTAIN="No" SUB="Yes">Paulo Ferreira </PLAYER>
<PLAYER PLFORN="" PLSURN="Rolando" PLID="125249" SQUADNO="4" RANK="14" CAPTAIN="No" SUB="Yes">Rolando </PLAYER>
<PLAYER PLFORN="" PLSURN="Duda" PLID="97005" SQUADNO="5" RANK="15" CAPTAIN="No" SUB="Yes">Duda </PLAYER>
<PLAYER PLFORN="Liedson" PLSURN="da Silva Muniz" PLID="100957" SQUADNO="9" RANK="16" CAPTAIN="No" SUB="Yes">Liedson da Silva Muniz </PLAYER>
<PLAYER PLFORN="" PLSURN="Danny" PLID="126966" SQUADNO="10" RANK="17" CAPTAIN="No" SUB="Yes">Danny </PLAYER>
<PLAYER PLFORN="Miguel" PLSURN="Veloso" PLID="132102" SQUADNO="14" RANK="18" CAPTAIN="No" SUB="Yes">Miguel Veloso </PLAYER>
<PLAYER PLFORN="" PLSURN="Pepe" PLID="126237" SQUADNO="15" RANK="19" CAPTAIN="No" SUB="Yes">Pepe </PLAYER>
<PLAYER PLFORN="Ricardo" PLSURN="Costa" PLID="101563" SQUADNO="21" RANK="20" CAPTAIN="No" SUB="Yes">Ricardo Costa </PLAYER>
<PLAYER PLFORN="Ruben" PLSURN="Amorim" PLID="127221" SQUADNO="17" RANK="21" CAPTAIN="No" SUB="Yes">Ruben Amorim </PLAYER>
<PLAYER PLFORN="" PLSURN="Deco" PLID="99315" SQUADNO="20" RANK="22" CAPTAIN="No" SUB="Yes">Deco </PLAYER>
<PLAYER PLFORN="Daniel" PLSURN="Fernandes" PLID="138805" SQUADNO="22" RANK="23" CAPTAIN="No" SUB="Yes">Daniel Fernandes </PLAYER>
</SQUAD>
+
<SQUAD TEAM="9742" FORMATION="541" HOMEFLAG="0">
<PLAYER PLFORN="Ri Myong" PLSURN="Guk" PLID="187701" SQUADNO="1" RANK="1" CAPTAIN="No" SUB="No">Ri Myong Guk </PLAYER>
<PLAYER PLFORN="Jong-Hyok" PLSURN="Cha" PLID="167876" SQUADNO="2" RANK="2" CAPTAIN="No" SUB="No">Jong-Hyok Cha </PLAYER>
<PLAYER PLFORN="Chol-Jin" PLSURN="Pak" PLID="187707" SQUADNO="13" RANK="3" CAPTAIN="No" SUB="No">Chol-Jin Pak </PLAYER>
<PLAYER PLFORN="Jun-Il" PLSURN="Ri" PLID="167875" SQUADNO="3" RANK="4" CAPTAIN="No" SUB="No">Jun-Il Ri </PLAYER>
<PLAYER PLFORN="Yun-Nam" PLSURN="Ji" PLID="167866" SQUADNO="8" RANK="5" CAPTAIN="No" SUB="No">Yun-Nam Ji </PLAYER>
<PLAYER PLFORN="Kwang-Chon" PLSURN="Ri" PLID="167874" SQUADNO="5" RANK="6" CAPTAIN="No" SUB="No">Kwang-Chon Ri </PLAYER>
<PLAYER PLFORN="Yong-Hak" PLSURN="Ahn" PLID="167868" SQUADNO="17" RANK="7" CAPTAIN="No" SUB="No">Yong-Hak Ahn </PLAYER>
<PLAYER PLFORN="In-Guk" PLSURN="Mun" PLID="167872" SQUADNO="11" RANK="8" CAPTAIN="No" SUB="No">In-Guk Mun </PLAYER>
<PLAYER PLFORN="Nam-Chol" PLSURN="Pak" PLID="167484" SQUADNO="4" RANK="9" CAPTAIN="No" SUB="No">Nam-Chol Pak </PLAYER>
<PLAYER PLFORN="Yong-Jo" PLSURN="Hong" PLID="168073" SQUADNO="10" RANK="10" CAPTAIN="No" SUB="No">Yong-Jo Hong </PLAYER>
<PLAYER PLFORN="Tae-Se" PLSURN="Jong" PLID="190754" SQUADNO="9" RANK="11" CAPTAIN="No" SUB="No">Tae-Se Jong </PLAYER>
<PLAYER PLFORN="Myong-Gil" PLSURN="Kim" PLID="167865" SQUADNO="18" RANK="12" CAPTAIN="No" SUB="Yes">Myong-Gil Kim </PLAYER>
<PLAYER PLFORN="Kum-il" PLSURN="Kim" PLID="167873" SQUADNO="6" RANK="13" CAPTAIN="No" SUB="Yes">Kum-il Kim </PLAYER>
<PLAYER PLFORN="Chol-Hyok" PLSURN="An" PLID="167482" SQUADNO="7" RANK="14" CAPTAIN="No" SUB="Yes">Chol-Hyok An </PLAYER>
<PLAYER PLFORN="Kum-Chol" PLSURN="Choe" PLID="173853" SQUADNO="12" RANK="15" CAPTAIN="No" SUB="Yes">Kum-Chol Choe </PLAYER>
<PLAYER PLFORN="Nam-Chol" PLSURN="Pak" PLID="189952" SQUADNO="14" RANK="16" CAPTAIN="No" SUB="Yes">Nam-Chol Pak </PLAYER>
<PLAYER PLFORN="Yong-Jun" PLSURN="Kim" PLID="167485" SQUADNO="15" RANK="17" CAPTAIN="No" SUB="Yes">Yong-Jun Kim </PLAYER>
<PLAYER PLFORN="Song-Chol" PLSURN="Nam" PLID="167867" SQUADNO="16" RANK="18" CAPTAIN="No" SUB="Yes">Song-Chol Nam </PLAYER>
<PLAYER PLFORN="Chol-Myong" PLSURN="Ri" PLID="176320" SQUADNO="19" RANK="19" CAPTAIN="No" SUB="Yes">Chol-Myong Ri </PLAYER>
<PLAYER PLFORN="Kwang-Hyok" PLSURN="Ri" PLID="175920" SQUADNO="21" RANK="20" CAPTAIN="No" SUB="Yes">Kwang-Hyok Ri </PLAYER>
<PLAYER PLFORN="Kyong-Il" PLSURN="Kim" PLID="175921" SQUADNO="22" RANK="21" CAPTAIN="No" SUB="Yes">Kyong-Il Kim </PLAYER>
<PLAYER PLFORN="Sung Hyok" PLSURN="Pak" PLID="189953" SQUADNO="23" RANK="22" CAPTAIN="No" SUB="Yes">Sung Hyok Pak </PLAYER>
<PLAYER PLFORN="Myong-Won" PLSURN="Kim" PLID="167864" SQUADNO="20" RANK="23" CAPTAIN="No" SUB="Yes">Myong-Won Kim </PLAYER>
</SQUAD>
</SQUADS>
−
<EVENTS>
<EVENT TIME="90" TYPE="7" PLAYER=""/>
<EVENT TIME="89" TYPE="1" TEAMFLAG="0" PLID="96219" PLAYER="Tiago"/>
<EVENT TIME="87" TYPE="1" TEAMFLAG="0" PLID="94592" PLAYER="Ronaldo"/>
<EVENT TIME="81" TYPE="1" TEAMFLAG="0" PLID="100957" PLAYER="da Silva Muniz"/>
<EVENT TIME="77" TYPE="9" TEAMFLAG="0" PLID="163879" SUBON="100957" PLAYER="Almeida"/>
<EVENT TIME="75" TYPE="9" TEAMFLAG="1" PLID="167876" SUBON="167867" PLAYER="Cha"/>
<EVENT TIME="74" TYPE="9" TEAMFLAG="0" PLID="94259" SUBON="97005" PLAYER="Simao"/>
<EVENT TIME="70" TYPE="9" TEAMFLAG="0" PLID="126502" SUBON="132102" PLAYER="Meireles"/>
<EVENT TIME="70" TYPE="4" TEAMFLAG="0" PLID="163879" INFO="Dissent" PLAYER="Almeida"/>
<EVENT TIME="60" TYPE="1" TEAMFLAG="0" PLID="96219" PLAYER="Tiago"/>
<EVENT TIME="58" TYPE="9" TEAMFLAG="1" PLID="167872" SUBON="167485" PLAYER="Mun"/>
<EVENT TIME="58" TYPE="9" TEAMFLAG="1" PLID="167484" SUBON="167873" PLAYER="Pak"/>
<EVENT TIME="56" TYPE="1" TEAMFLAG="0" PLID="163879" PLAYER="Almeida"/>
<EVENT TIME="53" TYPE="1" TEAMFLAG="0" PLID="94259" PLAYER="Simao"/>
<EVENT TIME="48" TYPE="4" TEAMFLAG="1" PLID="168073" INFO="Dissent" PLAYER="Hong"/>
<EVENT TIME="45" TYPE="14" PLAYER=""/>
<EVENT TIME="45" TYPE="6" PLAYER=""/>
<EVENT TIME="38" TYPE="4" TEAMFLAG="0" PLID="99312" INFO="Unsporting behaviour" PLAYER="Mendes"/>
<EVENT TIME="33" TYPE="4" TEAMFLAG="1" PLID="187707" INFO="Unsporting behaviour" PLAYER="Jin"/>
<EVENT TIME="29" TYPE="1" TEAMFLAG="0" PLID="126502" PLAYER="Meireles"/>
<EVENT TIME="0" TYPE="10" PLAYER=""/>
</EVENTS>
−
<STATS>
−
<TEAM ID="195">
<STAT TYPE="1">7</STAT>
<STAT TYPE="2">11</STAT>
<STAT TYPE="3">15</STAT>
<STAT TYPE="4">5</STAT>
<STAT TYPE="5">3</STAT>
<STAT TYPE="6">58.6</STAT>
<STAT TYPE="7">2</STAT>
<STAT TYPE="8"/>
<STAT TYPE="9">4</STAT>
<STAT TYPE="10">17</STAT>
<STAT TYPE="11">82.00</STAT>
</TEAM>
−
<TEAM ID="9742">
<STAT TYPE="1">0</STAT>
<STAT TYPE="2">3</STAT>
<STAT TYPE="3">11</STAT>
<STAT TYPE="4">1</STAT>
<STAT TYPE="5">3</STAT>
<STAT TYPE="6">41.4</STAT>
<STAT TYPE="7">2</STAT>
<STAT TYPE="8"/>
<STAT TYPE="9">17</STAT>
<STAT TYPE="10">4</STAT>
<STAT TYPE="11">74.49</STAT>
</TEAM>
</STATS>
<VALID>1</VALID>
</DETAILS>
<DETAILS>
.....
</DETAILS>
<DETAILS>
....
</DETAILS>
</XML>

Log in or register to post comments

Comment #153

mason@thecodingdesigner.com commented 26 June 2010 at 17:03

#75 works for me. Thanks Monkey Master!

Log in or register to post comments

Comment #154

ChaosD commented 28 June 2010 at 10:13

Log in or register to post comments

Comment #155

twistor commented 3 July 2010 at 14:57

It is related to the problem at hand. Currently (if you check out from cvs) XPath, Regex, and QueryPath are supported. I think I solved the namespace issues that exist with SimpleXML. I was unaware of this thread when I started this project, but it seems that the amount of duplicated work is minimal.

Log in or register to post comments

Comment #156

ChaosD commented 3 July 2010 at 16:32

maybe you should consider merging your projects as an extensible parser for feeds

Log in or register to post comments

Comment #157

twistor commented 3 July 2010 at 17:44

I'm not opposed to that. I've been thinking about breaking up the functionality of Feeds XPath Parser into different modules, however, I'd like to enable per-field query types. Such that you can use XPath for one field, regex for the second, and QueryPath for the third. The name is not so great for the functionality I do admit. The only thing about merging is that Feeds XML Parser configures the queries in the mapper and Feeds XPath Parser configures them at the endpoint. One allows for greater flexibility, one allows for more control of end users. I'm open to ideas.

Log in or register to post comments

Comment #158

ChaosD commented 3 July 2010 at 21:41

maybe you could call it feeds flexible parser as a collection of all those methods you mentioned. iam currently happy with the configuration in the mapper but i have to admit that iam not too familiar with XPath parser

Log in or register to post comments

Comment #159

meatbag commented 5 July 2010 at 12:29

The "entensible" XML parser supports only xpath
while the XML "Xpath" parser supports xpath, querypath and regex...

Really confusing...These two modules should swap their names...

Log in or register to post comments

Comment #160

blasthaus commented 15 July 2010 at 13:14

are there any immediate plans to finally integrate namespace support? or any tips to get several different namespaces registered or even just hardcoded for now? lookin' fwd.

wil

Log in or register to post comments

Comment #161

mokko commented 15 July 2010 at 13:40

As mentioned frequently in this thread, Chrisirhc's version supports namespaces basically: http://github.com/chrisirhc/feeds_xmlparser

Did you already check out if this version meets your requirements? If I remember correctly it does handle multiple namespaces well, but has problems with default namespaces. Please be more specific.

Also: I haven't had a chance to look at the new http://drupal.org/project/feeds_xpathparser if this might be an alternative.

Log in or register to post comments

Comment #162

fereira commented 16 July 2010 at 18:44

I am having some problems with namespaces as well.

The XML that I'm using defines several names spaces. Here is a small snippet of the code

<ags:resources
xmlns:ags="http://purl.org/agmes/1.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:agls="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2"
xmlns:dcterms="http://purl.org/dc/terms/">
<ags:resource ags:ARN="KE2009400905">
....
</ags:resource>

First of all when the the $namespaces = $xml->getNamespaces(true);
line gets executed it silently ignores the "agls" name space. I poked around a bit and found that the uri for that namespace doesn't appear to be valid anymore. If I change it to http://www.naa.gov.au/agls/ then the prefix and namespace showed up in the $namespaces array.

Secondly, I haven't been able to come up with some XPath yet that will parse the attribute of the each of the ags:resource elements. Even when I've hardcoded the registerXPathNamespace function calls with that correct list of namespaces I get the following errors:

warning: SimpleXMLElement::__construct() [simplexmlelement.--construct]: namespace error : Namespace prefix ags for ARN on resource is not defined in /www/html/feedsdev/sites/all/modules/feeds_xmlparser/FeedsXMLParser.inc on line 57.

That's failing when the $xml = new SimpleXMLElement($item); line is executed in the getSourceElement function.

The problem for me is that the attribute what I need to use for the unique GUID mapping.

I wonder if doing something like the evoc module is doing might work. It keeps a couple of database tables for rdf name spaces and uses a simple form element for adding the prefix and namespaces that are used by the rdfcck module for assigning rdf classes and properties.

Log in or register to post comments

Comment #163

mokko commented 17 July 2010 at 09:04

From the xml snippet you are posting I would assume that you want to set xpath to ags:resources and

resource/@ARN as GUID (unique target).

(assuming that you do really have recourse tags inside resource tags as in the snippet).

I haven't used xmlparser with namespaces for attributes.

I don't know about your concrete problem with the agls namespace. Can you find something here http://www.php.net/manual/en/book.simplexml.php?

Log in or register to post comments

Comment #164

fereira commented 17 July 2010 at 11:57

Yes, each ags:resource element is a unique resource identified by the ags:ARN attribute in that tag and they are nested within an ags:resources element (that's resources, not resource). Sorry, I didn't provide the closing ags:resources tag. Here's a real example:

http://mayfly.mannlib.cornell.edu/agrisdata/agris.xml

That content is using a schema developed by FAO of the UN for identifying resources held in an Agriculture Information Systems and is in use by dozens of institutions worldwide.

There are other issues that seem to be causing problems with the module as well. I notice in the feeds_xpathparser module that for each field one can specify whether or not to show the raw content. When parsing this xml using that module I had to tell it *not* to show raw xml for any element that had content wrapped with a CDATA tag.

The dc:description element for at least one of the resources contains html entiies (‘, ’) and I'm seeing errors on those elements as well.

In any case, I think a handbook with lots of examples of XML content with more complex structures with examples of XPath for parsing them, will go a long way to demonstrating just how flexible the module can be and may require code changes to handle all the possible *valid* xml constructs.

Log in or register to post comments

Comment #165

fereira commented 17 July 2010 at 12:35

Forgot to comment on your suggestion about the simplexml book. I had looked at the site but it didn't really answer all my questions.

I'm not so concerned about the agls namespace as that can be corrected in the Agris documentation, though it would seem to me that the getNamespace function should produce an error (and return false) if one of the namespace uris doesn't resolve correctly rather than just excluding it from the list.

More concerning, to me, is that when it's instantiating the new SimpleXMLElement class that it generates a warning about the ags namespace. I didn't see anything in the simplexml book but can a new SimpleXMLElement be instantiated with contains namespaces?

Log in or register to post comments

Comment #166

blasthaus commented 22 July 2010 at 00:04

i am using chrisirhc version and its not recognizing my media namespace, alas

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" xmlns:sc="http://www.screencast.com/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
   <channel>
      <title>Title here</title>
      <link>http://www.screencast.com/users/username/rss</link>
      <atom:link rel="next" href="http://www.screencast.com/username/rss/skip=50" />
      <atom:link href="http://www.screencast.com/users/username/rss" rel="self" type="application/rss+xml" />
      <sc:totalAvailable>62</sc:totalAvailable>

      <description>Video tutorials on ...</description>
      <pubDate>Tue, 13 Jul 2010 14:33:31 GMT</pubDate>
      <item>
         <title>Title here</title>
         <description />
         <pubDate>Tue, 13 Jul 2010 14:33:31 GMT</pubDate>
         <media:thumbnail width="70" height="70" url="http://thumb.screencast.com/thumb.gif" />
         <media:content url="http://content.screencast.com/users/RegressionBands2.swf" width="1008" height="658" duration="267" fileSize="6913835" type="application/x-shockwave-flash" />
         <enclosure url="http://content.screencast.com/users/RegressionBands2.swf" type="application/x-shockwave-flash" length="6913835" />
         <link>http://www.screencast.com/users/media/f6cb7c7f-c98f-4eb9-bd2d-196b5a0a5068</link>
         <guid>http://www.screencast.com/users/media/f6cb7c7f-c98f-4eb9-bd2d-196b5a0a5068/yes-its-unique</guid>
      </item>

maybe i'm not doing something right?!

thx-wil

Log in or register to post comments

Comment #167

meatbag commented 22 July 2010 at 00:58

#166
The urls are set as attribute of the tag, so they can't be parsed out.
You need to write a customized version to do that.

Log in or register to post comments

Comment #168

podox commented 22 July 2010 at 06:06

#166 Try...

*[name()='media:content']/@url
*[name()='media:content']/@duration
*[name()='media:content']/@fileSize

...etc. Use /rss/channel/item as the Xpath setting

Log in or register to post comments

Comment #169

vaene commented 22 July 2010 at 19:56

Having same issues as #166, tried


[name()='media:thumbnail']/@url    =>  Thumbnail  (cck text field)

but got this error for each attempted import:

warning: SimpleXMLElement::xpath() [simplexmlelement.xpath]: xmlXPathEval: evaluation failed in /var/www/vhosts/mysite.com/httpdocs/sites/all/modules/feeds_xmlparser/FeedsXMLParser.inc on line 52.

it imported other non-namespaced tags in the item tag so my original Xpath setting is correct I am assuming.

Sorry if I missed this before in the chain but could you give us a hint as to how to go about writing a custom parser that would specifically work for capturing attributes of media namespaced tags?

<item>
         <title>Title here</title>
         <description />
         <pubDate>Tue, 13 Jul 2010 14:33:31 GMT</pubDate>
         <media:thumbnail width="70" height="70" url="http://thumb.screencast.com/thumb.gif" />
         <media:content url="http://content.screencast.com/users/RegressionBands2.swf" width="1008" height="658" duration="267" fileSize="6913835" type="application/x-shockwave-flash" />
         <enclosure url="http://content.screencast.com/users/RegressionBands2.swf" type="application/x-shockwave-flash" length="6913835" />
         <link>http://www.screencast.com/users/media/f6cb7c7f-c98f-4eb9-bd2d-196b5a0a5068</link>
         <guid>http://www.screencast.com/users/media/f6cb7c7f-c98f-4eb9-bd2d-196b5a0a5068/yes-its-unique</guid>
      </item>

Log in or register to post comments

Comment #170

Fidelix commented 22 July 2010 at 20:26

OMG i want this badly. Is it working ATM ?

Log in or register to post comments

Comment #171

blasthaus commented 22 July 2010 at 21:56

thx podox, i tried that and get this error

warning: SimpleXMLElement::__construct() [simplexmlelement.--construct]: namespace error : Namespace prefix media on thumbnail is not defined in /Users/me/Sites/this_site/sites/default/modules/feeds_xmlparser/FeedsXMLParser.inc on line 52.

i assume this is where we need to be looking around line 28 of FeedsXMLParser.inc


    if (version_compare(PHP_VERSION, '5.2.0', '>=')) {
      //Fetch all namespaces
      $namespaces = $xml->getNamespaces(true);

      //Register them with their prefixes
      foreach ($namespaces as $prefix => $ns) {
/*        $xml->registerXPathNamespace($prefix, $ns); */
        $xml->registerXPathNamespace($prefix, $ns);
      }

question is how to get url attributes within a namespace tag, however it doesn't even look as if the namespace is getting recognized here.

thx-w

Log in or register to post comments

Comment #172

meatbag commented 23 July 2010 at 04:20

To deal with attributes and namespace better, i suggest using QueryPath as the parsing interface.

Here's an article by the QueryPath author which helps a lot.
http://www.ibm.com/developerworks/opensource/library/os-php-querypath/

Log in or register to post comments

Comment #173

podox commented 23 July 2010 at 08:18

#169 There is an asterisk before [name()='media:thumbnail']/@url - could you try again?

Log in or register to post comments

Comment #174

SeanBannister commented 24 July 2010 at 14:54

@meatbag QueryPath would be awesome, it's amazingly powerful and surprisingly much simpler.

Log in or register to post comments

Comment #175

twistor commented 24 July 2010 at 15:18

No to toot my own horn, the feeds_xpathparser module has tentative support for QueryPath in dev. I need people to test it out. All the features aren't currently implemented, but anything you can do with XPath should be accomplishable at this point. I agree that the syntax is much simpler plus it avoids having to learn yet another syntax.

Log in or register to post comments

Comment #176

arski commented 2 August 2010 at 14:48

hmm, any chance of actually having this module on d.o. anytime soon? You don't have to make a stable release straight away, but it would be nice to have the code in a proper place..

also what goes for the issues - it would be really great if you could take a look at the other issues reported for this project in here, and also maybe we should start splitting them a bit more instead of continuing in this 175-reply thread :o

Looking forward to this module!

Cheers

Log in or register to post comments

Comment #177

meatbag commented 3 August 2010 at 06:57

I modified the module to use QueryPath as parsing interface. It works perfectly for me.
Here's the code.
Note: This just serves as an example for those who want to know how QueryPath works.
You may need further modification before using it on your own server.

// $Id$
require_once 'QueryPath.php';
require_once 'QPXML.php';
/**
 * Parses a given file as an XML document.
 */
class FeedsXMLParser extends FeedsParser {
  /**
   * Implementation of FeedsParser::parse().
   */
  public function parse(FeedsImportBatch $batch, FeedsSource $source) {
    if (empty($this->config['xpath'])) {
      throw new Exception(t('Please set xpath for items in XML parser settings.'));
    }
    $temp = realpath($batch->getFilePath());
    if (!is_file($temp)) {
      throw new Exception(t('File %name not found.', array('%name' => $batch->getFilePath())));
    }
    $file = QueryPathEntities::replaceAllEntities(file_get_contents($temp));
    foreach (qp($file, $this->config['xpath']) as $qpitem) {
      $result[] = $qpitem;
    }
    if (!is_array($result)) {
      throw new Exception(t('Xpath %xpath in XML file %file failed.', array('%xpath' => $this->config['xpath'], '%file' => $batch->getFilePath())));
    }
    $batch->setItems($result);
  }
  function getSourceElement($item, $element_key) {
    switch ($element_key) {
      case 'img':
        foreach (qp($item, 'content|encoded') as $qpitem) {
          preg_match('/<img.*?src="(.*?)"/', strtolower($qpitem->cdata()), $itemmatch);
          $ext = substr(trim($itemmatch[1]), -4);
          if ($ext == '.jpg' or $ext == '.png') {
            $result[] = $itemmatch[1];
          }
        }
        break;
      case 'dc|date':
        foreach (qp($item, 'dc|date') as $qpitem) {
          $result[] = strtotime($qpitem->text());
        }
        break;
      default:
        foreach (qp($item, $element_key) as $qpitem) {
          $result[] = $qpitem->text();
        }
    }
    if (is_array($result)) {
      $values = array();
      foreach ($result as $obj) {
        $values[] = (string)$obj;
      }
      if (count($values) > 1) {
        return $values;
      }
      elseif (count($values) == 1) {
        return $values[0];
      }
    }
    return '';
  }
  public function configDefaults() {
    return array(
      'xpath' => 'item',
    );
  }
  public function configForm(&$form_state) {
    $form = array();
    $form['xpath'] = array(
      '#type' => 'textfield',
      '#title' => t('XPath'),
      '#default_value' => $this->config['xpath'],
      '#description' => t('XPath to locate items.'),
    );
    return $form;
  }
}

Log in or register to post comments

Comment #178

robbertnl commented 20 August 2010 at 07:07

Does this work with batch support as wel (see http://drupal.org/node/744660)?

Log in or register to post comments

Comment #179

fereira commented 23 September 2010 at 11:14

I'll toot your horn for you. The latest version of the feeds_xpathparser works really well. It would probably worth going through this issue to see if there is anything that could be added to the xpath parser but as I see it, there isn't any reason to keep this (feeds_xmlparser) around as the feeds_xpathparser does about everything one would need, its' easy to use, and it's even fairly well documented.

Log in or register to post comments

Comment #180

twistor commented 2 April 2013 at 19:39

Status:

Needs review

» Closed (fixed)

Cleaning old issues.

Log in or register to post comments

Extensible XML parser (mapping more sources)

Comments