I've spent a few hours today doing research on exactly how to get a non-standard XML schema data-feed into Drupal 6. From what I can tell, I'm certainly now alone in that respect.
See:
http://drupal.org/node/313910
(I've seen other similar cries for help out in the wild, but I didn't note the URLs unfortunately, but trust me ..this is an generic EAI kind of issue that many people are trying or will try to tackle. So, my thinking is: let's try to consolidate efforts etc. and deal with this the Drupal way - as a community by tapping into the collective wisdom here).
Originally, I was thinking of writing a one-off PHP script that would use some combination of CURL, SimpleXML, and node_save and cron to accomplish this goal, but then I came across FeedAPI, and now my suspicion is that there's a better, more Drupal friendly,more robust way to solve this problem that will be re-usable.
So, in that light, I'm trying to understand what is involved exactly to make this happen. I'm going barf (probably a very suitable word in this case) out my understanding to this point and lay out a rough plan of attack, and hopefully someone 'in the know' can chime in and tell me (us), if I'm on track here (or WAY off base - either way will be constructive I think ;) ).
Apologies in advance for any and all implied or overt ignorance on my part :)
Assumptions:- XML file containing data to import is available in some accessible folder, or at a URI
- Schema of XML file is known a priori, but will vary depending on each person's specific data feed
e.g. generically, something like
<mydata>
<item1>
<field1>value1</field1>
<field2>Value2</field2>
</item1>
<item2>
<field1>value1</field1>
<field2>Value2</field2>
</item2>
</mydata>
Parsing
Ok, so from what I can tell, there is no generic XML parser available with FeedAPI, so task 1 would be to create a parser (perhaps using one of the existing parsers' source code as a starting point; this would be useful no? I couldn't find any directly applicable examples in the docs). Guess one could use SimpleXML to grab and parse XML file; result would be a structured array that will be passed on to FeedAPI for subsequent processing and field mapping.
Processing & Mapping to Custom Content Type fields (CCK etc.)
Assuming that the specific task at hand here does not require a custom processor since the existing one (plus using Feed Element Mapper) will do the job - Accurate assumption, or way off here???
Field Data Types
I watched the great screen-casts put together by Sean at DrupalTherapy.com, and I'm assuming that because this has been worked out for media field types, we're good to go here for std. text fields, date fields and the like. A quick read through of Feed Element Mapper module page implies that this is true. Oh, and although it would be nice to deal with more complex things like node-reference fields or some of the less standard CCK field types, I'm setting that aside for now, to try and grok the basics here.
Given the steps so far
All one would have to do now is map the fields in the Drupal Admin UI (assuming Feed Element Mapper module is installed too) .... Yes?
Summary of Steps- Write a custom parser module
- Map fields to a custom content type you create using the Admin GUI
- Sit back relaxed knowing that every one else's hard work has made your job relatively easy ;)
How would you handle incremental updates to the source XML file? Seems to me this question comes in two flavours:
- assuming additional feed items have been added to the source XML; a new node will just be created next time cron runs (e.g. is add to XML source -> result: a new node will be created for item 3.True? of course right, unless I'm WAY out to lunch ....
- assuming an existing item has changed e.g.for , 's value changed from Value2 to NewValue2. In this case, will FeedAPI etc. simply deal with this change without any intervention or special handling? I.e. will the existing node for and it's corresponding field get updated with Value2 when cron runs?
I'm sure the answers to all my questions are "out there some where", but I already spent a number of hours to get to this point, which really has only got me to a level where I can now articulate my questions a little more clearly than I could have this morning.
If someone with FeedAPI know-how could chime in, I think this would help both myself and others struggling with the same or similar requirements. I'd be happy to help contribute to updating the existing docs once I can get my head around this so as to help others that come along later.
Thoughts/suggestions/answers/comments?
Comments
Comment #1
benansell commentedhttp://www.webinit.org/category/drupal/page/2/
This seems to be another relevant source of info
Comment #2
benansell commentedI'm closing this item .... I clearly didn't do enough homework before posting. I am still struggling with how to pull this off, but there's still more info/docs out there that I should have tried to read before posting.
Don't want to contribute to clutter, or waste anyone's time.
Comment #3
summit commentedHi Ben,
I have a sort of xml-parser working, using a sort of feeds.php a it-friend build for me. Only one problem I have is with location_cck mapper now, while feedapi has become tighter in supporting location. May be join effort can come to having a sot of xml-parser?
Greetings, Martijn
Comment #4
justinchev commentedI'm trying to figure a way to do this sort of thing too. I was thinking it would be good if you could pull an XML feed into the aggregator, and then using the 'Aggregator Item' view type in 'Views' to render out the available fields in your desired format.