hook_feedapi_item_save($feed_item, $fid) seems to be called with $feed_item only having a few settings (title, description, etc).
How is an item processor supposed to get at all the other fields that were in the original feed? (and available in the parser). Without those, we can't populate the node with all the information available in the feed (eg: itunes tags, enclosures, etc).
Duplicating the parser, and adding your extra code would be an option of course, but that's cut'n'paste reuse :) Maybe the parsers could have a hook to allow modules to embellish the default behaviour implementation?
Comments
Comment #1
aron novakAs you maybe saw that there is a primary/secondary parser setup in FeedAPI. The primary parser is responsible for filling all the "must-have" fields (title, description) and secondary parsers can add fields to $feed->options and $feed->items[n]->options . So when you create a parser for using it as secondary only, you don't have to write the code that take care of core fields. And maybe you noticed that downloading the feed is not the part of the parser too. Yes, you're right that there is a little bit duplication (XML processing), but the main concept was that the parsers are fully interchangeable. I'll consider the possibility to put hooks into the default parser for adding additional fields by external modules.
Comment #2
lyricnz commentedHow's this? (see patch)
Here's what an implementation of the hooks might look like:
Comment #3
lyricnz commentedUpdated title, category and status.
Comment #4
alex_b commentedHi lyricnz,
The problem with your suggestion is, that it requires add on modules to implement a hook-API and call those hooks. We should keep this functionality in feedapi - to keep add on modules simple.
I see the problem that you are describing, though.
Does the simple pie parser not pass on its parsed result to secondary parsers?
Alex
Comment #5
alex_b commentedComment #6
lyricnz commentedNo, the parsers are independent.
The reason I had to patch simplepie was because simplepie doesn't keep track of any fields in the feed or item that it doesn't know about (eg: enclosures! and some iTunes specific stuff). This patch allows custom modules to save whatever they want, and use it later when creating the nodes.
FWIW, I also patched feedapi_item to allow modules to mess with the node being created, using the data saved in the parser. Both these changes were required so that I could perform a reasonably simple task: consume a podcast RSS feed, and create audio nodes from it:
- I used the first parser hook to set_time_limit(0); (since downloading the MP3s takes quite a while) IIRC, I also saved a couple of attributes, so I could use them later when creating nodes.
- I used the second parser hook to add the enclosure information from the $simplepie_item into the $feed_item (so we can download it in the item processor)
- I used the extra hook mentioned above to customize $node after feedapi_item had created a default. In my case, I created an audio node using the audio API, then copied the pertinent fields from $mynode into $node.
I think if I was doing it again now, I'd also pass $feed into the item processor, which may reduce the need for the first hook.
Comment #7
alex_b commentedCould you post the patch as a cvs diff -u patch?
Comment #8
lyricnz commentedThe patch in this issue still applies correctly to DRUPAL-5 feedapi (with an offset). Or are you talking about the patch to feedapi_item? It's trivial, attached.
PS: I'm not especially attached to the hook names, or even the exact parameters, I just wanted to open a discussion about the ability to extend existing parsers/processors without duplicating them entirely.
Comment #9
lyricnz commentedFWIW, the podcast example I talked about above, used the roughly the following hooks in my module:
(simplepie item hook) This particular bit of code saves the enclosure from the item into the feeditem options, and actually fiddles with the title/description (by pulling the first line of the RSS item description into the title, rather than using the title that was in the <item>.
(feedapi_item save hook) This bit of code uses the information saved above to download the MP3 enclosure, and create an audio node using Audio API. It then copies the information from the node it just created into $node, so when the caller calls node_save, it's really an update, not a create-new. Some error checking/etc removed for clarity.
To be honest, I haven't updated this client to newer versions of FeedAPI, because I did their site before Aron established an upgrade path :)
Comment #10
aron novakFeedAPI was designed differently.
See this:
http://drupal.org/project/new_aggregator
Here the processors access to each other results, so no need to hook into at the most of the cases.