Create way to exclude feed items from processing - filter approach? [#236739]

I'd like to skip feed items that are older than a particular time from being downloaded.

My first try was to implement hook_feedapi_item('unique') -> doesn't work, because 'unique' does not AND the results returned by implementing hooks.

The only way to skip certain feed items is now to patch feedapi or feedapi_node.

E. g. I added these three lines to the top of _feedapi_node_unique():

function _feedapi_node_unique($feed_item, $feed_nid, $settings) {
  // The following three lines are a managing news specific patch.
  if ($feed_item->options->timestamp && $feed_item->options->timestamp < e0r_get_max_timeframe()) {
    return FALSE;
  }

This problem goes to the core of FeedAPI's architecture:

Currently, FeedAPI exposes the feed item object coming from the parser to every parser configured for the feed content type. But it doesn't allow processors to actually modify the feed item object - e. g. attach the result of their work to the feed item object to expose it for following processors or - like in my case - entirely removing it.

I would call the current architecture "one way", because add on modules don't give back their results in the download process and the other one "filters", because add on modules work together like filters working subsequently in a pipe.

I discussed this with Aron at one point and his concern was that the filter architecture might make the downloading process heavier. I agree. I wonder to what extent.

The possibilities of a filter like architecture seem promising: we could build processors that

1) modify the feed item's content like add RDF (semantic) markup - INDEPENDENT of the creation processor used (node or lightweight DB record)
2) remove feed items before they are being further processed (think of my feature request here, BUT ALSO: think of spam!)
3) add content to feed item like scraped comments - again, independent of creation processor used

I mainly post these thoughts to get people to start thinking about this issue - I am myself still trying to figure out what's the smartest architecture here. Ideas how to address the 3 use cases described above with the existing architecture would be helpful, too.

Alex