I wrote a custom fetcher, parser and processor in Alpha 9 and this is broken in Alpha 10. Please can you document what has changed in terms of the data object which should be passed from the fetcher to the processor via the parser? It looks like this should now be an object of type FeedsImportBatch. Please can you explain the reasoning behind this change / what the change does and whether we should expect this to change again?

It would also be good to know how close you are to a beta / release candidate in which the implementation / class hierarchy will not be changed.

Many thanks in advance (and for your work on the Feeds module so far!). We're using it to import a large library of images and associated XML meta data.

Comments

alex_b’s picture

Related issue #641522: Consolidate import stage results and provide generalized enclosure class. I should have been more verbose in CHANGELOG.txt, will make a point in doing so in the future.

The changes are very simple, take a look at existing fetchers/parsers/processors.

Beta is still a couple of weeks away. I will post a roadmap soon.

BenK’s picture

Subscribing...

fereira’s picture

Unfortunately those simple changes also broke an importer I created that I got working for the alpha9 release. In my case, fixing it isn't going to be easy.

I've written an importer that rather than fetching content from a url or file, gets it from the sparql module. Using the Sparql modules I can create sparql queries against a RDF triple store, use an SparqlResultsFetcher (an implementation of FeedsFetcher) that gets the sparql results, Parses them with a custom implementation of Feeds Parser, then creates nodes with a custom implementation of the NodeProcessor class (I can also use the out of the box TermProcessor to update a taxonomy vocabulary from a RDF triple store. Because the basic function signatures of core Feeds classes have changed I'll need to rewrite both my Fetcher and Parser implementations. Any thoughts on how I can create a FeedsImportBatch that will work without the assumption that the content is coming from somewhere other than a URL or File?

alex_b’s picture

Any thoughts on how I can create a FeedsImportBatch that will work without the assumption that the content is coming from somewhere other than a URL or File?

Extend FeedsImportBatch in the same file as your fetcher class and use it instead of the FeedsImportBatch class itself.

fereira’s picture

Thanks. That'll probably tax my fledgling PHP skills (I'm primarily a java programmer) but I'll give it a shot. I have a pretty good idea what I need to do. BTW, will the issue titled "Break FeedsImportBatch into two classes" (http://drupal.org/node/708228) influence how I approach this? I see that the patch posted there "Needs Review" and wondered if I should try to extend a class that may soon change.

CODECOWBOY-1’s picture

I got this working by making the following changes:

Fetcher:

    $batch = new FeedsImportBatch();
    $batch->setItems($xml_array); //xml array is my custom array of simplexml objects
    return $batch;

Parser:

 public function parse(FeedsImportBatch $batch, FeedsSource $source) {....

 while ($xml_obj = $batch->shiftItem()) { ...}

  $batch->setItems($parsed_results['items']);
  return $batch;
}

Processor:

Use the while loop above to iterate over the Batch items returned and pass to your process method.

The batch::items property is protected so can't be accessed directly, hence the need for the shiftItem() call.

hope this helps.

alex_b’s picture

That'll probably tax my fledgling PHP skills (I'm primarily a java programmer)

You should hit this out of the park then ! :)

Seriously, I'm sorry for the inconvenience. But API changes like these are exactly the reason why we're still in the Alpha phase.

On a general note: While I *can't* guarantee that the API does not change at all while we're in Alpha mode, I *can* guarantee that any changes will have an upgrade path. In short: we won't remove any existing features. I will make a point in being more verbose in future release notes so that changes don't come as a cold shower.

I appreciate your patience -

Alex

fereira’s picture

I got it working again without extending the FeedsImportBatch class. My code is loosely based on the code that the_real_codecowboy posted (once I found my typo error things went better). In my case, since the FeedsImportBatch class assumes that the feed is either coming from a URL or a File, and that the getRaw() function assumes the same, I just instantiated and returned an instance of the FeedsImportBatch class in my Fetcher impl. The Parser impl invokes a method in the sparql module, then iterates through the results and adds them to the FeedsImportBatch object using $batch->addItem().

I *may* look at refactoring things if the code in this issue (http://drupal.org/node/708228) gets committed.

alex_b’s picture

Heads up guys, there are also minor but important API changes in alpha12, see release notes: http://drupal.org/node/723216

CODECOWBOY-1’s picture

Title: Changes in Alpha10 » Changes in Alpha10 (and now 12)
Version: 6.x-1.0-alpha10 » 6.x-1.0-alpha12

Hi,

For anyone else following this, the required changes for your processor can be seen in plugins/FeedsNodeProcessor.inc

If you don't at least return FEEDS_BATCH_COMPLETE from your process() method and you are importing manually rather than using cron, the new progress bar displaying import progress will hang even when the import is complete.

My fetcher 'just worked'

luke.

alex_b’s picture

Status: Active » Closed (fixed)

Not relevant anymore.