Processing large amounts of data

Last updated on
31 May 2017

Drupal 7 will no longer be supported after January 5, 2025. Learn more and find resources for Drupal 7 sites

Feeds processes data using the Batch API. You can avoid hitting a page time-out when processing large amounts of data by keeping count of the processed elements and setting the batch progress.

The simplest example of implementation in the Drupal 6 version can be found at feeds/includes/FeedsBatch.inc:


class YourFeedsProcessor extends FeedsProcessor {
  // Total elements to process per page
  const MAX_PER_PAGE = 50;
  /**
   * Update your process() method within your custom Feeds processor with the
   * following logic.
   */
  public function process(FeedsImportBatch $batch, FeedsSource $source) {
    // Set counter of processed elements for this page load.
    $processed = 0;
    // Set counter of all processed items across page loads.
    if (!isset($batch->processed)) {
      $batch->processed = 0;
    }
    // Set total elements
    if (!$batch->getTotal(FEEDS_PROCESSING)) {
      $batch->setTotal(FEEDS_PROCESSING, count($batch->items));
    }
    // Loop items
    while ($item = $batch->shiftItem()) {
      // You can replace the following two lines by your custom logic to process nodes.
      $object = $this->map($item);
      $object->save();
      $processed++; // Processed in this page load.
      $batch->processed++; // Processed total.
      if ($processed > self::MAX_PER_PAGE) {
        $batch->setProgress(FEEDS_PROCESSING, $batch->processed);
        return;
      }
    }
    $batch->setProgress(FEEDS_PROCESSING, FEEDS_BATCH_COMPLETE);
  }
}

The above code will make the page to reload every 50 items, thus increasing the progress bar and avoiding page time-outs.

Have a look at how this is implemented by the generic FeedsProcessor in Drupal 7 or by FeedsNodeProcessor in Drupal 6.

You may also have a look at Steven Jones's sandbox: Feeds XPath Parser + XMLReader.

As of 13 October 2013, the latest official releases of Feeds Xpath Parser for D6 and D7 include a patch designed to overcome batching problems like stalling, timeout and memory exhaustion. You can examine that part of the module to look for further improvements.

Help improve this page

Page status: No known problems

You can: