The developer's guide to Feeds

Last updated on
11 March 2021

Drupal 7 will no longer be supported after January 5, 2025. Learn more and find resources for Drupal 7 sites

This documentation needs work. See "Help improve this page" in the sidebar.

Quick introduction to get started with improving Feeds or writing plugins or mappers for it.

Architecture

Feeds consists of three modules:

  • Feeds module (feeds.module) contains entry points to the Feeds API (feeds.api.php) and the production UI (feeds_ui/feeds_ui.module) (e. g. import forms).
  • Feeds Admin UI (feeds_ui/feeds_ui.module) module contains the UI elements necessary for configuring and managing Feeds during a site build. It can be optionally shut off once a site is built out (to not much security or performance gain, however).
  • Feeds Defaults (feeds_import/feeds_import.module via feeds_import/feeds_import.feeds_importer_default.inc) contains a couple of default importer configurations for getting started with feeds.

The feeds module itself provides the most important hook implementations, plugin functions and wrapper functions for retrieving objects of Feeds' API classes.

The most important API classes are:

  • class FeedsImporter: contains a FeedsFetcher, a FeedsParser and a FeedsProcessor plugin object and the configuration of the importer and its plugins.
  • abstract class FeedsPlugin: base class for all plugins
  • abstract class FeedsFetcher, abstract class FeedsParser, abstract class FeedsProcessor: Base classes for the three possible types of plugins. Derived from FeedsPlugin class.
  • class FeedsSource: holds a source (i. e. a URL or a file path). A FeedsSource object is being passed into FeedsImporter when importing from that source. A FeedsSource can be tied to a specific node or not.
  • abstract class FeedsConfigurable: base class for configurable and persistent entities (FeedsImporter, FeedsPlugin and FeedsSource)
  • class FeedsScheduler: this class is responsible for periodic import (= aggregation).

See Feeds glossary for more information on Feeds terminology.

Plugins API

Feeds has a CTools based plugins API that works similar to Views. Examples can be found in Feed's very own feeds.plugins.inc or as an example stand alone module at Extractor module.

SUMMARY:
A module that wants to provide its own Fetcher, Parser or Processor plugin must declare a plugin and implement it.

Declaration of a Feeds plugin

Implement hook_feeds_plugins() and return an array of plugin definitions.

A plugin definition must contain a name, a description and a handler. The handler must contain a parent plugin name, the class name of the plugin, the file and path identifying the location of the definition of the plugin. Feeds distinguishes between the key of the plugin definition (the key of the plugin in the $info array) and the class name of the plugin handler. However, most of the times the class name and the key are the same.

Every plugin must be derived either directly or indirectly from either FeedsFetcher, FeedsParser or FeedsProcessor. The value of the parent property refers to the plugin key of another plugin, not its class name.

Example

In file mymodule.module:

/**
 * Implements hook_feeds_plugins().
 */
function mymodule_feeds_plugins() {
  $info = array();
  $info['MyParser'] = array(
    'name' => 'My parser',
    'description' => 'Parses custom data.',
    'handler' => array(
      'parent' => 'FeedsParser', // A plugin needs to derive either directly or indirectly from FeedsFetcher, FeedsParser or FeedsProcessor.
      'class' => 'MyParser',
      'file' => 'MyParser.inc',
      'path' => drupal_get_path('module', 'mymodule'),
    ),
  );
  return $info; 
}

in file mymodule.install:

function mymodule_enable() {
  //clear the cache to display in Feeds as available plugin.
  cache_clear_all('plugins:feeds:plugins', 'cache');
}

Note: If you intend to extend the SimplePie parser, as described below, you will need to declare your plugin handler as follows: 'parent' => 'FeedsSimplePieParser' or else you will receive an error message about a missing class when you try to enable your parser.

Implementation of a feeds plugin

The implementation of the plugin needs to reside in the file indicated in hook_feeds_plugins(). When the plugin is requested (for instance when Feeds displays a list of available plugins to the user, or when a plugin is loaded to import a source), it is dynamically loaded from this file.

Depending on which plugin type (Fetcher, Parser or Processor) is implemented, a varying set of methods must or can be defined or overwritten. The classes used to implement Fetcher, Parser and Processor plugins do the main part of their work through methods named fetch(), parse() and process() respectively. For details, refer to FeedsFetcher, FeedsParser or FeedsProcessor class.

There are some differences between the Drupal 6 and Drupal 7 versions of the methods used to implement feeds plugins, as noted below.

Drupal 6 plugins

In Feeds 6.x-1.0, the Fetcher, Parser and Processor plugins rely on an object of class FeedsBatch and its subclass, FeedsImportBatch, to keep track of the state of a feed and its content while it is being fetched, parsed and processed. Documentation for the FeedsBatch and FeedsImportBatch classes can be found in the comments in file includes/FeedsBatch.inc.

The fetch() method for a Fetcher in Feeds 6.x-1.0 has the following function signature:

FeedsFetcher::fetch(FeedsSource $source);

The fetch() method returns an object of class FeedsImportBatch or a FeedsImportBatch subclass.

The parse() and process() methods for Parsers and Processors in Feeds 6.x-1.0 then receive and modify an object of class FeedsImportBatch. They have the following function signatures:

FeedsParser::parse(FeedsImportBatch $batch, FeedsSource $source);
FeedsProcessor::process(FeedsImportBatch $batch, FeedsSource $source);

The FeedsImportBatch object is therefore a single object which changes "states" as it is handed from Fetcher to Parser to Processor.

Drupal 7 plugins

In Feeds 7.x-2.0, the FeedsBatch/FeedsImportBatch class is gone and is replaced with different "result" classes that are returned by Fetchers and Parsers. Instead of returning an object of class FeedsImportBatch, therefore, the Fetcher returns an object of type FeedsFetcherResult.

The parse() and process() methods for Parsers and Processors in Feeds 7.x-2.0 therefore have the following function signatures:

FeedsParser::parse(FeedsSource $source, FeedsFetcherResult $fetcher_result);
FeedsProcessor::process(FeedsSource $source, FeedsParserResult $parser_result);

Documentation for the FeedsFetcherResult and FeedsParserResult classes can be found in the comments in files plugins/FeedsFetcher.inc and plugins/FeedsParser.inc.

In other words, Feeds handling still flows from Fetcher->Parser->Processor, but the objects used to pass information between each stage have changed. In Feeds 7.x-2.0, the process is as follows:

  • The fetch() method of the Fetcher returns an object of class FeedsFetcherResult.
  • The parse() method of the Parser receives an object of class FeedsFetcherResult and returns an object of class FeedsParserResult.
  • The process() method of the Processor receives an object of class FeedsParserResult.

Example of a Parser Plugin for Feeds 6.x-1.0

This example builds on the one in "Declaration of a Feeds plugin". It defines a MyParser that extends FeedsParser. NOTE: This example is based on Feeds version 6.x-1.0. The parse() method takes different parameters in Feeds version 7.x-2.0.

  • parse() parses the source document and populates a FeedsImportBatch.
  • getMappingSources() declares the fields the parser returns. This information is being used for mapping (See Mapping API section).
/**
 * Parses My Feed
 */
class MyParser extends FeedsParser {

  /**
  * Parses a raw string and populates FeedsImportBatch object from it.
  */
  public function parse(FeedsImportBatch $batch, FeedsSource $source) {
    // Get the file's content.
    $string = $batch->getRaw();

    // Parse it...

    // The parsed result should be an array of arrays of field name => value.
    // This is an example of such an array:
    $items = array();
    $items[] = array(
      'guid' => 'MyGuid1',
      'title' => 'My Title',
    );
    $items[] = array(
      'guid' => 'MyGuid2',
      'title' => 'My Other Title',
    );

    // Populate the FeedsImportBatch object with the parsed results.
    $batch->title = 'Example title'; 
    $batch->items = $items;
  }

  public function getMappingSources() {
    return array(
      'guid' => array(
        'name' => t('GUID'),
        'description' => t('Unique ID.'),
      ),
      'title' => array(
        'name' => t('Title'),
        'description' => t('My Title.'),
      ),
    );
  }

} 

Extending the SimplePie parser integration

To enable SimplePie:

  1. Download it from http://simplepie.org/
    Download the most recent package to get the simplepie.inc file you need in the next step. (You don't need the source.)
  2. Copy the simplepie.inc into feeds/libraries as mentioned in the readme
  3. Clear the cache (admin/settings/performance) to see the option

Alternatively, you can also use the Libraries API module, and put simplepie.inc inside sites/all/libraries/simplepie/ (so the end path would be sites/all/libraries/simplepie/simplepie.inc).

For customization purposes, Feeds' SimplePie parser integration can be extended by overriding parseExtensions() and getMappingSources().

Here is an example:

/**
* A simple parser that extends FeedsSimplePieParser by adding support for a
* couple of iTunes tags.
*/
class MyParser extends FeedsSimplePieParser {
  /**
   * Add the extra mapping sources provided by this parser.
   */
  public function getMappingSources() {
    return parent::getMappingSources() + array(
      'itunes_keywords' => array(
        'name' => t('iTunes:Keywords'),
        'description' => t('iTunes Keywords.'),
      ),
      'itunes_duration' => array(
        'name' => t('iTunes:Duration'),
        'description' => t('iTunes Duration.'),
      ),
    );
  }

  /**
   * Parse the extra mapping sources provided by this parser.
   */
  protected function parseExtensions(&$item, $simplepie_item) {
    $itunes_namespace = 'http://www.itunes.com/dtds/podcast-1.0.dtd';
    if ($value = $simplepie_item->get_item_tags($itunes_namespace, 'keywords')) {
      $item['itunes_keywords'] = $value[0]['data'];
    }
    if ($value = $simplepie_item->get_item_tags($itunes_namespace, 'duration')) {
      $item['itunes_duration'] = $value[0]['data'];
    }
  }
}

Mapping API

Feeds allows other modules to define additional mapping targets by way of alter hooks. The API hooks are specific to the processor in use but they work very similar. An API implementer adds additional mapping targets via the alter hook. These targets contain a name, a description and a callback. The callback will be invoked by Feeds if a user has picked the mapping target for their importer.

See also list of implemented mappers.

Here is a step by step guide showing how to implement a new mapping target.

1) Implement target alter hook

First the presence of an additional mapping target needs to be declared to Feeds. This is done by implementing an alter hook that modifies the array of available mapping targets. Additional mapping targets must have a name, a description and a callback. Name and description are used on the mapping form of a processor, the callback is invoked if a site builder has added the mapping target to a processor configuration and a feed item is presently being imported.

There are different target alter hooks for node and term processors; they are documented in feeds.api.php. This guide focuses on extending the node processor.

Drupal 6:

/**
 * Implements hook_feeds_node_processor_targets_alter().
 */
function mymodule_feeds_node_processor_targets_alter(&$targets, $content_type) {
  if ($content_type == 'my_content_type') {
    $targets['my_target_field'] = array(
      'name' => t('My Target Field'),
      'description' => t('Shows up in legend on mapping form.'),
      'callback' => 'mymodule_set_target', // See 2)
    );
  }
}

Drupal 7:

/**
 * Implements hook_feeds_processor_targets_alter().
 */
function mymodule_feeds_processor_targets_alter(&$targets, $type, $bundle) {
  if ($type == 'node' && $bundle == 'my_content_type') {
    $targets['my_target_field'] = array(
      'name' => t('My Target Field'),
      'description' => t('Shows up in legend on mapping form.'),
      'callback' => 'mymodule_set_target', // See 2)
    );
  }
}

2) Implement callback

Now all that is left is to implement the callback.

Drupal 6:
Pay attention to the parameters in a mapping callback: $node is the node object presently being assembled for storage. $target is the key of the target field. In this example it can only be 'my_target_field', because the hook only declared this one target key. $value can be a numeric, a string or a FeedsElement object. Alternatively it can be an array of values of these types. If FeedsElement is used like a string it converts automatically to a string.

This example assumes that any non-array value is acceptable as target value:

/**
 * Mapping callback.
 */
function mymodule_set_target($node, $target, $value) {
  if (!is_array($value)) {
    $node->$target = $value;
  }
}

Drupal 7:

/**
 * Mapping callback.
 */
function mymodule_set_target($source, $entity, $target, $value, $mapping) {
  $entity->{$target}[$entity->language][0]['value'] = $value;
  if (isset($source->importer->processor->config['input_format'])) {
    $entity->{$target}[$entity->language][0]['format'] =
      $source->importer->processor->config['input_format'];
  }
}

Frequently asked questions on Mapping

Where can I learn more than what's documented here?

For further information, look at code. A good place to start reading is mappers/content.inc or for a more advanced example, mappers/filefield.inc. For learning more about the internal gears of mapping in Feeds, refer to the Doxygen documentation's module section.

Should I submit my mapper as a patch to Feeds?

Generally speaking, the most preferable location for a mapper integration is the module that is home to the mapping target.

If this is not possible (e. g. for core modules like taxonomy) a mapper can be implemented as a patch to Feeds. Another reason why a mapper would be included with Feeds is that the integrated module is very popular (e. g. CCK, Filefield, Date modules).

For easing maintenance, mappers that are included in Feeds must have tests verifying their functionality.

I can't find a mapper for targets like the node title

Simple mapping targets on the subject of the processor - for example the node title or the node body field in FeedsNodeProcessor - are implemented natively in the getMappingTargets() and setTargetElement() methods of the processors themselves. These appear in the Feeds UI under Mapping. For example:

  • FeedsNodeProcessor.inc: Defines the Base Node Fields, such as Node ID, User ID, User Name, User Mail, Published Status, etc.
  • FeedsEntityProcessor.inc: Custom Entities get the targets fields using the properties defined at the hook_entity_property_info(). For Feeds to read the properties it is necessary to define the setter callback to entity_property_verbatim_set:
     
      $properties['content_type'] = array(
        'label' => t("Content Type"),
        'type' => 'text',
        'description' => t("Content Type."),
        'schema field' => 'content_type',
        'setter callback' => 'entity_property_verbatim_set',//for Feeds Mapping. Won't show otherwise!
      );

Defaults hooks

Any module can define a Feeds Importer configuration by way of a hook_feeds_importer_default() implementation. See The site builder's guide to Feeds for more information.

Programmatic operations

Create a feed node programmatically

This example demonstrates how to create a Feed node programmatically. It assumes that there is an Importer configured and attached to the content type 'feed'. The Importer uses an HTTP fetcher.

$node = new stdClass();
$node->type = 'feed';
$node->title = 'My feed';
$node->feeds['FeedsHTTPFetcher']['source'] = 'http://example.com/rss.xml';
node_save($node);

Using rules

To extend this to rules enable PHP Input filter and create a new PHP action component with 2 or 3 arguments $type, $title and $uri containing this snippet

$node = new stdClass();
$node->type = $type;
$node->title = $title;
$node->feeds['FeedsHTTPFetcher']['source'] = $uri;
node_save($node);

Now you can use this action component for your rules to trigger imports.

Auto node title

If you want automatic node title from the feed URL, you can use this snippet below.

feeds_include_library('common_syndication_parser.inc', 'common_syndication_parser');
feeds_include_library('http_request.inc', 'http_request');
$node = new stdClass ();
$node->type = 'feed'; 
$node->title = '';
$feed = http_request_get($url);
if (!empty($feed->data)) {
  $feed = common_syndication_parser_parse($feed->data);
  if (!empty($feed['title'])) {
    $node->title = $feed['title'];
  }
$node->feeds['FeedsHTTPFetcher']['source'] = 'http://example.com/rss.xml'; 
node_save($node);

Trigger import programmatically

This example shows how to import a feed programmatically. It can be read in continuation of the previous example.

// Using Batch API (user will see a progress bar).
feeds_batch_set(t('Importing'), 'import', 'my_importer_id', $node->nid);
// Not using Batch API (complete import within current page load)
while (FEEDS_BATCH_COMPLETE != feeds_source('my_importer_id', $node->nid)->import());

* Note: To do this with a stand alone feeds importer (IE... if the importer form is not a node), you set the $node->nid to zero.

Create a feed node programmatically using services

Similar to the above, you can also create a feed node programmatically using Services 3. Here is an example in python to create a feed node that has a FeedsFileFetcher field to an already uploaded file (few dependencies in python such as requests and json python modules):

#Set the headers and cookie jar
headers = {'content-type': 'application/json'}
jar = cookielib.CookieJar()
#Set the endpoints and payloads
login = 'https://example.com/endpoint/user/login'
url = 'https://example.com/endpoint/node'
creds = {'username': 'USERNAME', 'password': 'ABRACADABRA'}
payload = {'title':'WHATEVER','type':'YOUR_FEED_NODES_TYPE','name':'Admin','language':'und', 'feeds':{'FeedsFileFetcher':{'source':'private://path_to_file'}}}
#First we login using our credentials
requests.post(login, data=json.dumps(creds), headers=headers, verify=False, cookies=jar)
#Then we create a new feed node
requests.post(url, data=json.dumps(payload), headers=headers, verify=False, cookies=jar)

Testing

Use Feeds Test Site for testing. Follow instructions in its README.txt file.

Help improve this page

Page status: Needs work

You can: