Introduction

Last updated on
11 March 2021

Drupal 7 will no longer be supported after January 5, 2025. Learn more and find resources for Drupal 7 sites

This documentation needs work. See "Help improve this page" in the sidebar.

Introduction

Feeds is designed to address import and aggregation use cases. It provides a UI for creating and managing multiple configurations for importing and aggregating simultaneously.

A single configuration for importing is called an Importer. As many importers as desired can be created. Each importer contains a Fetcher for downloading a feed, a Parser for parsing and a Processor for "doing stuff" with it - usually storing the feed.

Default configurations

Don't forget to enable the "Feeds Admin UI" module!

When you install Feeds and also its submodules and go to Administration > Structure > Feed importers (admin/structure/feeds), you will find 4 default importer configurations:

  • Feed
    Provided by the submodule "Feeds News".
    Aggregation importer. Aggregates RSS/Atom feeds to nodes. Provides a node type Feed and a node type Feed item. Create one or more "Feed" nodes to add RSS/Atom feeds to your site. On cron, these feeds will continuously produce "Feed item" nodes.
  • OPML import
    Provided by the submodule "Feeds News".
    Import an OPML file and create Feed nodes from its entries. This configuration should be used together with the "Feed" configuration. To use this importer go to http://www.example.com/import.
  • Node import
    Provided by the submodule "Feeds Import".
    Import nodes from a CSV file (http://drupal.org/node/622710#csv). To use this importer go to http://www.example.com/import.
  • User import
    Provided by the submodule "Feeds Import".
    Import users from a CSV file. To use this importer go to http://www.example.com/import.

Creating an importer configuration

Of course if the default importers don't fit your use case, you can modify them (click "override"), copy them (click "clone") or you can start from scratch (click "New importer").

Here is a short run down on how to create your own importer. Copying or modifying an existing one is very similar.

  1. Go to admin/structure/feeds, click "New importer".
  2. Add a name and a description.
  3. Click "create", now you will be kicked over to the importer's configuration page. From here on out, modifying/copying an existing importer or configuring your new importer works essentially the same way.
  4. Go to "Basic settings". Decide whether the importer should be used on a standalone form or by creating a node ("Attached to content type"); decide whether the importer should periodically refresh the feed and in what time interval it should do that ("Periodic import").
  5. Click "Change" next to "Fetcher" and pick a suitable fetcher for your job. Do the same for "Parser" and "Processor".
  6. Review the settings of each fetcher, parser and processor and adjust them to your job's requirements.
  7. On "Processor" click on "Mapping": define which elements of the feed ("Sources", for example the published date of a feed item) should be mapped to which elements of the Drupal entities ("Targets" - for example a node type's fields). There is a Legend on the bottom of the mapping page, it explains the available mapping sources and targets. This step is mandatory and if omitted, will result in empty entities.

Read more in Creating/editing importers

Using your importer

If you have set the importer to be run periodically under Basic Settings then cron and the fetcher will take care of running the importer.

If you are doing one off imports you need to run the importer by
going to example.com/import

Use the glossary

Confused by the terminology? Take a look at the Feeds glossary to get an overview of the terminology in Feeds.

Requirements and Installation

Install like any other Drupal module. If you install for the first time, make sure you install Feeds, Feeds Admin UI and Feeds Defaults module, all included in the download. Don't forget to configure cron!. Also this will require your PHP to have the CURL library installed (http://drupal.org/node/731918). PHP5-Curl.

Required modules:

Consult the README.txt file included in the module for details on requirements and installation.

Exportables and default hook

Every importer configuration can be exported. Go to admin/build/feeds and click on "export". Copy the exported code and paste it in your module into a hook "hook_feeds_importer_default()".

The export code will populate a variable called $feeds_importer. At the end of the hook, copy $feeds_importer into an export array and return it.

Here is an example:

/**
 * Default definition of 'myimporter'
 */ 
function mymodule_feeds_importer_default() {
  $export = array();
  $feeds_importer = new stdClass;
  $feeds_importer->disabled = TRUE; 
  $feeds_importer->api_version = 1;
  $feeds_importer->id = 'myimporter';
  $feeds_importer->config = array(
  // ...
  );
  $export['myimporter'] = $feeds_importer; 
  return $export;
}

Then, for this hook to be found, it must be declared by your module.

function mymodule_ctools_plugin_api($module = '', $api = '') {
  if ($module == "feeds" && $api == "feeds_importer_default") {
    // The current API version is 1.
    return array("version" => 1);
  }
}

Alternatively, you can use Features to export Feeds configuration.

Performance

Using Feeds module, how many feeds can be downloaded in what frequency?

Unfortunately, this question is impossible to answer globally. Overall, aggregation performance depends on:

  • Your server's CPU and storage I/O performance.
  • Your server's network connection.
  • The content type being created (complex content type? simple Data record?).
  • The activity of your feeds being processed (many new items per run?).
  • The number of feeds being processed.
  • The parser being used (not as critical as other factors).

Usually, as performance degrades feeds will appear to be stale (no new items present although original feed has been updated a while ago).

The staleness will increase with the number of feeds you add. A good measure of overall aggregation performance is the time difference between the most recently updated feed and the last updated feed:

# my_importer_id is the id of the importer to be examined (can be looked up in feeds_importer table).
SELECT MAX(last) - MIN(last) FROM job_schedule WHERE id = 'my_importer_id';

The result of this query is a time span in seconds. For instance, a result of 3600 would mean that there is 1 hour between the feed that has just been updated and the feed that has not been updated for the longest time of all feeds.

To make sure that results are sane, also compare against current time:

# Watch out: UNIX_TIMESTAMP() returns DB's time which may or may not be the same as in PHP. Use date_part('epoch',now()) if you're on pgsql.
SELECT UNIX_TIMESTAMP() - MIN(last) FROM job_schedule WHERE id = 'my_importer_id';
SELECT UNIX_TIMESTAMP() - MAX(last) FROM job_schedule WHERE id = 'my_importer_id';

Performance: tuning

I experience performance problems, feeds are not updating as often as they should

Here are a some options if you experience performance issues with Feeds:

  1. Make sure cron runs often enough, like every 6 minutes.
  2. Run cron with drush.
  3. Alternatively, use superfeedr http://superfeedr.com as a dedicated pubsubhubbub hub (see Feeds README file).
  4. Improve system resources: analyze bottlenecks. Chances are your storage I/O maxes out as heavy aggregation involves a lot of writes. The exact remedies will depending on your findings but could be one or more of these: tune database settings, split out DB to separate server, add RAM to DB server, rearchitect to use a lighter storage model like Data etc.
  5. If you are using MySQL, be aware that by nature most of what Feeds does is update data in the database, so these entries will be captured in your binary log. If you are importing large feeds, this means LOTS of log entries in the binary log file(s). Make sure that you have enough disk space for these logs and don't keep them for longer than you need. See the MySQL Binary Log page for more. If you run out of space on your logging drive, your Drupal site will stop working until you fix it.

Help improve this page

Page status: Needs work

You can: