Edit feed related settings such as entity name, delete protection, processor or hash manager.

Feed

  • Feed name - any human readable name
  • Entity name - the imported entity types. You cannot change this if there are attached fields
  • Import at cron - enable/disable import when cron runs
  • Delete protection - useful when you have regular imports and want to avoid expired entities deletion.
    This will reschedule all feed items
    • Source file or network is unavailable - protect when source is not reachable (like network errors)
    • Total number of items is less than - protect when source provides less number of items than you expect.
      You can also use a percentage relative to last import.

      Number of last import items will not be updated if protection is needed and you used a percentage
  • Pre-process callback - you can use a pre-process function before the feed is imported in order to make some changes to configuration

Processor

A processor handles all import process.

These settings are for the default processor: Feed Import Processor. Modules can provide alternative processors.

  • After how many created entities to save them - when 0 all items are first imported and after that saved/updated.
    If greater than 0 then a save/update will occur after each chunk of imported items, freeing memory. If you have a huge source consider a number like 200-300 which generally works for most configurations
  • Skip already imported items - when active and items are monitored (have unique id), already imported items will not be updated.
    You can use this if you want to import only new items and don't want updates. However, if expire time is not 0 (see Hash Manager) nor item is protected the expired entities will be deleted
  • Only update already imported items - Using this option no new entity will be created. This is possible only when items are monitored.
  • Reset entity static cache - when entities are loaded they can be cached in memory. If you have a lot of items this can result in high memory usage. Use this setting to clear static cache, or 0 to disable it
  • Throw exception on error - if active and an error occurs the import will stop. If you are a developer use try-catch statement to prevent import from stopping
  • Maximum number of errors to log - max number of errors to show in log, if reports are enabled. Do not use a big value (>200) since it will eat memory
  • Stop import if a filter function is not declared - as it says, the import will not start if a filter function is missing. Recommended to be active, otherwise the missing filter function will not be used which can result in a broken import
  • Skip creating already declared dynamic functions - This is usefull if you try to import multiple feeds that declare same dynamic function names. However, it is recommended to put those functions in a php filter file rather than creating them dynamically.
  • Unique id alter callback - Function to call before the hash is computed using unique id value.
  • Entity after save/update callback - a function name to be called after entity was saved or updated. More info
  • Entity before combine callback - a function name to be called when an imported item already exists as entity, before the merging of two happens. You can alter both item and current entity or you can skip (merging or import of entity), protect or reschedule the entity. More info
  • Entity after combine callback - a function name to be called when an imported item already exists as entity, after item and entity were merge. You can alter current entity or you can skip, protect, or reschedule the entity. More info
  • Entity before create callback - a function name to be called before a new entity is created. You can skip the entity import, create or save the entity. More info
  • Entity before save callback - a function name to be called before a new entity is saved after it was created. You can skip the entity import or save the entity. More info

Entity after save/update callback

Function name to call after entity was saved (a new one) or updated.

This callback receives two params: entity and is new status. Example:

/**
 * My after save or update callback
 *
 * @param object $entity
 *   The saved/updated entity
 * @param boolean $is_new
 *   True if entity is new or false if is updated
 */
function my_after_save_or_update_callback($entity, $is_new) {
  // code ...
}

Entity before combine callback

Function to call before merging source item with an existing entity.

This callback receives three params: item, current entity and changed status (in this case status will always be FALSE). Values should be passed by reference. Example:

  /**
   * My before combine callback description
   *
   * @param array &$item
   *   Current feed item
   * @param object &$entity
   *   Entity matched for item
   * @param bool &$changed
   *   Merge status will always be false
   *   because merge was not called yet, but
   *   you can alter entity and this param
   *   if you want to skip combine
   *
   * @return int
   *   A FeedImportProcessor::ENTITY_* constant
   */
  function my_before_combine_callback(array &$item, &$entity, &$changed) {
    // code ...
  }

Return value can be one of the following constants from FeedImportProcessor class:

  • ENTITY_CONTINUE - continue the import processes normally
  • ENTITY_NO_COMBINE - skip merging of this item with current entity and go to update.
    You can also implement your own merging mechanism by using this, but remeber to set the change status
  • ENTITY_RESCHEDULE - reschedule entity for deletion, skipping merge and update
  • ENTITY_SKIP - skip merge, update and reschedule, and continue to next item
  • ENTITY_MARK_PROTECTED - mark item as protected (see Protected items) and acts like ENTITY_SKIP
  • other returned value acts like ENTITY_CONTINUE

See Operations order when updating an entity.

Example of marking an item as protected

  function my_entity_protect_before_combine_callback(array &$item, &$entity, &$changed) {
    if ($entity->user_id == 1) {
      // This item was edited by admin so it won't accept other changes
      return FeedImportProcessor::ENTITY_MARK_PROTECTED;
    }
    // Normally continue import
    // You can also use: return 0; instead of FeedImportProcessor::ENTITY_CONTINUE
    return FeedImportProcessor::ENTITY_CONTINUE;
  }

Entity after combine callback

Function to call after source item was merged with entity.

This callback receives three params: item, current entity and changed status. Values should be passed by reference. Example:

  /**
   * My after combine callback description
   *
   * @param array &$item
   *   Current feed item
   * @param object &$entity
   *   Entity matched for item
   * @param bool &$changed
   *   Merge status, indicates if
   *   the merging operation changed
   *   the entity
   *
   * @return int
   *   A FeedImportProcessor::ENTITY_* constant
   */
  function my_after_combine_callback(array &$item, &$entity, &$changed) {
    // code ...
  }

Return value is the same as Entity before combine callback except that ENTITY_NO_COMBINE will be treated as ENTITY_CONTINUE
since the merging operation was already made.

Example of altering an entity

  function my_entity_alter_after_combine_callback(array &$item, &$entity, &$changed) {
    // Check if entity already has user_id
    if (!isset($entity->user_id)) {
      // Add user 0
      $entity->user_id = 0;
      // Set some log mesage in field_log
      $entity->field_log[$entity->language][0]['value'] =
          t('Added default user on @date', array('@date' => date('Y-m-d H:i:s')));
      // Mark as changed!
      $changed = TRUE;
    }
    // Continue import
    return FeedImportProcessor::ENTITY_CONTINUE;
  }

Operations order when updating an entity

If is active Processor » Skip already imported items then entities will not be updated! Otherwise:

  1. if item is marked as protected => goto 8
  2. load entity using item's hash
  3. call before combine callback (if exists). If returns:
    • ENTITY_CONTINUE => goto 4
    • ENTITY_NO_COMBINE => goto 6
    • ENTITY_RESCHEDULE => goto 7
    • ENTITY_SKIP => goto 8
    • ENTITY_MARK_PROTECTED => mark as protected and goto 8
  4. merge feed item with entity
  5. call after combine callback (if exists). If returns:
    • ENTITY_CONTINUE => goto 6
    • ENTITY_RESCHEDULE => goto 7
    • ENTITY_SKIP => goto 8
    • ENTITY_MARK_PROTECTED => mark as protected and goto 8
  6. if entity was changed update it and call after save callback (if exists)
  7. reschedule entity if needed
  8. process next item

Entity before create callback

Function name to call before creating a new entity.

This callback receives one param: the entity as reference. Example:

/**
 * My before create callback
 *
 * @param array &$entity
 *   The entity data
 *
 * @return int
 *   A FeedImportProcessor::ENTITY_* constant
 */
function my_before_create_callback(array &$entity) {
  // code ...
}

Return value can be one of the following constants from FeedImportProcessor class:

  • ENTITY_CONTINUE - continue the import processes normally
  • ENTITY_SKIP - skip create and save and continue to next item
  • ENTITY_SKIP_CREATE - skip create and go to save (meaning you created the entity in the callback)
  • ENTITY_SKIP_SAVE - skip save (meaning you created and saved the entity in the callback)

See Operations order when creating a new entity.

Entity before save callback

Function name to call before saving a new entity.

This callback receives one param: the created entity object. Example:

/**
 * My before save callback
 *
 * @param object $entity
 *   The entity object
 *
 * @return int
 *   A FeedImportProcessor::ENTITY_* constant
 */
function my_before_save_callback($entity) {
  // code ...
}

Return value is the same as Entity before create callback except that ENTITY_SKIP_CREATE will be treated as ENTITY_CONTINUE
since the entity was already created.

Operations order when creating a new entity

  1. call before create callback (if exists). If returns:
    • ENTITY_CONTINUE => goto 2
    • ENTITY_SKIP_CREATE => goto 3
    • ENTITY_SKIP_SAVE => goto 5
    • ENTITY_SKIP => goto 7
  2. create entity
  3. call before save callback (if exists). If returns:
    • ENTITY_CONTINUE => goto 4
    • ENTITY_SKIP_SAVE => goto 5
    • ENTITY_SKIP => goto 7
  4. save entity
  5. call after save callback (if exists)
  6. save hash if items are monitored
  7. process next item

Hash Manager

Hash Manager handles hashes used to monitor items. Hashes are created using the value of the unique id (if unique id is missing hash manager does nothing because is a one time import).

These settings are for the default hash manager: SQL Hash Manager. Modules can provide alternative processors.

As it's name says the hashes are stored in a SQL database.

  • Group - the group for this feed. Multiple feeds can update same entities if belong to same group. More info below
  • Keep imported items - after how many seconds the items expire and should be deleted.
    If you use 0 the items never expire but will receive updates (if is not active Processor » Skip already imported items).
  • Minimum number of hashes to commit update - when this number of updated entities is reached, hash manager will update expire time (reschedule or protect entities)
  • Minimum number of hashes to commit insert - when this number of created entities is reached, hash manager will insert entity info, hash and expire time (to monitor entities)

Group - what is it?

In version 2.x the group was feed's machine name (you could not change it) which was ok in most cases. Consider the following scenario in 2.x:

I regullary import books from 2 feed sources with different structures (one is xml and the other one is csv).
I'll need two feed configurations: one for xml and one for csv both using as unique id the ISBN value.
Done! Wait, why do I have duplicates?

The answer is simple, because they have different machine names resulting different hashes for the same unique id.
By using the Group settings in 3.x you can say that both xml and csv feeds belong to same group (let's say books) and
there will be no more duplicates (will also reduce the number of hashes in database).

Remeber! The hash is composed using three items: entity type, group and provided unique id!
So, if you have two feeds with the same group but handle different entity types there will be duplicates
(you cannot say that an user is a node nor that a taxonomy term is a commerce product).

Protected items

Protected items are those entities which cannot be updated anymore nor deleted by Feed Import.

Consider the following scenario where protected items are useful:

I'm importing articles into my site from external sources.
As long as imported articles are not manually edited by an user I want to update them from source.
Also, I don't want that manually edited articles to expire (to be deleted).

You can mark an entity as protected using Processor » Entity before/after combine callback.
Because there are a lot of possible scenarios you'll have to build the logic of "when entity becomes protected" in a php function.


Next part » Edit source

Comments

TangMonk’s picture

I just want to import a country city list, my god..

Sorin Sarca’s picture

All settings have defaults, they are just explained. Please open an issue if you have problems, I'll be happy to help you.

rahu231086’s picture

Hello there,

Hope all is well. Your module seems to be great but very complex i have been using feeds module for importing of Drupal Commerce products and its nodes and it's easy to use feeds. So could your module can handle commerce products import.

thanks

Be COOL

phponwebsites’s picture

How to get skipped items while importing from csv file. You can skip the current line and navigate to next line using $entity->fedds_item->skip = TRUE.
ok fine. Then how can get skipped item values?

Sorin Sarca’s picture

Hi, please post an issue if you need any help.

phponwebsites’s picture

Actually i explained my issue on my previous comment. Once again i explained for you.
I can import question using feed_import and qq_import module in drupal 7.
I want to skip some lines while importing.
I can also skipped paritcular line while importing csv file using

$entity->feeds_item->skip = TRUE;

Now i want to get all skipped line data into new csv file. So i need help to solve this problem. How can do this?

Sorin Sarca’s picture

Is your comment a question or request for support?
Take it to the forums. Questions are not answered here, and comments not providing useful information may be removed at any time.

If you need any help, once again, please post an issue because this isn't the right place to solve it.

hughworm’s picture

"Only update already imported items" " Skip creating already declared dynamic functions" and "Unique id alter callback" were missing from this documentation. I've add as displayed on the form but more info may be useful.

If at first you don’t succeed, call it version 1.0.