Feed Import is a module that allows you to import content into entities from various file types (like XML, HTML, CSV).
In this page you'll find a general way to use feed import and examples on how to build your feed configuration in order to import a XML file, HTML page or CSV file.
If you want to see list of provided filters by feed import check this page.
First, navigate to admin/config/services/feed_import. You can change global settings using "Settings" link. To add a new feed click "Add new feed" link and fill the form with desired data.
After you saved feed click "Edit" link from operations column. You can change process function and it's settings in PROCESS FUNCTION SETTINGS fildset. Now at the bottom is a fieldset with XPATH settings. Add XPATH for required item parent and unique id (you can now save feed). To add a new field choose one from "Add new field" select and click "Add selected field" button. A fieldset with field settings appeared and you can enter xpath(s) and default action/value. If you wish you can add another field and when you are done click "Save feed" submit button.
Check if all fields are OK. If you want to (pre)filter values select "Edit (pre)filter" tab. You can see a fieldset for each selected field. Click "Add new filter" button for desired field to add a new filter. Enter unique filter name per field (this can be anything that let you quickly identify filter), enter function name (any php function, even static functions ClassName::functionName) and enter parameters for function, one per line. To send field value as parameter enter [field] in line. There are some static filter functions in feed_import_filter.inc.php file >> class FeedImportFilter that you can use. Please take a look. I'll add more soon.
If you want to change [field] with something else go to Settings. You can add/remove any filters you want but don't forget to click "Save filters" submit button to save all.
Now you can enable feed and test it.
We will use as XML file an example from w3schools: http://www.w3schools.com/xml/cd_catalog.xml.
1- First we create a new content type (admin/structure/types/add) named "Music collection" (with machine name: music_collection) and add some fields to it:
- field_artist (text)
- field_country (text)
- field_year (integer)
- field_price (float)
- Save this content type.
2- Go to Feed Import (admin/config/services/feed_import).
2.1- Add a new feed named "Music collection feed".
2.2- Select entity name "node".
2.3- Set xml url to "http://www.w3schools.com/xml/cd_catalog.xml" (without quotes).
2.4- Select "processXML" for process function and click on "Add feed" button to save the new feed.
3- Click on Edit link to edit this newer feed.
3.1- Scroll down to the bottom of the page at XPATH SETTINGS and type "//CD" (without quotes) to set the "ENTER PARENT ITEM XPATH" field.
3.2- Type "TITLE" (without quotes) to set the "Enter XPATH TO UNIQUE IDENTIFIER OF ITEM" field.
3.3- On the "SELECT DEFINED FIELD" scroll down menu, choose "type" and click on "Add field" button.
3.4- Leave the XPATH textarea blank and enter "music_collection" (without quotes) as default value to set the node type where we want to import. Note 1: filling the 'type' field is mandatory ! Note 2: If you are using "Rubik" theme as your admin theme, you won't see the "default value" field. Switching back to the core "Seven" admin theme is recommended.
3.5- Add more fields with xpaths (fieldname: xpath_value - replace "fieldname" with the machine-name of your Content Type field, and "xpath_value" with the element value in your XML source):
- title: TITLE
- field_artist: ARTIST
- field_country: COUNTRY
- field_year: YEAR
- field_price: PRICE
You can also add a default value for each field (for title you can add "Unknown song", for artist "Unknown artist" and so on).
If you can't find some fields to add in select, please Clear cache and refresh (don't forget to save)
Note: For this example we don't use pre-filters only a filter.
- Go to EDIT FILTERS tab and click on FIELD_PRICE.
- Now click "Add new filter ... ", in filter name insert "Round price" (without quotes), in function name insert "round" (without quotes) and in function params insert on first line "[field]" (without quotes).
If you take a look at round() function you see that you can enter a second parameter for precision. Now don't enter a second parameter and save filters by pressing "Save filters" button.
Feed is ready to be processed. Go back to admin/config/services/feed_import and click "Process" operation for this feed. A success status should appear. To view imported content go to admin/content and filter by type "Music collection". Check all content to see that every item has rounded price (comparing to xml file).
Go back to filters to field price and add second parameter (on second line) for function round with value of 1. Save and process import again. You can notice that items were updated and not doubled and price is now other.
But this isn't enough... Now we want to tag these nodes with Country and Artist so we can filter nodes.
To do that first go to manage fields for our content type and add existing field field_tags. Now go back to feed import to edit our feed configuration and add field_tags field. Set XPATH to COUNTRY|ARTIST (this will select both country and artist) and select for action when result is empty: "Ignore this field". Save feed and go to EDIT FILTERS tab. Add a new filter for field_tags with name: "Tag with country and artist" , function: "FeedImportFilter::setTaxonomyTerms" and first parameter (on first line) is "[field]" and second (on second line) is "Tags" (all without quotes). Save filter and process feed. You'll see now that all nodes are tagged and you can filter by country or artist.
Note: If you don't want to add new terms to vocabulary use FeedImportFilter::getTaxonomyIdByName instead of FeedImportFilter::setTaxonomyTerms function.
For this example we will use as HTML page the Drupal Commerce demo site http://demo.commerceguys.com/dc/ to extract products from first page (hope they don't mind for "stolen" products).
First we have to create a new content type to store imported products. Create one named "Commerce products" with machine name: commerce_products. Add the following fields:
- field_image (Image)
- field_price (Float)
Now go to admin/config/services/feed_import and add new feed named "Commerce products", entity name: node, url: http://demo.commerceguys.com/dc/ and for process function select processHTMLPage.
After saving this feed go and edit it. Scroll down to XPATH SETTINGS and for parent xpath use:
Do not use anything for unique xpath because this will be an one-time import.
Now add the following fields:
do not use any xpath for this but enter a default value: commerce_products
use for xpath
header/h1/aand for action if empty "Skip importing this item" because we don't want to import content without title.
use for xpath
div[@class='article-content']/div//pand for action if empty "Skip importing this item" because we don't want to import content without body.
use for xpath
div[@class='article-content']/div/section/div[@class='field-items']/divand for action if empty "Skip importing this item" because we want only complete products.
use for xpath
div[@class='article-content']/div//img/@srcand for action if empty "Ignore this field".
don't use any xpath for this just enter default value 1 (this node will appear as added by admin but you can ignore this field)
don't use any xpath for this just add default value 1 so this will be promoted to front page (can be omitted)
Ok, now save this because we have to add filters.
Go to filters because we have to extract product image to add it in field_image. We can do this with filters from FeedImportFilter class.
Because we used for xpath a property we have to extract value from it.
Add a filter named "Get src attribute" with function name "FeedImportFilter::getProperty" and two params: [field] and src.
Now we have url to image, but the image is a thumbnail and is useless. We need a large image and for this we will use a simple hack.
Add new filter named "Get large image not thumbnail" with function name "FeedImportFilter::replace" and three params: [field], /thumbnail/ and /large/
Now we have turned a url like
which contains a thumbnail image into one like
which contains a large image, and that's what we wanted to do.
Finally we have to save image to field.
Add new filter named "Save image" with function name "FeedImportFilter::saveImage" and three params: [field] and public://field/image/ and 1 (last param is for replacing existing file, since v2.6)
For field_price we have to filter data like $10.00 to be a float like 10.00. A simple replace should be enough.
Add a filter for filed_price named "Remove dollar sign" with function name "FeedImportFilter::replace" and two params: [field] and $
For body we use for filter "FeedImportFilter::join" and one param: [field] because some descriptions have more than one paragraph.
Don't forget to save filters because we are ready for import process.
Process import and go to admin/content and you should see all imported products. Click to see they have all fields including images.
For this example we will create a CSV file containing three columns: User, Mail and Pass.
Save this file on localhost (or elsewhere) so it can be accessed from an url.
This users will be imported to user entity.
Go to admin/config/services/feed_import and create a new feed named "Users from CSV", entity name "user", process function "processCSV" and url to your CSV file.
Now go to edit it. In PROCESS FUNCTION SETTINGS fieldset set the "Use column names" setting to value of 1.
Go to XPATH SETTINGS and for parent xpath use //row and leave empty unique xpath field.
Add the following fields:
mail and init
with no xpath and default value set to 1.
and save feed. Because user password is hashed we have to hash it too.
So go to filters tab and add a filter for pass named "Hash password" with function name "FeedImportFilter::userHashPassword" and one param: [field]
Save filters and go to process import. Check new users navigating to admin/people. If they don't appear then check reports for possible errors.
If in reports you find errors like "Undefined index: mail" in user_save() then you must to update user module.
You can now log in with an imported user and password.