Feed Element Mapper has nice and easy to understand code - great job!

Feature request: some feeds come with data inside XML fields that needs to be extracted separately. For example, I need to parse one feed that looks like this:

<rss version="2.0">
...
<item> 
 <title>Element Title Here</title> 
 <link>http://www.example.com/data.php?show=9</link> 
 <description> 
	 Date: Tuesday, June 9, 2009<br>
	 Time: 9:00 AM to 5:00 PM<br>
	 <br>
	 Host: Acme Corp<br>
	 Location: North Pole<br>
	 <br>
	Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum....<br>
 </description> 
 <pubDate></pubDate> 
 </item> 

Note the semicolon-demarcated data inside "description" field, such as Date, Time, Host, Location. I need those broken up into separate CCK fields.

General functionality to be able to assign such data to CCK fields would be very useful. I looked at other modules (including feedapi_eparser) and Feed Element Mapper appears to be the closest fit for this functionality.

Comments

iva2k’s picture

Looking for different implementation options, I found out that current mapper implementation uses many-to-one mappings with only one of the mappings that point to the same CCK field to have effect. It is a limitation which can be turned around as follows:

- The mapping would use one-to-many principle, listing target node fields (instead of feed fields) in the map form
- Then each node field will have a drop-down list with a selection of feed fields items as available via installed mappers

That way a single input feed field can be used in one-to-many mappings to a number of target CCK fields.

The above in my mind is the easiest way to provide functionality to parse feed sub-fields into any number of CCK fields. The only missing piece then would be a custom mapper in a separate module using hook_feedapi_mapper(). That custom mapper can be selected in the new mapper form for extracting each sub-field from a single feed field.

By the way, that proposal will not break current implementation of existing mappers.

The feedapi_mapper module code will change to something like this:

function feedapi_mapper_form($form_state, $node) {
...
//Old  $field_mappers = _feedapi_mapper_get_field_mappers($feed_item_type);
// New:
  $field_mappers = _feedapi_mapper_get_field_mappers_by_node($feed_item_type, $elements);
// The above Returns an array of all feed mappers that are applicable to this node type, listed by feed elements.
...
//Old  foreach ($elements as $element_name => $path) {
//  ...
//  }
// New:
  foreach ($node_fields as $field_name => $path) {
...
  }

iva2k’s picture

Title: Feature needed: Break up feed fields into sub-fields » Documentation update needed: Mention "Feed Scraper" in FeedAPI & Element Mapper documentation/project page
Component: Code » Documentation
Category: feature » task

Looking further, I found a very new module Feed Scraper, which allows just that - create regex subfields, and then use it in the mapper. Some configuring time, and I've solved my problem. Issue closed for me.

Given that it took me very long time to stumble upon the solution by filtering through ALL Drupal modules, I want to see mapper & FeedAPI documentation mentioning that module. It is a very universal task, and very solid solution.

Changing this issue to that.

Summit’s picture

Hi #1 is still needed I think to get the possibility to map a rssitem to more than one drupal field, right?
Greetings Martijn