Well done!
As per #1033202: [Meta] Generic entity processor we have entity processors with latest dev of feeds and applied patch.
But we have a set of problems with mappers and targets.
So I write few functions which is solve this problem and I think this functions must be placed into data_entity module or separate model.

Let's look into code.

First, we need to define hook_entity_property_info otherwise feeds can't access our data table through entity abstraction:

function data_entity_entity_property_info() {
 $tables = data_entity_get_entity_tables();
 foreach ($tables as $table) {
   foreach ($table->table_schema['fields'] as $field_name => $field) {
     $info['data_' . $table->name]['properties'][$field_name] = array(
        'getter callback' => 'entity_metadata_field_verbatim_get',
        'setter callback' => 'entity_metadata_field_verbatim_set',
        'field' => TRUE,
      );
   }
 }
}

Second, we need to define hook_feeds_processor_targets_alter which gives feeds information about available targets:

function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
  $tables = data_entity_get_entity_tables();
  foreach ($tables as $table) {
	if ($entity_type == 'data_' . $table->name) {
		foreach ($table->table_schema['fields'] as $field_name => $field) {
    			$targets[$field_name] = array(
    				'name' => !empty($table->meta['fields'][$field_name]['label']) ? $table->meta['fields'][$field_name]['label'] : $field_name,
    				'description' => 'Field of type ' . $field['type'] . '.',
				'callback' => 'data_entity_set_target',
			);
		}
	}
  }
}

Finally, we need to define hook_set_target to tell feeds what values they must set to this targets:

function data_entity_set_target($source, $entity, $target, $value) {
  $entity->{$target} = $value;
}

Use it well! :)

Comments

joachim’s picture

Status: Needs work » Active

Looks good!

Any chance you could post these as a patch, along with documentation blocks for each function and formatted to Drupal coding standards (eg 2 spaces for indentation)?

joachim’s picture

Status: Active » Needs work

Setting to needs work since there's code here even if not a patch yet :)

Daniel A. Beilinson’s picture

Status: Active » Needs work
StatusFileSize
new1.83 KB

Hello! Of course!

js’s picture

Thank you for this code and help.

I have cross commented here
http://drupal.org/node/1033202#comment-5940244

I don't have the option to select certain fields as a "unique target" (after selecting the data processor).

js’s picture

I haven't sorted out the logic yet, but the problem is that I am not setting up fields correctly when I create the data table structure.

Daniel A. Beilinson’s picture

@js, yes, there's no such option. And there's no option for expiring of records. Hope to add this settings later.

joachim’s picture

+++ b/data_entity/data_entity.module
@@ -223,3 +223,48 @@ function data_entity_views_api() {
+  $tables = data_entity_get_entity_tables();
+  foreach ($tables as $table) {
+    if ($entity_type == 'data_' . $table->name) {

Surely only one thing in $tables will match here?

So the thing to do is:

$tables = data_entity_get_entity_tables();
// Check substr($entity_type) to see if it looks like 'data_' . $table_name
// check isset($tables[$table_name]) since the array is keyed by table name

Though granted data_get_all_tables() needs better documentation.

js’s picture

Thanks, @Daniel, does that mean I can't use this for a news feed because any records from a previous "pull" that are still in the feed will duplicate?

If so, I guess I could write my own check in hook_feeds_after_parse().

Or, use a node content type for now and migrate later. I have over 200,000 records now and was looking for a way to lighten the load.

Daniel A. Beilinson’s picture

StatusFileSize
new1.87 KB

@js, you can use this option now.
@joachim, I can't see any problems :(

js’s picture

It seems to work, @Daniel, I appreciate the code and quick response. I am importing over 200,000 records now and expect significantly more. "Data" is much lighter than using nodes. I didn't realize until very recently that D7 nodes store duplicate data in field revisions tables.

This is a very useful line of code you added. Thanks.

joachim’s picture

+++ b/data_entity/data_entity.module
@@ -223,3 +223,49 @@ function data_entity_views_api() {
+function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
+  $tables = data_entity_get_entity_tables();
+  foreach ($tables as $table) {
+    if ($entity_type == 'data_' . $table->name) {

It's not a problem, just a cleaner and more understandable algorithm.

Here we have a hook that acts on one entity type that is given as a parameter. Only one of our data tables will correspond to that entity type (or none at all).

Thus we don't need to iterate over all the tables in a foreach loop -- rather, we can look in the list of table names with isset().

Thus:

function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
  // Only act on entities that are from a data table.
  if (substr($entity_type, 0, 5) == 'data_')) {
    $table_name = substr($entity_type, 5);
    $tables = data_entity_get_entity_tables();
    if (isset($tables[$table_name]) {
      // Now do stuff
    }
  }

See what I mean?

js’s picture

Does it end up looking like this?
(edited after fixing the brackets)

/**
 * Implements hook_feeds_processor_targets_alter().
 * 
 * Alter mapping targets for entities. Use this hook to add additional target
 * options to the mapping form of Node processors.
 */
function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
  if (substr($entity_type, 0, 5) == 'data_') {
    $table_name = substr($entity_type, 5);
    $tables = data_entity_get_entity_tables();
    if (isset($tables[$table_name])) {
      foreach ($tables as $table) {
        if ($entity_type == 'data_' . $table->name) {
          foreach ($table->table_schema['fields'] as $field_name => $field) {
            $targets[$field_name] = array(
              'name' => !empty($table->meta['fields'][$field_name]['label']) ? $table->meta['fields'][$field_name]['label'] : $field_name,
              'description' => 'Field of type ' . $field['type'] . '.',
              'callback' => 'data_entity_set_target',
              'optional_unique' => TRUE,
            );
          }
        }
      }
    }
  }
} 
joachim’s picture

Nearly there.

You don't need this part:

      foreach ($tables as $table) {
        if ($entity_type == 'data_' . $table->name) {

The point -- assuming I am understanding this correctly of course -- is to supply data based on the data table's field, IF the entity in question is a data table entity. There will only be one table that matches the entity, if any.

js’s picture

@joachim, this seems to work and make good sense, thank you.

Can you help me with another issue? When the "feeds_item" table is populated, the "url" and "guid" fields are empty. I am confused by the
plugins/FeedsEntityProcessor.inc
which seems to be very different than
plugins/FeedsNodeProcessor.inc
which works fine.

Should I be setting these somehow?

The entitySave function is saving $entity where
$entity->feeds_item->url and guid are empty.


/**
 * Implements hook_feeds_processor_targets_alter().
 *
 * Alter mapping targets for entities. Use this hook to add additional target
 * options to the mapping form of Node processors.
 */
function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
  if (substr($entity_type, 0, 5) == 'data_') {
    $table_name = substr($entity_type, 5);
    $tables = data_entity_get_entity_tables();
    if (isset($tables[$table_name])) {
      $table = $tables[$table_name];
      foreach ($table->table_schema['fields'] as $field_name => $field) {    
        $targets[$field_name] = array(    
          'name' => !empty($table->meta['fields'][$field_name]['label']) ? $table->meta['fields'][$field_name]['label'] : $field_name,    
          'description' => 'Field of type ' . $field['type'] . '.',    
          'callback' => 'data_entity_set_target',    
          'optional_unique' => TRUE,    
        );    
      }    
    }
  }  
}    

joachim’s picture

That's looking right.

I don't know about how the Feeds internals work though, I'm afraid.

Can you roll the above as a patch, and I'll commit it and it can be refined further in a follow-on issue.

josephcheek’s picture

StatusFileSize
new1.92 KB

rerolled with the changes from #14.

js’s picture

I appreciate that. The way I make patches is so clumsy. By the time I have the script working I have lost the original. I end up downloading another original, running diff between them and editing the result to make it work as a patch. There must be an easier and better way.

So, thanks for making the patch I was asked to make.

joachim’s picture

> There must be an easier and better way.

Use git. See http://drupal.org/node/707484

mikemadison’s picture

I've tried several of the patches here. I have the entity processor setup from http://drupal.org/node/1033202, but I don't see the Data Tables processor setup anywhere (but all of the other entities are showing up fine). Am I missing something?

Thanks!

Daniel A. Beilinson’s picture

Clear your cache, please!

imclean’s picture

@lalweil, use the patch from #69 in the feeds issue and the patch in #16 in this issue.

- Create a new Data table (or edit an existing), make sure you have set a field to be a primary key. This is important.
- In an existing or new feed importer, go to the list of Processors (change)
- Clear all caches, even if you've done it before. I had to clear the cache again after setting a primary key.

The table should now appear in the list of processors.

imclean’s picture

Title: Feeds integration » Data Feeds integration

The data table shows up but importing fails. The primary key is unique, (e.g. first_value, second_value) but I'm seeing the following errors:

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'first_value' for key 'PRIMARY'
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'data_table-0' for key 'PRIMARY'
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'data_table-0' for key 'PRIMARY'
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'data_table-0' for key 'PRIMARY'
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'data_table-0' for key 'PRIMARY'
mikemadison’s picture

Thanks @imclean for the info in #21. I can now see the Data Table as a processor.

I am also running into the issues posted in #22.

I also get "Notice: Undefined index: data_node in data_node_settings_form() (line 31 of drupal/sites/all/modules/data/data_node/data_node.admin.inc)."

I also saw some errors about the Feeds Entity Processor missing, but am unable to reproduce those right now.

imclean’s picture

Clearing the cache produces the FeedsEntityProcessor Missing errors for me. That's more related to the other issue I think.

imclean’s picture

Version: 7.x-1.x-dev » 7.x-1.0-alpha3
Status: Needs review » Needs work
StatusFileSize
new2.09 KB

In the patches above, neither data_entity_entity_property_info() nor data_entity_set_target() actually return anything. I've addressed this and added the "label" property to hook_entity_property_info().

This may help with #1538588: Add Entity Label field (Was: allow entity ref fields to point to data items)

This is using the latest git of data and feeds with the patch from #116.

imclean’s picture

StatusFileSize
new2.09 KB

Using empty() instead.

imclean’s picture

StatusFileSize
new2.09 KB

Misuse of t().

imclean’s picture

Ok this won't work without a data save function. Looking into it.

imclean’s picture

StatusFileSize
new3.28 KB

This adds a basic save function. It works with existing tables only and doesn't support the imported timestamp or feeds_nid.

imclean’s picture

Simplified the save method. This is using the the patch from #116 with the drupal_alter() line uncommented. This changes the structure of the object to be saved.

imclean’s picture

Status: Needs work » Needs review

Need to get some update/replace code in there as well but this seems to work ok. To view the results, create a data table view rather than entity view, which doesn't seem to work.

imclean’s picture

Version: 7.x-1.0-alpha3 » 7.x-1.x-dev

Version change.

imclean’s picture

Status: Needs review » Needs work

This only works when mapping to a field of the same name.

imclean’s picture

Status: Needs work » Needs review
StatusFileSize
new3.72 KB

- Mappings are now respected
- Added a start time timestamp field as a source field, currently called "feeds_item_imported". This can be mapped to a db field in the usual way.

imclean’s picture

StatusFileSize
new4.02 KB

Of course Feeds takes care of mappings. This just adds the import start time field as above.

bdawg8569’s picture

imclean,
Have you tried importing anything with a varchar datatype? I was using feeds + data on a D6 project and i'm trying to upgrade it. I have a few tables that were using the data processor and i'm trying to recreate those. It appears your patches are enabling that behavior, but when i tried to run a test import on a new table that uses a varchar as the primary key, i get SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'Array' for key 'PRIMARY' for every row except for the first which actually inserts "Array" in to the entry. It seems like the string is being passed in as an array for some reason. Have you encountered this problem at all or do you know what I might be doing wrong?

Here are the steps i took.
I used your patch in #35 for Data as well as the patch in #116 for Feeds.
I created a new data table using the data UI with 4 fields, 3 of which are varchar's including the primary key.
I created the importer selecting the entity processor for my new table. I created the mappings and specified that the primary key should be unique.
When i ran it the first time i got 1014 errors on 1015 records. The first record stored "Array" in all of the varchar columns and an incorrect integer in the integer column. The rest threw the primary key constraint violation.

john bickar’s picture

bdawg8569, I got the same error until I uncommented the drupal_alter() line in plugins/FeedsEntityProcessor.inc. See https://drupal.org/node/1033202#comment-6571266

bdawg8569’s picture

Thanks John, that got it working for me.

imclean’s picture

Thinking about it, the custom save method may not be required. I think I introduced it when drupal_alter() was commented out in the other issue. I'm not in a position to test this for a little while but John or bdawg8569 if you could, please try removing it from data_entity.entity.inc. This may also allow the feeds_item table to be populated.

imclean’s picture

Version: 7.x-1.0-alpha3 » 7.x-1.x-dev
Status: Needs work » Needs review

Thinking some more, the point of Feeds using a generic entity processor is so it can populate any entity without special coding. With this in mind, we should be able to remove data_entity_feeds_processor_targets_alter() and other feeds hooks and put something more generic in the DataEntityController class. This would allow more than just feeds to use data_entity.

imclean’s picture

Ok, with the overridden save method the feeds hook data_entity_feeds_processor_targets_alter() (and its callbacks) isn't required.

I'm having problems with the entity ID because my feeds source doesn't have a numeric ID field. Creating an auto-increment field in the database doesn't seem to work I guess because it isn't in the record to be saved.

imclean’s picture

StatusFileSize
new3.58 KB

As per #41, data_entity_feeds_processor_targets_alter() and the setter callback have been removed.

- data_entity_feeds_presave() remains to add the import timestamp (could also add importer ID)
- Unique option only available on the ID field. This is a FeedsEntityProcessor limitation, I'd like to find a generic way to add the option to all Primary/Unique database fields. This may not be possible as it's a Feeds config option. This isn't used at the moment anyway.
- data_entity_get_id_field() now selects the first Primary Key which is an integer to be the ID.

There's still a bit to do beyond just Feeds integration but this will at least get data into the db.

Currently we're not using the feeds_nid table and I'm not sure we should. The point of the Data module is to be lean and efficient. Each row is an entity which would add a new entry in the feeds_nid table.

imclean’s picture

The errors in #22 occur when trying to add the entity type to the entity_type field of feeds_item. Feeds sets this field as a primary key, which, if there's only one primary key, needs to be unique. As entity_type is unlikely to be unique it fails the insert.

If we could solve this issue then data_entity may not need its own save function.

imclean’s picture

StatusFileSize
new1.44 KB

Ok, most of the patch in #42 isn't required. The only addition needed for basic feeds integration is hook_entity_property_info().

The feeds errors in #22 were caused by the entity_id field in feeds_item not being set as a PRIMARY key. It is in the schema for feeds but wasn't in my database.

To fix the errors, make sure your database table feeds_item matches the schema in feeds.install. This may have been changed between feeds versions.

This is working well for me for straight imports.

joachim’s picture

Title: Data Feeds integration » implement hook_entity_property_info for Feeds integration
Status: Needs review » Needs work

> Ok, most of the patch in #42 isn't required. The only addition needed for basic feeds integration is hook_entity_property_info().

That's good news!

Patch just needs a little more work though:

+++ b/data_entity/data_entity.module
@@ -184,7 +184,7 @@ function data_entity_permission() {
-      'title' => t('Edit data in the %table_name table', array('%table_name' => $table->title)), 
+      'title' => t('Edit data in the %table_name table', array('%table_name' => $table->title)),

This is an unrelated whitespace fix.

+++ b/data_entity/data_entity.module
@@ -223,3 +223,26 @@ function data_entity_views_api() {
+ * Implements hook_entity_property_info().
+ *
+ * Allow modules to define metadata about entity properties.
+ */
+
+function data_entity_entity_property_info() {

IIRC this should go in a separate .inc file.

imclean’s picture

StatusFileSize
new1.09 KB

It should indeed.

imclean’s picture

Status: Needs work » Needs review

Status update.

imclean’s picture

StatusFileSize
new1.09 KB

Fix comment.

joachim’s picture

Status: Needs review » Fixed

Committed.

Thanks everyone.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

esmitex’s picture

Hi,

I've just got the following issue, I applied the patch but it didn't fix anything. I still get the warning.

Notice: Undefined index: module in data_entity_feeds_processor_targets_alter()

jelo’s picture

I am seeing the same notice as esmitex reported in #51 in 7.x-1.0-alpha7