Well done!
As per #1033202: [Meta] Generic entity processor we have entity processors with latest dev of feeds and applied patch.
But we have a set of problems with mappers and targets.
So I write few functions which is solve this problem and I think this functions must be placed into data_entity module or separate model.
Let's look into code.
First, we need to define hook_entity_property_info otherwise feeds can't access our data table through entity abstraction:
function data_entity_entity_property_info() {
$tables = data_entity_get_entity_tables();
foreach ($tables as $table) {
foreach ($table->table_schema['fields'] as $field_name => $field) {
$info['data_' . $table->name]['properties'][$field_name] = array(
'getter callback' => 'entity_metadata_field_verbatim_get',
'setter callback' => 'entity_metadata_field_verbatim_set',
'field' => TRUE,
);
}
}
}
Second, we need to define hook_feeds_processor_targets_alter which gives feeds information about available targets:
function data_entity_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
$tables = data_entity_get_entity_tables();
foreach ($tables as $table) {
if ($entity_type == 'data_' . $table->name) {
foreach ($table->table_schema['fields'] as $field_name => $field) {
$targets[$field_name] = array(
'name' => !empty($table->meta['fields'][$field_name]['label']) ? $table->meta['fields'][$field_name]['label'] : $field_name,
'description' => 'Field of type ' . $field['type'] . '.',
'callback' => 'data_entity_set_target',
);
}
}
}
}
Finally, we need to define hook_set_target to tell feeds what values they must set to this targets:
function data_entity_set_target($source, $entity, $target, $value) {
$entity->{$target} = $value;
}
Use it well! :)
Comments
Comment #1
joachim commentedLooks good!
Any chance you could post these as a patch, along with documentation blocks for each function and formatted to Drupal coding standards (eg 2 spaces for indentation)?
Comment #2
joachim commentedSetting to needs work since there's code here even if not a patch yet :)
Comment #3
Daniel A. Beilinson commentedHello! Of course!
Comment #4
js commentedThank you for this code and help.
I have cross commented here
http://drupal.org/node/1033202#comment-5940244
I don't have the option to select certain fields as a "unique target" (after selecting the data processor).
Comment #5
js commentedI haven't sorted out the logic yet, but the problem is that I am not setting up fields correctly when I create the data table structure.
Comment #6
Daniel A. Beilinson commented@js, yes, there's no such option. And there's no option for expiring of records. Hope to add this settings later.
Comment #7
joachim commentedSurely only one thing in $tables will match here?
So the thing to do is:
Though granted data_get_all_tables() needs better documentation.
Comment #8
js commentedThanks, @Daniel, does that mean I can't use this for a news feed because any records from a previous "pull" that are still in the feed will duplicate?
If so, I guess I could write my own check in hook_feeds_after_parse().
Or, use a node content type for now and migrate later. I have over 200,000 records now and was looking for a way to lighten the load.
Comment #9
Daniel A. Beilinson commented@js, you can use this option now.
@joachim, I can't see any problems :(
Comment #10
js commentedIt seems to work, @Daniel, I appreciate the code and quick response. I am importing over 200,000 records now and expect significantly more. "Data" is much lighter than using nodes. I didn't realize until very recently that D7 nodes store duplicate data in field revisions tables.
This is a very useful line of code you added. Thanks.
Comment #11
joachim commentedIt's not a problem, just a cleaner and more understandable algorithm.
Here we have a hook that acts on one entity type that is given as a parameter. Only one of our data tables will correspond to that entity type (or none at all).
Thus we don't need to iterate over all the tables in a foreach loop -- rather, we can look in the list of table names with isset().
Thus:
See what I mean?
Comment #12
js commentedDoes it end up looking like this?
(edited after fixing the brackets)
Comment #13
joachim commentedNearly there.
You don't need this part:
The point -- assuming I am understanding this correctly of course -- is to supply data based on the data table's field, IF the entity in question is a data table entity. There will only be one table that matches the entity, if any.
Comment #14
js commented@joachim, this seems to work and make good sense, thank you.
Can you help me with another issue? When the "feeds_item" table is populated, the "url" and "guid" fields are empty. I am confused by the
plugins/FeedsEntityProcessor.inc
which seems to be very different than
plugins/FeedsNodeProcessor.inc
which works fine.
Should I be setting these somehow?
The entitySave function is saving $entity where
$entity->feeds_item->url and guid are empty.
Comment #15
joachim commentedThat's looking right.
I don't know about how the Feeds internals work though, I'm afraid.
Can you roll the above as a patch, and I'll commit it and it can be refined further in a follow-on issue.
Comment #16
josephcheekrerolled with the changes from #14.
Comment #17
js commentedI appreciate that. The way I make patches is so clumsy. By the time I have the script working I have lost the original. I end up downloading another original, running diff between them and editing the result to make it work as a patch. There must be an easier and better way.
So, thanks for making the patch I was asked to make.
Comment #18
joachim commented> There must be an easier and better way.
Use git. See http://drupal.org/node/707484
Comment #19
mikemadison commentedI've tried several of the patches here. I have the entity processor setup from http://drupal.org/node/1033202, but I don't see the Data Tables processor setup anywhere (but all of the other entities are showing up fine). Am I missing something?
Thanks!
Comment #20
Daniel A. Beilinson commentedClear your cache, please!
Comment #21
imclean commented@lalweil, use the patch from #69 in the feeds issue and the patch in #16 in this issue.
- Create a new Data table (or edit an existing), make sure you have set a field to be a primary key. This is important.
- In an existing or new feed importer, go to the list of Processors (change)
- Clear all caches, even if you've done it before. I had to clear the cache again after setting a primary key.
The table should now appear in the list of processors.
Comment #22
imclean commentedThe data table shows up but importing fails. The primary key is unique, (e.g. first_value, second_value) but I'm seeing the following errors:
Comment #23
mikemadison commentedThanks @imclean for the info in #21. I can now see the Data Table as a processor.
I am also running into the issues posted in #22.
I also get "Notice: Undefined index: data_node in data_node_settings_form() (line 31 of drupal/sites/all/modules/data/data_node/data_node.admin.inc)."
I also saw some errors about the Feeds Entity Processor missing, but am unable to reproduce those right now.
Comment #24
imclean commentedClearing the cache produces the FeedsEntityProcessor Missing errors for me. That's more related to the other issue I think.
Comment #25
imclean commentedIn the patches above, neither data_entity_entity_property_info() nor data_entity_set_target() actually return anything. I've addressed this and added the "label" property to hook_entity_property_info().
This may help with #1538588: Add Entity Label field (Was: allow entity ref fields to point to data items)
This is using the latest git of data and feeds with the patch from #116.
Comment #26
imclean commentedUsing empty() instead.
Comment #27
imclean commentedMisuse of t().
Comment #28
imclean commentedOk this won't work without a data save function. Looking into it.
Comment #29
imclean commentedThis adds a basic save function. It works with existing tables only and doesn't support the imported timestamp or feeds_nid.
Comment #30
imclean commentedSimplified the save method. This is using the the patch from #116 with the drupal_alter() line uncommented. This changes the structure of the object to be saved.
Comment #31
imclean commentedNeed to get some update/replace code in there as well but this seems to work ok. To view the results, create a data table view rather than entity view, which doesn't seem to work.
Comment #32
imclean commentedVersion change.
Comment #33
imclean commentedThis only works when mapping to a field of the same name.
Comment #34
imclean commented- Mappings are now respected
- Added a start time timestamp field as a source field, currently called "feeds_item_imported". This can be mapped to a db field in the usual way.
Comment #35
imclean commentedOf course Feeds takes care of mappings. This just adds the import start time field as above.
Comment #36
bdawg8569 commentedimclean,
Have you tried importing anything with a varchar datatype? I was using feeds + data on a D6 project and i'm trying to upgrade it. I have a few tables that were using the data processor and i'm trying to recreate those. It appears your patches are enabling that behavior, but when i tried to run a test import on a new table that uses a varchar as the primary key, i get SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'Array' for key 'PRIMARY' for every row except for the first which actually inserts "Array" in to the entry. It seems like the string is being passed in as an array for some reason. Have you encountered this problem at all or do you know what I might be doing wrong?
Here are the steps i took.
I used your patch in #35 for Data as well as the patch in #116 for Feeds.
I created a new data table using the data UI with 4 fields, 3 of which are varchar's including the primary key.
I created the importer selecting the entity processor for my new table. I created the mappings and specified that the primary key should be unique.
When i ran it the first time i got 1014 errors on 1015 records. The first record stored "Array" in all of the varchar columns and an incorrect integer in the integer column. The rest threw the primary key constraint violation.
Comment #37
john bickar commentedbdawg8569, I got the same error until I uncommented the drupal_alter() line in plugins/FeedsEntityProcessor.inc. See https://drupal.org/node/1033202#comment-6571266
Comment #38
bdawg8569 commentedThanks John, that got it working for me.
Comment #39
imclean commentedThinking about it, the custom save method may not be required. I think I introduced it when drupal_alter() was commented out in the other issue. I'm not in a position to test this for a little while but John or bdawg8569 if you could, please try removing it from data_entity.entity.inc. This may also allow the feeds_item table to be populated.
Comment #40
imclean commentedThinking some more, the point of Feeds using a generic entity processor is so it can populate any entity without special coding. With this in mind, we should be able to remove data_entity_feeds_processor_targets_alter() and other feeds hooks and put something more generic in the DataEntityController class. This would allow more than just feeds to use data_entity.
Comment #41
imclean commentedOk, with the overridden save method the feeds hook data_entity_feeds_processor_targets_alter() (and its callbacks) isn't required.
I'm having problems with the entity ID because my feeds source doesn't have a numeric ID field. Creating an auto-increment field in the database doesn't seem to work I guess because it isn't in the record to be saved.
Comment #42
imclean commentedAs per #41, data_entity_feeds_processor_targets_alter() and the setter callback have been removed.
- data_entity_feeds_presave() remains to add the import timestamp (could also add importer ID)
- Unique option only available on the ID field. This is a FeedsEntityProcessor limitation, I'd like to find a generic way to add the option to all Primary/Unique database fields. This may not be possible as it's a Feeds config option. This isn't used at the moment anyway.
- data_entity_get_id_field() now selects the first Primary Key which is an integer to be the ID.
There's still a bit to do beyond just Feeds integration but this will at least get data into the db.
Currently we're not using the feeds_nid table and I'm not sure we should. The point of the Data module is to be lean and efficient. Each row is an entity which would add a new entry in the feeds_nid table.
Comment #43
imclean commentedThe errors in #22 occur when trying to add the entity type to the entity_type field of feeds_item. Feeds sets this field as a primary key, which, if there's only one primary key, needs to be unique. As entity_type is unlikely to be unique it fails the insert.
If we could solve this issue then data_entity may not need its own save function.
Comment #44
imclean commentedOk, most of the patch in #42 isn't required. The only addition needed for basic feeds integration is hook_entity_property_info().
The feeds errors in #22 were caused by the entity_id field in feeds_item not being set as a PRIMARY key. It is in the schema for feeds but wasn't in my database.
To fix the errors, make sure your database table feeds_item matches the schema in feeds.install. This may have been changed between feeds versions.
This is working well for me for straight imports.
Comment #45
joachim commented> Ok, most of the patch in #42 isn't required. The only addition needed for basic feeds integration is hook_entity_property_info().
That's good news!
Patch just needs a little more work though:
This is an unrelated whitespace fix.
IIRC this should go in a separate .inc file.
Comment #46
imclean commentedIt should indeed.
Comment #47
imclean commentedStatus update.
Comment #48
imclean commentedFix comment.
Comment #49
joachim commentedCommitted.
Thanks everyone.
Comment #51
esmitex commentedHi,
I've just got the following issue, I applied the patch but it didn't fix anything. I still get the warning.
Notice: Undefined index: module in data_entity_feeds_processor_targets_alter()Comment #52
jelo commentedI am seeing the same notice as esmitex reported in #51 in 7.x-1.0-alpha7