I'm importing a relatively simple csv into a custom content type I named "Press."

I'm getting an error called "Missing bundle property on entity of type node" from the Feed Logs.

The import file (CSV) has two rows: headers and a single record.

A new node is created, but the importing of the data fails and so an orphan node of sorts is created without a title or any data.

Here is the error listed in the Feeds log:

Missing bundle property on entity of type node.
Original item

array (
  'id' => '100',
  'title' => 'People on the Move',
  'url' => 'http://www.chicagobusiness.com/section/people-on-the-move',
  'date' => '27-Oct-11',
  'publisher' => 'Crain\'s Chicago',
)

Entity

stdClass::__set_state(array(
   'uid' => '1',
   'status' => 1,
   'promote' => 1,
   'sticky' => 0,
   'created' => 1325712133,
   'revision' => false,
   'comment' => 2,
   'menu' => 
  array (
    'link_title' => '',
    'mlid' => 0,
    'plid' => 0,
    'menu_name' => 'main-menu',
    'weight' => 0,
    'options' => 
    array (
    ),
    'module' => 'menu',
    'expanded' => 0,
    'hidden' => 0,
    'has_children' => 0,
    'customized' => 0,
    'parent_depth_limit' => 8,
  ),
   'log' => 'Replaced by FeedsNodeProcessor',
   'feeds_item' => 
  stdClass::__set_state(array(
     'entity_id' => '100',
     'entity_type' => 'node',
     'id' => 'press_csv',
     'feed_nid' => 0,
     'imported' => 1325712133,
     'hash' => 'a05d52969a6588e09cf28009b2648252',
     'url' => 'http://www.chicagobusiness.com/section/people-on-the-move',
     'guid' => '',
  )),
   'title' => 'People on the Move',
   'field_publication' => 
  array (
    'und' => 
    array (
      0 => 
      array (
        'value' => 'Crain\'s Chicago',
        'format' => 'plain_text',
      ),
    ),
  ),
   'field_published_date' => 
  array (
    'und' => 
    array (
      0 => 
      array (
        'timezone' => 'UTC',
        'offset' => 0,
        'date' => 
        FeedsDateTime::__set_state(array(
           'granularity' => 
          array (
            0 => 'year',
            1 => 'month',
            2 => 'day',
            3 => 'zone',
          ),
           '_serialized_time' => NULL,
           '_serialized_timezone' => NULL,
           'date' => '2011-10-27 00:00:00',
           'timezone_type' => 3,
           'timezone' => 'America/Chicago',
        )),
        'value' => '2011-10-27 00:00:00',
      ),
    ),
  ),
))
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

emackn’s picture

Status: Active » Postponed (maintainer needs more info)

If you are creating a custom entity you should check with your entity first.

wrburgess’s picture

Sorry, I don't know what you mean by "check with entity, first".

This is just a custom content type in Drupal 7.

pcambra’s picture

Are you setting a unique field for your importer? does this happen when importing new data or updating?

Suggestion: The error you're posting I'd say raises in common.inc "entity_extract_ids" function, you can place a dpm(debug_backtrace()) and see what's the execution flow to track this error.

jide’s picture

I'm having the same issue when importing OG nodes. Too tired now to debug, but I'll do this tomorrow.

jide’s picture

Problem solved, that was because of a hook_entity_insert() which was using the passed $entity and then used that entity with a field_get_items() although the entity did not have an id. D'oh.

emackn’s picture

Status: Postponed (maintainer needs more info) » Closed (works as designed)
jide’s picture

Status: Closed (works as designed) » Postponed (maintainer needs more info)

Please don't close the issue - My issue is not the same as the original poster (@wrburgess). Let's give him a chance to give more details.

emackn’s picture

If its not the same, then please make a new issue and reference this one.

Thanks.

sydneyDK’s picture

I see the same error as well when importing with HTTP Fetcher . I have a Bitnami stack on my pc and get the same errror. When I run the same Fetcher at my host provider, I do not. Possibly related to Bitnami's php configuration? My error.log does not reporting anything.

sydneyDK’s picture

Another note. I'm running 7.10 Drupal with Bitnami, but Drupal 7.09 with my hoster.

jide’s picture

@emackn: In fact it IS the same issue, but the issue was raised in a precise case for me, which I describe here to help. The issue seems to happen for several people in different scenarios, but I guess describing how I could debug it in my case helps. So this definitely belongs here.

sydneyDK’s picture

Just posting a follow up about my bug report above. I reinstalled the module and reconfigured my feed and that fixed my problem.

realnerd’s picture

#12 fixed the issue for me too.

nguyentran’s picture

Hello, I have the same error. whatever i change the info of row that cause the error, it has the error.
At last, i change the SKU and it work, anybody has the same problem, just try this. I hope this help some one.

Jarviss’s picture

Version: 7.x-2.0-alpha4 » 7.x-2.x-dev

So I don't understand how it was fixed? (Feeds 7.2-dev 2012-May-12)
Post #5 seems good
I have fields nodeid with values (600,601,602) guid with values (1,2,3,4) tried also to add column with values entity_id (1,2,3) and entity_type with values (node, node ...)

But it still gets: Missing bundle property on entity of type node! May anyone comment this?

Jarviss’s picture

Feeds Log:

array (
  'guid' => '1',
  'title' => 'My node title',
  'body' => 'My body text',
  'published' => '1',
  'nodeid' => '600',
  'type' => 'mycontent_type',
  'entity_id' => '600',
  'comment' => '1',
  'entity_type' => 'node',
  'status' => '1',
  'category' => 'category 1',
  'userid' => '1',
fonant’s picture

I'm getting this in a CSV with nearly 9,000 lines, of which just 15 fail to work with a node reference by GUID. I'll see if I can spot what is unusual about those 15 lines.

Hmm... it seems that just one node failed to be created fully, resulting in a broken node with no title. The 15 failure lines were merely repeats of that item, which feeds would have updated instead of creating new nodes for.

For some reason the GUID appearing in the feeds_item table in the database was truncated, and so not the same as the GUID in the CSV file.

I can't see anything unusual or odd about the failing CSV line :(

ben.hamelin’s picture

Running 7.x-2.0-alpha6 and ran into this. Reduced my HTTP Fetched CSV file to one item, and began removing fields.
Got down to 2 fields, "GUID" and "title". My GUID was 163 and when I changed it to 164 (confirming there was no conflict) it worked. I looked for node/163 but it doesn't exist (at least currently). This error was persistent both when I used CRON to run periodic import as well as when I was running it manually.

Maybe this helps shed some light.

casaran’s picture

Had a similar problem. #12 fixed the issue for me too.

dkingofpa’s picture

Title: Missing bundle property on entity of type node when importing CSV » EntityMalformedException: Missing bundle property on entity of type node. in entity_extract_ids()
Status: Postponed (maintainer needs more info) » Active

In a feature module, I'm installing and importing a feed in a hook_install() function. Installing the feed source works. However, I'm getting EntityMalformedException: Missing bundle property on entity of type node. in entity_extract_ids() (line 7650 of /srv/www/includes/common.inc). during the import. If I import the feed via the /import ui, it works without any exception.

I'm using the stand-alone form and adding the feed items to a CType with a machine name of "news". Here's my code:

function my_news_install() {
  // Set feed source url
  // Code from drupal.org/node/1115714#comment-4382624
  $source = feeds_source('news_feed_importer');
  $source->addConfig(array(
    'FeedsHTTPFetcher' => array(
      'source' => '<feed_url>',
    ),
  ));
  $source->save();

  // Import items from feed source
  // Code from "Trigger Import Programmatically" section of drupal.org/node/622700
  while (FEEDS_BATCH_COMPLETE != feeds_source('news_feed_importer', 0)->import());
}

This error occurs for each item in the feed, but this is the Feeds log output from just one item.

Entity
stdClass::__set_state(array(
   'type' => NULL,
   'changed' => 1359635177,
   'created' => 1359635177,
   'language' => 'und',
   'status' => 1,
   'promote' => 1,
   'sticky' => 0,
   'uid' => 0,
   'revision' => false,
   'menu' => 
  array (
    'link_title' => '',
    'mlid' => 0,
    'plid' => 0,
    'menu_name' => 'main-menu',
    'weight' => 0,
    'options' => 
    array (
    ),
    'module' => 'menu',
    'expanded' => 0,
    'hidden' => 0,
    'has_children' => 0,
    'customized' => 0,
    'parent_depth_limit' => 8,
  ),
   'log' => 'Created by FeedsNodeProcessor',
   'feeds_item' => 
  stdClass::__set_state(array(
     'entity_id' => 0,
     'entity_type' => 'node',
     'id' => 'news_feed_importer',
     'feed_nid' => 0,
     'imported' => 1359635177,
     'hash' => '60d516fc6821cf8c7c8238ea2b6f0482',
     'url' => '',
     'guid' => '',
  )),
))

I need to investigate this more. It may be a case of my news CType being created after the hook_install() is run. Does anybody else successfully run an import from hook_install()?

jonathanpglick’s picture

Component: Feeds Import » Code

I was doing something similar to dkingofpa:

$source = feeds_source('{feed_name}');
$source->addConfig(array(
  'FeedsHTTPFetcher' => array(
    'source' => '{feed_url}',
  ),
));
$source->save();
// Try to reload source.
$source = feeds_source('lemelson_center_events');
$source->import();

I found that my features-managed processor wasn't getting populated with the correct config. It was missing the bundle and the mappings.

Turns out that both `FeedsConfigurable::instance()` and `FeedsSource::instance()` use PHP static variables so clearing the drupal_static() cache wasn't emptying them.

My workaround for the time being is to schedule the feed import and then use drush to call cron. Oddly, I have to move back the `job_schedule` next time and invoke cron twice before the nodes are imported:

...
$source->save();
$source->schedule();
db_query("UPDATE {job_schedule} SET next = :next WHERE type = :type", array(
  ':next' => time() - 1000,
  ':type' => $source->id,
));
drush_invoke_process('', 'cron', array(), array());
drush_invoke_process('', 'cron', array(), array());
rcodina’s picture

Uncheking the "Unique" checkbox for GUID solved the problem for me.

minff’s picture

I can confirm the problem with making GUID "unique" – unchecking the checkbox solved it for me.

imclean’s picture

Removing the unique attribute for GUID also worked for me. I'm now using the title attribute as unique and populating it with the unique attribute from the source.

wimpie3’s picture

I'm getting this error as well.

After import: "Missing bundle property on entity of type node."

The log files mention this: EntityMalformedException: Missing bundle property on entity of type node. in entity_extract_ids() (line 7844 of C:\sites\drupal\includes\common.inc).

I get this error for even the most simple import with one line and a few fields...

rcodina’s picture

If what I commented on #23 doesn't solve the problem for you:

Uncheking the "Unique" checkbox for GUID solved the problem for me.

...then, If you have your importer attached to a content type, change that configuration to "Use standalone form". The error message will still appear, but at least you will have your nodes imported!

P.D. I just wonder if it make sense at all to have the option to attach an importer to a content type?

MegaChriz’s picture

@rcodina

I just wonder if it make sense at all to have the option to attach an importer to a content type?

As answered in #2603684: Remove attach to content type feature, yes it still has it uses.

About the EntityMalformedException: I think about the following things might cause it:

  • The content type used on the node processor was changed after there was already imported content.
  • The content type of previously imported content was removed while there were still imported nodes of that type.
  • The configured content type on the Node processor doesn't exists on the website. This can happen if the Feeds importer was configured on an other Drupal site and then transferred via a feature module for example. There is an already issue open that lets Feeds check for errors like these: #2320781: Validate feed importer configuration: check for invalid bundle and invalid language

@rcodina
In case the cause of the issue isn't one of above, could you write an automated test that demonstrates the problem?

rcodina’s picture

@MegaChriz

I've been thinking on the possible causes of EntityMalformedException and I think that in my case could be the first one you mentioned:

The content type used on the node processor was changed after there was already imported content.

I think that could be my case because I usually clone Feed importers once I'm done with the first one ("done" means I already have imported nodes) And once cloned, I change the content type for another content type. I do this because some fields are common in many content types: published, updated, nid, tnid, etc.

Do you think this could be the problem? I will try this scenario in a clean drupal and let you know.

rcodina’s picture

@MegaChriz

I also wonder why while using standalone form all imports are done and while using an attached content type it fails. Maybe this could be improved by adding a try/catch somewhere?

MegaChriz’s picture

@rcodina
You are right, that this error stops to happen as soon you change the "Attach to" setting indicates that there may be more possible causes than the ones mentioned in #28.
I wouldn't think at first that cloning Feeds importers are involved, as a cloned Feeds importer gets a new id, and there is no content associated yet with that id. Thinking of that, if you remove a Feeds importer with which you had imported content, and then create a Feeds importer with the same id, the previously imported content would still be there and it would get associated with the new created Feeds importer. That could have the same effect as "The content type used on the node processor was changed after there was already imported content.".

Ideally, you should not be able to change particular processor settings after there is already imported content. Twistor had proposed this before (I don't remember in which issue).

rcodina’s picture

@MegaChriz

I've been thinking maybe this error it's not a cloning issue at all. In fact, my importer was working well until this morning I added three new fields to the content type and I added three field mappers for them. Modifying target content type fields could be a problem? I think this scenario may be included on your first case too like "cloning":

The content type used on the node processor was changed after there was already imported content.

rcodina’s picture

@MegaChriz

Thinking of that, if you remove a Feeds importer with which you had imported content, and then create a Feeds importer with the same id, the previously imported content would still be there and it would get associated with the new created Feeds importer. That could have the same effect as "The content type used on the node processor was changed after there was already imported content.".

I don't think this is my case. In this site I get the error I haven't deleted any importer yet.

Ideally, you should not be able to change particular processor settings after there is already imported content. Twistor had proposed this before (I don't remember in which issue).

This would be great to avoid problems!

rcodina’s picture

@MegaChriz

I just tried to prove my theory and I failed:

1) I created a brand new importer for content type "A"
2) I imported some nodes of type "A"
3) I added a new text field to content type "A"
4) I modified the mapping to import the new field
5) I reimported the nodes and it worked well with no error

So modifying target content type fields is not a problem (or maybe my test was too simple).

MegaChriz’s picture

@rcodina, I have thought about this issue again. Could it possibly have to do with orphaned items in the feeds_item table? The feeds_item table contains references to entities that were imported (created or updated by Feeds). Normally, when such an entity is deleted, the reference in the feeds_item is removed as well, but I can think of cases where this doesn't happen (for example: in case of errors during deleting or when Feeds was disabled at the time the entity was deleted). And if I remember well, Feeds didn't clean up orphaned items in the past (I believe before the alpha7 version). Each feed item contains a reference to a feed node (if the importer is attached to a content type) and eventually a guid.
Here is my theory: if you had GUID as unique target, Feeds will lookup the entity to update from the feeds_item table. If the entity no longer exists this may result into the reported error. If you change the unique target, Feeds will look elsewhere for which entity to update. Same theory applies for changing from attached to content type to standalone form. When using GUID as unique target, Feeds will look for an existing entity by using GUID + Feed node ID. When the importer is attached to a content type, the Feed node ID will be for example "12". But for the standalone importer this is always "0".

Switching the "attached to content type" setting or no longer use GUID as unique target will cause Feeds to no longer lookup an eventually orphaned item from the feeds_item table.

rcodina’s picture

@MegaChriz

for example: in case of errors during deleting or when Feeds was disabled at the time the entity was deleted

Thanks for trying to solve this. Your theory may be correct. I think in my case the problem was that I had some feeds importers disabled when I deleted some imported nodes. I also may have done more operations while I had them disabled. Is there a way to force a feeds_item clean up from the module UI?

MegaChriz’s picture

I confirmed my theory with the attached patch!

@rcodina

Is there a way to force a feeds_item clean up from the module UI?

No, there is no way to clean up no longer relevant feeds_item entries. It could be a pretty massive operation to perform, especially if the table is large: for every item it should be checked if the specified entity id still exists. We could minimize the performance impact by loading entity id's in bulk per entity type, but it could still be a time consuming operation. Then there is also the possibility that not all entries in the table may refer to entities (as a processor can override functions for entity loading and saving and in theory provide something else than a entity).
Conclusion: it is probably not a good idea to do this during every cron run.

Would it also be good if Feeds would clean up a feeds_item entry during processing, when it finds out that the item refers to a no longer existing entity?

Status: Needs review » Needs work

The last submitted patch, 37: feeds-orphaned-feeds-item-1394320-37-tests-only.patch, failed testing.

The last submitted patch, 37: feeds-orphaned-feeds-item-1394320-37-tests-only.patch, failed testing.

rcodina’s picture

@MegaChriz

Would it also be good if Feeds would clean up a feeds_item entry during processing, when it finds out that the item refers to a no longer existing entity?

Yeah, that would be a smart solution.

MegaChriz’s picture

I made a start with this today by checking in the process() method if the result of entityLoad() resulted into a non-empty value, but this didn't solve the problem, as that method is overloaded by the node processor (and probably also other processors too), all expecting a valid object. Seems this could only be fixed by throwing an exception in FeedsProcessor::entityLoad(). With that thought, I was thinking: should Feeds just import the item and act as if the feeds_item did not exist or inform the user about it and not import the item? In the second case the feeds_item entry wouldn't be cleaned up.

Below the code I have (not making a patch of it as it didn't fix the problem):

        $entity = NULL;

        if ($entity_id) {
          // Load an existing entity.
          $entity = $this->entityLoad($source, $entity_id);

          if (!empty($entity)) {
            // The feeds_item table is always updated with the info for the most
            // recently processed entity. The only carryover is the entity_id.
            $this->newItemInfo($entity, $source->feed_nid, $hash);
            $entity->feeds_item->entity_id = $entity_id;
            $entity->feeds_item->is_new = FALSE;
          }
          else {
            // Entity not found. Empty the entity id.
            $entity_id = NULL;
            if ($skip_new) {
              // Go to the next item if items should not be inserted.
              continue;
            }
          }
        }

        if (empty($entity)) {
          // No entity id is provided or the entity could not be loaded.
          // Build a new entity.
          $entity = $this->newEntity($source);
          $this->newItemInfo($entity, $source->feed_nid, $hash);
        }
rcodina’s picture

@MegaChriz

With that thought, I was thinking: should Feeds just import the item and act as if the feeds_item did not exist or inform the user about it and not import the item? In the second case the feeds_item entry wouldn't be cleaned up.

I would just import the item as a new one and act as if the feeds_item didn't exist. Given this problem is about feeds_item table having orphaned data, i think this is the way to go.

I think you can try to place the logic you added in process function inside entityLoad function. I wouldn't launch any exception, I would just insert a message into Feeds log telling about the orphaned item.

MegaChriz’s picture

Status: Needs work » Needs review
FileSize
13.19 KB
5.69 KB

I think you can try to place the logic you added in process function inside entityLoad function. I wouldn't launch any exception, I would just insert a message into Feeds log telling about the orphaned item.

I don't think entityLoad() should try to create a new entity if an entity could not be loaded. Code that is calling this method expect an existing entity in all cases. Also, new entities should not be created if the "insert new" setting is turned off. But thanks for trying to help me.

The attached patch should fix the issue. I had to change quite a lot in the FeedsProcessor::process() method, so I hope I didn't break something else.

One thing that I didn't manage to do is to make sure a feeds_item is cleaned up when not updating existing entities. For this to work, entities should be loaded always, even when not updating existing. This would add a heavy extra load compared to the tiny chance that a feeds_item refers to a no longer existing entity. So this means that items with an orphaned feeds_item are only created when updating existing entities.

This are the changes in the patch:

  • A new method provideEntity() is added that loads or creates an entity depending on the settings:
    • With an existing feed_item:
      • When inserting but not updating, it will do nothing.
      • When inserting and updating, it will first try to load an existing entity. When that fails, it will create a new entity.
      • When updating but not inserting, it will first try to load an existing entity. When that fails, it will do nothing.
    • With no existing feed_item:
      • When inserting but not updating, it will create a new entity.
      • When inserting and updating, it will create a new entity.
      • When updating but not inserting, it will do nothing.
  • entityLoad() will now throw an FeedsEntityNotFoundException in case loading the entity leaded to no results. This had to be done this way because other processors expect a valid entity object when calling FeedsProcessor::entityLoad() (see FeedsNodeProcessor::entityLoad() and FeedsUserProcessor::entityLoad()). This exception will be catched by provideEntity().

An alternative solution to this problem would be skipping the item and mark it as failed.

Let's see first if the patch breaks something else.

Status: Needs review » Needs work

The last submitted patch, 43: feeds-orphaned-feeds-item-1394320-43.patch, failed testing.

MegaChriz’s picture

Status: Needs work » Needs review
FileSize
7.9 KB
564 bytes

Oops. Two errors (syntax problem and forgot a git rebase 7.x-2.x).

New try.

The last submitted patch, 43: feeds-orphaned-feeds-item-1394320-43.patch, failed testing.

rcodina’s picture

@MegaChriz

Great job! I totally agree with the new behaviour of module after applying last patch you submitted (thanks for the detailed description on comment #43). I will try the new patch ASAP and give some feedback.

An alternative solution to this problem would be skipping the item and mark it as failed.

I think that would be a valid option only if you had a manually launchable batch job to erase all orphaned rows on feed_item.

So here we are:

Option 1: Patch on #45:
-For: Smart solution (user is not aware of orphaned items)
-Against: Too many changes on Feeds source code that may break other things, a bit more time/resource consuming imports

Option 2: Skip orphaned items on imports and mark them + manual batch job for clean up
-For: Small changes on Feeds current source code
-Against: This is a programming issue that user shouldn't know/care about, so high time/resource consuming operation

I personally prefer option 1. What do you think?

Collins405’s picture

This is still happening for me, are there any updates?

rcodina’s picture

@Collins405 Have you tested the patch on #45? Could you give some feedback about it? I haven't had any time to check it out yet.

joelpittet’s picture

Status: Needs review » Reviewed & tested by the community

I've been using the patch in #45 for a while, it's working for me. The code has a bunch of clean-up, still applies.

MegaChriz’s picture

Forgot about this one! Let's make this a target for Feeds 7.x-2.0-beta5.

According to the comments in the patch and the comments on this issue the following two things still need to happen:

  1. Decide how to clean up orphaned items when not updating existing entities and preferable don't loose performance.
  2. Write a test for feeds items that reference a non-existing feed node.

For point 1, collecting the entity ID's first might work (as suggested in the @todo below), then pass these to a EntityFieldQuery.

+++ b/plugins/FeedsProcessor.inc
@@ -116,6 +121,9 @@ abstract class FeedsProcessor extends FeedsPlugin {
    * @todo We should be able to batch load these, if we found all of the
    *   existing ids first.
Rudi Teschner’s picture

I get this same error message for nodes that dont even exist yet, but are "temporarily" created to get relevant language options for bundles. So, after tracking it down to FeedsNodeProcessor.inc:

  /**
   * Overrides parent::languageOptions().
   */
  public function languageOptions() {
    // Content types can have "extended" language enabled, allowing all
    // available languages, not just enabled. Account for this here.
    if (module_exists('i18n_node')) {
      $node = new stdClass();
      $node->type = $this->bundle();
      $node->is_new = TRUE;
      node_object_prepare($node);
      $languages = array(LANGUAGE_NONE => t('Language neutral')) + i18n_node_language_list($node);
      return $languages;
    }

    // If i18n_node is not enabled, default to enabled languages.
    return parent::languageOptions();
  }

$this->bundle returns an empty value and its a valid return value according to the function. So for me the node bundle itself is not the problem, but that the node feeds processor does not check that value before it creates nodes with missing bundle information.

I think the best bet would be to add an additional check to the code above and change it to:
if (module_exists('i18n_node') && !empty($this->bundle())) {

and include it in the latest patch.

MegaChriz’s picture

Title: EntityMalformedException: Missing bundle property on entity of type node. in entity_extract_ids() » Orphaned items in feeds_item table can cause an EntityMalformedException
Status: Reviewed & tested by the community » Needs work

@Rudi Teschner
Thanks for reporting. Your issue is a completely different one than what is being handled in this issue (I changed the issue title now to reflect this). Can you a create a new issue and - after you have created the issue - provide your suggested change as a patch? Can you also provide the steps to reproduce the issue? I enabled the i18n_node module, created a new importer and went to the node processor settings but did not get an error there. Thanks in advance.

Rudi Teschner’s picture

Ok, will do. Thanks for the fast reply and the clarification. :)

AlfTheCat’s picture

#23 worked for me.

geefin’s picture

@Rudi Teschner

Did you create a new issue for this problem as I'm finding similar - while nodes are being created from a feed they create temporary orphaned malformed bundle which throws a site-wide error until the creation of the node is complete.