We have a feed setup to use the XML Xpath parser configured to "Update existing nodes". We use the Node ID as the unique identifier and all seems to be working fine. However...
Every time we run an import, nodes are being updated. From my understanding, a hash of the feed item is generated and compared to determine which nodes should be updated and which nodes shouldn't.
Per http://drupal.org/node/631962#comment-4053270, I started down a path thinking that I needed to define some type of 'timestamp' in the XML data; upon further research, I've discovered that when using the XML Xpath parser the feeds_node_item
table, which stores the hash, is always empty.
I've confirmed that when using the 'Common syndication parser', entries in the feeds_node_item
table are generated properly.
I haven't seen any similar issues reported here, so I'm not sure if this is a bug or I just don't have something configured correctly. There aren't any errors being logged.
Comments
Comment #1
stacysimpson CreditAttribution: stacysimpson commentedNo response. Is anyone else seeing the same behavior; or, do we have something setup incorrectly? Thanks in advance.
Comment #2
stacysimpson CreditAttribution: stacysimpson commentedWell, I originally thought that this issue had to do with using the XML Xpath parser, but it turns out that there was a configuration difference in my testing. (I wasn't trying to update specific nodes while testing RSS feeds...)
My current understanding:
If the 'Node processor' is configured to use Node IDs specified in the feed, no hash is generated; therefore, nodes are always updated whenever the feed triggers. Looks like there is deliberate logic in ./plugins/FeedsNodeProcessor.inc's buildNode() to bypass the hashing mechanism if the Node ID is specified in the feed.
We should be able to work around this issue by configuring nodes that were originally created by Feeds. However, this is odd behavior and should at least be documented somewhere.
Comment #3
twistor CreditAttribution: twistor commentedI'm not sure what you're looking at.
The has is generated and added outside of buildNode().
Did these nodes exist before the first import? Or are they created by Feeds? Using the nid as a unique target is tricky business.
Comment #4
stacysimpson CreditAttribution: stacysimpson commentedIn my particular scenario, the nodes existed before first import. However, I don't think that matters.
Basically, what I'm seeing: Whenever I chose to get a Node ID from a feed, even if it's not used as a unique target, no entries are generated for the feeds_node_item table.
Comment #5
twistor CreditAttribution: twistor commentedIs this still an issue? It's been a long time.
Feeds isn't designed to handle managing existing nodes. If should work though, if you map to the nid. I would expect the behavior to be that the existing nodes are updated once, no matter what, then managed normally.
Comment #6
rudiedirkx CreditAttribution: rudiedirkx commentedFor me this is still an issue. Feeds creates a hash for the full source to see if anything has changed, but not per item. If you import a CSV of 2000 items and only 2 have changed, Feeds will update 2000 nodes. (So `updated` timestamp will be changed, even though the node hasn't.)
That's how the hash works, isn't it? Maybe I misunderstand.
Comment #7
rudiedirkx CreditAttribution: rudiedirkx commentedActually, I might be wrong. Apparently the hash works per item and it works. I don't know why it just updated all nodes after I changed only 2 in the source. Never mind. Sorry.
Comment #8
twistor CreditAttribution: twistor commented