Feeds actually replaces nodes when updating due to avoiding of node_load.
Perhaps it's ok in some scenarios... But what if someone needs to update just a few cck-fields or product prices?
or maybe create new cck-field and then fill it with feeds (keeping other data intact) - lots of new opportunities...
Please add an option to enable/disable node_load.
I had to replace
// If updating populate nid and vid avoiding an expensive node_load().
$node->nid = $nid;
$node->vid = db_result(db_query('SELECT vid FROM {node} WHERE nid = %d', $nid));
with $node = node_load($nid, NULL, TRUE); for the last two projects with ubercart to update prices/store with feeds.
Comments
Comment #1
alex_b commentedThat's all it needed? Nice.
So we could replace the "update existing" checkbox with a three radio button choice:
I'd be happy to accept a patch, ideally with a test.
There *may* be a duplicate issue on the queue. Couldn't find it right now though.
Comment #2
Monkey Master commentedI made a patch (under windows, hope it's ok) with #1 radios
also changed resulting message to contain created/replaced/updated counts (with format_plural)
Comment #3
alex_b commentedAt first glance this is looking great. Need some time to do a closer review. Will get back to you over the weekend.
Comment #4
Monkey Master commentedAlso small fixes needed for mappers - cleanup before updating
taxonomy.inc: Remove target vocabulary terms from a node before adding new ones
filefield.inc: Do not keep files already stored in a target filefield field
Otherwise there will be a problem with repeating updates - the same terms and files will be added again and again.
Comment #5
Monkey Master commentedJust checked content.inc. It also tries to keep old data in a target field - without node_load() it doesn't make sense anyway...
$field = isset($node->$target) ? $node->$target : array();So a patch similar to filefield - replace with
$field = array();Comment #6
Monkey Master commentedSorry, missed isset() check in taxonomy.inc - updated patch
Added zip archive with all 4 patches
Comment #7
Monkey Master commentedI think mappers keep old field values to be able to save multiple source fields to a single target field. (For example, save 'pic1' & 'pic2' to 'pics'). In that case my mapper patches break this functionality. So I have a better solution: drop each target field value only once for a node.
1. In FeedsNodeProcessor.inc add temporary array to node object before $this->map($item, $node); and unset it after:
2. In mappers, for example content.inc, if target field is not in that array - add it there and drop its value. So target value will be droped only once per map process.
I use temporary array in a node ($node->feeds_targets) because only node is available in hook_feeds_set_target(). Another option is to use some global variable for that purpose.
What you think?
I can prepare mapper patches if this solution fits.
Comment #8
alex_b commentedCould you be more specific? Where do you see this happening?
Otherwise:
1. I'd love to keep the update user feedback simple, no need to track replaces and updates separately in FNP::process().
2. Let's use a CONSTANT for possible values 0, 1, 2 for 'update_existing'.
3. We'll need tests that verify the replace behavior.
The attached patch merges all patches of #6 into a single one - was there a specific reason why you attached them separately? Either way, this is the preferred format.
Comment #9
Monkey Master commentedI didn't see it - I tried to find a reason for the code in hook_content_feeds_set_target()'s e.g. mappers/content.inc:
$field = isset($node->$target) ? $node->$target : array();instead of simple$field = array();.Personally I'm happy with no tracking
I'll make a patch today
Didn't used tests before. maybe somebody else could do it?..
Comment #10
Monkey Master commentedThe code in FeedsNodeProcessor.inc overlaps with my other feature request - Keep batch processing when mapping fails for particular node. I can wait till it will be committed to build upon that version with stats update after node_save()
Comment #11
Monkey Master commentedAdded constants for 'update_existing'
Also changed messages block to have import results in watchdog:
Comment #12
mweixel commented@Monkey Master: Excellent work! I have been trying do find the time to look at this, as I too need to preserve existing additional node data during updates. See #688696: NodeProcessor: Ignore changes to certain fields upon node update for earlier discussions and case examples.
I did experience some problems in the taxonomy mapper. PHP kept blowing up when the mapper's taxonomy_feeds_set_target function tried to assign existing taxonomy ids to the updated node, reporting Cannot use object of type stdClass as array. That didn't start happening until I applied your 753426_update_replace.patch. Has that happened to you?
Will your patches work against alpha 13?
@alex_b: Any chance that this functionality will find its way into the module, or will it remain something that has to be patched in?
Comment #13
alex_b commentedThis patch should go in.
Comment #14
Monkey Master commentedMy patch adds only the following code to taxonomy mapper:
Please give more details about that error, I don't have it
yes, it works
Comment #15
mweixel commented@Monkey Master: I applied the patch to an alpha 13 upgrade today. One of the hunks failed: the changes to FeedNodeProcessor that perform a node_load if the config isn't set to FEEDS_NODE_REPLACE_EXISTING. I tried to make the edit manually and then reload my feed, which I new contained updated items. After accessing the feed, Drupal errors, reporting "An HTTP error 500 occurred. /travelwarnings2/?q=batch&id=44&op=do". Checking the server log, I see that the problem is at line 77 in the mappers/taxonomy.inc, which is throwing the " Cannot use object of type stdClass as array" error.
So, if your change to taxonomy.inc isn't involved at that point, could it be that the change to FeedNodeProcessor is somehow interfering with the batch processing?
Comment #16
Monkey Master commentedI updated the patch, with a tiny fix - constant in FeedsNodeProcessor::configDefaults()
Do you use clean alpha13 with my patch only?
Did you tried to import without my patch? or in different mode e.g. REPLACE?
Comment #17
alex_b commented- Rerolled after patch did not apply due to recent commits.
- Removed separate tracking of replaced vs updated stories. Simplifies patch.
- All existing tests pass.
Todo:
- Tests.
Comment #18
cruelgamer commentedTested this patch using drupal 6.16 and every version of feeds. I attempted to populate a new cck field using a cvs import with only a GUID and data for the new field. Each attempt yielded similar results. After import most fields are blank save for the last CCK field and the taxonomy fields, part of the patch also fails against alpha15. Can't remember which part exactly, but I believe it was the taxonomy part.
I also tried hard coding the php code <? $node = node_load($nid, NULL, TRUE); ?> into FeedsNodeProcessor and got the same results
If anyone is having success with this, please let me know what builds of drupal and feeds your using. So far I am having no luck.
Comment #19
naxoc commentedHere is a re-roll to keep up with head. Still no test written, but I will try to find the time.
Comment #20
smartinm commentedSubscribe.
This issue is related to #774858: Node Processor updates node "created" time when updating
Comment #21
mattman commentedJust stumbled upon this issue when looking into extending the FeedsNodeProcessor class.
Here is yet another use case to support this feature being implemented in FeedsNodeProcessor.
I'm using http://drupal.org/project/parser_ical iCal Parser to pull an .ics feed from a Google Calendar. Using Google Calendar is great for managing time based events. (plus you can use iCal to manage the events on a Mac - very nice)
However, the only thing desired from the calendar update (in my case) is the title, timestamp and location (if/when parser_ical supports geo data).
I'm not mapping the description (body) field because this is added/enhanced within the drupal site by an editor. Therefore, the description and any other cck fields need to be preserved. Right now they are being wiped out per the issue. Location module (location_cck) rows are also being killed as well as the author, etc.
Here's my vote for this patch making it in as soon as possible. I was actually thinking of how I might implement such a feature and was leaning towards a "lock this field" or "preserve this field" checkbox.
Initially, I was thinking it would appear within the mapping of the fields, but this wouldn't provide the level of control needed because mapping only accounts for a portion of all possible fields.
Forcing an update of ONLY the mapped fields, which is suggested here I believe, is a great way to go (to start with), but the most flexible, I believe, would be settings on the content type itself, where you could check which fields should be ignored on any update from FeedsNodeProcessor.
This would add some confusion to the interface, but this could be answered by putting a note below the mapping listing mentioning that fields could be locked on updates within the settings for the content type.
Looking forward to the addition and GREAT work on this updated version of how Feeds is working, especially on the interface.
Matt Petrowsky @gotdrupal
Comment #22
andrewlevine commentedSubscribe. This patch is also a dependency of my feature request #828000: Use multiple data sources/parsers to save into the same nodes.
Comment #23
andrewlevine commentedRe-roll against latest head. Still no test. Same as naxoc, I'll try to get time
Comment #24
andrewlevine commentedAghh sorry. I introduced a syntax error. Here is the valid re-roll.
Comment #25
andrewlevine commentedHere is the patch from #24 with tests
Comment #26
andrewlevine commentedWhoops, meant to mark needs review
Comment #27
andrewlevine commentedWow another thing. You'll need to add this file to the tests/feeds directory and rename it to nodes_changes_1.csv.
Comment #28
alex_b commentedGreat to see this moving forward. Won't have the time to review until the end of the week though. You can use cvsdo to add new file.
Comment #29
andrewlevine commentedThanks for the tip. attached is the patch which now includes the new file.
Comment #30
alex_b commentedThis version fixes a problem where properties set by FeedsNodeProcessor itself (e. g. user ids) got overwritten even in UPDATE_EXISTING mode - see modifications to buildNode().
The test case in #29 only checks for functioning REPLACE_EXISTING mode, which is already covered in FeedsRSStoNodesTest. What's new and needs tests is UPDATE_EXISTING mode. This patch adds a simple assertion to FeedsRSStoNodesTest covering UPDATE_EXISTING.
andrewlevine: would be great to get your last review here. I think we're close to committing.
All tests passing.
Comment #31
andrewlevine commentedJust as a preliminary look (will look in depth later), did you mean to remove all the changes to the mappers? I haven't looked into what they are for but I kept them from #19
Comment #32
alex_b commented#31: That was accidental. We need to bring that back (see #4).
I am worried though that resetting the field when mapping will lead to some regressions - existing deployments of feeds may assume that mappers act additive, allowing using a mapping target more than once in a mapping.
Can't think of a good solution to this problem atm.
Comment #33
andrewlevine commentedI think I found a way around the problems we were having thinking about filefield and mappers in general (where we have no extra metadata storage). If you think about it, the current update_existing implementation currently has the same problem (it doesn't know when a new file has been added or replaced because it doesn't store the original filename or filehash). The only reason this problem doesn't surface is because the whole node is replaced and the files are re-uploaded.
So why can't we do this same behavior for the new UPDATE_EXISTING behavior (with node_load)? We can! The only change we will have to make is to keep some sort of state in the $node variable or else pass state in as an extra argument to the map method(). Assuming the latter, the code would look something like this:
In FeedsProcessor::map(), instead of
we have something like
Then in filefield.inc and our other mappers which have no metadata storage and can't be simply ID based (like taxonomy.inc can) we have
This should be backwards compatible with all current mappers using REPLACE_EXISTING (previous update_existing).
We will usually be able to use the ID approach with mappers like taxonomy.inc (and won't have to worry about this problem). However, we can always resort to this approach for filefield and other additive mappers if we make this minor addition of the optional $state argument.
Marking my approach as 'needs review'. If that sounds OK I will proceed with a patch. This functionality is important for work so I am willing to work on this until it gets in :)
Comment #34
alex_b commentedandrewlevine and I just discussed #33 as a viable solution to continue to support aggregating into fields (additive mapping, issue brought up in #32).
For the purpose of this patch I propose to stop supporting aggregating into fields and modify behavior as suggested in #4.
The likeliness that somebody is using this undocumented behavior is rather small, I'd like to find out before we actually implement support for it. I have opened a separate issue on it here: #840626: Support using same mapping target multiple times.
This patch brings back modifications introduced in #4.
Comment #35
andrewlevine commentedTook a more thorough look at #34 and it looks good. I am going to test it on my application that depends on these features later today.
I agree with your comment in #30 that the tests in patch #29 were broken, but if I fixed them, I do think that they are a more extensive set than what is included in the last patch (#34). Is there any reason you added those couple assertions instead of modifying the tests I wrote in #29?
Comment #36
alex_b commentedThe tests in 29 are very verbose when we really only need a quick assertion of whether updating works. I'm anxious to keep the number of assertions low as running all tests in Feeds already takes ages.
Comment #37
andrewlevine commentedI have tested the patch against my application and it works perfectly.
I sympathize with your desire to keep the test run time down, it took about 10 minutes for them to complete on my setup initially. However, IMHO tests should test all possible cases because they mean a human won't have to. If I set concurrency to 6 it completes in just over 1 minute which isn't bad anyway. My tests were intended to ensure these cases: 1. Importing partial data over a currently imported feed, 2. Importing where no content should be changed, 3. initial import from a feed. I can probably remove some assertions but I feel testing these 3 cases are useful.
That said, I won't push the point any further if you disagree, I will just ask about the tests in #34:
I'm not exactly sure what the assertions of the UPDATE_EXISTING case are proving. Is it that items did not recreate/replace nodes because they still don't have authors after the UPDATE query? Wouldn't the same assertion prove true if running the import did nothing at all? Even if we need to keep to a limited number of tests/assertions I think we'd benefit by testing "Importing partial data over a currently imported feed" with a separate feed file with deleted info.
Comment #38
alex_b commentedThe test checks whether a local change persists when using UPDATE_EXISTING mode. The test sets feed item's user id to 0, reimports with UPDATE_EXISTING and verifies that the user id has not been reset.
Point 2. (initial import from Feed) is currently covered in FeedsRSStoNodesTest. Point 3 could be easily added.
Comment #39
andrewlevine commentedPutting aside our disagreement on the ideal extensiveness of testing, in my mind this is RTBC. I don't think it makes sense to add point 3 with the current tests.
Comment #40
alex_b commentedLink tests are failing because link does in fact aggregate URL and title into the same field. Hammering out solutions under: #840626: Support using same mapping target multiple times
Comment #41
sagar ramgade commentedHi All,
I am trying to save/ update nodes from multiple sources, from the one feed i am trying to save the nodes and from other i am trying to update few cck fields of nodes saved from the first feed.
I tested the patch http://drupal.org/node/753426#comment-3145468, with update existing items option, ti doesn't works.
I get update x nodes but when i check those node fields they seems blank. Can anyone guide me how to achieve it.
Thank you in advance.
Comment #42
lyricnz commentedWithout this patch, Feeds is rather broken IMHO. One of the quite common use-cases is keeping a set of data in Drupal in sync with an external source, via CSV or RSS etc. Unless Feeds can update nodes without overwriting (non-mapped) changes made by users, it's not terribly useful for this.
PS: can we just apply the patch as-is, and continue with any outstanding issues separately?
#840626: Support using same mapping target multiple times
Comment #43
andrewlevine commentedHere patch #34 without the changes to the mappers which should be taken care of by #840626: Support using same mapping target multiple times
Comment #44
smoothk commented@alex_b and andrewlevine
Hi guys,
could you please briefly explain how this is supposed to work?
I get the patch working (both last one from #43 and the one from #34), since now nodes get updated and not created from scratch.
I thought from other comments that the patch's goal was also to preserve data that has not changed on the feed.
My case is: I have pictures manually uploaded (cck imagefield), they get deleted on every feed update and they shouldn't.
After countless tests it finally seems to be working (I'm no programmer) so I just wanted to know if this is how it is supposed to work:
CCK fields which are not mapped don't get deleted nor updated?
By the way, patch from #43 gives an error about duplicated terms in taxonomy: but I supposed it is part of splitting the issues in two, the other being #840626: Support using same mapping target multiple times
Thank you so much for your work, it saved the day for me.
Comment #45
andrewlevine commented@smoothk, try applying the patch from #840626: Support using same mapping target multiple times as well.
Comment #46
alex_b commented#44
Correct.
After applying #840626, I am giving this another review now.
Comment #47
alex_b commentedReviewed, all tests passing, committed #43 - thank you everybody for sticking through this.
http://drupal.org/cvs?commit=389860
Comment #48
andrewlevine commentedThanks all for the help getting this patch in. Just in case you are interested (and based on this commit), I contributed a module that allows you to import from multiple feeds into the same nodes: http://drupal.org/project/feeds_node_multisource
Comment #49
alex_b commentedFYI, we forgot to update naming conventions in non - node processors: #853144: Consistent use of "replace" vs "update"