This issue is part of #2027583: [meta] Behavior when importing blank values
Original Report
My csv-import file does not contain all columns that are declared in the Node Feeds importer.
When I feeds the file to the Feeds importer (updating a previously created/imported node), all fields that are not in the import file are cleared from the Node, except for the taxonomy terms, which stay intact.
Is this expected behaviour?
My use case is this: My node type is populated from 3 different sources. I did not combine the 3 files to 1 new file, but I created a Feeds importer with 10 fields. Now I feed the 3 files to the importer. Each file contains 7+2+1 fields.
An alternative would be to create 3 importers for the same node type, but this is not supported, until Attach multiple importers to one content type is committed.
Comments
Comment #1
johnvThis patch only clears/initializes node fields if an appropriate source/column is provided in the feed/file. If a field is not in the source file, the field will be skipped (= not cleared) by Feeds.
Comment #2
johnvNew attempt to upload the patch.
Comment #3
johnvPatches #1 and #2 are invalid. Try this one.
Comment #4
johnvchanged patch to avoid warnings.
Comment #5
johnvNew version to support multiple-value fields.
Comment #6
johnvN.B. If you use Feeds Tamper together with this patch, you need to apply #1145678: Too many calls to callback function in feeds_tamper_feeds_after_parse(), too.
Comment #7
johnvPerhaps this patch should consider the option 'Replace/Update existing nodes'. Replacing nodes does mean clearing ALL fields.
My use case/tests only consider the option Update existing nodes.
Comment #8
jerdavisThis is a great feature and something I've had to hack my way around as well.
Comment #9
ayalon CreditAttribution: ayalon commentedThanks for the patch. Attached you will find an updated patch that works with the latest dev (August 2011)
Comment #10
johnvRecent feeds version requires a new, attached patch.
Comment #11
johnvThis new patch only affects 1 file instead of 2.
Comment #12
ethnovode CreditAttribution: ethnovode commentedThanks, the patch works perfectly for me. Now I can update my nodes from multiple sources.
Comment #13
joaomachado CreditAttribution: joaomachado commentedCan this patch be ported to D6? So badly needed!!!
Comment #14
joaomachado CreditAttribution: joaomachado commentedoops...change back to bug report.. Sorry.
Comment #15
franzIn some cases the replace/clear behavior is desired, so we should try to make this flexible:
#1107522: Framework for expected behavior when importing empty/blank values + text field fix has a good summary on this, and the plan so far is to add a config for each mapper (configs for mappers are almost ready now).
Comment #16
franzI'm re-opening this as a sub-task of #1107522: Framework for expected behavior when importing empty/blank values + text field fix, because each mapper needs to implement it's own options and logics.
Comment #17
franzComment #18
franzComment #19
gaele CreditAttribution: gaele commentedThat was not what this patch is about. Seems like a dupe of #1107522: Framework for expected behavior when importing empty/blank values + text field fix
Comment #20
franzIndeed, I read the initial report, which was addressing the specific file problem, but the patch merged into the problem we face on #1107522.
Comment #21
johnvLet's keep this open.
The last patch from #11 makes it possible to import 'incomplete' files (e.g. not all columns are in the csv file / you DO NOT provide a field in the importer).
This offers the possibility to fill a content/entity type from 2 or more import files.
- Without this patch: missing field in importer is cleared in Entity.
- With this patch: missing field in importer is untouched in Entity.
#1107522: Framework for expected behavior when importing empty/blank values + text field fix considers the case when you DO provide a field, but the value is empty.
Indeed, that is a more general case, but requires a modification in each mapper.
Comment #22
franzMmm, I'm still thinking we need to merge the efforts there. I know the work is a little stalled, I could use some help on the issues needing work on #1107522: Framework for expected behavior when importing empty/blank values + text field fix. I really wish to get a consistent behavior on Feeds on all those edge cases.
Comment #23
johnvFranz, I'll test the other one when I upgrade my Feeds again.
Comment #24
gaele CreditAttribution: gaele commented#11 is working for me. See also #12, so RTBC.
Comment #25
franzOnly now did I understand it properly.
One question. Doesn't this affect the *existing* field with empty value as well?
This comment is confusing, shouldn't it be "provided sources"? Maybe a little more explanation to make it clear we're talking about a missing field and not an empty field. Still, I don't see how this code discerns one from the other.
Comment #26
johnvI'll review the patches soon. The patch from this issue might be obsolete when #860748: Config for mappers? gets in.
It seems the most striking (only?) difference is that
(!empty($value))
is centralized here, whereas other patches do this in each mapper.On a side note: IMO the each mapper must be as clean a possible. For example, the following line could be done by Feeds, removing it from the mappers:
Comment #27
franz@johnv, I completely agree, and it has bother me a lot that mappers have to deal with multiple values, empty values, etc. I think a mapper should be able to do it, but not needed to. I already thought about turning mappers into a class that can be extended, maybe it's time to work on something along that way.
Comment #28
yurgon CreditAttribution: yurgon commentedHi,
#11 work for me, but reference field in field collection dont update :(
Comment #29
twistor CreditAttribution: twistor commentedI haven't read this issue extensively yet, so sorry if I'm repeating things, or missing things. I do, however, have some thoughts on the issue.
The current behavior is like this:
*Replace existing: Currently only for nodes, that's dumb.
What is desired is a new behavior:
From what I can gather, the existing behavior is the default expected behavior. The proposed behavior is useful, but tricky. Each parser has to take very special care to set its items correctly. I currently jump through a lot of hoops with the XPath parser to get this right.
Comment #30
twistor CreditAttribution: twistor commentedRe: set target callbacks repeating themselves,
I totally agree. However, it needs to be tackled differently. Rather than make *any* assumptions about how a target expects its data, we should supply some generic target callbacks that work for, entity properties, single column fields, and multi-column fields. On top of this, we would need a validate callback that a target can implement, (we can supply some generic ones) that filters the values prior to mapping.
Comment #31
johnvUnlike my promises in june/july, I haven't been able to update feeds and test all outstanding patches regarding 'missing fields in feeds sources'.
@Twistor, the (possibly outdated) patch from #11 does just that: "Update existing, but if the field is not present in the feed, do not update the field".
The complete node/entity is read, but only fields in the feeds are updated. I've been using that for a long time, having several feeds updating the same node - it seems to be working fine.
How would that affect the rest of the parser?
Comment #32
twistor CreditAttribution: twistor commentedRight, the problem is expected behavior. #11 makes sense if you remove a column from your CSV file, and re-import. But if you have an RSS feed that you update, and the author tag is removed, I think it would make sense to remove it from the node. I could be wrong, but this is a pretty big change in behavior.
Your patch it seems, would also make it impossible to delete a field wouldn't it? How would you set it to NULL? Or a numeric field to 0? Just a note.
It only affects parsers in that they have to be careful to handle falsy values correctly.
Comment #33
johnvAd "patch it seems, would also make it impossible to delete a field."
You can delete a field by adding an empty column. Then the field should be reset.
Ad "I think it would make sense to remove [Author] it from the node?"
I am not sure about this use case. Why should you remove the author?
I see, with csv you have control over the columns you have/supply. (or, you can choose one of multiple importers for the same csv, each importer containing another subset of field mappers.)
But for RSS feeds, you have no control over the supplied fields, ofcourse.
I haven't seen the new 'config for mappers' yet. Does/Should it contain an option 'behaviour when not in source' ?
Comment #34
twistor CreditAttribution: twistor commentedErr, doesn't the !empty() remove any chance of a NULL value being set in a field?
Let me put it this way, the expected behavior is that the data in your entities reflects the data in the feed. Anything outside of that is unexpected/extra.
Well... I've been leaning against using it for settings that could apply to every field. In this case I think that a processor level setting would suffice. But, I am also not very excited about adding yet another checkbox.
Comment #35
johnv"doesn't the !empty() remove any chance of a NULL value.
Needs explicit testing. You might have a point. An empty column/field should empty the field, NO column should leave it untouched.
Comment #36
TyrelDenison CreditAttribution: TyrelDenison commentedHere is a re-rolled version of the patch from #11 that applies to the latest release of feeds. Not sure where this issue is headed, but I know that I use this functionality on a live site daily. Hopefully I didn't miss something in the conversation of this thread that indicates this patch is superfluous. If so, at least I got my "exercise" today :)
Comment #37
RogerRogers CreditAttribution: RogerRogers commented@TyrelDenison Thank you, this is an essential patch, for my purposes at least! I was just pulling out my hair, after installing feeds without this path, when I remembered this patch and reapplied.
One problem I have with it is that if I set a numeric value to anything other than 0, and then attempt to set the value to 0, the value will not change. I can set to anything else, but not zero. In fact, I assume it would probably not allow me to change the value to null either (I think it should). Any hints on where I can tweak the code to allow zero values?
Comment #38
franzThat's because the patch use empty(), which returns TRUE for 0.
Comment #38.0
a.ross CreditAttribution: a.ross commentedLinking to meta-issue
Comment #39
twistor CreditAttribution: twistor commentedSorry.
All of this is getting a little carried away. This is not expected behavior, and would be very surprising if it were to change.
I have some ideas for a contrib module to handle this use case, but it does not belong in Feeds.
Comment #40
twistor CreditAttribution: twistor commentedSaid contrib module, https://drupal.org/sandbox/twistor/2094551
Comment #40.0
twistor CreditAttribution: twistor commentedUpdated issue summary.
Comment #41
AlsPatch #36 tested with the currently official release 7.x-2.0-alpha8: works for me.
I tried a few patches found in the Feeds support threads. Only this could do both of the following:
- Not mess up values for existing items when the corresponding imported item, matched via GUID, does not have a value for a file field.
- Not mess up values for existing items when the corresponding imported item, matched via GUID, is skipped because of a filter set in Feeds Tamper.
I understand that Twistor is following a different approach, but a solution is not there yet so I believe it's ok to underline successful patches.
Comment #42
MegaChriz CreditAttribution: MegaChriz commentedComment #43
dimapanin CreditAttribution: dimapanin commentedFeeds Version: 7.x-2.0-alpha9.
With path #36 dont work "term autocreate". Why?