This issue is part of #2027583: [meta] Behavior when importing blank values

Original Report

content_taxonomy_feeds_set_target() had a bug whereby empty cells were importing as taxonomy tids with empty names. to fix, change FROM

// Return if there are no or empty terms.
if (!is_array($terms) || empty($terms)) {
return;
}

TO

// Return if there are no or empty terms.
if (!is_array($terms) || (count($terms) == 1 && empty($terms[0]))) {
return;
}

this relates to issues like:

http://drupal.org/node/1047894
http://drupal.org/node/1107522

Files: 
CommentFileSizeAuthor
#23 feeds-empty-term-name-1668186-23.patch2.83 KBdrclaw
PASSED: [[SimpleTest]]: [MySQL] 4,698 pass(es).
[ View ]
#19 feeds-empty-term-name-1668186-19.patch639 bytesdrclaw
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]
#16 feeds-empty-term-name-1668186-16.patch633 bytestwistor
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]
#11 feeds_taxonomy_skip_empty_values-1668186-11.patch340 bytesDouglas Marsh
PASSED: [[SimpleTest]]: [MySQL] 4,505 pass(es).
[ View ]
#8 feeds-taxonomy-empty-values-1668186-8.patch485 bytesdrclaw
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]
#8 feeds-7.x-2.0-alpha8-taxonomy-empty-values-1668186-8-do-not-test.patch485 bytesdrclaw

Comments

Where is this line of code you are referring to? In which file?

right there in content_taxonomy_feeds_set_target(), as i said

I think he is asking in which file...

Version:6.x-1.0-beta12» 7.x-2.x-dev

The problem persists on D7 also.

The patch in that comment (#13) doesn't appear to apply to the current releases (7.x-2.0-alpha8 or 7.x-2.0-dev). It is from 2011, so hardly surprising...

The latest patch (in #81) does seem to apply, but doesn't handle taxonomy fields, only text ones. It sounds like the plan is to create separate issues for each data type. I'll see if they want to use this issue as a base.

Issue summary:View changes

Linking to meta-issue

Issue summary:View changes

Updated issue summary.

I have sucessfully applied patch from #63 to latest dev version (7.x-2.x-dev 2013-Jul-06) and this did solve my problems.

StatusFileSize
new485 bytes
new485 bytes
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]

I think the problem is that the check for Zero needs to use an "identical" operator instead of just an equals operator. Patches attached for dev and alpha8.

Status:Active» Needs review

Oops forgot to change status

Status:Needs review» Patch (to be ported)
StatusFileSize
new340 bytes
PASSED: [[SimpleTest]]: [MySQL] 4,505 pass(es).
[ View ]

I can't believe this is still an issue. Can apply the following patch to current (7.x-2.0-alpha8) or -dev (as of this date):

--- feeds/mappers/taxonomy.inc.orig 2013-09-05 09:10:31.056594222 -0700
+++ feeds/mappers/taxonomy.inc 2013-09-05 09:13:10.158521559 -0700
@@ -124,6 +124,10 @@
   // Iterate over all values.
   foreach ($terms as $term) {
+    if (empty($term)) {
+      continue;
+    }
+
     if ($info['cardinality'] == $delta) {
       break;
     }

Just apply this to current/dev and get this taken care of. Taxonomy terms with zero length string(s) are a pain when importing and this fixes. Is there a plan to fix this someplace else in the code.

Status:Patch (to be ported)» Needs review

@Douglas Marsh
"patch (to be ported)" means there is a patch committed to a branch that needs to modified to commit it to an other branch. I think you meant that you supplied a new patch.

Setting status to "needs review".

Title:empty content taxonomy cells imported as taxonomy terms with empty namesTaxonomy terms auto-created with empty name.

I'm guessing this is what the problem is.

The patch in #8 seems to do the trick.

Patch in #11 worked for me. #8 did not make any difference. Tried with alpha8 only.

Issue tags:+Needs tests
StatusFileSize
new633 bytes
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]

We should avoid using empty() when checking since it will fail for the string '0', and that is a valid term name, or GUID. Even an empty string is a valid GUID.

I'd rather have the check here, since it changes existing behavior the least. Not convinced though.

I believe twistor is correct. I have not looked at the patch in a production environment, but I believe his/her patch (checking string length) is a better direction. Mind you I think a trim() is in order:

+          elseif ($mapping['autocreate'] && strlen(trim($term))) {

I will look into trying this method as I agree my hack didn't account for legit conditions ("0" or number 0 for example).

Thanks for catching that and finding the line. I was looking for a "better" place to catch the autocreate logic and add the test for a (trimmed) zero length string on the term name.

UPDATE: I implemented this patch into production now and I am much happier!

--Doug

Status:Needs review» Needs work

Yes, there should be a trim in there.

Status:Needs work» Needs review
StatusFileSize
new639 bytes
PASSED: [[SimpleTest]]: [MySQL] 4,503 pass(es).
[ View ]

New patch based on comment #17

So... how should we test this? Does it even need it's own test? I don't think the other mappers have tests for empty values (maybe they need them too?)

How should we test this?

How I think this should be tested:

  1. Add a test method to the FeedsMapperTaxonomyTestCase class.
  2. Add two source files (rss or csv), both with content for nodes.
    • In the first source file, add at least fields "guid", "title" and "category". This file will be used to test if terms that only consist of spaces are skipped.
    • In the second source file, add at least fields "guid" and "title" but omit the field "category". This file will be used to test if no terms are created at all.
  3. Fill the source files with a couple of records. In the first source file, at least one record should have a valid category, at least one record should have a category with a single space, and at least one record should have a category that has no characters at all.
  4. In the test, create a node importer with mappers for at least guid, node title and taxonomy reference field.
  5. In the test, import the first source file and assert the number of terms created.
  6. In the test, import the second source file and assert the number of terms created.

I don't think the other mappers have tests for empty values (maybe they need them too?)

Yes, other mappers should have them too, though I think that is out of scope for this issue. Tests for other mappers should be done in, for example, #1107522: Framework for expected behavior when importing empty/blank values + text field fix.

Status:Needs review» Needs work

Cool. Sounds good. I'm not sure we need the second test though. If a field is missing from the source, I think it's ignored entirely.

Otherwise, I think this sounds like a good plan. I'll see what I can do. =)

I agree that the second test is not needed

Now that I think of it, I think you (drclaw) are right the second test isn't really needed in the way I described it, as it is completely covered by the first test. If the first test already proves that no empty terms are created with an empty value, then I guess there is no need to prove that it will happen also with no value at all provided.
I guess I was confused with some other issues (#2093651: Simplify target callbacks. and #1107522: Framework for expected behavior when importing empty/blank values + text field fix) where a change was proposed that when a field is defined as mapping, but does not exists in the source, the same thing should happen as when it is in the source, but with a empty value. The reason behind this is that for some parsers (like the XPath-parser, if I'm not wrong) the difference between empty value and non-existent value can not be made.

A case for the second test, but out of scope for this issue

Anyway, a case where the second test would be useful is to test if values are wiped out in the following two situations:

  • When an empty value is provided.
  • When no value is provided.

But that's beyond the scope of this issue, that is just about making sure that no empty terms are created. It's not about wiping out values.

If a field is missing from the source, I think it's ignored entirely.

That currently depends on the implementation of the parser. This may count for the CSV-parser, but not for the XPath-parser. That inconsistency is subject of change.

New test plan

To keep things clear, here is a new test plan based on the plan in #20 but without the 'second test'. I also added that a record with a term called '0' should be added, as in that case a term should be created.

  1. Add a test method to the FeedsMapperTaxonomyTestCase class.
  2. Add a source file (rss or csv) with content for nodes. Add at least fields "guid", "title" and "category" to this file.
  3. Fill the source file with a couple of records:
    • At least one record with a valid category (should create a term);
    • At least one record with a category that consists of a single space (should NOT create a term);
    • At least one record with a category that has no characters at all (should NOT create a term);
    • At least one record with a category that is 0 (should create a term).
  4. In the test, create a node importer with mappers for at least guid, node title and taxonomy reference field.
  5. In the test, import the source file and assert the number of terms created. It's probably also good to assert which terms were created instead of asserting only the amount of terms.

Status:Needs work» Needs review
StatusFileSize
new2.83 KB
PASSED: [[SimpleTest]]: [MySQL] 4,698 pass(es).
[ View ]

Thanks for the feedback. Here's a re-roll with tests!

Issue summary:View changes

Updated issue summary.

Can confirm that the latest patch is working #23

Note: Delete the empty taxonomy terms before re importing.

#23 worked for me

#23 also worked for me!

Issue summary:View changes
Status:Needs review» Fixed

Thanks everybody!

I screwed up the commit message. Doing too many things at once.
http://drupalcode.org/project/feeds.git/commit/5fe509c

Does this need a backport?

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.