Items with comma are imported as separate term even though quoted properly.

yhager - May 19, 2009 - 06:53
Project:Taxonomy CSV import/export
Version:6.x-3.1
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

I tried to import a one line csv:

"parent", "child, with, commas"

and actually got 4 terms imported, See screenshots of the process and resutls.

AttachmentSize
tc1.png43.4 KB
tc2.png29.92 KB
tc3.png9.01 KB

#1

Daniel_KM - May 20, 2009 - 10:56

Hi,

Thanks for your remark.

That's right, the import process imports four terms, while two are expected.

In fact, there are two delimiters in your text:
- the first is the double quote " " ";
- the second is the combinaison of two double quotes and a comma " "," ".

As the term extract of a line uses a basic php function (explode), there is an import error.
In order to avoid this problem, I advise you to choose a never used delimiter, as the symbol " ¤ " or even a combinaison as " A8èV$+# ". So your import text will be:
Parent¤Childs, with, commas
or
ParentA8èV$+#Childs, with, commas

Smarter term extract process can be implemented, but it's currently not a priority for me, because delimiter can be chosen.
Example I indicate in the advanced help is similar and has same problem, so I'm going to add some precision on it.

Sincerely,

Daniel Berthereau
Knowledge manager

#2

yhager - May 21, 2009 - 09:41

Thanks for the detailed and prompt explanation. In my case, I can re-export the data with any delimiters I choose, so this will not be a problem.

However, I guess I just expected it to work that way, since I believe that enclosing a field with double quotes is rather common when having a comma in the field. I believe this is how MySQL (INTO OUTFILE) behaves (maybe also Excel). Wikipedia mentions this too in http://en.wikipedia.org/wiki/Comma-separated_values.

Also testing fgetcsv, shows the same results:

$ cat /tmp/a.csv
Hello, "Hello, world"
Test, "This, is, a, test"
$ php -a
Interactive shell

php > $f = fopen("/tmp/a.csv","r");
php > print_r(fgetcsv($f));
Array
(
    [0] => Hello
    [1] => Hello, world
)
php > print_r(fgetcsv($f));
Array
(
    [0] => Test
    [1] => This, is, a, test
)

I haven't looked into the code of this module, but those are the reasons I expected it to work this way.

Thanks again!

#3

Daniel_KM - August 12, 2009 - 12:38

Hi,

Thanks you for your remark.

New release 6.x-4.1 complies with csv standards better. It improves the import process and allows now any separator and delimiter. A new option "Enclosure" has been added to allow any import. Hope that can help.

Best regards,

Daniel Berthereau

 
 

Drupal is a registered trademark of Dries Buytaert.