This is something someone can easily recreate.
(data base collation: utf8_general_ci // latest 6.x-1.x-dev)

case 1: When I use a unique target (mapped guid, GUID) and the character set of the CVS is NOT UTF -8, and contains latin (ex. for the field title) and greek (ex. for the field body) characters: it imports only the latin characters (the Title field) and not the greek characters (the Body field is empty)... // thats expected

case 2: When I use a unique target (mapped guid, GUID) and the character set of the CVS is now UTF -8, and like before, it contains latin (ex. field title) and greek (ex. field body) characters: it imports correct only the LAST entry of the CSV file..(all other nodes are not present)...

case 3: When I dont' use a unique target and the character set of the CVS is again UTF -8, and like before, it contains latin (ex. field title) and greek (ex. field body) characters: it imports correct ALL!! the entries as nodes...(but now I can not just update!! them - because a unique target is missing -...)

This can be tested with all the included test - example CVS files of the dev module. Two days now I am searching for a solution without success. I have a CVS file with thousands of entries and need often to update them! Any help at this problem?

Comments

alex_b’s picture

Priority: Critical » Normal
Status: Active » Postponed (maintainer needs more info)

Can you post example CSV files for reproducing this error? case 2 sounds very strange. Looks like the comparison check finds false positives. If you post example CSVs, can you point out which columns are used as unique targets?

alex_b’s picture

Title: Node CSV import problem (imports only one node -last cvs entry-) when CVS character set is UTF-8 and unique target GUID is on. » CSV Parser: errors with non-UTF8 encodings and checking uniqueness

Keeping the title a little shorter for better legibility of the queue.

summit’s picture

Subscribing, may be same as: http://drupal.org/node/704532?
greetings, Martijn

jdench’s picture

I had the exact same experience as the OP.

What happens in Case 2 is that the importer treats it all as the same node, just rewriting the data from line 2 into the node, over the data from line 1, etc, etc, until at the end you just have the data from your last line. (The evidence of this is that it prints out on the screen the whole history of how it kept giving the same node a new alias.)

The whole thing seems pretty weird in that I had the reverse problem when trying to import non-UTF8 data: I didn't have a guid and the data kept rewriting itself on the same node. When I added the guid, it created one node per line as expected.

twistor’s picture

Status: Postponed (maintainer needs more info) » Closed (fixed)

Closing very old issues. Feel free to re-open.