Closed (works as designed)
Project:
User Import Framework
Version:
7.x-1.0-beta8
Component:
Code
Priority:
Normal
Category:
Bug report
Assigned:
Reporter:
Created:
13 Oct 2011 at 09:15 UTC
Updated:
22 Jun 2012 at 09:46 UTC
Jump to comment: Most recent file
Comments
Comment #1
xcheng commentedyes, use http://php.net/manual/en/function.utf8-encode.php
Comment #2
dwightaspinwall commentedthank you -- I'll have a look
Comment #3
dwightaspinwall commented7.x-1.0-beta8 now has same preg_replace as above. Not sure about using utf8_encode(); not seeing this in any other code.
Comment #4
DieWaldfee commentedis there a possibility to import Korean characters. i dont understand the uft8-encode thing :/
Comment #5
dwightaspinwall commentedI need to look into this as I don't know the proper way to support it. Any information welcome.
Comment #6
rolkos commentedAfter update from beta7 to beta8 I can't import Swedish nor Polish charachters like "ąężźńćłóś å ö ä". After import I get info "Warning on row 2: Illegal characters were removed from first_name column. May require edit." it should not work that way. I also can't import URL to url field.
Comment #7
dwightaspinwall commentedfixed in 7.x-1.0-beta9 and 6.x-1.6.
Checks for valid UTF-8 and accepts it if so or, if invalid, strips all but ASCII and throws warning (but still succeeds at import).
Comment #8
rolkos commentedThanks, now it's better but if first character in imported string is non ASCII character all non ASCII characters would be removed without warning, unless there is ASCII character inside string, in this case it removes only non ASCII part from the string.
For example if we import strings like:
Zażółć
It would be imported correctly, if we have string like:
ąężźńćłóśöä
It will not be imported at all without any warning.
If we try to import:
ąężźńAćłóśöä
it will import part starting with ASCII sign in this case:
Aćłóśöä
Thanks
Comment #9
dwightaspinwall commented@rolkos: I had no problem importing users with the names you provided. No errors. See attached.
UIF expects the input file to be in UTF-8 format. It does not do any conversion. For my test I used Google Docs spreadsheet. I have seen problems with Microsoft Excel and recommend against its use.
Hope this clears things up.