By dewolfe001 on
How/where does Drupal convert special characters (umlauts, accents, etc..) that come with Spanish text into something useable and storable in the db? From what I can see, it happens before the node is saved, but where I do not know.
Thanks in advance,
Mike
Comments
They are not special
They are not special characters, just normal Unicode characters. This is what Unicode is about.
What exactly is the problem? Don't they appear right? Or are you trying to import non-Unicode text?
I have an import script that
I have an import script that passes code to
node_save, so it's bypassing a lot of the form functionality.When I pass in a string like this "Jesús Martín-Barbero - Pensamiento" it saves it as "Jes"-- clipping before the first accented character.
When I submit the same data via a form interface, the string takes fine. So, it's not a bug in Drupal, it's a bug in my import tool-- one I wanted to circumvent by putting in whatever function calls are neccessary to wrap/convert the strings. I've looked for the likely candidates (htmlentities and those that behave like it) but I have come up cold.
Thanks,
Mike
Oh, so you have non-unicode
Oh, so you have non-unicode data which you want to import. You will probably find something here:
http://api.drupal.org/apis/4.7/utf8
Thank you, CogRusty!
That was very helpful.
drupal_convert_to_utfwas what I needed.What I needed to do was force conversion of strings with non-unicode characters then iterate through the node object before they were saved to the db. A call to node_save and the {node}_{update|insert} calls were not doing it.
So far, this function seems to hold water.
can be simplified
Thanks for this post, I had the same problem.
As "reward", I suggest the following more simple code does the same thing :
function direct_node_import_iterate( $term ) {
if (is_object( $term)) return (object) direct_node_import_iterate(get_object_vars($term));
if( is_array( $term )) return array_map( 'direct_node_import_iterate', $term);
return drupal_convert_to_utf8( $term, 'iso-8859-1' );
}
I think in most importation issues the imported things aren't objects with more structure than stdClass,
so the code is OK, but if $term had other methods ("functions"), they would get lost here
and one should cast the array to an object of term's type.
(I checked that array_map preserves the keys, strange enough that nothing about that is said at php.net/array_map)
Thank you. This post very
Thank you.
This post very useful.