Project:Google Search Appliance
Version:6.x-2.0-beta1
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Issue tags:encoding meta-tags utf8

Issue Summary

All drupals are per default in "UTF-8" encoding.

the meta tags (inserted by google_appliance module) are inserted not correctly

e.g.
Category Type is "Cafés"

is-status:

   <meta name="category-orts-kategorie" content="Caf&Atilde;&copy;s" />

wish-status:

   <meta name="category-orts-kategorie" content="Caf&eacute;s" />

Problem can be solved with explicit charset to the function htmlentities:

--- google_appliance.orig/google_appliance.module 2009-08-21 11:08:46.000000000 +0200
+++ google_appliance/google_appliance.module 2010-11-30 10:01:45.000000000 +0100
@@ -1189,7 +1189,7 @@
         '<meta name="@name" content="!content" />',
         array(
           '@name' => google_appliance_sgml_id_name($name), //see HTML4, 7.4.4 Meta data
-          '!content' => htmlentities($content, ENT_QUOTES),
+          '!content' => htmlentities($content, ENT_QUOTES, 'UTF-8'),
         )
       ));
     }

If You have an ISO-Drupal You should make it configurable, but standard drupals are in UTF-8

AttachmentSize
google_appliance.patch604 bytes

Comments

#1

??
In general: there should be sufficient a "htmlspecialchars".
No htmlentities necessary, then You have no problem with encoding over all.