There is a problem using unicode text with the drupal phptal engine.

When entering content cotaining Swedish characters in the default drupal theme (bluemarin), it displays OK when drupal renders it. When switching theme engine to PHPtal engine (tal_grey) the page encoding is reported as UTF-8, but the actual text is in reality rendered as iso-8859-1 (which I can confirm by forcing the character encoding to be iso-8859-1 for that page in firefox).

The net result is that the page says it is utf-8 encode, but the contents will not display properly unless the encoding is manually changed to iso-8859-1.

Comments

boci’s picture

The problem:
the PHPTal produced code is paritaly contain html entities, it's contain html entities ( example á) and standard characters (example; á), And when html_entities_decode parsing the text it's give back bad result.

My first solution:

I change the phptal.engine:

return html_entity_decode($res);

to

  $res=str_replace("&lt;","<",$res);
  $res=str_replace("&gt;",">",$res);
  $res=str_replace("&amp;","&",$res);
  $res=str_replace("&quot;","\"",$res);

return $res;

It's convert the <,>,& and " character to the right form because ignore the special characters (example &raquo; (») , &aacute; (á) etc...)

Secod solution:
modify the

  $phptal[$hook] = &new PHPTAL(basename($file), dirname($file));

to

  $phptal[$hook] = &new PHPTAL(basename($file), dirname($file));
  $phptal[$hook]->setEncoding("ISO8859-1");

becuse it's not work correctily. Example the &raquo;(») character is bad.

boci’s picture

Hi again!

I found another solution.

in phptal.engine :
change

return html_entity_decode($res);

to

    return $res;

and change the libs/PHPTAL/OutputControl.php

        if (defined("PHPTAL_DIRECT_OUTPUT") && count($this->_buffers) == 0) {
            // echo htmlentities($str);
            // support for cyrillic strings thanks to Igor E. Poteryaev
            echo htmlentities($str, $this->_quoteStyle, $this->_encoding);
        } else {
            // $this->_buffer .= htmlentities($str);
            // support for cyrillic strings thanks to Igor E. Poteryaev
            $this->_buffer .= htmlentities($str, $this->_quoteStyle, $this->_encoding);
        }

to

        if (defined("PHPTAL_DIRECT_OUTPUT") && count($this->_buffers) == 0) {
            echo $str;
        } else {
            $this->_buffer .= $str;
        }

This solution generate a right code, please anybody migrate this patch into phptal.

olav’s picture

Assigned: Unassigned » olav
Status: Active » Closed (won't fix)

This problem is gone after the rewrite, at least under PHP 5 / PHPTAL 1.0. Ümläuts both in content and in the templates do work now.