From includes/common.inc (Drupal 4.6/CVS):

/**
 * Encode special characters in a string for display as HTML.
 *
 * Note that we'd like to use htmlspecialchars($input, $quotes, 'utf-8')
 * as outlined in the PHP manual, but we can't because there's a bug in
 * PHP < 4.3 that makes it mess up multibyte charsets if we specify the
 * charset. This will be changed later once we make PHP 4.3 a requirement.
 */
function drupal_specialchars($input, $quotes = ENT_NOQUOTES) {
  return htmlspecialchars($input, $quotes);
}

As we now require at least PHP >= 4.3.3 (see INSTALL.txt) this can now be tackled, I guess.

Comments

Steven’s picture

This was adressed as part of the check_plain patch which is sitting in the queue.

I simply got rid of that comment. UTF-8 is 7-bit ASCII compatible (unlike some other multibyte encodings), so htmlspecialchars() doesn't need to know about it. All it changes are plain ASCII characters (angle brackets, amp, quotes, ...).

I don't know who added this comment originally, but I'm pretty sure that this parameter is there for more complicated encodings like SJIS, which may use plain ASCII bytes as part of non-ASCII characters, and thus need more than simple string replacements.