The comment URL may include all UTF8
I've successfully integrated Hebrew Urls for this module.
The comment_page_cleanstring function is too agressive and strips all non-english characters from the URL.
To be truly i18n compatible - there is almost no need for string cleaning.
All we need is to clean the chars which has special URL meaning, and make sure the url is utf8-encoded.
Fuller explanation about the pathauto treatment of utf8 urls support
Patch (line 273 and above):
// Preserve alphanumerics, everything else becomes a separator
/* AL */
// $pattern = '/[^a-zA-Z0-9]+/ ';
// $output = preg_replace($pattern, $separator, $output);
$special_chars = array ("?",":","&","@","~","+","_","\"","'",";",".");
$output = str_replace($special_chars, "", $string);
/* AL End */
Comments
Comment #1
druvision commentedTo reduce duplicate code, it might be a wise idea to use the pathauto_cleanstring, the string cleaning function of the pathauto module.
Comment #2
rszrama commentedHmm... I think the thing to do here is to just disable comment subject support in URLs unless you're using Pathauto. This will keep us from having to maintain it in two places. Fixing this now.
Comment #3
Anonymous (not verified) commentedAutomatically closed -- issue fixed for two weeks with no activity.