I'm getting the dreaded "� " character when someone uses a special character in a hashtag.
At the moment, you can see this problem on http://clients.gertowebservices.com/wfs/node/1 - I also attached a screenshot.
In this case a Tweet mentions "Armenië" (no hashtag) , which is displayed correctly, and another with hashtag "#Armenië" , becomes "#Armeni�"
If I remove the following code, the characters display normally again (but hashtags do not have links any longer, obviously) so I do not think this is a character encoding problem with my site:
twitter_block.module, line 287:
// Linkify tags.
$status_text = preg_replace(
'/(^|\s)#([\wåäöÅÄÖ]+)/',
'\1#<a href="http://search.twitter.com/search?q=%23\2">\2</a>',
$status_text
);
I sadly don't know enough about regular expressions to find the problem here. Thanks in advance.
Comment | File | Size | Author |
---|---|---|---|
#1 | 1567154-Hashtag_Encoding.patch | 1.09 KB | ZenDoodles |
screenshot.png | 13.55 KB | Gerto |
Comments
Comment #1
ZenDoodles CreditAttribution: ZenDoodles commentedThis does seem to be a character encoding issue. This patch adds more of the special characters the regex would miss and uses the pcre u and i flags which will hopefully help avoid � in the hashtags.
Please let me know if it works, and I'll commit it.
Comment #2
ZenDoodles CreditAttribution: ZenDoodles commentedNeed to use the regex constant from core in the last preg replace instead
Comment #3
drupalerocant CreditAttribution: drupalerocant commentedI have the same problem in my site.
Can we safely use the patch? thanks
Comment #4
Devin Carlson CreditAttribution: Devin Carlson commentedWith the Twitter API v1 going away (see #1933164: Twitter API v1 is going away on May 7) Twitter Block 1.x will no longer function after June 11, 2013. A new 2.x version of Twitter Block has been released (see the release notes) which utilizes Embedded Timelines.
As Twitter Block 2.x uses a different API and shares little code with the Twitter Block 1.x, existing issues no longer apply.