I'm getting the dreaded "� " character when someone uses a special character in a hashtag.

At the moment, you can see this problem on http://clients.gertowebservices.com/wfs/node/1 - I also attached a screenshot.

In this case a Tweet mentions "Armenië" (no hashtag) , which is displayed correctly, and another with hashtag "#Armenië" , becomes "#Armeni�"

If I remove the following code, the characters display normally again (but hashtags do not have links any longer, obviously) so I do not think this is a character encoding problem with my site:
twitter_block.module, line 287:

  // Linkify tags.
  $status_text = preg_replace(
    '/(^|\s)#([\wåäöÅÄÖ]+)/',
    '\1#<a href="http://search.twitter.com/search?q=%23\2">\2</a>',
    $status_text
  );

I sadly don't know enough about regular expressions to find the problem here. Thanks in advance.

CommentFileSizeAuthor
#1 1567154-Hashtag_Encoding.patch1.09 KBZenDoodles
screenshot.png13.55 KBGerto
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

ZenDoodles’s picture

Version: 7.x-1.0 » 7.x-1.x-dev
Status: Active » Needs review
FileSize
1.09 KB

This does seem to be a character encoding issue. This patch adds more of the special characters the regex would miss and uses the pcre u and i flags which will hopefully help avoid � in the hashtags.

Please let me know if it works, and I'll commit it.

ZenDoodles’s picture

Status: Needs review » Needs work

Need to use the regex constant from core in the last preg replace instead

drupalerocant’s picture

I have the same problem in my site.
Can we safely use the patch? thanks

Devin Carlson’s picture

Status: Needs work » Closed (won't fix)

With the Twitter API v1 going away (see #1933164: Twitter API v1 is going away on May 7) Twitter Block 1.x will no longer function after June 11, 2013. A new 2.x version of Twitter Block has been released (see the release notes) which utilizes Embedded Timelines.

As Twitter Block 2.x uses a different API and shares little code with the Twitter Block 1.x, existing issues no longer apply.