Project:Facebook-style Statuses (Microblog)
Version:6.x-2.0-rc2
Component:Code - Functionality
Category:bug report
Priority:normal
Assigned:Unassigned
Status:closed (fixed)
Issue tags:hashtags, umlaut

Issue Summary

Hi,

the module doesn't handle germanic umlauts in tags, so if one tries to tag #gebäck it won't be recognized as tag. Is there a solution for that? The drupal taxonomies support umlauts (and so does twitter).

Regards,
jakewalk

Comments

#1

Title:Germanic umlauts and tags» Special characters in #hashtags

The original way I did it used Regex, which is about a zillion times better than the current string-based implementation (and would fix this problem) except that some people had problems with it due probably to server configuration issues. I'd like to switch back to Regex entirely, but I will probably just add an option to use it instead.

#2

Status:active» needs work

I'm having some trouble getting the regex to work with unicode. I opened a forum topic on a PHP Regex board though, and once I get an answer there the rest is fairly trivial.

#3

Thanks for your instant response. I'd be glad if that issue could be fixed.

#4

Status:needs work» fixed

I just committed a fix to CVS that completely switches to Regex. I didn't leave an option for the string-based method because it was too restrictive. The new parsing:

  • Allows special UTF8 characters in hashtags and usernames.
  • Corrects a problem where a status like "#the #therapy" would match "#the" twice and "#therapy" never.
  • Allows surrounding hashtags and usernames with square brackets to support tags with word-break characters. For example, [#hello world] now matches the tag #hello world where it previously would only have matched #hello.

#5

Hi,

thanks that works very well so far. We just noticed that dashes in tags (#hash-tag f.e.) don't work:

http://fbss.icecreamyou.com/statuses/term/hash-tags

I hope you're able to fix that, too.

Regards,
Raphael

#6

[#hash-tag] works, but #hash-tag doesn't. I'm going to keep it that way since a hyphen is a word-break character and Twitter doesn't accept hashtags with hyphens either.

#7

Actually, #hash-tag seems to work too, but I think Views might have a problem with the hyphen character in arguments or something like that... will look into it.

#8

Yeah, it turns out that Views has a bug with taxonomy term name arguments where the term name has a hyphen in it. There's not much I can do about that, unfortunately.

#9

So what would be the next step? I might post an issue at the views module, but as you seem to have an idea what the problem might be, maybe you want to?

Thx so far!
Raphael

#10

I have no idea what the actual problem is, I just asked on IRC. There's probably already an issue open in the Views queue, and if not that's likely the best way to go.

#11

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.