At the inner div.
It just looks that way in WordPad. Try notepad++
The hexdump of that line is
grep -A10 theme_user_signature core/modules/user/user.module | grep '<div>' | hexdump -C00000000 20 20 20 20 24 6f 75 74 70 75 74 20 2e 3d 20 27 | $output .= '|00000010 3c 64 69 76 3e e2 80 94 3c 2f 64 69 76 3e 27 3b |<div>...</div>';|00000020 0a |.|00000021
So the characters between the div and close-div are: e2 80 94 which is unicode for — Would it be better to use the the — or are we OK with unicode in the output markup?
So we can protect against editors which don't support unicode editing that symbol and file transfer protocol which modify the non-ASCII contents of files I think it should be named to the HTML character reference.
Here is a patch which updates this.
Works for me.
Committed to 8.x. Thanks!
Same patch as above but for D7.
Trivial, and already committed to D8.
7: non_ascii_character-421294-7.patch queued for re-testing.
On further thought, should we really do this?
There is unicode all over the Drupal codebase, including in HTML output, and including several instances of this exact same character.
At the very least, we should be consistent.
And I am also under the impression that Drupal generally prefers unicode characters to HTML entities, because they're more portable.
If you text editor can't handle unicode, I think you need a better text editor :)
IIRC, at the time the original patch was submitted, this was the only instance (of this character at least).
FWIW, there doesn't seem to be a clear consensus online about Unicode vs Character Entities (both have their advantages/disadvantages) - http://en.wikipedia.org/wiki/Unicode_and_HTML
As far as I understand (and I'm not terribly current on this), the theme_* functions are tied explicitly to the HTML output mechanism (as compared to web-services etc), so the difference between entities and unicode is probably mostly about maintainability rather than anything semantic.
I personally don't really care one way or the other, but I think we should be consistent. This issue was reported against D6 and took 4+ years to get into D8, the least we could do is backport this trivial change to D7. Or revert the D8 change and close the issue. I'm not sure why tiny changes always take more effort than major rewrites.
#657166: use × instead of x is similar, but in that it's proposed to replace the ASCII with Unicode, sort of the opposite of what we're doing here ...
Um. It took one week to get this in 8.x, once a patch was supplied. We can't commit changes to D8 if we have no patch. So let's please be accurate. :)
I personally lean towards reverting the D8 patch, but it's unclear what the actual problem was that was being solved, since the issue summary is a single sentence.
Yes, please rollback. There are unicode characters all over the code base, and that's how it should be.
In my opinion we shouldn't force editors on developers. Having less unicode in the code base is better in my opinion. Non-unicode editors are rare, but I'd dislike breaking a file by using one if I needed to.
Drupal is a registered trademark of Dries Buytaert.