Topics imported with HTML entities escaped
webchick - January 31, 2007 - 04:06
| Project: | phpBB2Drupal |
| Version: | HEAD |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Description
Again, this could be my DragonflyCMS thing being weird. If so, please won't fix.
Topics are being imported as:
"Smokin& #039; Aces: Theatrical Review"
They are actually stored this way in the database; however, since Drupal does filtering on output, they're being double-escaped.
This patch fixes the problem by decoding the titles before they're put in the db. Bodies don't have this problem, since they're handled by BBCode.
| Attachment | Size |
|---|---|
| html-entities.patch | 700 bytes |

#1
Whoops. Forgot about comment titles as well. And this is probably a better approach.
#2
committed. thanks.
#3
#4
In http://drupal.org/node/67068#comment-688227 I wrote:
and:
Beginner replied:
I think the first approach from the two patches should fix this. I will need to test it later on.
#5
Just tested a removing this patch but no luck. I think it may just be some corrupted data on my end.
#6
Where do you have the problem?
In the node title or within the node body?
In Drupal, html entities are not allowed within the title.
#7
Within the body.
I tried to change it so that only the html entities are escaped for the titles (like the first patch, but also for comments, polls etc), but not for the node body.
I do not get this everywhere, but one specific post in my test database (of about 110,000 posts... so there may be more, but this does not happen everywhere), It may just be data corruption.
#8
The original (committed) patch output the characters to an iso-8859-1 encoding by default - iso-8859-1 cannot correctly store special characters.
$text = html_entity_decode($text, ENT_QUOTES);Should be
$text = html_entity_decode($text, ENT_QUOTES, 'utf-8');(I also have moved the decoding of html entities to after the conversion of everything as this seems like a safer bet for other encodings.)
This did affect titles too. Fixed in both HEAD and drupal 5.x-3.x-dev branch.
#9
Automatically closed -- issue fixed for two weeks with no activity.