Character Problem with nodes that have greek titles
Chrys - July 16, 2009 - 19:55
| Project: | Drigg |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
Jump to:
Description
I have created a drigg website but I have a problem with Greek characters. Some titles come scrambled (example). The weird part is that some nodes from the same source are scrambled and some are not. Also note that this feed item is not scrambled in the feed agreegator page.
Useful Information
Drupal version: 6.10
MySQL: 4.1.22
Collation: utf8_general_ci
PHP: 5.2.5
PHP memory limit: 32M

#1
Drigg has issues with Non-English characters.
I hope Merc will fix it in next release.
#2
Well, this website mysuba.ru seems to handle non-english characters just fine. I contacted the owner of the website to see if he faced similar problem.
It is difficult to debug this because some feed items come fine and some feed items come with question marks. Also all the feed items in the aggregator page are all ok (aggregator_item table). So I guess the problem is where the title is copied from the {aggregator_item} table to the {node} table.
I will continue searching for this bug. Kudos to merc and the other guys that work on this module. Very well structured and very well commented.
#3
Are the items coming all from the same form ? Can you find a reason why some are ok ? From a special user ? From a special form ? Grabbed From a rss feed ?
#4
The items come from different feed sources. For example this node is broken and this node is not broken. However they come from the same feed source!
What I discovered so far:
- Aggregator module: All the feed items in the aggregator table {aggregator_item} table are ok. So the aggregator module is ok.
- Database: aggregator_item table and node table have the same collation (utf8_general_ci).
#5
Still working on this issue. I added log information and the scrambled items appear just after this line in drigg/drigg_rss/drigg_rss.module:
$result = db_query("SELECT * from {aggregator_item} WHERE iid> %d", $last_iid);while ($item = db_fetch_object($result)) {
...
log_debug("TITLE: $item->title");
The strange thing is that the titles in the table aggregator_item are fine when I find them with phpMyAdmin. So why db_query() or db_fetch_object() don't understand Greek characters (sometimes) and bring item titles with question marks?
A broken link example is here
#6
Fixed. I did two changes so I am not sure which one did the work :)
The changes I made were the following:
- Before connecting to the database I added these two commands:
db_query("SET CHARACTER SET 'utf8'");db_query("SET NAMES 'utf8'");
So that I am sure that UFT8 is used.
- Changed utf_8_general_cι collation to utf_unicode_cι collation.
#7
Fixed. I did two changes so I am not sure which one did the work :)
The changes I made were the following:
- Before connecting to the database I added these two commands:
db_query("SET CHARACTER SET 'utf8'");db_query("SET NAMES 'utf8'");
So that I am sure that UFT8 is used.
- Changed utf_8_general_cι collation to utf_unicode_cι collation.
#8
I saw the issue marked as fixed. But neither see any patch not see any code changes made to cvs code.