Transliteration

smk-ka - December 9, 2007 - 11:57

This module provides a central transliteration service for other Drupal modules, as well as sanitizing of file names when uploading new files.

Generally spoken, it takes Unicode data and tries to represent it in US-ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F). The representation is almost always an attempt at transliteration — i.e., conveying, in Roman letters, the pronunciation expressed by the text in some other writing system.

Another purpose is cleaning the names of new uploaded files: Drupal currently doesn't take care of invalid (according to RFC 2396) and non-ASCII characters in filenames, which can break your site's file attachments, image uploads, etc. This module provides a generic (i.e., no need for customization) approach to sanitizing file names.

The transliteration is based on CPAN's Text::Unidecode library and some code from MediaWiki's UtfNormal.php.

According to Unidecode, "[...] the output is not so dirty at all: Russian and Greek seem to work passably; and while Thaana (Divehi, AKA Maldivian) is a definitely non-Western writing system, setting up a mapping from it to Roman letters seems to work pretty well. But sometimes the output is very dirty: [it] does quite badly on Japanese and Thai."

What's new

Version 6.x-1.0 retroactively converts existing filenames on installation or update.

Installation

Authors

This project has been sponsored by UNLEASHED MIND.
Specialized in consulting and development of Drupal powered sites, our services include installation, development, theming, customization, and hosting to get you started. Visit http://www.unleashedmind.com for more information.

Releases

Official releasesDateSizeLinksStatus
6.x-2.02008-Jun-1296.08 KBRecommended for 6.xThis is currently the recommended release for 6.x.
5.x-2.02008-Jun-1295.95 KBRecommended for 5.xThis is currently the recommended release for 5.x.


 
 

Drupal is a registered trademark of Dries Buytaert.