Package: wnpp Version: N/A; reported 2002-10-24 Severity: wishlist * Package name : libtext-unidecode-perl Version : 0.04 Upstream Author : Sean M. Burke <[EMAIL PROTECTED]> * URL : http://search.cpan.org/author/SBURKE/Text-Unidecode-0.04/lib/Text/Unidecode.pm * License : Joint GPL1 / Artistic Description : Last-resort ASCII transliterations of Unicode text
Text::Unidecode is a simple, quick and dirty library for converting displayable characters outside the US-ASCII range U+0000 to U+007F into that range. The method employed is a lossy, simplistic, context- insensitive, and usually phonetic transliteration into Roman characters, which is passable for Cyrillic and Greek alphabets, sometimes okay for non-Western scripts, bad for Mandarin Chinese and worse for other uses of the Han characters and the Thai script. . In other words, if there is a library which directly addresses your problem domain, then you should be using that instead. . On the other hand, this library's output is always better than characters being transcribed as empty boxes, "?"s, or backslash- references into UTF8 space. The algorithm and its capabilities and shortcomings are described in <http://www.sysadminmag.com/documents/sam05060002/>. -- System Information Debian Release: testing/unstable Architecture: i386 Kernel: Linux swift 2.4.19via-epia-tiny #1 Fri Oct 11 21:57:33 BST 2002 i686 Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8