On Fri, Jun 01, 2001 at 01:56:42PM +0200, Marco d'Itri wrote: > On Jun 01, Josip Rodin <[EMAIL PROTECTED]> wrote: > > >Nice things these general tendencies... in my country we still have problems > >using ISO 8859-2 because Windows 1250 has polluted everything. Adding > >another one to the pile is likely to screw things up even more. <sigh> > This is the reason we can't just switch the terminals to UTF-8, there > are way too many programs which can't correctly recode ISO-8859-* text, > because they are broken or because the charset is unlabeled.
so we first make them work with ISO-8859-*, then work on making applications work with UTF-8, then work on making those terminals display UTF-8? I can see a shortcut here... > Let's first fix the software, then we'll talk about using UTF-8 by > default for everybody. but fix it which way? To support CP1250? ISO-8859-2? CP852? Or KOI8-R? CP1251? EMCA? CP866? better concentrate on UTF-8, it indeed does solve many problems better let the old terminals die... if we had not let the old Czechoslovak Kamenicky encoding to die, but focused instead on fixing the software, we would have much bigger mess than there is now. All the i18n stuff in glibc is a bit flawed... it assumes you NEVER want to change the default locale while the program is running, and it assumes everybody has correct terminal. Have you seen konwert package? It is really nice, glibc's iconv(3) should have been like this and there would be one problem less... Ideal (under the circumstances) would be: have glibc work internaly in UTF-8 unconditionaly Output is transliterated according to terminal charset (ideally UTF-8, so no conversion is necessary). Terminal charset can be switched over _on the fly_, maybe via SIG-SOMETHING to glibc locales are in UTF-8 unconditionaly isprint(3) returns 1 for UTF-8 characters (fuzzy here... but it definitely should not be tied to locale), actual displaying the character is handled by konwert-like output routine You want your ISO-8859-1 console with locales? No problem, do export OUTPUTCHARSER=ISO-8859-1 and glibc will transliterate eventual russian fortunes into latin script... and strip diacritics from Slovak names. readline, stty & co. are UTF-8 aware input can be recoded to UTF-8 if necessary, but ideally it is already coming in UTF-8 (the biggest problems are text editors here, maybe use filterm(1) for old applications) Input encoding, too, can be changed on the fly. have X work in UTF-8 xkb sends UTF-8, default font encoding is UTF-8 allow UTF-8 in /etc/passwd. Damn. I was bitten by this a few days ago. 8-bit chars in GECOS behave unpredictably -- ----------------------------------------------------------- | Radovan Garabik http://melkor.dnp.fmph.uniba.sk/~garabik/ | | __..--^^^--..__ garabik @ melkor.dnp.fmph.uniba.sk | ----------------------------------------------------------- Antivirus alert: file .signature infected by signature virus. Hi! I'm a signature virus! Copy me into your signature file to help me spread!