On Sep 24 16:03, IWAMURO Motonori wrote: > 2009/9/22 Andy Koppe <andy.ko...@gmail.com>: > > Let's use the Windows "ANSI" codepage as the character set for the C > > locale, for both the conversion functions and filenames. This means > > CP1252 on Western systems, CP1251 on Cyrillic ones, CP932 on Japanese > > ones, and so on. > > I oppose the approach (the ANSI codepage is used at C locale) because > CP932 (the codepage for Japanese) is hostile to the UNIX-like tools. > > The reason is that the CP932 format contains a lot of meta characters > as follows. > > single character of CP932: > /[\x00-\x7F\xA0-\xDF]|[\x81-\x9F\xE0-\xFC][\x40-\x7E\x80-\xFC]/
I don't understand. Are you saying that the single character in CP932 consists of 12 bytes? As far as I can see, CP932 is S-JIS, which is a just a simple double byte character set. What am I missing. > This has a ruined influence to the tools that don't see locale. Can you please try to explain the problem in a bit more detail for those of us not fluent in eastern asian languages? What do you mean with "hostile" and "ruined influence"? Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple