On Sun, Jan 8, 2017 at 1:47 AM, Stephen J. Turnbull <[email protected]> wrote: > INADA Naoki writes: > > > I want UTF-8 mode is enabled by default (opt-out option) even if > > locale is not POSIX, > > like `PYTHONLEGACYWINDOWSFSENCODING`. > > > > Users depends on locale know what locale is and how to configure it. > > They can understand difference between locale mode and UTF-8 mode > > and they can opt-out UTF-8 mode. > > But many people lives in "UTF-8 everywhere" world, and don't know > > about locale. > > I find all this very strange from someone with what looks like a > Japanese name. I see mojibake and non-Unicode encodings around me all > the time. Caveat: I teach at a University that prides itself on being > the most international of Japanese national universities, so in my > daily work I see Japanese in 4 different encodings (5 if you count the > UTF-16 used internally by MS Office), Chinese in 3 different (claimed) > encodings, and occasionally Russian in at least two encodings, ..., > uh, I could go on but won't. In any case, the biggest problems are > legacy email programs and busted websites in Japanese, plus email that > is labeled "GB2312" but actually conforms to GBK (and this is a reply > in Japanese to a Chinese applicant writing in Japanese encoded as GBK).
Since I work on tech company, and use Linux for most only "server-side" program, I don't live such a situation. But when I see non UTF-8 text, I don't change locale to read such text. (Actually speaking, locale doesn't solve mojibake because it doesn't change my terminal emulator's encoding). And I don't change my terminal emulator setting only for read such a text. What I do is convert it to UTF-8 through command like `view text-from-windows.txt ++enc=cp932` So there are no problem when Python always use UTF-8 for fsencoding and stdio encoding. > > I agree that people around me mostly know only two encodings: "works > for me" and "mojibake", but they also use locales configured for them > by technical staff. On top of that, international students (the most > likely victims of "UTF-8 by default" because students are the biggest > Python users) typically have non-Japanese locales set on their > imported computers. Hmm, Which OS do they use? There are no problem in macOS and Windows. Do they use Linux with locale with encoding other than UTF-8, and their terminal emulator uses non-UTF-8 encoding? As my feeling, UTF-8 start dominating from about 10 years ago, and ja_JP.EUC_JP (it was most common locale for Japanese befoer UTF-8) is complete legacy. There is only one machine (which is in LAN, lives from 10+ years ago, /usr/bin/python is Python 1.5!), I can ssh which has ja_JP.eucjp locale. _______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
