Gregory Ewing <greg.ew...@canterbury.ac.nz>: > As a result, most unix programs, most of the time, deal > with text on stdin and stdout.
Well, ok. But even accepting that premise, that "text" might not be what Python3 considers "text". For example, if your program reads in XML, JSON or Python, the parser object might prefer to take it in as bytes and not have it predecoded by sys.stdin. > So, it makes sense for them to be text by default. I'm not sure. That could lead to nasty surprises. I've experienced analogous consternations when the "sort" utility hasn't worked identically for identical input: it is heavily influenced by the (spit, spit) locale. That's why 99.9% of your scripts should prefix "sort" and "grep" with LC_ALL=C -- even when the input really is UTF-8. Should I now take it further and prefix all Python programs with LC_ALL=C? Probably not, since UTF-8 might cause sys.stdin to barf. > And wherever there's text, there needs to be an encoding. No problem there, only should sys.stdin and sys.stdout carry the decoding/encoding out or should it be left for the program. Marko -- https://mail.python.org/mailman/listinfo/python-list