paul wrote: > However, this will change in py3k..., > what's the new rule of thumb?
In py3k, the str type will be what unicode is now, and there will be a new type called bytes for holding binary data -- including text in some external encoding. These two types will not be compatible. At the lowest level, reading a file will return bytes, which then have to be decoded to produce a (unicode) str, and a str will have to be encoded into bytes before being written to a file. There will be wrappers for text files that perform the decoding and encoding automatically, but they will need to be set up to use a specified encoding if you're dealing with anything other than ascii. (It may be possible to set up a system-wide default, I'm not sure.) So you won't be able to get away with ignoring encoding issues in py3k. On the plus side, it should all be handled in a much more consistent and less error-prone way. If you mistakenly try to use encoded data as though it were decoded data or vice versa, you'll get a type error. -- Greg -- http://mail.python.org/mailman/listinfo/python-list