Hi All! I have a very simple (and probably stupid) question eluding me. When exactly is the char-set information needed?
To make my question clear consider reading a file. While reading a file, all I get is basically an array of bytes. Now suppose a file has 10 bytes in it (all is data, no metadata, forget the BOM and stuff for a little while). I read it into an array of 10 bytes, replace, say, 2nd bytes and write all the bytes back to a new file. Do i need the character encoding mumbo jumbo anywhere in this? Further, does anything, except a printing device need to know the encoding of a piece of "text"? I mean, as long as we are not trying to get a symbolic representation of a "text" or get "i"th character of it, all we need to do is to carry the intended encoding as an auxiliary information to the data stored as byte array. Right? --shashank
-- http://mail.python.org/mailman/listinfo/python-list