Re: (Simple?) Unicode Question

Rami Chowdhury Thu, 27 Aug 2009 09:46:12 -0700

Further, does anything, except a printing device need to know the
encoding of a piece of "text"?

I may be wrong, but I believe that's part of the idea between separationof string and bytes types in Python 3.x. I believe, if you are usingPython 3.x, you don't need the character encoding mumbo jumbo at all ;-)

If you're using Python 2.x, though, I believe if you simply set the fileopening mode to binary then data you read() should still be treated as anarray of bytes, although you may encounter issues trying to access then'th character.


Please do correct me if I'm wrong, anyone.

On Thu, 27 Aug 2009 09:39:06 -0700, Shashank Singh<[email protected]> wrote:

Hi All!

I have a very simple (and probably stupid) question eluding me.
When exactly is the char-set information needed?

To make my question clear consider reading a file.
While reading a file, all I get is basically an array of bytes.

Now suppose a file has 10 bytes in it (all is data, no metadata,

forget the BOM and stuff for a little while). I read it into an array of10

bytes, replace, say, 2nd bytes and write all the bytes back to a new
file.

Do i need the character encoding mumbo jumbo anywhere in this?

Further, does anything, except a printing device need to know the
encoding of a piece of "text"? I mean, as long as we are not trying
to get a symbolic representation of a "text" or get "i"th character
of it, all we need to do is to carry the intended encoding as
an auxiliary information to the data stored as byte array.

Right?

--shashank




--
Rami Chowdhury

"Never attribute to malice that which can be attributed to stupidity" --Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list

Re: (Simple?) Unicode Question

Reply via email to