Re: Hebrew in idle ans eclipse (Windows)

iu2 Tue, 22 Jan 2008 14:01:13 -0800

On Jan 17, 10:35 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> ...
>   print lines[0].decode("<a-encoding>").encode("<c-encoding>")
> ...
> Regards,
> Martin


Ok, I've got the solution, but I still have a question.

Recall:
When I read data using sql I got a sequence like this:
\x88\x89\x85
But when I entered heberw words directly in the print statement (or as
a dictionary key)
I got this:
\xe8\xe9\xe5

Now, scanning the encoding module I discovered that cp1255 maps
'\u05d9' to \xe9
while cp856 maps '\u05d9' to \x89,
so trasforming \x88\x89\x85 to \xe8\xe9\xe5 is done by

s.decode('cp856').encode('cp1255')

ending up with the pattern you suggested.

My qestion is, is there a way I can deduce cp856 and cp1255 from the
string itself? Is there a function doing it? (making the
transformation more robust)

I don't know how IDLE guessed cp856, but it must have done it.
(perhaps because it uses tcl, and maybe tcl guesses the encoding
automatically?)

thanks
iu2



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Hebrew in idle ans eclipse (Windows)

Reply via email to