Hi, thanks for the reply.
However, I get strange behavior when I try to feed text that must be unicode
to altavista for translation.
Just before sending, I've got the following on the log using
print "RECV DATA: ", repr(data)
and after entering "então" ("so" in Portuguese)
RECV DATA: 'right: ent\xc3\xa3o?'
Sent Message to Client Nr. 1
CONTENT: ['right', ' ent\xc3\xa3o?']
Above before the CONTENT printout, there is a data.split(":")
Now right before sending the data to be translated by altavista I print out
from the CONTENT[1] which yields:
Translating: então?
Which I find odd. Obvisouly, feeding this into babelfish results in a failed
translation. So before sending I try to encode it like you suggest.
try:
print "Translating: ", content[1]
decoded = content[1].encode('utf8')
print "Decoding Prior to Translating: ", decoded
except Exception, e:
print "EXCEPTION ENCODING ", e
try:
translated = translate(decoded, src_l, dest_l)
except Exception, e:
print "EXCEPTION TRANSLATING ", e
translated = "translation failed"
The Exception thrown is:
EXCEPTION ENCODING 'ascii' codec can't decode byte 0xc3 in position 4:
ordinal
not in range(128)
I was dealing w/ a Ascii string and was asking it to be encoded in UTF,
whereas Python is telling me he can't encode it in UTF?? Makes little sense
to me.
Chrs
j.
From: Kent Johnson <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
CC: tutor@python.org
Subject: Re: [Tutor] i18n on Entry widgets
Date: Wed, 17 Aug 2005 13:27:24 -0400
Jorge Louis de Castro wrote:
Hi,
How do I set the encoding of a string? I'm reading a string on a Entry
widget and it may use accents and other special characters from languages
other than English.
When I send the string read through a socket the socket is automatically
closed. Is there a way to encode any special characters on a string?
First you have to know what the encoding is of the string you get from the
Entry. IIRC a Tkinter widget will give you an ASCII string if possible,
otherwise a Unicode string. You could check this by
print repr(data)
where data is the string you get from the Entry.
Next you have to encode the unicode string to the encoding you want on the
socket. If you want utf-8, you would use
socket_data = data.encode('utf-8')
This will work if data is ASCII or Unicode. There are many other supported
encodings; see http://docs.python.org/lib/standard-encodings.html for a
list.
Kent
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor