Re: Python nuube needs Unicode help

Diez B. Roggisch Thu, 11 Jan 2007 15:26:01 -0800

[EMAIL PROTECTED] schrieb:
> HELP!
> Guy who was here before me wrote a script to parse files in Python.
> 
> Includes line:
> print u
> where u is a line from a file we are parsing.
> However, we have started recieving data from Brazil. If I open file to
> parse in VI, looks like:
> 
> <Utt id="3" transcribe="yes" audioRoot="A1"
> audio="313-20070102144528.wav" grammarSet="G3" rawText="n&#227;o"
> recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0"
> transcribedText="n&#227;o" parsableText="n&#227;o"/
> 
> Clearly those "n&#227" are some non-Ascii characters, but how do I get
> print to understand that?
> 
> I keep getting:
> "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
> position 40:
>  ordinal not in range(128)"
>


Does the error happen at the

print u

line? If yes, what happens is that you try and print a unicode object. 
Which means that it has to be converted (actually the right term is 
encoded) to a byte-string. If you don't do that explicitely, it will be 
done implicitly, using the default encoding - which is ascii.

If you have non-ascii characters, you end up with the error you see.

What to do? Use something like this:

print u.encode('utf-8')

instead.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python nuube needs Unicode help

Reply via email to