In <[EMAIL PROTECTED]>, Chris Mellon
wrote:

> On 11 Jan 2007 13:28:14 -0800, [EMAIL PROTECTED]
> <[EMAIL PROTECTED]> wrote:
>
>> <Utt id="3" transcribe="yes" audioRoot="A1"
>> audio="313-20070102144528.wav" grammarSet="G3" rawText="n&#227;o"
>> recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0"
>> transcribedText="n&#227;o" parsableText="n&#227;o"/
>>
>> Clearly those "n&#227" are some non-Ascii characters, but how do I get
>> print to understand that?
>>
>> I keep getting:
>> "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
>> position 40:
>>  ordinal not in range(128)"
>>
> 
> Find out what encoding the files are in and modify the script to use it.

The problem is not the encoding of the files as you see they are decoded
to unicode strings by the XML reading part already.

Ciao,
        Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to