Yeah, that's what i don't know how to do. Chris Mellon wrote: > On 11 Jan 2007 13:28:14 -0800, [EMAIL PROTECTED] > <[EMAIL PROTECTED]> wrote: > > HELP! > > Guy who was here before me wrote a script to parse files in Python. > > > > Includes line: > > print u > > where u is a line from a file we are parsing. > > However, we have started recieving data from Brazil. If I open file to > > parse in VI, looks like: > > > > <Utt id="3" transcribe="yes" audioRoot="A1" > > audio="313-20070102144528.wav" grammarSet="G3" rawText="não" > > recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0" > > transcribedText="não" parsableText="não"/ > > > > Clearly those "nã" are some non-Ascii characters, but how do I get > > print to understand that? > > > > I keep getting: > > "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in > > position 40: > > ordinal not in range(128)" > > > > Find out what encoding the files are in and modify the script to use it.
-- http://mail.python.org/mailman/listinfo/python-list