HELP! Guy who was here before me wrote a script to parse files in Python. Includes line: print u where u is a line from a file we are parsing. However, we have started recieving data from Brazil. If I open file to parse in VI, looks like:
<Utt id="3" transcribe="yes" audioRoot="A1" audio="313-20070102144528.wav" grammarSet="G3" rawText="não" recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0" transcribedText="não" parsableText="não"/ Clearly those "nã" are some non-Ascii characters, but how do I get print to understand that? I keep getting: "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 40: ordinal not in range(128)" -- http://mail.python.org/mailman/listinfo/python-list