Progress! You managed to change the error message. File "./acc_test_script_generator.py", line 106, in loadData print u.encode('utf-8') AttributeError: Utterance instance has no attribute 'encode'
I'm missing somethign really obvious here, but I don't know what it is... Diez B. Roggisch wrote: > [EMAIL PROTECTED] schrieb: > > HELP! > > Guy who was here before me wrote a script to parse files in Python. > > > > Includes line: > > print u > > where u is a line from a file we are parsing. > > However, we have started recieving data from Brazil. If I open file to > > parse in VI, looks like: > > > > <Utt id="3" transcribe="yes" audioRoot="A1" > > audio="313-20070102144528.wav" grammarSet="G3" rawText="não" > > recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0" > > transcribedText="não" parsableText="não"/ > > > > Clearly those "nã" are some non-Ascii characters, but how do I get > > print to understand that? > > > > I keep getting: > > "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in > > position 40: > > ordinal not in range(128)" > > > > Does the error happen at the > > print u > > line? If yes, what happens is that you try and print a unicode object. > Which means that it has to be converted (actually the right term is > encoded) to a byte-string. If you don't do that explicitely, it will be > done implicitly, using the default encoding - which is ascii. > > If you have non-ascii characters, you end up with the error you see. > > What to do? Use something like this: > > print u.encode('utf-8') > > instead. > > Diez -- http://mail.python.org/mailman/listinfo/python-list