2011/9/12 Alec Taylor <alec.tayl...@gmail.com>: > Good evening, > > I have converted ODT to HTML using LibreOffice Writer, because I want > to convert from HTML to Creole using python-creole. Unfortunately I > get this error: "File "Convert to Creole.py", line 17 > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py > on line 18, but no encoding declared; see > http://www.python.org/peps/pep-0263.html for details". > > Unfortunately I can't post my document yet (it's a research paper I'm > working on), but I'm sure you'll get the same result if you write up a > document in LibreOffice Writer and add some End Notes. > > How do I automate the removal of all non-ascii characters from my code? > > Thanks for all suggestions, > > Alec Taylor > -- > http://mail.python.org/mailman/listinfo/python-list >
It would obviously help to see the content of the line mentioned in the traceback (and probably its context); however, that value seems to correspond with â in some European encodings, in which case it would probably be part of some quoted unicode/string literal. (at least in python 2, in python3 it could also be a name of an object in python code, the traceback seems to be the same for both cases.) cf. >>> print '\xe2'.decode("iso-8859-1") â # and likewise for iso-8859-... 1,2,3,4; 9, 10, 14, 15, 16, some windows- encodings etc. Possibly (as previouslz suggested) adding the encoding information like iso-8859-1 or windows-1252 or others depending on other data etc. at the top of the source file might fix the error. Which would be certainly preferable to throwing all non ascii data away. Zou would add e.g. # -*- coding: iso-8859-1 -*- as the first or second line of the file. hth, vbr -- http://mail.python.org/mailman/listinfo/python-list