On 12 sep, 10:17, Gary Herron <gher...@islandtraining.com> wrote: > On 09/12/2011 12:49 AM, Alec Taylor wrote: > > > > > Good evening, > > > I have converted ODT to HTML using LibreOffice Writer, because I want > > to convert from HTML to Creole using python-creole. Unfortunately I > > get this error: "File "Convert to Creole.py", line 17 > > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py > > on line 18, but no encoding declared; see > >http://www.python.org/peps/pep-0263.htmlfor details". > > > Unfortunately I can't post my document yet (it's a research paper I'm > > working on), but I'm sure you'll get the same result if you write up a > > document in LibreOffice Writer and add some End Notes. > > > How do I automate the removal of all non-ascii characters from my code? > > > Thanks for all suggestions, > > > Alec Taylor >
The coding of the characters is a domain per se. It is independent from any OS's or applications. When working with (plain) text files, you should always be aware about the coding of the text you are working on. If you are using coding directives, you must ensure your coding directive matches the real coding of the text files. A coding directive is only informative, it does not set the coding. I'm pretty sure, you problem comes from this. There is a mismatch somewhere, you are not aware of. Removing ascii chars is certainly not a valuable solution. It must work. If your are working properly, it can not, not work. Frome a linguistic point of view, the web has informed me Creole (*all the Creoles*) can be composed with the iso-8859-1 coding. That means, iso-8859-1, cp1252 and all Unicode coding variants are possible coding directives. jmf -- http://mail.python.org/mailman/listinfo/python-list