On Sep 12, 4:49 am, Steven D'Aprano <steve +comp.lang.pyt...@pearwood.info> wrote: > On Mon, 12 Sep 2011 06:43 pm Stefan Behnel wrote: > > > I'm not sure what you are trying to say with the above code, but if it's > > the code that fails for you with the exception you posted, I would guess > > that the problem is in the "[more stuff here]" part, which likely contains > > a non-ASCII character. Note that you didn't declare the source file > > encoding above. Do as Gary told you. > > Even with a source code encoding, you will probably have problems with > source files including \xe2 and other "bad" chars. Unless they happen to > fall inside a quoted string literal, I would expect to get a SyntaxError. > > I have come across this myself. While I haven't really investigated in great > detail, it appears to happen when copying and pasting code from a document > (usually HTML) which uses non-breaking spaces instead of \x20 space > characters. All it takes is just one to screw things up. > > -- > Steven
Depending on the load, you can do something like: "".join([x for x in string if ord(x) < 128]) It's worked great for me in cleaning input on webapps where there's a lot of copy/paste from varied sources. -- http://mail.python.org/mailman/listinfo/python-list