> On Behalf Of Fabian Lopez > like ^�u�u啖啖才是�w.���扉L锍才是�� or ヘアアイロン... The problem is that I get
Just thought I'd point out here that the second string is Japanese, not Chinese. >From your second post, it appears that you've parsed the text without problems -- it's when you go to print them out that you get the error. This is no doubt because your default encoding can't handle Chinese/Japanese characters. I can imagine several ways to fix this, including encoding the text in utf-8 for printout. If you really want to strip out Asian characters, here's a way: def strip_asian(text): """"Returns the Unicode string text, minus any Asian characters""" return u''.join([x for x in text if ord(x) < 0x3000]) Regards, Ryan Ginstrom
-- http://mail.python.org/mailman/listinfo/python-list