After looking at the pyparsing results, I think I see the problem with your original code. You are selecting only the characters after the rightmost "-" character, but you really want to select everything to the right of "- -". In some of the titles, the encoded Chinese includes a "-" character, so you are chopping off everything before that.
Try changing your code to: title=full_title.split("- -")[1] I think then your original program will work. -- Paul -- http://mail.python.org/mailman/listinfo/python-list