Liu DongMiao added the comment:
i think this should not be a bug.
as we dont know the encoding of str, so we cannt deal with str and
unicode together.
in my example, str is in utf-8, so i need to convert unicode to str in
utf-8.
i will takes bones' suggestion.
--
status:
New submission from Liu DongMiao :
HTMLParser (Python 2.6.2) Cannot deal with mixture of arbitrary data and
character reference.
In line 365-373, replaceEntities(s) returns unichr(charref) in unicode,
which cannot be a mixture with arbitrary data in str.
A fix way: replace unichr(c) with