Bugs item #1452246, was opened at 2006-03-17 06:57 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1452246&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Helmut Grohne (gnarfk) Assigned to: Nobody/Anonymous (nobody) Summary: htmllib doesn't properly substitute entities Initial Comment: I'd like to illustrate and suggest a fix by showing a simple python file (which was named htmllib2.py so you can uncomment the line in the doctest case to see that my fix works). It's more like a hack than the fix though: #!/usr/bin/env python2.4 """ Use this instead of htmllib for having entitydefs substituted in attributes,too. Example: >>> import htmllib # >>> import htmllib2 as htmllib >>> import formatter >>> import StringIO >>> s = StringIO.StringIO() >>> p = htmllib.HTMLParser(formatter.AbstractFormatter(formatter.DumbWriter(s))) >>> p.feed('<img alt="<>&">') >>> s.getvalue() '<>&' """ __all__ = ("HTMLParser",) import htmllib from htmlentitydefs import name2codepoint as entitytable entitytable = dict([(k, chr(v)) for k, v in entitytable.items() if v < 256]) def entitysub(s): ret = "" state = "" for c in s: if state.startswith('&'): if c == ';': ret += entitytable.get(state[1:], '%s;' % state) state = "" else: state += c elif c == '&': state = c else: ret += c return ret class HTMLParser(htmllib.HTMLParser): def handle_starttag(self, tag, method, attrs): """Repair attribute values.""" attrs = [(k, entitysub(v)) for (k, v) in attrs] method(attrs) if __name__ == '__main__': import doctest doctest.testmod() ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2006-03-31 20:16 Message: Logged In: YES user_id=31435 Thanks, Ray! Closing as Fixed. ---------------------------------------------------------------------- Comment By: Rares Vernica (rvernica) Date: 2006-03-31 20:13 Message: Logged In: YES user_id=1491427 This bug has been fixed on patch #1462498. Ray ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1452246&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com