New submission from Christopher Allen-Poole <christoph...@allen-poole.com>:
This is is encountered when extending html.parser.HTMLParser and running with strict mode False. Expected behavior: When '''<div style="" ><b>The <a href="some_url">rain</a> <br /> in <span>Spain</span></b></div>''' is passed to the feed method, div, b, a, br, and span should all be passed to the handle_starttag method. Actual behavior The handle_data method receives the values <div style="" >,<b>,<a href="some_url">,<br />,<span> in addition to the regular text. This can be fixed by changing this (inside the parse_starttag method): m = hparse.attrfind_tolerant.search(rawdata, k) to m = hparse.attrfind_tolerant.match(rawdata, k) ---------- components: Library (Lib) messages: 146479 nosy: Christopher.Allen-Poole priority: normal severity: normal status: open title: HTMLParser improperly handling open tags when strict is False type: behavior versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13273> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com