BeautifulSoup 4 and HTML5parser are known to not play well together. I have a workaround for that. See
https://bugs.launchpad.net/beautifulsoup/+bug/1430633 This isn't a fix; it's a postprocessor to fix broken BS4 trees. This is for use until the BS4 maintainers fix the bug. John Nagle -- https://mail.python.org/mailman/listinfo/python-list