Ezio Melotti <ezio.melo...@gmail.com> added the comment:

HTMLParser is supposed to follow the HTML5 standard, and never raise an error.

For the example in the first comment ("<![hi world]>"), the steps should be:

* https://html.spec.whatwg.org/multipage/parsing.html#data-state:tag-open-state
* 
https://html.spec.whatwg.org/multipage/parsing.html#tag-open-state:markup-declaration-open-state
* 
https://html.spec.whatwg.org/multipage/parsing.html#markup-declaration-open-state:bogus-comment-state
* https://html.spec.whatwg.org/multipage/parsing.html#bogus-comment-state

I agree that the error should be fixed by setting `match` to None, and a test 
case that triggers the UnboundLocalError (before the fix) should be added as 
well (what provided by Karthikeyan looks good).

However, it also seems wrong that HTMLParser ends up calling self.error() 
through  Lib/_markupbase.py ParserBase after HTMLParser.error() and all the 
calls to it have been removed.  _markupbase.py is internal, so it should be 
safe to remove ParserBase.error() and the code that calls it as suggested in 
#31844 (and possibly to merge _markupbase into html.parser too).  Even if this 
is done and the call to self.error() is removed from 
ParserBase.parse_marked_section(), `match` still needs to be set to None 
(either in the `else` branch or before the `if/elif` block).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34480>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to