balaji marisetti wrote: > Hi, > > I'm trying to parse a pice of HTML code using `html.parser` in Python3. > I want to find out the offset of a particular end tag (let's say </p>) and > then stop processing > the remaining HTML code immediately. So I wrote something like this. > > [code] > def handle_endtag(self, tag): > if tag == mytag: > #do something > self.reset() > [code] > > I called `reset()` method at the end of `handle_endtag()` method. Now the > problem is: when I call parser.feed("some html"), it's giving an > "AssertionError" exception. Isn't the `reset()` method > supposed to be called inside "handler" methods?
Obviously not ;) After looking into the code I think there is no controlled way to stop parsing. I suggest that you raise a custom exception instead: import html.parser class TagFound(Exception): pass class MyParser(html.parser.HTMLParser): def handle_endtag(self, tag): if tag == wanted_tag: raise TagFound wanted_tag = "a" parser = MyParser() for data in ["<html><body><a></a></body></html>", "<html><body><b></b></body></html>"]: try: parser.feed(data) except TagFound: print("tag {!r} found".format(wanted_tag)) else: print("tag {!r} not found".format(wanted_tag)) parser.reset() -- https://mail.python.org/mailman/listinfo/python-list