Martin Hosken <martin_hos...@sil.org> added the comment:
Sorry. This test is rather long because it is 3 tests: from __future__ import print_function import sys import xml.etree.ElementTree as et import xml.etree.cElementTree as cet from io import StringIO teststr = u"""<?xml version="1"?> <root> <child> Hello <!-- Greeting --> World </child> </root>""" testf = StringIO(teststr) if len(sys.argv) >= 2 and 'a' in sys.argv[1]: testf.seek(0) for event, elem in et.iterparse(testf, events=["end", "comment"]): if event == 'end': print(elem.tag + ": " + str(elem.text)) elif event == 'comment': print("comment: " + elem.text) if len(sys.argv) < 2 or 'b' in sys.argv[1]: testf.seek(0) def doComment(data): parser.parser.StartElementHandler("!--", ('text', data)) parser.parser.EndElementHandler("!--") parser = et.XMLParser() parser.parser.CommentHandler = doComment for event, elem in et.iterparse(testf, parser=parser): if hasattr(elem, 'text'): print(elem.tag + ": " + str(elem.text)) else: print(elem.tag + ": " + elem.get('text', "")) if len(sys.argv) < 2 or 'c' in sys.argv[1] or 'd' in sys.argv[1]: testf.seek(0) useet = et if len(sys.argv) < 2 or 'c' in sys.argv[1] else cet class CommentingTb(useet.TreeBuilder): def __init__(self): self.parser = None def comment(self, data): self.parser.parser.StartElementHandler("!--", ('text', data)) self.parser.parser.EndElementHandler("!--") tb = CommentingTb() parser = useet.XMLParser(target=tb) tb.parser = parser kw = {'parser': parser} if len(sys.argv) < 2 or 'c' in sys.argv[1] else {} for event, elem in useet.iterparse(testf, **kw): if hasattr(elem, 'text'): print(elem.tag + ": " + str(elem.text)) else: print(elem.tag + ": " + elem.get('text', "")) Test 'a' is how I would like to write the solution to my problem. Not sure why 'comment' isn't supported by iterparse directly, but hey. Test 'b' is how I solved in it python2 Test 'c' is how I would have to solve it in python3 if it worked Test 'd' is the same as 'c' but uses cElementTree rather than ElementTree. Results: Success output for a test is: ``` !--: None child: Hello root: ``` Python2: a Fails (obviously) b Succeeds c Succeeds d Fails: can't inherit from cElementTree.TreeBuilder Python3: a Fails (obviously) b Fails: XMLParser has no attribute 'parser' c Fails: event handling only supported for ElementTree.TreeBuilder targets d Fails: Gives output but no initial comment component (line 1) The key failure here is Python3 'c'. This is what stops any hope of comment handling using the et.XMLParser. The only way I could get around it was to use my own copy from the source code. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34600> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com