Public bug reported: Given a minimal parser (below) and a particular input file (attached), iterparse is not returning the `tail` of the last `<span>` tag.
I am listening for the `end` event, which is the default, instead of the `start` event. Changing the input, for example by deleting unrelated tags such as the `<link>` tag in the `<head>`, causes the missing text to reappear. This makes it hard to produce a minified input! I was able to remove everything /after/ the element with the missing tail, which doesn't affect the bug, so that is what I attached. I took the silence on the mailing list to mean that I did not have any obvious problems with the way I was using iterparse. :) https://mailman- mail5.webfaction.com/pipermail/lxml/2017-April/007882.html --- ```python #!/usr/bin/env python3 import sys from lxml import etree for _, element in etree.iterparse(sys.argv[1], html=True): print(( element.tag, element.attrib, element.text, element.tail, )) ``` Invoke by: ```sh $ ./bug.py bug.html | grep "splays their blue cards left" ``` Expected output: ``` ('span', {'class': 'age e'}, '4', '.\n... Nnastya splays their blue cards left.\n') ``` Actual output: none, and return code 1. --- Python : sys.version_info(major=3, minor=5, micro=2, releaselevel='final', serial=0) lxml.etree : (3, 7, 3, 0) libxml used : (2, 9, 3) libxml compiled : (2, 9, 3) libxslt used : (1, 1, 29) libxslt compiled : (1, 1, 29) ** Affects: lxml Importance: Undecided Status: New ** Affects: lxml (Ubuntu) Importance: Undecided Status: New ** Attachment added: "Somewhat minified input that triggers bug" https://bugs.launchpad.net/bugs/1684273/+attachment/4865114/+files/bug.html ** Also affects: lxml (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1684273 Title: Missing tail in iterparse To manage notifications about this bug go to: https://bugs.launchpad.net/lxml/+bug/1684273/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs