New submission from Vojtěch Rylko <vojta.ry...@seznam.cz>: Hi,
I have file with 10 000 records of same element item (always same): $ head test.xml <channel> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> <item><section>Twitter</section></item> And run simply program for printing content of element section: $ python pulldom.py test.xml | head Twitter Twitter Twitter Twitter Twitter Twitter Twitter Twitter Twitter Twitter Seems work fine: $ python pulldom.py test.xml | wc -l 10000 But (in two cases of 10 000) gives me just "Twi" not Twitter: $ python pulldom.py test.xml | grep -v Twitter Twi Twi Why? This example program demonstrate big problems in my real application - xml.dom.pulldom is cutting content of some elements. Thanks for any advice Vojta Rylko --------------------------- Python 2.5.4 (r254:67916, Feb 10 2009, 14:58:09) [GCC 4.2.4] on linux2 --------------------------- pulldom.py: --------------------------- file=open(sys.argv[1]) events = pulldom.parse(file) for event, node in events: if event == pulldom.START_ELEMENT: if node.tagName == 'item': events.expandNode(node) print node.getElementsByTagName('section').item(0).firstChild.data ---------- components: XML messages: 117999 nosy: vojta.rylko priority: normal severity: normal status: open title: xml.dom.pulldom strange behavior type: behavior versions: Python 2.5 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10026> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com