[issue2892] improve cElementTree iterparse error handling

Hrvoje Nikšić Fri, 11 Jun 2010 01:54:39 -0700

Hrvoje Nikšić <hnik...@gmail.com> added the comment:

Here is a small test case that demonstrates the problem, expected behavior and 
actual behavior:


{{{
for ev in xml.etree.cElementTree.iterparse(StringIO('<x></x>rubbish'), 
events=('start', 'end')):
    print ev
}}}

The above code should first print the two events (start and end), and then 
raise the exception.  In Python 2.7 it runs like this:

{{{
>>> for ev in xml.etree.cElementTree.iterparse(StringIO('<x></x>rubbish'), 
>>> events=('start', 'end')):
...   print ev
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 84, in next
cElementTree.ParseError: junk after document element: line 1, column 7
}}}

Expected behavior, obtained with my patch, is that it runs like this:

{{{
>>> for ev in my_iterparse(StringIO('<x></x>rubbish'), events=('start', 'end')):
...  print ev
... 
('start', <Element 'x' at 0xb771cba8>)
('end', <Element 'x' at 0xb771cba8>)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 26, in __iter__
cElementTree.ParseError: junk after document element: line 1, column 7
}}}

The difference is, of course, only visible when printing events.  A 
side-effect-free operation, such as building a list using list(iterparse(...)) 
would behave exactly the same before and after the change.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2892>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2892] improve cElementTree iterparse error handling

Reply via email to