I'm having some trouble inserting elements where I want them using the lxml ElementTree (Python 2.6). I presume I'm making some wrong assumptions about how lxml works and I'm hoping someone can clue me in.
I want to process an xml document as follows: For every occurrence of a particular element, no matter where it appears in the tree, I want to add a sibling to that element with the same name and a different value. Here's the smallest artificial example I've found so far demonstrates the problem: <foo> <whatever> <something/> </whatever> <bingo>Add another bingo after this</bingo> <bar/> </foo> What I'd like to produce is this: <foo> <whatever> <something/> </whatever> <bingo>Add another bingo after this</bingo> <bar/> </foo> Here's my program: -------- cut here ----- from lxml import etree as etree xml = """<?xml version="1.0" ?> <foo> <whatever> <something/> </whatever> <bingo>Add another bingo after this</bingo> <bar/> </foo> """ tree = etree.fromstring(xml) # A list of all "bingo" element objects in the unmodified original xml # There's only one in this example elems = tree.xpath("//bingo") # For each one, insert a sibling after it bingoCounter = 0 for elem in elems: parent = elem.getparent() subIter = parent.iter() pos = 0 for subElem in subIter: # Is it one we want to create a sibling for? if subElem == elem: newElem = etree.Element("bingo") bingoCounter += 1 newElem.text = "New bingo %d" % bingoCounter newElem.tail = "\n" parent.insert(pos, newElem) break pos += 1 newXml = etree.tostring(tree) print("") print(newXml) -------- cut here ----- The output follows: -------- output ----- <foo> <whatever> <something/> </whatever> <bingo>Add another bingo after this</bingo> <bar/> <bingo>New bingo 1</bingo> </foo> -------- output ----- Setting aside the whitespace issues, the bug in the program shows up in the positioning of the insertion. I wanted and expected it to appear immediately after the original "bingo" element, and before the "bar" element, but it appeared after the "bar" instead of before it. Everything works if I take the "something" element out of the original input document. The new "bingo" appears before the "bar". But when I put it back in, the inserted bingo is out of order. Why should that be? What am I misunderstanding? Is there a more intelligent way to do what I'm trying to do? Thanks. Alan -- http://mail.python.org/mailman/listinfo/python-list