Stefan, I'm honored by your response. You are correct about the bad xml. I attempted to shorten the xml for this example as there are other tags unrelated to this issue in the mix. Based on your feedback, I was able to make following fully functional code using some different techniques:
from lxml import etree from StringIO import StringIO import random sourceXml = "\ <theroot>\ <contents>Stefan's fortune cookie:</contents>\ <random>\ <item>\ <random>\ <item>\ <contents>You will always know love.</contents>\ </item>\ <item>\ <contents>You will spend it all in one place.</contents>\ </item>\ </random>\ </item>\ <item>\ <contents>Your life comes with a lifetime warrenty.</contents>\ </item>\ </random>\ <contents>The end.</contents>\ </theroot>" parser = etree.XMLParser(ns_clean=True, recover=True, remove_blank_text=True, remove_comments=True) tree = etree.parse(StringIO(sourceXml), parser) xml = tree.getroot() def reduceRandoms(xml): for elem in xml: if elem.tag == "random": elem.getparent().replace(elem, random.choice(elem)[0]) reduceRandoms(xml) reduceRandoms(xml) for elem in xml: print elem.tag, ":", elem.text One challenge that I face now is that I can only replace a parent element with a single element. This isn't a problem if an <item> element only has 1 <contents> element, or just 1 <random> element (this works above). However, if <item> elements have more than one child element such as a <contents> element, followed by a <random> element (like children of <theroot>), only the first element is used. Any thoughts on how to replace+append after the replaced element, or clear+append multiple elements to the cleared position? Thanks again :) -- http://mail.python.org/mailman/listinfo/python-list