I have XML which looks like: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE KMART SYSTEM "my.dtd"> <LEVEL_1> <LEVEL_2 ATTR="hello"> <ATTRIBUTE NAME="Property X" VALUE ="2"/> </LEVEL_2> <LEVEL_2 ATTR="goodbye"> <ATTRIBUTE NAME="Property Y" VALUE ="NULL"/> <LEVEL_3 ATTR="aloha"> <ATTRIBUTE NAME="Property X" VALUE ="3"/> </LEVEL_3> <ATTRIBUTE NAME="Property Z" VALUE ="welcome"/> </LEVEL_2> </LEVEL_1>
The "Property X" string appears twice times and I want to output the "path" that leads to all such appearances. In this case the output would be: LEVEL_1 {}, LEVEL_2 {"ATTR": "hello"}, ATTRIBUTE {"NAME": "Property X", "VALUE": "2"} LEVEL_1 {}, LEVEL_2 {"ATTR": "goodbye"}, LEVEL_3 {"ATTR": "aloha"}, ATTRIBUTE {"NAME": "Property X", "VALUE": "3"} My actual XML file is 2000 lines and contains up to 8 levels of nesting. I have tried this so far (partial code, using the xml.etree.ElementTree module): def get_path(data_dictionary, val, path): for node in data_dictionary[CHILDREN]: if node[CHILDREN]: if not path or node[TAG] != path[-1]: path.append(node[TAG]) print(CR + "recursing ...") get_path(node, val, path) else: for k,v in node[ATTRIB].items(): if v == val: print("path- ",path) print("---- " + node[TAG] + " " + str(node[ATTRIB])) I'm really not even close to getting the output I am looking for. Python 3.2.2. Thank you.
-- http://mail.python.org/mailman/listinfo/python-list