"Full" element tag listing possible with Elementtree?

jaime . dyson Thu, 04 Sep 2008 23:31:22 -0700

Hello all,

I have the unenviable task of turning about 20K strangely formatted
XML documents from different sources into something resembling a
clean, standard, uniform format.  I like Elementtree and have been
using it to step through the documents to get a feel for their
structure.  .getiterator() gives me a depth-first traversal that
eliminates the hierarchy of the elements.  What I'd like is to be able
to traverse elements while keeping track of ancestors, and print out
the full structure of all of an ancestor's nodes as I arrive at each
node.  So, for example, if I had a document that looked like this:


<a>
  <b att="atttag" content="b"> this is node b </b>
  <c> this is node c
    <d />
    <e> this is node e </e>
  </c>
  <f> this is node f </f>
</a>

I would want to print the following:

<a>
<a> <b>
<a> <b> text: this is node b
<a> <c>
<a> <c> text: this is node c
<a> <c> <d>
<a> <c> <e>
<a> <c> <e> text: this is node e
<a> <f>
<a> <f> this is node f


Is there a simple way to do this?  Any help would be appreciated.
Thanks..

--
http://mail.python.org/mailman/listinfo/python-list

"Full" element tag listing possible with Elementtree?

Reply via email to