[ Fredrik Lundh ] [ ... ]
> the iterparse/clear approach works best if your XML file has a > record-like structure. if you have toplevel records with lots of > schnappi records in them, iterate over the records and use find > (etc) to locate the subrecords you're interested in: (...) The problem is that the file looks like this: <data> <schnappi> <color>green</color> <friends> <friend> <id>Lama</id> <color>white</color> </friend> <friend> <id>mother schnappi</id> <color>green</color> </friend> </friends> <food> <id>human</id> <id>rabbit</id> </food> </schappi> <schnappi> <!-- something interesting --> </schnappi> <!-- 60,000 more schnappis --> </data> ... and there is really nothing above <schnappi>. The "something interesting" part consists of a variety of elements, and calling findall for each of them although possible, would probably be unpractical (say, distinguishing <friend>'s colors from <schnappi's>). Conceptually I need a "XML subtree iterator", rather than an XML element iterator. <schnappi>-elements are the ones having a complex internal structure, and I'd like to be able to speak of my XML as a sequence of Python objects representing <schnappi>s and their internal structure. [ ... ] > (I've reorganized the code a bit to cut down on the operations. also > note the "is" trick; iterparse returns the event strings you pass > in, so comparing on object identities is safe) Neat trick. Thank you for your input, ivr -- "...but it's HDTV -- it's got a better resolution than the real world." -- Fry, "When aliens attack" -- http://mail.python.org/mailman/listinfo/python-list