Thanks for the comments and thoughts. I must admit that I have an overwhelming feeling of having just stepped into the middle of a complex, heated conversation without having heard the preamble.
(FYI, this reply is only an attempt to help those that come afterwards -- I'm not looking to advocate much of anything here.) Fredrik's invocation of the "infoset" term led me to a couple of quick searches that clarified the state of play. Here he sets the stage for the .tail behaviour that I originally posted about: http://effbot.org/zone/element-infoset.htm And it looks like there have been tussles over other mismatches in expectations before, specifically around how namespaces are handled: http://groups.google.com/group/comp.lang.python/browse_thread/thread/ 31b2e9f4a8f7338c http://nixforums.org/ntopic43901.html From what I can see, there are more than a few people that have stumbled with ElementTree's API because of their preexisting expectations, which others have probably correctly bucketed as "implementation details". This comes as quite a shock to those who have stumbled (including myself) who have, lo these many years, come to view those details as the only standard that matters (perhaps simply because those details have been so consistent in our experience). Which, in my view, is just fine -- different strokes for different folks, and all that. When I originally started poking around the python xml world, I was somewhat confused as to why 4suite/Domlette existed, as it seemed pretty clear that ElementTree had crystallized a lot of mindshare, and has a very attractive API to boot. Thankfully, I can now see its appeal, and am very glad it's around, as it seems to have all of those comfortable implementation details that I've been looking for. :-) As for the infoset vs. "sequence of piggies" nut: if ElementTree's infoset approach is technically correct, then wouldn't it also be correct to use a .head attribute instead of a .tail attribute? Example: <a>first<b>middle</b>last</a> might be represented as: <Element a: head='', text='last'> <Element b: head='first', text='middle'> If I'm wrong, just chalk it up to the fact that this is the first time I've ever looked at the Infoset spec, and I'm simply confused. If that IS a technically-valid way to represent the above xml fragment . . . then I guess I'll make sure to tread more carefully in the future around tools that work in infoset terms. For me, it turns out that sequences of piggies really are important, at least in contexts where XML is merely a means to an end (either because of the attractiveness of the toolsets or because we must cope with what we're provided as input) and where consistency with existing tools (like those that adhere to DOM level 2/3) and expectations are critical. I think this is what Paul was nodding towards with his original response to Stefan's response. Cheers, - Chas On Nov 16, 2006, at 5:11 AM, Fredrik Lundh wrote: > Paul Boddie wrote: > >>> Yes, it is. Just look at the API. It's an attribute of an >>> Element, isn't it? >>> What other API do you know where removing an element from a data >>> structure >>> leaves part of the element behind? >> >> I guess it depends on what you regard an element to be... > > Stefan said "Element", not "element". > > "Element" is a class in the "ElementTree" module, which can be used to > *represent* an XML element in an XML infoset, including all the data > *inside* the XML element, and any data *between* that XML element and > the next one (which is always character data, of course). > > It's not very difficult, really; especially if you, as Stefan said, > think in infoset terms rather "a sequence of little piggies" terms. > > </F> -- http://mail.python.org/mailman/listinfo/python-list