On Tue, 2010-12-28 at 07:08 +0100, Stefan Behnel wrote: > Roy Smith, 28.12.2010 00:21: > > To go back to my earlier example of > > <Parental-Advisory>FALSE</Parental-Advisory> > > using 432 bits to store 1 bit of information, stuff like that doesn't > > happen in marked-up text documents. Most of the file is CDATA (do they > > still use that term in XML, or was that an SGML-ism only?). The markup > > is a relatively small fraction of the data. I'm happy to pay a factor > > of 2 or 3 to get structured text that can be machine processed in useful > > ways. I'm not willing to pay a factor of 432 to get tabular data when > > there's plenty of other much more reasonable ways to encode it. > If the above only appears once in a large document, I don't care how much > space it takes. If it appears all over the place, it will compress down to > a couple of bits, so I don't care about the space, either.
+1 > It's readability that counts here. Try to reverse engineer a binary format > that stores the above information in 1 bit. I think a point many of the arguments against XML miss is the HR cost of custom solutions. Every time you come up with a cool super-efficient solution it has to be weighed against the increase in the tool-stack [whereas XML is, essentially, built-in] and nobody-else-knows-about-your-super-cool-solution [1]. IMO, tool-stack bloat is a *big* problem in shops with an Open Source tendency. Always tossing the new and shiny thing [it's free!] into the bucket for some theoretical benefit. [This is an unrecognized benefit to expensive software - it creates focus]. Soon the bucket is huge and maintaining it becomes a burden. [1] The odds you sufficiently documented your super-cool-solution is probably nil. So I'm one of those you'd have to make a *really* good argument *not* to use XML. XML is known, the tools are good, the knotty problems are solved [thanks to the likes of SAX, lxml / ElementTree, and ElementFlow]. If the premise argument is "bloat" I'd probably dismiss it out of hand since removing that bloat will necessitate adding bloat somewhere else; that somewhere else almost certainly being more expensive. -- http://mail.python.org/mailman/listinfo/python-list