On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote: > At 08:21 -0400 10/21/03, Dan Sugalski wrote: > > > I find the notion of an "XML header" a bit confusing, given Dan's > >> statement to the effect that it was a throw to XML folks. > >> > >> I think anything "XML folks" will be interested in will entail > >> *wrapping* stuff, not *prefixing* it. > > > >Nah, I expect what they'll want is for the entire data stream of > >serialized objects to be in XML format. Which is fine--they can have that. > >(It's why I mentioned the serialization routines can be overridden) > > > >For an XML stream the header might be <xml parrot format='xml' > >version=1.0> with the rest of the stream in XML. A YAML stream would start > ><xml parrot format='yaml' version=1.0> with the rest in YAML, and teh > >binary format as <xml parrot format='binary' version=1.0>. Or something > >like that, modulo actual correct XML. > > If you want that to be looking like valid XML, it would have to be different: > > error: Specification mandate value for attribute parrot > <xml parrot/> > ^ > Better in my opinion would be something like: > > <parrot format="xml" version="1.0"/>data yadda yadda yadda
I'm not an XML guy, and I'm making all this up as I go along. If that's better, fine with me. :) > >This way we have a single, fixed-format type/version header, which makes > >the initial identification easier and less error-prone. (Possibly even > >fit for file and programs of its ilk to note) The binary format won't > >care, and teh YAML format shouldn't care (as long as the indenting's > >right) but the XML format would, so it seems to make sense to use the XML > >stuff for the initial header. > > So are we talking about a header or a wrapper? If it is really a > header, it's not XML and then it's prettyy useless from an XML point > of view. We're talking about the first thing in a file (or stream, or whatever). I was under the impression that XML files should be entirely composed of valid XML, hence the need for the stream type marker being valid XML. YAML doesn't care as much, so far as I understand, and for our own internal binary format we cna do whatever we want. If that's not true, then we can go for a more compact header. Note that the serialized stream will be different depending on the encoder chosen. If you have the structure: $bar = 1; @foo[0] = \$bar; @foo[1] = "Baz"; The XML stream serializing @foo might look like: <XML type=parrot-xml version=1.0> <PMC name=foo> <type> PerlArray </type> <value> <PMC>bar</pmc> <string>Baz</string> </value> </PMC> <PMC name=bar> <type> PerlInt </type> <value> <integer>1</integer> </value> </PMC> Only not inevitably horribly broken, invalid, and poorly done. :) The YAML form might look like <XML type=parrot-yaml version=1.0> PMC: foo type: PerlArray values: pmc: bar string: Baz PMC: bar type: PerlInt values: integer:1 Once again, modulo my limited and inevitably incorrect YAML knowledge. So if the header says it's XML the whole thing is valid XML, while if it doesn't the rest of the stream doesn't have to be. (Just enough of the header so that an XML processing program can examine the stream and decide that the valid XML chunk at the beginning says that the rest of the stream's not XML) Basically we want some nice, fixed (mostly) thing at the head of the stream that doesn't vary regardless of the way the stream is encoded, and XML seemed to be the most restrictive of the forms I know people will clamor for. (I know, it means the stream can't be valid Lisp-style sexprs, but XML's more widespread :) Dan