On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

> At 08:21 -0400 10/21/03, Dan Sugalski wrote:
> >  > I find the notion of an "XML header" a bit confusing, given Dan's
> >>  statement to the effect that it was a throw to XML folks.
> >>
> >>  I think anything "XML folks" will be interested in will entail
> >>  *wrapping* stuff, not *prefixing* it.
> >
> >Nah, I expect what they'll want is for the entire data stream of
> >serialized objects to be in XML format. Which is fine--they can have that.
> >(It's why I mentioned the serialization routines can be overridden)
> >
> >For an XML stream the header might be <xml parrot format='xml'
> >version=1.0> with the rest of the stream in XML. A YAML stream would start
> ><xml parrot format='yaml' version=1.0> with the rest in YAML, and teh
> >binary format as <xml parrot format='binary' version=1.0>. Or something
> >like that, modulo actual correct XML.
>
> If you want that to be looking like valid XML, it would have to be different:
>
> error: Specification mandate value for attribute parrot
> <xml parrot/>
>            ^
> Better in my opinion would be something like:
>
> <parrot format="xml" version="1.0"/>data yadda yadda yadda

I'm not an XML guy, and I'm making all this up as I go along. If that's
better, fine with me. :)

> >This way we have a single, fixed-format type/version header, which makes
> >the initial identification easier and less error-prone. (Possibly even
> >fit for file and programs of its ilk to note) The binary format won't
> >care, and teh YAML format shouldn't care (as long as the indenting's
> >right) but the XML format would, so it seems to make sense to use the XML
> >stuff for the initial header.
>
> So are we talking about a header or a wrapper?  If it is really a
> header, it's not XML and then it's prettyy useless from an XML point
> of view.

We're talking about the first thing in a file (or stream, or whatever). I
was under the impression that XML files should be entirely composed of
valid XML, hence the need for the stream type marker being valid XML. YAML
doesn't care as much, so far as I understand, and for our own internal
binary format we cna do whatever we want. If that's not true, then we can
go for a more compact header.

Note that the serialized stream will be different depending on the encoder
chosen. If you have the structure:

  $bar = 1;
  @foo[0] = \$bar;
  @foo[1] = "Baz";

The XML stream serializing @foo might look like:

  <XML type=parrot-xml version=1.0>
  <PMC name=foo>
     <type>
       PerlArray
     </type>
     <value>
        <PMC>bar</pmc>
        <string>Baz</string>
     </value>
  </PMC>
  <PMC name=bar>
    <type>
      PerlInt
    </type>
    <value>
      <integer>1</integer>
    </value>
  </PMC>

Only not inevitably horribly broken, invalid, and poorly done. :) The YAML
form might look like

  <XML type=parrot-yaml version=1.0>
  PMC: foo
    type: PerlArray
    values:
      pmc: bar
      string: Baz
  PMC: bar
    type: PerlInt
    values:
      integer:1

Once again, modulo my limited and inevitably incorrect YAML knowledge. So
if the header says it's XML the whole thing is valid XML, while if it
doesn't the rest of the stream doesn't have to be. (Just enough of the
header so that an XML processing program can examine the stream and decide
that the valid XML chunk at the beginning says that the rest of the
stream's not XML)

Basically we want some nice, fixed (mostly) thing at the head of the
stream that doesn't vary regardless of the way the stream is encoded, and
XML seemed to be the most restrictive of the forms I know people will
clamor for. (I know, it means the stream can't be valid Lisp-style sexprs,
but XML's more widespread :)

                                                Dan

Reply via email to