Re: DRAFT RFC: Enhanced Pack/Unpack

Glenn Linderman Thu, 03 Aug 2000 09:28:27 -0700
Bart Lateur wrote:

> On Wed, 02 Aug 2000 12:22:09 -0700, Glenn Linderman wrote:
>
> >  [ 'bar' => 'integer32', 'baz' => 'integer32', 'count' => 'integer32' ]
> >
> >  [ 'var1' => 'int32', 'var2' => 'int16', 'var3' => 'int8' ]
>
> That doesn't reconsider BigEndian vs. LittleEndian, AKA pack/unpack 'N'
> vs. pack/unpack 'V'.

Bart,

It certainly didn't make explicit how to deal with BigEndian vs.
LittleEndian.  I don't think there is anything there that prevents dealing
with BigEndian vs. Little Endian, either.  Let's give it a whirl....

Whirl #1: pack/unpack treat big/littleEndian as different types, we could
too:

   [ 'var1' => 'bigendian32', 'var2' => 'littleendian32' ]

In doing so, we (like pack/unpack) either limit big/littleEndian issues to
integers, or need to define 3 types of each type (type, bigendiantype,
littleendiantype).  This gets boring, if nothing else.  This also fits my
prior proposal without change.  But not changing that proposal isn't the
goal, getting it right is.


Whirl #2: Under the assumption that not only are integers, but large
characters, floating point numbers, and all multi-byte scalar types affected
by the endianness of the machine (this is how most machines I'm aware of deal
with it), and under the assumption that a data structure is generally
endianness consistent within itself (this would seem to handle common usage
I'm aware of), then endianness could be factored out of the issue in the
Structure::new call:

   $foo = new Structure ( definition, 'endianness' => 'local' )
   $foo = new Structure ( definition, 'endianness' => 'big' )
   $foo = new Structure ( definition, 'endianness' => 'little' )
   $foo = new Structure ( definition, 'endianness' => 'network' )
   $foo = new Structure ( definition, 'endianness' => 'vax' )

Where each type insertion/extraction function would then have to be aware of
endianness (at least have the ability to die if it can't deal with it),
probably via an extra parameter passed to it.


Whirl #3: Under the assumption that all multi-byte types are affected, but
that data structures might mix the endianness, you could require explicit
endianness specification for every scalar data element in every structure
that doesn't want the default endianness (probably local would be default).
This could be done via:

  [ 'var1' => [ 'int', 32, 'big' ], 'var2' => [ 'int', 16, 'little' ],
         'var3' => 'int32' ]

or

  [ 'var1' => [ 'int32', 'big'], 'var2' => [ 'int16', 'little' ],
         'var3' => 'int32' ]

I show two different ways of factoring type, size, and endianness in these
examples, perhaps both are useful.  Certainly it is convenient for normal
types to have type and size specified with as little punctuation and verbage
as possible, as shown by 'var3' in both examples.  But if more info is needed
(endianness) perhaps a complete factoring of the concepts is a good idea (the
first of the above two examples)?  This just extends my prior proposal,
upward compatible with it.


Whirl #4:  I suppose you could take it as far as

   [ 'var1' => [ 'type' => 'int', 'size' => 32, 'endianness' => 'big' ]]

I think this goes to far, but showing the example is adequate proof of that,
perhaps.


Orthogonal comments:

Note that if this seems too clumsy, you could define a new, terser syntax,
and an interpreter for it.  Either within Perl, or within a module that wraps
this.  With a wrapper, all the keywords could get reduced for direct use in
Perl core.

Note that for commonly used types, this can be factored:

   $bi32 = [ 'integer', 32, 'big' ];
   $li16 = [ 'integer', 16, 'little' ];

  [ 'var1' => $bi32, 'var2' = $li16, 'var3' => 'integer32' ];

--
Glenn
=====
There  are two kinds of people, those
who finish  what they start,  and  so
on...                 -- Robert Byrne



_____NetZero Free Internet Access and Email______
   http://www.netzero.net/download/index.html
Re: DRAFT RFC: Enhanced Pack/Unpack

Reply via email to