RFC 248 (v1) enhanced groups in pack/unpack

Perl6 RFC Librarian Sun, 17 Sep 2000 22:34:31 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

enhanced groups in pack/unpack

=head1 VERSION

  Maintainer: Ilya Zakharevich <[EMAIL PROTECTED]>
  Date: 16 September 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 248
  Version: 1
  Status: Developing

=head1 ABSTRACT

This RFC makes pack/unpack builtins able to handle non-flat lists.  Four
additional action-types in TEMPLATES cover most paradigms in binary storage
of data: data groups, required group headers, position-specific "named"
data, arbitrary-order fields.

=head1 DESCRIPTION

Currently pack() encodes data supplied in a flat list, and unpack() extracts
data into a flat list.  This proposal introduces I<marked groups> in TEMPLATES
which allow to handle more complicated structured data.

A I<group> is introduced by a type C<'g'> followed by matched parentheses.

=over

=item C<g(TEMPLATE)>

behaves the same as the C<(TEMPLATE)>-group.  (Syntactic sugar, optional);

=item C<g.(TEMPLATE) REST>

"lookahead".  On unpack(): stores a position in the string, extracts some
substring using C<TEMPLATE>, then restores the position and extract using
REST.  On pack(): the inverse operation.

For example, in C<struct { short count, flags; char data[1];}> the C<count>
is not immediately preceeding C<data>, thus one cannot use C<n/a*> to extract
the variable-length C<data>.  But this works:

  ($flags, $data) = unpack <<EOP, $string
    g.( x[n] n )        # Skip count, extract flags, return back
    n x[n]              # extract count, skip flags
    /a*                 # extract the data using 'count'
  EOP

=item C<g[TEMPLATE]>

pack() expects the corresponding argument to be an array reference.
C<TEMPLATE> is applied to the data inside this array.  unpack() extracts
data described by C<TEMPLATE> and puts it into an array reference;

=item C<g{TEMPLATE}>

similar to C<g[TEMPLATE]>, but for hash references.

=item C<g\(TEMPLATE)>

similar to C<g[TEMPLATE]>, but for scalar references.

=item C<K>

"Packed string contains less data than the Perl data."

Should be used with a multiplier $LEN, as in C<K[3]int>.  The following
$LEN characters of the template form the "key" (as in Hollerit (sp?) code).
For unpack() the "key" is put into the output list.  For pack(), if
inside C<g{TEMPLATE}>: use the value of the given key as the next data to
pack:

  pack 'g{ K[3]int N K[4]char C }', $hashref

acts the same as

  pack 'g{ N C }', $hashref->{int}, $hashref->{char}

For pack() when not in C<g{TEMPLATE}>: skip the following argument (and
ignore the key!).  Optional: a C<K!> variant: it is an error if
the skipped argument does not coincide with the key.

Check that the actions on pack() and unpack() are indeed inverse to each other!

=item C<k>

"Packed string contains more data than the Perl data."

Opposite to C<K>: the syntax is the same, but on pack() the key is used as
if it were the next-argument-to-process.  On unpack() the next extracted
value is not put in the output list.  This creates C<"list46789">:

  pack 'k[4]list A* g[ A/(A)* ]', [6..9]

Optional: a C<k!> variant: it is an error if the skipped extracted data
does not coincide with the key.

(This is useful less often than C<K>, but needed time to time.)

=back

=head1 MIGRATION ISSUES

None.

=head1 IMPLEMENTATION

Straightforward.

=head1 REFERENCES

RFC 142: Enhanced Pack/Unpack

RFC 246: pack/unpack uncontrovercial enhancements

RFC 247: pack/unpack C-like enhancements
RFC 248 (v1) enhanced groups in pack/unpack

Reply via email to