This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE enhanced groups in pack/unpack =head1 VERSION Maintainer: Ilya Zakharevich <[EMAIL PROTECTED]> Date: 16 September 2000 Mailing List: [EMAIL PROTECTED] Number: 248 Version: 1 Status: Developing =head1 ABSTRACT This RFC makes pack/unpack builtins able to handle non-flat lists. Four additional action-types in TEMPLATES cover most paradigms in binary storage of data: data groups, required group headers, position-specific "named" data, arbitrary-order fields. =head1 DESCRIPTION Currently pack() encodes data supplied in a flat list, and unpack() extracts data into a flat list. This proposal introduces I<marked groups> in TEMPLATES which allow to handle more complicated structured data. A I<group> is introduced by a type C<'g'> followed by matched parentheses. =over =item C<g(TEMPLATE)> behaves the same as the C<(TEMPLATE)>-group. (Syntactic sugar, optional); =item C<g.(TEMPLATE) REST> "lookahead". On unpack(): stores a position in the string, extracts some substring using C<TEMPLATE>, then restores the position and extract using REST. On pack(): the inverse operation. For example, in C<struct { short count, flags; char data[1];}> the C<count> is not immediately preceeding C<data>, thus one cannot use C<n/a*> to extract the variable-length C<data>. But this works: ($flags, $data) = unpack <<EOP, $string g.( x[n] n ) # Skip count, extract flags, return back n x[n] # extract count, skip flags /a* # extract the data using 'count' EOP =item C<g[TEMPLATE]> pack() expects the corresponding argument to be an array reference. C<TEMPLATE> is applied to the data inside this array. unpack() extracts data described by C<TEMPLATE> and puts it into an array reference; =item C<g{TEMPLATE}> similar to C<g[TEMPLATE]>, but for hash references. =item C<g\(TEMPLATE)> similar to C<g[TEMPLATE]>, but for scalar references. =item C<K> "Packed string contains less data than the Perl data." Should be used with a multiplier $LEN, as in C<K[3]int>. The following $LEN characters of the template form the "key" (as in Hollerit (sp?) code). For unpack() the "key" is put into the output list. For pack(), if inside C<g{TEMPLATE}>: use the value of the given key as the next data to pack: pack 'g{ K[3]int N K[4]char C }', $hashref acts the same as pack 'g{ N C }', $hashref->{int}, $hashref->{char} For pack() when not in C<g{TEMPLATE}>: skip the following argument (and ignore the key!). Optional: a C<K!> variant: it is an error if the skipped argument does not coincide with the key. Check that the actions on pack() and unpack() are indeed inverse to each other! =item C<k> "Packed string contains more data than the Perl data." Opposite to C<K>: the syntax is the same, but on pack() the key is used as if it were the next-argument-to-process. On unpack() the next extracted value is not put in the output list. This creates C<"list46789">: pack 'k[4]list A* g[ A/(A)* ]', [6..9] Optional: a C<k!> variant: it is an error if the skipped extracted data does not coincide with the key. (This is useful less often than C<K>, but needed time to time.) =back =head1 MIGRATION ISSUES None. =head1 IMPLEMENTATION Straightforward. =head1 REFERENCES RFC 142: Enhanced Pack/Unpack RFC 246: pack/unpack uncontrovercial enhancements RFC 247: pack/unpack C-like enhancements