This and other RFCs are available on the web at
http://dev.perl.org/rfc/
=head1 TITLE
Distinguish packed binary data from printable strings
=head1 VERSION
Maintainer: Tim Conrow <[EMAIL PROTECTED]>
Date: 18 Sept 2000
Mailing List: [EMAIL PROTECTED]
Number: 258
Version: 1
Status: Developing
=head1 ABSTRACT
Perl should be able to distinguish between printable strings and
packed binary data stored as strings (presumed to not be printable
text) just as it can distinguish between numeric and non-numeric
strings. This would permit greater specificity in programming and thus
better error checking.
=head1 DESCRIPTION
Differentiating between packed strings and printable strings would
permit some useful error checking. A new scalar flag, which I'll call
BOK (for "Binary OK"), would be set in a scalar's flags by any builtin
function or operator that produces a string which is likely to be a
packed data structure of some sort rather than printable text or a
numeric value. These include C<pack>, C<read>, C<sysread>, C<vec>, the
multi-argument form of C<select>, and the string-context form of
operators C<<< >>, <<, |, &, ^, ~ >>>. (Others???)
With packed strings recognizable to the compiler and interpreter, they
would interact with functions, operators, and other relevant scalar
types as follows:
use warnings 'packed';
use strict 'packed';
$a = pack("A*","abc"); # $a(BOK) = 0; printable string
$a = pack("a*","123"); # $a(BOK) = 1; packed thing
print $a; # Promote via pack("a*",$a), issue warning
print "$a"; # Stringify $a; same as perl5
$b = pack($tmpl8,@more_data); # $b(BOK) = 1
$c = $a ^ $b; # String context xor. $c(BOK) = 1
$d = "255";
$e = "13";
$c = $d ^ $e; # Numeric context xor via promotion to numeric.
$d = $a ^ 255; # Error
$d = $a ^ "zzz"; # Promote via pack("a*","zzz"), but issue warning
$e = vec($a,12,4); # OK; $e is numeric; $e(BOK) = 0
$e = vec(123,1,8); # Error
vec(123,1,8) = 1; # Error
$binary = "\x04\x12";
$e = vec($binary,1,8); # Promote $binary, issue warning
vec($binary,1,8) = 1; # Promote $binary, no warning
select(undef,$rin=0,undef,0.25); # Error
select(undef,$rin="",undef,0.25); # Promote $rin, no warning
$a = pack("a*","123");
$b = pack("a*","456");
$c = $a + $b; # Error
$c = "$a" + "$b"; # Weird, but OK
$c = <STDIN>; # $c(BOK) = 0; $c is a normal string
sysread FOO,$a,16; # $a(BOK) = 1; $a is a packed thing
syswrite BAR,123; # Error
syswrite BAR,"123"; # Promote, issue warning
syswrite BAR,pack "a*","123"; # OK
if($a) ... # Always true
if($a eq "xxx") ... # Stringify, issue warning
if("$a" eq "xxx") ... # Stringify, no warning
if($a == 123) ... # Error
The exceptions for vec and select (string arg.s auto-promoted with
C<pack("a*",$str)> without warnings) are for backward compatibility.
I'm sure I haven't covered all the relevant cases, but I hope the
intent is clear: to cut down on the room for accidental use of
un/packed data in inappropriate circumstances and to increase the
ability of the user to be specific regarding intent. In particular,
accidentally using string context bit ops when meaning to use numeric,
or vice versa, would raise a warning, and mixing numeric and packed
arg.s would be an error.
If anyone knows of common constructs/idioms which would break under
this scheme and where it's too painful to add C<pack("a*",...)> or
C<"..."> as appropriate ... well I don't have to ask to have them
pointed out, do I? :-) The only cases I've been able to think of are
JAPHs or code samples.
If RFCs 73 and/or 161 end up being adopted, the idea of packed things
being distinct could be extended to allow additional
functionality. E.g.
$a = pack("a*","\x8f\x01"); # Save the template in the instance data
if(ref($a) eq "Packed") { ... }
$b = $a->unpack; # Use saved template
$a->STRING = sub { join ",",$_[0]->unpack; };
print "$a\n"; # Readable
If RFC 89 is adopted, a variable could be forced to hold only packed
things. E.g.
my packed $thingie : (template=>"lll");
$thingie->pack 123,456,789;
... or something like that.
How this would interact with RFCs 142,246-250 is TBD, but I see no
outright conflicts right off. This might dovetail well with RFC 159 to
allow un/packing on the fly based on context, but I'm not sure.
=head1 IMPLEMENTATION
I know almost nothing about internals, so this is probably wrong, but
see if I convey my meaning anyway.
=item *
With exceptions exemplified above, builtin operators and functions
which operate in a bitwise manner on their string arguments would
behave as follows:
NOK POK BOK
-------------
0 0 1 not possible
0 1 0 promote, warning if use warnings 'packed'
0 1 1 OK
1 0 0 error if use strict 'packed', otherwise
stringify and promote, warning if
use warnings 'packed'
1 0 1 not possible
1 1 0 promote, warning if use warnings 'packed'
1 1 1 OK
=item *
By way of an imperfect analogy, note the similarity between packed
strings having a BOK flag via C<pack> (and others) and regexs having
an ROK flag via C<qr()>. Implementation of BOK is, of course,
considerably simpler.
=item *
When translating code with p526, simply put
no warnings 'packed';
no strict 'packed';
at the top.
=head1 REFERENCES
RFC 73: All Perl core functions should return objects
RFC 89: Controllable Data Typing
RFC 142: Enhanced Pack/Unpack
RFC 159: True Polymorphic Objects
RFC 161: Everything in Perl becomes an Object
RFC 246-250: Various pack/unpack enhancements