RFC 73 (v1) All Perl core functions should return ob

Perl6 RFC Librarian Tue, 08 Aug 2000 19:28:38 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

All Perl core functions should return objects

=head1 VERSION

   Maintainer: Nathan Wiger <[EMAIL PROTECTED]>
   Date: 08 Aug 2000
   Version: 1
   Mailing List: [EMAIL PROTECTED]
   Status: Developing
   Number: 73

=head1 ABSTRACT

In the past, Perl has only provided two return options for its builtin
functions: scalars or arrays. In a scalar context, only one select value
was returned, greatly impacting the functionality of the function
(unless you like pulling apart long lists).

The reason this was done was to allow easy access to scalar/string data.
However, objects can use the mechanism described in RFC 49 (or something
like it) to become strings on-demand in Perl 6. As such, we should make
all Perl functions return objects in a scalar context in Perl 6.

=head1 DESCRIPTION

When several of the mechanisms proposed in other RFC's combine, we have
the power in Perl 6 to pass I<everything> around as objects, converting
them to strings as they're needed. This gives us much power, since we
present both objects and "true" scalars to beginners and advanced Perl
users alike with one common set of functions. As such, objects are
embedded in Perl from the ground up.

Others have proposed typing objects, and extracting them that way:

   my $uid   = getpwnam $user;    # "true" scalar uid
   my pw $pw = getpwnam $user;    # object of type "pw"

However, this has a couple problems:

   1. You have to have the correct object class and, at
      the very least, alter your script's syntax.

   2. You can't make $pw look like $uid transparently.

Instead, having objects that walk and talk like scalars on demand is a
more powerful approach. Note that this RFC does not necessarily preclude
being able to type objects and pull out specific classes. The two
approaches could be combined, giving multi-class objects that can be
transparently accessed as "true" scalars.

We'll start out with complex examples, where it's obvious how this is a
benefit, and go down to simpler functions, where it might be less
obvious.

=head2 Complex Functions

Let's choose C<stat>, since it's an easy target. Currently, it only
returns one of two things:

   1. A massive 13-element list (LIST context)
   2. A boolean success/failure (SCALAR context)

Neither of these is particularly useful, unless you like pain. To get a
decent interface, you have to use C<File::stat> or some other module.
Instead, let's put an object-oriented interface in the core:

   $stat = stat $file;        # returns an object
   @stat = stat $file;        # legacy list (like current)
   %stat = stat $file;        # hash (see RFC 37)

   print "$stat";             # calls $stat->STRING, which 
                              # could print the filename or
                              # some other useful piece of info

Note that our legacy calling methods are unaffected, but we now have an
object too. The boolean truth value is simplistic to determine still,
you simply have to say:

   $stat = stat $file or die "Can't stat $file: $!";

The object methods of the C<$stat> object are powerful enough that
kludges like C<_> no longer have to exist. Plus, you can stat multiple
files out of order and still get the benefits of cached C<stat> calls:

   $f1 = stat $file1 or die;
   $f2 = stat $file2 or die;
   
   if ( $f1->size > 0 and $f2->owner == 0 ) {
      print "root-owned $file1 is mode ", $f1->mode & 07777;
      if ( $f1->mtime > time ) {
          # here, "$f1" is $f1->STRING == $f1->filename
          warn "warning: bad mojo, $f1 has an mtime in the future!\n";
      }
   }

Now, we have a full object-oriented C<stat> implementation, wrapped in a
tidy function that can work just like Perl 5's if you want it to.

As a second example, let's examine C<getpwnam>, whose return options
are:

   1. A pretty dang big 9-element list
   2. The numeric user id

C<getpwnam> differs from C<stat> in that you have to actually use what
you get in a C<SCALAR> context in many situations. However, this can be
fixed with an mechanism like that in RFC 49:

   $pw = getpwnam $username;  # get the object

   print $pw->gcos, " is uid $pw";   # $pw->STRING could
                                     # reference $pw->uid 

Here, the legacy return value (numeric user id) is preserved. No need to
redo any of the documentation - but advanced users can be told "By the
way, $pw is actually an object..." and so on.

=head2 Medium-Level Functions

As a medium-level example, let's take C<fork>. At first, making what's
returned from C<fork> into an object might not seem to have much use.
However, consider how cool it would be if C<fork> maintained stuff like:

    $fork = fork;    # create a new process

    $fork->pid       # current pid
    $fork->ppid      # parent's process id
    $fork->errno     # EAGAIN, for example
    $fork->is_parent # parent?
    $fork->is_child  # child?

Then, you could fork multiple times, without having to worry about
maintaining info on which fork is which - it's all right there in the
<$fork> objects. Furthermore, the default function would be
C<$fork->pid>, yielding the ability to treat fork the same as it has
always acted. Adapted from Camel-3 p. 715:

   if ( $pid = fork ) {
      # parent here
      print "fork returned with errno ", $fork->errno;
   }

This example is fairly trite, true. However, I'm sure others who do a
lot of forking can fill the object in with valuable data they wish they
could preserve easily.

As a second example, consider C<system>. Who cares, right? Return
success or failure. But consider:

   $sys = system "some command";   # object
  
   $sys->command     # for record-keeping 
   $sys->errno       # system errno || 0
   $sys->stdout      # standard output from command
   $sys->stderr      # standard error from command
  
There's probably more stuff that could go in there too. Why do this? For
example:

   $sys = system "command";
   if ( $sys->errno ) {
      warn $sys->command, " failed with error ",
           $sys->errno;
      die "Error output: ", $sys->stderr;
   }

For one simple operation, this seems like overkill. And it is. However,
remember that the success value is still in there as the creation of the
object:

   system("command") or die "command failed: $?\n";

That certainly looks familiar. :-)

Again, you get the entire power of objects hidden behind a clean,
familiar looks-like-Perl-5 frontend.

=head2 Simple Functions

Okay, who cares about getting an object back from C<chomp>? Is that
really necessary?

Yes and no. One big advantage is that if B<every> scalar is just an
object in disguise, then this makes implementation way easier. It's just
that some objects might only have one value, rather than a whole gamut
of functions. So, if you went to assign this:

   $the_answer = 42;

"Internally" (take that with a BIG grain of salt), it might do something
like this:

   create_object($the_answer);
   $the_answer->SCALAR = 42;
   $the_answer->STRING = \&SCALAR;

So even though this is a "true" scalar, it's not. It's just an object in
disguise, which only happens to have one value. You'd still be able to
do numeric and string operations on it just like it was a "real" scalar.
All the object stuff would be hidden. Think C<tie>.

However, that's an internals thing, really, and a digression. From a
functional, language standpoint, unless B<everything> is made into an
object, then some string-related functions might be overkill. For
example, C<chomp>, C<grep>, etc.

Just for kicks, though, consider C<chomp>:

   $newstr = chomp $string;        # new syntax, RFC 58

   print "new string is $newstr";  # calls $chomp->STRING
   print $newstr->numchars, " were chomped\n";

You get the idea. 

=head1 IMPLEMENTATION

There's probably about 1,000 ways to implement this, and another 10,000
how not too. Thank goodness I'm not an internals guy. ;-)

This specification probably needs some bigtime revision. In particular,
there is this problem:

   $uid = getpwnam $user;
   if ( $uid < 100 ) {   # <-- ?
      die "sorry, " $uid->gcos, ", but your UID is too low\n";
   }

Here, $uid is not being evaluated in a string context, so $uid->STRING
wouldn't be called. If we make B<everything> an object, then this gets
easy because we just define certain operators as calling a C<SCALAR>
function, like C<tie>. But if we don't make everything an object, just
some things, we have to make some key decisions. The above might be
solved because:

   if ( OBJECT=HASH(0xef958) < 100 ) 

doesn't seem like it would ever be useful (to me at least). So we could
make it so that numeric comparisons or other such "true" scalar
operations signal the calling of an object's C<SCALAR> function. But
then we have to consider tough stuff, like ! (not the object $pw, or not
$pw->SCALAR?).

=head1 REFERENCES

Many many discussions by lots of people about something like this.
Most I've read are available on the perl5-porters archive at:
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/

RFC 49: Objects should have builtin stringifying STRING method 
RFC 37: Positional Return Lists Considered Harmful
RFC 21: Replace C<wantarray> with a generic C<want> function 
RFC 58: C<chomp()> changes. 
C<File::stat> and Camel-3 p. 715, 801
RFC 73 (v1) All Perl core functions should return ob

Reply via email to