RFC 271 (v1) Subroutines : Pre- and post- handlers for subroutines

Perl6 RFC Librarian Thu, 21 Sep 2000 13:19:21 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Subroutines : Pre- and post- handlers for subroutines

=head1 VERSION

  Maintainer: Damian Conway <[EMAIL PROTECTED]>
  Date: 21 September 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 271
  Version: 1
  Status: Developing

=head1 ABSTRACT

In response to the desiderata set out in RFC 194 and in the
Class::Contract module, this RFC proposes a generic "handler" mechanism
that can install behaviours around (before or after) a subroutine invocation.

=head1 DESCRIPTION

=head2 Overall semantics

It is proposed to provide a mechanism which allows two sequences of
"handlers" to be associated with a specific subroutine or built-in
function (hereafter referred to as the I<primary>).

One sequence of handlers (the I<prefix sequence>) would be called
whenever the primary is invoked, but I<before> the body of the primary
is executed. Each handler would itself be a subroutine, and each would
be called with the same argument list as the primary it prefixes, plus
an extra argument (specifically, $_[-1]) representing a slot for the
eventual return value of the primary. This last argument would normally
have the value C<undef>.

The second sequence of handlers would be called I<after> the body of the
primary has executed, but I<before> the return value is returned to the
invoking scope. Once again, each handler would itself be a subroutine,
and each would be called with the same argument list as the primary it
postfixes, plus the return value(s) argument ($_[-1]). For a postfix
handlers, this extra argument would hold a reference to an array
containing the value(s) actually returned from its primary.

Normally prefix handlers would be prepended to the appropriate prefix
sequence, whilst postfix handlers would be appended to the appropriate
postfix sequence, thereby preserving the symmetry of pre- and post-fix
handlers. The mechanisms for setting up such sequences are described in
L<General Syntax of Handler Installers>.


=head2 Prefix Handler Semantics

A prefix handler may do anything that any other subroutine may do. A
typical action might be to trace or log the invocation of the primary
(the C<pre> syntax is used to set up a handler, and is explained in
L<General Syntax of Handler Installers>).

        package Foo;

        # set up a tracing prefix handler for the subroutine &Foo::bar...

        pre bar { 
                local $" = ','
                my @caller = caller;
                print "Called foo with args (@_[0..$#_-1])\n",
                      "from @caller[1,2]\n",
                      "in contexts: ", join(", ",want);
        }

Note that this (correctly) implies that a handler receives the same information
from C<caller> and C<want> (RFC ???) as its primary would.

Another common usage would be to acquire resources that the primary will use:

        pre read_file {
                flock $_[0], LOCK_EX;
        }

(For the obvious complement of this usage, see L<Postfix Handler
Semantics>).

Note that, in all cases, the return value of the handler is ignored.


=head3 Special semantics: changes to arguments

Each handler receives the same argument list: the one that the primary was
called with.

If a handler changes one of the original arguments through the
aliases in its @_ array, those changes a passed on to subsequent
handlers and to the primary itself. For example:

        pre tax_payable_on {
                $_[0] -= 20.00;         # routinely underquote sales price
        }
                

        sub tax_payable_on {            # now sees prices $20 less than
                                        # specified argument
                printf("Tax: %.2lf", $_[0] * 0.1);
        }

        tax_payable_on(99.95);  # prints 8.00
        tax_payable_on(29.95);  # prints 0.99
        tax_payable_on( 9.95);  # prints -1.01  (a profit!)


=head3 Special semantics: changes to return value slot

If a handler changes the value of the (normally C<undef>) return value slot
(i.e. $_[-1]), then the remaining handlers (prefix and postfix) are still
called, but the primary itself is I<not> called. Instead, the defined
value in $_[-1] is used as the return value for the primary.
This allows techniques such as memoization to be implemented:

        my %sin_cache = (
                          0               =>  0,
                          1.5707963267949 =>  1,
                          3.1415926535898 =>  0,
                          4.7123889803847 => -1,
                          6.2831853071796 =>  0,
                        );

        pre CORE::sin {
                $_[-1] = $sin_cache{$_[0]};     # short-circuit if cache
                                                # value defined
        }


One might also use this feature to short-circuit upon failure to acquire
resources:

        pre read_file {
                flock($_[0], LOCK_EX|LOCK_NB)
                        or $_[-1] = "";         # Can't lock file so...
                                                # "No data for you!"
        }


=head3 Special semantics: exceptions

If a handler throws an exception, that exception is immediately
propagated back to the calling scope, without any other handlers (nor
the primary itself) having been invoked. This is useful for setting 
up preconditions on subroutine and methods (for Design-By-Contact programming).
For example:

        package PriorityList;

        pre pop {
                my ($self) = @_;
                croak "Can't pop empty PriorityList"
                        unless @{$self->{list}} > 0;
        }

An exception might also be an appropriate response
on resource acquisition failure:

        pre read_file {
                flock($_[0], LOCK_EX|LOCK_NB) or
                        croak "Couldn't get immediate lock";
        }



=head2 Postfix Handler Semantics

When called, a postfix handlers may do anything that any other subroutine 
may do. Just like a prefix handler, it may print out tracing information:

        package Foo;

        post bar { 
                local $" = ','
                my @caller = caller;
                print "Finished call to foo from @caller[1,2]\n",
                      "returning: @{$_[-1]}";
        }

or release resources:

        post read_file {
                flock $_[0], LOCK_UN;
        }

or alter the primary's argument list (for example: to effect changes to
the original values after the primary has finished with them):

        post issue_licence {
                $_[0]--;        # decrement licence counter
                carp "You just used your last licence"
                        if $_[0] == 0;
        }

or change the return value:

        post tax_payable_on {
                $_[-1] -= 1.00;         # routinely underquote tax payable
        }
                
        sub tax_payable_on {
                printf("Tax: %.2lf", $_[0] * 0.1);
        }

        tax_payable_on(99.95);  # prints 9.00
        tax_payable_on(29.95);  # prints 2.00
        tax_payable_on( 9.95);  # prints -0.01


or throw an exception (for example, to implement method post-conditions
in a DBC system):

        package PriorityList;

        post pop {
                my ($self, $return_val) = @_;
                croak "pop returned undef (something odd is happening)"
                        unless defined $return_val;
        }

Again, changes to any element of @_ (particularly to the return value:
$_[-1]) propagate to later postfix handlers and -- eventually -- back to
the original calling scope


=head2 General Syntax of Handler Installers

It is proposed that two built-in functions -- C<pre> and C<post> -- be
introduced. These may be used to install prefix or postfix handlers for
any subroutine (or for all subroutines in a single package).
Each would take two arguments:

=over 4

=item 1.

Either a subroutine name (possibly as a bareword),
or a subroutine reference, or a class name (possibly as a bareword).
This argument would be mandatory.

=item 2.

Either a block or subroutine reference implementing the handler,
or a hash reference whose value is a sub reference implementing the handler,
or a string or bareword specifying the logical name of a handler.
This argument would be optional.

=back

Thus, in the notation proposed by RFC 128:

        pre  ( [""&]identifier; [""&\%]option)

        post ( [""&]identifier; [""&\%]option)

with all the parsing niceties that implies (e.g. optional barewords in
either position, optional separating comma and optional
terminating semicolon if the second argument is a block, etc.)


=head2 Installing C<pre> Handlers

A call to C<pre> would normally install the handler implementation
specified by the second argument) as a prefix handler for the
subroutine specified by the first argument. The handler would be
installed by I<prepending> it to an internal list of handlers for the
subroutine (or package).

Note that C<pre> (and later, C<post>) are I<not> compile-time directives,
but are regular built-in functions that are called as the program executes.
This means that they can be used either to set up handlers during compilation
(e.g. by calling them in the body of a module) or during execution (by
calling them from the main package or from some other subroutine).

If the first argument is a string or bareword, the prefix handler is
installed for the subroutine or method of the same name in the current
package. For example:

        package Foo;

        pre bar { print "about to bar()..." };

installs a prefix handler for the subroutine C<&Foo::bar>.

If the specified name is qualified (i.e. contains at least one instance
of "::"), then the handler is installed for the subroutine in the
qualifying package. For example:

        package Foo;

        pre Coola::bar, sub { print "about to bar()..." };

installs a prefix handler for the subroutine C<&Coola::bar>.

If the first argument ends in "::" and is therefore the name of a package --
rather than a subroutine -- then the handler is installed in 
a special I<package prefix sequence>, and is shared by every
existing (and future) subroutine in the package. For example:

        pre Cafe::Bar::, sub { print "calling some method of class Cafe::Bar" };

installs a prefix handler that will be invoked whenever I<any> subroutine 
from package Cafe::Bar is called. 

If the first argument is a subroutine reference, the handler is installed
for the referent. For example:

        package Foo;

        pre \&Zan::Zi::bar, sub { print "about to bar()..." };

installs a prefix handler for the subroutine C<&Zan::Zi::bar>, whilst:

        package Foo;

        my $anon = sub { print "I am the sub with no name\n" };

        pre $anon, sub { print "about to bar()..." };

installs a prefix handler for the anonymous subroutine referred to by $anon.

It should be noted that the following two calls are I<not> 
identical in their effects:

        pre "Cross::bar", sub { print "high jumping..." };
        pre \&Cross::bar, sub { print "high jumping..." };

For an explanation of the difference between them see, L<Inheritance of
Handlers>.


=head2 Installing C<post> Handlers

The semantics of the C<post> function are symmetrical with those
of the C<pre> function described in the previous section.

A call to C<post> would normally install the handler implementation
specified by the second argument) as a postfix handler for the
subroutine specified by the first argument. The handler would be installed
by C<appending> it to the end of an internal list of handlers for the
subroutine (or package).

If the first argument is a string or bareword, the postfix handler is
installed for the subroutine or method of the same name in the current
package. For example:

        package Foo;

        post bar { print "...just finished bar()" };

installs a postfix handler for the subroutine C<&Foo::bar>.

If the specified name is qualified, then the handler is installed for
the subroutine in the qualifying package. For example:

        package Foo;

        post Coola::bar, sub { print "...just finished bar()" };

installs a postfix handler for the subroutine C<&Coola::bar>.

If the first argument ends in "::" and is therefore the name of a package --
rather than a subroutine -- then the handler is installed in 
a special I<package postfix sequence>, and is shared by every
existing (and future) subroutine in the class. For example:

        post Cafe::Bar::, sub { print "called some method of class Cafe::Bar" };

installs a postfix handler that will be invoked whenever I<any> subroutine 
from package Cafe::Bar is called. 

If the first argument is a subroutine reference, the handler is installed
for the referent. For example:

        package Foo;

        post \&Zan::Zi::bar, sub { print "...just finished bar()" };

installs a postfix handler for the subroutine C<&Zan::Zi::bar>, whilst:

        package Foo;

        my $anon = sub { print "I am the sub with no name\n" };

        post $anon, sub { print "...my name was Nobody" };

installs a postfix handler for the anonymous subroutine referred to by $anon.

It should be noted that, as with C<pre>, the following two calls are
I<not> identical in effect:

        post "Cross::bar", sub { print "...and landing" };
        post \&Cross::bar, sub { print "...and landing" };

For an explanation of the difference between them see, L<Inheritance of
Handlers>.

=head2 Logically Identified Handlers

In order to later manipulate or remove prefix and postfix handlers, it
is convenient to given them a logical name. To this end, the second argument to
C<pre> or C<post> may be specified as reference to a one-element hash:

        pre Lum::bar, { PAIN => sub { print "ouch!..." } };
        post Lum::bar, { PAIN => sub { print "...ahhh!" } };

These have exactly the same handler-setting effects as:

        pre Lum::bar, sub { print "ouch!..." };
        post Lum::bar, sub { print "...aaah!" };

except that the specified handlers now have the logical name "PAIN" associated
with them.

This is important, because at a later point in the code it might be desirable
to replace one or both of the handlers (whilst keeping them
in the same sequential position):

        pre Lum::bar, { PAIN => sub { print "OUCH!!!!" } };

That is, when a handler is re-specified with a logical name, if a
handler of that name already exists in the appropriate prefix or postfix
handler sequence, the new handler replaces the existing one in situ,
rather than being prepended or appended to the handler sequence.

The logical name could also be used to I<remove> the handler
completely:

        pre Lum::bar, { PAIN => undef };

which would delete the named handler from the sequence. A subsequent
C<pre> specification of a handler with the logical name "PAIN" would
therefore I<prepend> the new handler to the prefix sequence (since it no
longer appears within the subroutine's prefix sequence). If the aim is to
remove a named handler, but leave its logical name as a placeholder for
future re-instatement, the following should be used instead:

        pre Lum::bar, { PAIN => sub{} };


=head3 Accessing named handlers

It is also possible to retrieve a reference to a named handler, by
passing just its logical name as the second argument. For example:

        $pain_handler = pre Lum::bar, 'PAIN';

        $tree_handler = post \&Coola::bar, 'IS_SHADY';

        $package_handler = pre Cafe::Bar::, 'CHECK_SUGAR';

Finally, it is possible to retrieve the entire prefix or postfix
handler sequence for a particular subroutine, by passing just its
name/reference to C<pre>:

        $pain_handler_array_ref = pre Lum::bar;

        $tree_handler_array_ref = post \&Coola::bar;

        $package_handler_array_ref = pre Cafe::Bar::;

This returns a reference to the actual handler sequence for the
subroutine, in which each element is a reference to a hash whose single
key is the logical name of the handler (or "" if it has no logical name)
and whose single value is a reference to the handler itself. That is, the
structure of the returned value is like this:

        [
            { 'NAME' => sub {...} },
            { 'NAME' => sub {...} },
            { ''     => sub {...} },
            { 'NAME' => sub {...} },
            { ''     => sub {...} },
        ]

Changes to this array change the actual prefix handling of the corresponding
subroutine.

For example, a particular module might require that any prefix handler
that it installs for C<&CORE::open> (the handler mechanism applies to
built-in functions too) is the very last handler that is called. Hence
the module cannot use:

        package LastOpen;

        sub import {
                pre CORE::open, \&LastOpen::handler;
        }

as that would I<prepend> the C<&LastOpen::handler> subroutine to the sequence
of handlers for C<CORE::open>).

Instead, the LastOpen module would use:

        package LastOpen;

        sub import {
                my $open_prefix_sequence = pre Core::open;
                push @{$open_prefix_sequence}, { "" => \&LastOpen::handler };
        }

thereby ensuring the prefix handler is called at the end of C<CORE::open>'s
prefix sequence.



=head2 Inheritance of Handlers

L<Installing C<pre> Handlers> noted that the following two calls to C<pre>
are not identical in effect:

        pre \&Cross::bar, sub { print "high jumping..." };
        pre "Cross::bar", sub { print "high jumping..." };

The difference between them lies in the way they are treated when the
class Cross is inherited by some other class. For example:

        package Irate;
        use base 'Cross';

        sub bar { print "humbug!" }

If a prefix (or postfix) handler for C<&Cross::bar> has been defined via a
direct reference to the subroutine:

        pre  \&Cross::bar, sub { die if $_[1] < $qualifying_height }
        post \&Cross::bar, sub { die unless $_[0]->still_balanced() };

then that handler is I<not> inherited by C<Irate::bar>. Handlers defined
in this way are referred to as being I<non-heritable>.

However, if the handler was defined using a string or bareword representing
C<&Cross::bar>'s name:

        package Cross;

        pre  bar => sub { die if $_[1] < $qualifying_height };
        post 'Cross::bar' { die unless $_[0]->still_balanced() }

then that handler I<is> inherited by C<Irate::bar>. Hence, handlers
defined in this way are referred to as being I<heritable>.

Heritable handlers are associated with the I<identifier> (i.e. name)
of a particular subroutine, not with its I<identity> (i.e. reference).
Thus it is possible to specify prefix or postfix handlers for a subroutine
that exists C<in name only> within a package. Specifically, it is possible
to associate a heritable handler with a subroutine that is not defined in
a package but is, instead, inherited from some ancestral package.


=head3 Inheritance of postfix handlers

Inheritance of postfix handlers is simple. The theory of
Design-By-Contact programming states that heritable postconditions on a
method must be as strong or stronger in a derived class. That is, a
derived-class method may make extra guarantees about its behaviour, but may
never renege on those guarantees already made by its base-class
counterpart.

This implies that any postfix handlers inherited by a derived-class
method act as if they are appended to its heritable postfix sequence, and thus
are called I<after> the derived-class postfix handlers, at which time they
have the opportunity to impose additional constraints on the method.

For example:

        package Cat;

        sub new { 
                my ($class, $name, $weight) = @_;
                bless { name => $name, weight => $weight }, $class;
        }

        post new {
                die "Anti-matter cat detected"
                        if $_[-1]->{weight} <= 0;
        }

        package Tiger;
        use base 'Cat';

        post new {
                die "Tiger died of shame"
                        if $_[-1]->{name} eq 'Fluffy';
        }

Given the above definitions, a Cat object need only have a positive
weight, but a Tiger object must have both a positive weight I<and> an
unembarrassing name.

Note that these same inheritance semantics also apply to package-wide
postcondition handlers (since they are always specified by name, not by
reference). This makes it possible to set up inheritable class invariants.
For example:

        package Cat;

        pre Cat::, sub { $_[0]->is_dry() or die "Wet cat :-(" }


=head3 Inheritance of prefix handlers

The inheritance behaviour of prefix handlers is less straightforward.

DBC theory states that the heritable preconditions defined on a
derived-class method must be no stronger (and may well be weaker) than those
for its base-class counterpart. In other words, a derived-class method is
allowed to be more permissive in its requirements, but not more
demanding.

This implies that any heritable prefix handlers inherited from a base
class are executed I<disjunctively> with those heritable handlers
specified in the derived class. In practice this means that the set of
heritable prefix handlers from the base class(es) is invoked within an
C<eval> and if none of them throws an exception, that success suffices
to satisfy the precondition constraint on the method, in which case
the heritable prefix handlers of the derived class are not called
(although the non-heritable handlers are still invoked). If the eval
terminates because some inherited handler threw an exception, then the
full set of prefix handlers (heritable and non-heritable) for the
derived-class are tried instead.

For example, given:

        package Cat;

        sub new { 
                my ($class, $claw_len, $caged) = @_;
                bless { claw_len => $claw_len, caged => $caged }, $class;
        }

        pre new {
                my ($class, $claw_len) = @_;
                die "Too dangerous"
                        if $claw_len > 0.5;
        }

        package Tiger;
        use base 'Cat';

        post new {
                my ($class, $claw_len, $caged) = @_;
                die "Too dangerous"
                        unless $caged;
        }

then a Cat object must have properly trimmed claws, whereas a Tiger may either
have properly trimmed claws (i.e. the heritable precondition succeeds)
I<or> be safely caged (i.e. the derived precondition succeeds).


=head1 MIGRATION ISSUES

None.

Though various modules might like to take advantage of the standardized
facility.


=head1 IMPLEMENTATION

A prototype (pure Perl) implementation featuring everything except
package-wide handlers will be available (as the Hook::SubHandlers
module) some time next week. It will be archived as:

        http://www.csse.monash.edu.au/~damian/CPAN/Hook-SubHandlers.tar.gz

and subsequently added to the CPAN.

As Jarkko points out in RFC 194, the implementation details of true built-in
C<pre> and C<post> functions will depend very heavily on what becomes of
typeglobs.


=head1 REFERENCES

RFC 21: Subroutines: Replace C<wantarray> with a generic C<want> function

RFC 23: Higher order functions

RFC 128: Subroutines: Extend subroutine contexts to include name parameters and lazy 
arguments

RFC 137: Overview: Perl OO should I<not> be fundamentally changed.

RFC 194: Standardise Function Pre- and Post-Handling

Meyer, B, "Object Oriented Software Construction, 2nd Edition", Prentice Hall.
RFC 271 (v1) Subroutines : Pre- and post- handlers for subroutines

Reply via email to