[PHP-DEV] Extensions to traits

Ben Schmidt Sun, 02 Jan 2011 04:17:44 -0800

Hello, PHP developers,

I'm new to the list, but have been using PHP for a number of years, have
done a little hacking in the source code, and have an interest in the
development of the language.


Particularly recently I have been reading with a good deal of excitement
Stefan Marr's RFC on 'horizontal reuse' on the PHP Wiki:
http://wiki.php.net/rfc/horizontalreuse

The trait functionality looks like a great start and an innovative
language development, and I'm looking forward to trying it out soon
(when I can find some more time!), and particularly looking forward to
making good use of it when it makes it into a release.

While it's still in the pre-release stage, though, I would like to put
in a vote for the assignment syntax: I think it is a lot easier to read
and understand than the 'insteadof' syntax.

I would also like to propose some extensions to the functionality as
currently described, which I think could potentially add tremendous
power to the mechanism, with relatively little additional conceptual
complexity and implementation effort. I've written it up as a bit of a
proposal below.

I'd love to hear what you think.

I would be willing to play a part implementing it, too.

Cheers,

Ben.



=============================
Proposed extensions to traits
=============================

Background
==========

Traits in PHP [1] enable improved code reuse. They can be simplistically
viewed as compiler-assisted copy-and-paste. Methods designed to be
reused can be defined in traits and then these traits can be used in
classes. The traits are 'flattened', so it is as if the trait methods
were defined directly in the class in which they are used. Traits can
access other methods and properties of the class, including those of
other traits. They also fit in with the method overriding system:
methods defined directly in a class override those in used traits, which
in turn override those in ancestor classes.

There are two limitations of traits in their current implementation for
which I would like to propose extensions. The first limitation is that
traits can very easily break, particularly when methods are omitted from
classes in which the rest of the trait is used, or shadowed by method
definitions in the class proper. The second limitation is that the trait
overriding semantics are impoverished and needlessly restrictive.

Breakability
============

Limitation
----------

There are two main aspects of traits which make them easy to break:
incorrect method calls and unintentionally shared state.

Incorrect method calls spring from the way trait methods can be omitted
from classes where the rest of the trait is used, or shadowed by methods
defined in the class proper. In either of these scenarios, any call in a
trait to such a method may not call the method that was originally
intended--they may fail, or they may call a different method, with
unpredictable results. Of course, sometimes such a behaviour is
desirable, if writing a trait which communicates with the rest of the
class by means of method calls, yet provides a fallback methods in case
the class author does not wish to provide such methods. However, when it
is not intended, this could lead to incorrect and difficult-to-pinpoint
behaviour.

The other way traits can break is by unintentionally sharing state.
Traits may make use of the same data members (not recommended, but
possible), or the same accessors, when each should actually have their
own independent state. Again, this could lead to incorrect and
difficult-to-pinpoint behaviour.

Example
-------

trait ErrorReporting {
   public function error($message) {
      $this->print($message);
   }
   private function print($message) {
      fputs($this->output,$message."\n");
   }
}

class Printer {
   use ErrorReporting;
   public $output=null;
   public function print($document) {
      /* Send the document to the printer--$this->output. */
      /* ... */
      if (there_was_an_error()) {
         $this->error("printing failed");
      }
      /* ... */
   }
}

This example is very contrived, and hopefully no programmer would be
silly enough to fall into this exact trap. However, it is easy to
imagine more subtle cases where this kind of thing could happen,
particularly as traits and classes are modified from their original
conception.

The ErrorReporting trait allows the programmer to report errors in a
consistent way by using the trait in many classes. It includes a print()
method that is used to print the error to the screen. However, this
method has been unintentionally shadowed by a print method in the class,
intended to print a document on a printer. No error or warning will be
generated, but the class will not work as intended; probably it will
infinitely recurse; a nasty problem to track down.

Furthermore, even if the incorrect method call didn't occur, there would
be data-sharing problems, as both the ErrorReporting trait and the
class' print() function make use of the $output data member,
unintentially sharing data.

Proposal
--------

I suggest these problems should be solved from two angles. Firstly,
additional warnings should be triggered to alert the programmer to the
problems, and secondly, the traits mechanism should be extended to allow
more desirable behaviours to be programmed.

Warnings
- - - -

To avoid silent unintended shadowing, I suggest issuing a warning when a
conflict between trait and class methods occurs. So this would trigger
a warning:

   trait SaySomething {
      public function sayIt() {
         echo "Something\n";
      }
   }
   class Sayer {
      use SaySomething;
      public function sayIt() {
         echo "Hello world!\n";
      }
   }

Something such as this would be required to suppress it:

   use SaySomething {
      sayIt = null;
   }

or perhaps with a more 'insteadof'-like syntax something like:

   use SaySomething {
      unset sayIt;
   }

This indicates that no sayIt() method should be included from any trait
(and thus there is no conflict with any sayIt() method defined in
Sayer). We could also have:

   use SaySomething {
      sayIt = SaySomething::sayIt;
   }

which would also suppress the warning, but indicates that we know what
we are doing--we desire the trait method to be included. This is only
useful in combination with my proposal for dealing with overriding below
('prev').

Extension
- - - - -

I suggest these two problems can be simply solved by introducing two
additional uses of the trait keyword: as a scoping keyword and an access
specifier.

As a scoping keyword, it would be used analogously to self. Method calls
such as $this->print() could be replaced with trait::print() when the
programmer desires to ensure that their trait method, and only their
trait method, is called--when there is no intention that overriding
should be possible. It would only be able to be used in a trait, and
could only be used to reference methods or properties defined in the
same trait, using their original name.

As an access specifier, it would be used instead of public, private,
etc. in trait definitions, to mean that the member (data or method) can
and can only be accessed using the mechanism above (trait::).

Implementation could be very simple. When flattening a trait into a
class, every trait method, and every trait property with trait level
access, could be included with a mangled name (e.g. making use of the
reserved __ prefix and/or characters which are illegal in code, e.g.
__trait-TraitName-methodName), and any occurrences of trait:: scoping in
any trait method body could be replaced with a call to the same kind of
mangled name (e.g. trait::print() becomes
$this->__trait-ErrorReporting-print()). Data members could be treated in
exactly the same way (e.g. trait::$output becomes
$this->__trait-ErrorReporting-output). Static members pose no additional
problems. When flattening a trait into another trait, the
mangling/transformation is slightly different, but not much harder.
Perhaps a little demangling code for backtraces and/or error messages
would be nice. This would be sufficient, though. The trait access
specifier is nothing more than an indication that a method should be
omitted with its unmangled name (essentially the same as an insteadof
directive, but without any method taking its place), or that a property
should be included with a mangled name, rather than going through the
existing property conflict checking mechanism.

I realise a proposal for non-breakable traits [2] has already been
declined. However, I believe my proposal here is simpler to understand
and implement. It doesn't introduce any new overriding or
property-sharing semantics, doesn't overload the public and private
keywords with confusing additional meanings, and fits better with PHP's
dynamic nature by not requiring a decision at compile time about whether
or not to mangle a name (most notably in method bodies).

Rewritten example
-----------------

The original example would work as desired if rewritten thus:

trait ErrorReporting {
   trait $output = STDOUT;
   public function setErrorOutput($output) {
      trait::$output = $output;
   }
   public function error($message) {
      trait::print($message);
   }
   trait function print($message) {
      fputs(trait::$output,$message."\n");
   }
}

class Printer {
   use ErrorReporting;
   public $output=null;
   public function print($document) {
      /* Send the document to the printer--$this->output. */
      /* ... */
      if (there_was_an_error()) {
         $this->error("printing failed");
      }
      /* ... */
   }
}

Overriding
==========

Limitation
----------

At present, the overriding semantics of traits are that a method defined
in a class proper overrides a method defined in a used trait which in
turn overrides a method defined in an ancestor class.

However, to my knowledge, there is no way for a class method to call a
trait method by the same name.

Furthermore, it is my belief that completely disallowing trait methods
of the same name, rather than allowing an overriding behaviour between
traits, is needlessly restrictive.

Proposal
--------

I would therefore like to propose an extension backwards-compatible with
the current trait implementation. I will, however, extend the assignment
syntax, rather than the 'insteadof' syntax, as I find that clearer, and
more amenable to this extension. Of course, though, other syntaxes could
be found.

There are four aspects to this extension: (1) Introducing a new scoping
keyword. (2) Allowing a method name to be used from multiple traits. (3)
Allowing a trait to be included multiple times.

(1) Introducing a new scoping keyword.
- - - - - - - - - - - - - - - - - - -

I suggest something such as 'prev', to refer to the previous definition
of the method. Similar to 'parent', and the same in the absence of
traits, this refers to the 'next higher definition in the trait
hierarchy'; the 'trait hierarchy' is pictured like the 'class hierarchy'
but including traits. So if 'prev' is used in a class method when a
trait method of the same name exists, it will refer to the trait method,
rather than referring to a method in a parent class. Alternatively, the
'parent' keyword meaning could be changed to have this meaning. My
apologies if this is already the case: I have not played with the
implementation in the trunk (though look forward to doing so at some
stage) so am basing my comments purely on the RFC.

(2) Allowing a method name to be used from multiple traits.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

When multiple methods of the same name are defined they simply take
their place in the 'trait hierarchy' and can be accessed by means of
'prev' (see (1) above).

So we could write, for instance:

   trait Hello {
      public function sayIt() {
         echo "Hello ";
      }
   }
   trait World {
      public function sayIt() {
         prev::sayIt();
         echo "world ";
      }
   }
   class HelloWorld {
      use Hello, World {
         sayIt = Hello::sayIt, World::sayIt;
      }
      public function sayIt() {
         prev::sayIt();
         echo "!\n";
      }
   }
   $o = new HelloWorld();
   $o->sayIt();
   // Outputs "Hello world !\n"

sayIt() in the class overrides sayIt() in World, which overrides sayIt()
in Hello, but all are included. The first two make use of 'prev' to
reference those higher up the hierarchy.

(3) Allowing a trait to be included multiple times.
- - - - - - - - - - - - - - - - - - - - - - - - - -

This is only useful with the extension I proposed above to deal with
breakability, where traits can be somewhat isolated.

However, the trait is simply given multiple names by which to refer to
it when it is used, and then its methods are referenced using those
names. So, for instance:

   trait PublicQueue {
      trait $arr = array();
      public function add($item) {
         trait::$arr[]=$item;
      }
      protected function remove() {
         return array_shift(trait::$arr);
      }
   }
   class OperationManager {
      use Q1=PublicQueue, Q2=PublicQueue {
         localQueue = Q1::add;
         remoteQueue = Q2::add;
         localUnqueue = Q1::remove;
         remoteUnqueue = Q2::remove;
         add = null;
         remove = null;
      }
      public function process() {
         while ($i=$this->localUnqueue()) echo "local $i\n";
         while ($i=$this->remoteUnqueue()) echo "remote $i\n";
      }
   }
   $o = new OperationManager();
   $o->localQueue("one");
   $o->remoteQueue("two");
   $o->localQueue("three");
   $o->process();
   // Outputs local one\nlocal three\nremote two\n

Implementation
- - - - - - -

I don't think any of this poses particular implementation difficulties.
Everything that is necessary is pretty obvious: name mangling of trait
methods needs to use the aliases (e.g. Q1 in the example) rather than
the original trait name; multiple methods need to be able to be included
in a single class, with an ordering; and 'prev' needs to start a method
search at the current class, but not actually make a call until the
method using 'prev' has been passed in the search (or else, have some
way of starting the method search at the relevant place within a class);
only when prev is used is anything but the first method with a given
name called.

Fuller example
==============

A more 'real-world' example is definitely in order now. One possible
scenario where all this kind of functionality could be useful is in
defining active record classes, where objects can 'retrieve themselves'
from a database, and 'update themselves' in the database. There are many
different semantics for records which could be added incrementally with
overriding traits, to construct classes with many different behaviours
and combinations of behaviours. With many, many database record types, a
lot of code reuse could be avoided. Just a taste:

   abstract class ActiveRecord {
      protected $new;
      protected $id;
      protected $other_values;
      protected function __construct($id,$values,$new) {
         $this->id=$id;
         $this->other_values=$values;
         $this->new=$new;
      }
      public function save() {
         if ($this->new) {
            if (!create_in_the_database()) return false;
            if ($this->id===null) $this->id=last_insert_id();
         } else {
            if (!update_in_the_database()) return false;
         }
         return true;
      }
      public static function new() {
         return new static(null,static::$default_values,true);
      }
      public static function get($id) {
         return new static($id,get_from_the_database(),false);
      }
   }
   trait LoggingOperations {
      public function save() {
         if ($this->new) {
            log("Creating ".get_called_class());
         } else {
            log("Updating ".get_called_class()." ID ".$this->id);
         }
         if (!prev::save()) {
            log("Failed");
            return false;
         }
         log("Succeeded");
         return true;
      }
   }
   trait EnsuringNoConcurrentChanges {
      trait $original_values = array();
      protected function setOriginalValues($values) {
         trait::$original_values = $values;
      }
      public static function get($id) {
         $record = prev::get($id);
         $record->setOriginalValues($record->other_values);
         return $record;
      }
      public function save() {
         $current_values=select_from_database();
         if ($this->new&&$current_values) return false;
         if (!$this->new&&!$current_values) return false;
         if ($current_values!=trait::$original_values) return false;
         return prev::save();
      }
   }
   trait UsingHashesForIDs {
      public function save() {
         if ($this->id===null) $this->id=random_hash();
         return prev::save();
      }
   }
   class SessionRecord extends ActiveRecord {
      protected static $default_values=array(
         'user'=>'',
         'time'=>''
      );
      use UsingHashesForIDs;
   }
   class Client extends ActiveRecord {
      protected static $default_values=array(
         'user'=>'',
         'name'=>'',
         'address'=>''
      );
      use EnsuringNoConcurrentChanges, LoggingOperations {
         save = EnsuringNoConcurrentChanges::save,
               LoggingOperations::save;
      }
   }

Obviously, other combinations are possible, too. Admittedly, with things
this simple, a single base class with configurable options would
probably be sufficient, but with additional complexity, traits that can
override each other definitely offer a distinct advantage with a great
amount of possible code reuse. There is a huge amount of power available
here, I think, which is unlocked by these proposed extensions.

Final note regarding grafts
===========================

My proposals here are not a replacement for grafts [1]. Grafts can be
viewed as compiler-assisted use of the delegate design pattern, and come
with all the benefits of that pattern. As such, they allow greater
isolation than traits and greater ability to use helper classes by
passing themselves via $this to other methods (which can then rely on
access to all the grafted class' public methods, without knowledge of
the enclosing class or which of the grafted class' methods it forwards
to).

Regarding the problem of maintaining encapsulation when doing such
things as returning $this from a graft method. I suggest perhaps the
'PHP way' of solving this would be to provide a get_called_object()
method, so one can return get_called_object() rather than return $this.
It could be problematic when grafts nest, but perhaps with some worked
examples and experience, would be found to be sufficient. In my opinion,
having $this behaving different ways in different contexts isn't just
difficult, but really isn't feasible--the programmer's intentions can't
be guessed adequately to make that kind of thing work.

References
==========

- [1] http://wiki.php.net/rfc/horizontalreuse
- [2] http://wiki.php.net/rfc/nonbreakabletraits




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Extensions to traits

Reply via email to