=head1 TITLE type inference =head1 VERSION Maintainer: Steve Fink <[EMAIL PROTECTED]> Date: 1 Aug 2000 Version: 1 Mailing List: [EMAIL PROTECTED] Number: 4 =head1 ABSTRACT Types should be inferred whenever possible, and optional type qualifiers may be used to gain whatever level of type stricture desired. =head1 DESCRIPTION For large systems, and often for small ones, type checking is extremely valuable as a way of eliminating bugs at compile time and avoiding errors while making global changes. I propose that we create a type hierarchy, such as any list list(T) hash hash(T -> T) scalar reference ref(T) nonref number integer void (This is just a sketch; there are many ways of skinning this cat.) By default, only constants would be assigned a type. Every node in the parse tree would be assigned a type. Variables would not have a single type; they would have a possibly different type after every assignment. So using the default rules 1 $x = 3; 2 $x .= "x"; 3 $h{$x} = \$x; 4 $h{foo} = "bar"; 5 $x = f(); I<$x> would have type C<number> after line 1 and C<nonref> after line 2. I<%h> would have type C<< hash(nonref -> ref(nonref)) >> after line 3, and then would find the nearest ancestor in the next line, resulting in C<< hash(nonref -> scalar) >>. Line 5's effect depends on whether C<f()>'s type is known. If not, then I<$x> will have type C<any> after line 5. Notice that so far, all existing programs will always typecheck successfully, so no burden has been placed on the programmer who does not want types. Now say we insert C<my $x : number> at the beginning of the example (or some other syntax). That means that we are asserting that I<$x> will I<always> be of type C<number>, and we will flag a type error on line 2 and an optional warning on line 5 if the return type of C<f()> in scalar context is unknown. Note that error messages are only generated when two things with strong types collide. So C<my ($x : integer) = /(\d.*)/> will not complain, but C<my $x : integer = "string"> will. (I am leaving out a lot of details, such as what happens to the type of I<%h> if just after line 3 you say C<$x = [[]]>. Or what happens to the types of all accessible variables on an eval"", or function types, or a hundred other messy problems. But even if lots of stuff gets promoted to type C<any>, I still think that types will be very useful within individual subroutines and other isolated areas.) =head1 IMPLEMENTATION I propose not changing runtime behavior at all; in the case of C<my ($x : integer) = /(\d.*)/>, I<$x> may actually end up containing a non-integral string with no warning issued. If you want a warning, write your own RFC. ;-) Implementation for the most part is straightforward type inference using unification. The wrinkles come in from how complicated the type hierarchy is, and where we want to place the balance between false positives and false negatives. (Type theorists do not allow false negatives, but I'm not a type theorist and their motivation for that stance is allowing safe run-time behavioral differences.)