This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Additions to 'use strict' to fix syntactic ambiguities

=head1 VERSION

   Maintainer: Nathan Wiger <[EMAIL PROTECTED]>
   Date: 24 Sep 2000
   Last Modified: 30 Sep 2000
   Mailing List: [EMAIL PROTECTED]
   Number: 278
   Version: 2
   Status: Frozen

=head1 ABSTRACT

Several RFCs and many people have voiced concerns with different parts
of Perl's syntax. Most take issue with syntactic ambiguities and the
inability to easily tokenize Perl.

This RFC shows how these all boil down to a three central issues, and
how they can be solved with some simple additions to C<use strict>. By
default, Perl should remain as flexible as possible. By adding these
flags to C<use strict>, those who desire them can have all the benefits
of a stricter syntax, without hurting those that like these features.

=head1 DESCRIPTION

=head2 The Problems

=head3 Indirect Objects

RFC 244 proposes eliminating the bareword indirect object syntax because
this:

   print STDERR @stuff;

Can be parsed as either of these:

   STDERR->print(@stuff);
   print(STDERR(@stuff));

Depending on your usage of C<STDERR> other places in your program.
However, many of us like writing:

   $q = new CGI;

Quite a bit, and consider this DWIMish.

=head3 Barewords vs. Functions

RFC 244 and others mention several problems with barewords such as:

   Class->stuff(@args);   # Class()->stuff or 'Class'->stuff ?

Again, the fact that Perl can figure this out correctly is quite
DWIMish, and this functionality should not be removed by default.

=head3 Special Cases

Many special cases abound, such as the bare C<//> mentioned in RFC 135.
Again, this is stuff that makes Perl fun, and should not be taken out
of the language.

=head2 The Solutions

At first, these may not seem related. However, they very much are, and
in fact all boil down to only three issues which can be resolved with
additions to C<use strict>.

=head3 Function Parens - C<use strict 'words'>

This imposes a very simple restriction: barewords are not allowed. They
must be either quoted or specified with parens to indicate they are
functions. Note this solves the C<%SIG> problem from Camel:

   use strict 'words';
   $SIG{PIPE} = Plumber;    # syntax error
   $SIG{PIPE} = "Plumber";  # use main::Plumber
   $SIG{PIPE} = Plumber();  # call Plumber()

In addition, this also forces users to disambiguate certain functions:

   use strict 'words';

   name->stuff(@args);      # syntax error
   'name'->stuff(@args);    # 'name'->stuff
   name::->stuff(@args);    # ok too, same thing
   name()->stuff(@args);    # name()->stuff

   $result = value + 42;    # syntax error
   $result = value() + 42;  # value() + 42
   $result = value(  + 42); # value(42)
   $result = 'value' + 42;  # ok, if you think this is Java...

It's simple: barewords are not allowed.

=head3 Indirect Objects - C<use strict 'objects'>

Another major problem is ambiguous indirect objects. Under C<use strict
'objects'>, the indirect object I<must> be surrounded by braces:

   use strict 'objects';
   no strict 'words';

   print STDERR @stuff;     # print(STDERR(@stuff))
   print 'STDERR' @stuff;   # syntax error
   print {'STDERR'} @stuff; # 'STDERR'->print(@stuff)

   print $fh @junk;         # syntax error
   print {$fh} @junk;       # $fh->print(@junk)

This eliminates the possibility of ambiguity with indirect objects. When
combined with C<strict 'words'>, code becomes even less ambiguous:

   use strict qw(words objects);

   $q = new 'CGI';          # syntax error
   $q = new {'CGI'};        # 'CGI'->new
   $q = new ('CGI');        # new('CGI')
   $q = new (CGI());        # new(CGI())

   $q = new 'CGI' @args;    # syntax error
   $q = new {'CGI'} (@args);# 'CGI'->new(@args)
   $q = new (CGI (@args));  # new(CGI(@args))

=head3 Syntactic Problems - C<use strict 'syntax'>

There are many other "little ambiguities" throughout Perl. Adding
C<strict 'syntax'> would remove these and require the user to specify
them explicitly. In this category fits the bare C<//> problem mentioned
in RFC 135, as well as several common "bugs" (mistakes).

Under this rule, the following would apply:

   1. No more // by itself, you must use m//

   2. Trailing conditionals would require parens

   3. Precedence other than for basic math and boolean ops would
      not apply

This is designed to force you to write clean, unambiguous code that
borders on being non-Perlish:

   use strict 'syntax';
   next if /^#/ || /^$/;       # syntax error
   next if m/^#/ || m/^$/;     # syntax error
   next if (m/^#/ || m/^$/);   # ok

   use strict 'syntax';
   $data = $a + $b / $c - $d || $default or die;      # no way
   $data = ($a + $b / $c - $d) || $default or die;    # nope
   ($data = ($a + $b / $c - $d) || $default) or die;  # ok

Basically, the idea is to impose a truly unambiguous style so that
people don't get carried away with precedence and special cases.

=head2 Combining all these together

Let's look at an example of how all these would be combined. With the
additions to C<use strict> proposed in this RFC, the existing Perl 5
code snippet:

   $r = Apache->request;

   sub geturl {
       shift->{DATA}->{uri};
   }

   sub seturl {
       shift->{DATA}->{uri} = $_[0];
   }

   seturl $r->uri;
   print STDERR geturl or die "Can't print to STDERR: $!\n";

Would have to be rewritten:

   use strict;

   my $r = Apache::->request();

   sub geturl {
       shift()->{DATA}->{uri};
   }

   sub seturl {
       shift()->{DATA}->{uri} = $_[0];
   }

   seturl($r->uri());
   print {'STDERR'} (geturl()) or die("Can't print to STDERR: $!\n");

The syntax remains consistent, even though it's much more strict. If you
don't like a particular aspect, such as indirect objects requiring
braces, you can always turn that off with C<no strict>.

=head2 Problems with RFC 244 and special-casing ->

While syntactic ambiguities are problems in Perl, going about it in the
way that RFC 244 proposes - by special-casing -> and bareword indirect
objects - is not the solution. In the process it introduces new
syntactic ambiguities, where sometimes a bareword is a function and
other times it's a word, depending on context:

   print STDERR @data;            # error?
   print 'STDERR' @data;          # syntax error
   print {'STDERR'} @data;        # Joe Scripter: "huh?!"

   $u = Apache->request->uri;     # error?
   $u = Apache->request()->uri;   # huh? parens just in the middle?

Particularly annoying is that last one, since the second to last cannot
involve ambiguity because of its context. However, an approach like RFC
244's would require us to explicitly use the last form, even where it
wouldn't be needed.

Instead, we need to attack the real problem: The ambiguity throughout
Perl between unquoted barewords and functions.

Perl 6 should remain as flexible as Perl 5 by default, so that people
don't run away screaming. To tighten up code on important apps, the
above rules can simply be invoked with C<use strict>:

   use strict;

   print {'STDERR'} @data;              # works, is 'STDERR'->print
   my $u = Apache::->request()->uri();  # works, and more consistent

Plus, C<use strict> is lexically scoped, allowing us to have blocks of
loosely-evaluated code mixed with strict code, and vice versa.

=head1 IMPLEMENTATION

New hooks in C<use strict>.

=head1 MIGRATION

Simplest case: For those scripts that have a C<use strict>, add a C<no
strict qw(words objects syntax)> right after it. Then code doesn't have
to be changed.

=head1 REFERENCES

RFC 244: Method calls should not suffer from the action on a distance

RFC 277: Method calls SHOULD suffer from ambiguity by default

perl6storm issues #0003, #0035, #0050

Reply via email to