This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Additions to 'use strict' to fix syntactic ambiguities =head1 VERSION Maintainer: Nathan Wiger <[EMAIL PROTECTED]> Date: 24 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 278 Version: 2 Status: Frozen =head1 ABSTRACT Several RFCs and many people have voiced concerns with different parts of Perl's syntax. Most take issue with syntactic ambiguities and the inability to easily tokenize Perl. This RFC shows how these all boil down to a three central issues, and how they can be solved with some simple additions to C<use strict>. By default, Perl should remain as flexible as possible. By adding these flags to C<use strict>, those who desire them can have all the benefits of a stricter syntax, without hurting those that like these features. =head1 DESCRIPTION =head2 The Problems =head3 Indirect Objects RFC 244 proposes eliminating the bareword indirect object syntax because this: print STDERR @stuff; Can be parsed as either of these: STDERR->print(@stuff); print(STDERR(@stuff)); Depending on your usage of C<STDERR> other places in your program. However, many of us like writing: $q = new CGI; Quite a bit, and consider this DWIMish. =head3 Barewords vs. Functions RFC 244 and others mention several problems with barewords such as: Class->stuff(@args); # Class()->stuff or 'Class'->stuff ? Again, the fact that Perl can figure this out correctly is quite DWIMish, and this functionality should not be removed by default. =head3 Special Cases Many special cases abound, such as the bare C<//> mentioned in RFC 135. Again, this is stuff that makes Perl fun, and should not be taken out of the language. =head2 The Solutions At first, these may not seem related. However, they very much are, and in fact all boil down to only three issues which can be resolved with additions to C<use strict>. =head3 Function Parens - C<use strict 'words'> This imposes a very simple restriction: barewords are not allowed. They must be either quoted or specified with parens to indicate they are functions. Note this solves the C<%SIG> problem from Camel: use strict 'words'; $SIG{PIPE} = Plumber; # syntax error $SIG{PIPE} = "Plumber"; # use main::Plumber $SIG{PIPE} = Plumber(); # call Plumber() In addition, this also forces users to disambiguate certain functions: use strict 'words'; name->stuff(@args); # syntax error 'name'->stuff(@args); # 'name'->stuff name::->stuff(@args); # ok too, same thing name()->stuff(@args); # name()->stuff $result = value + 42; # syntax error $result = value() + 42; # value() + 42 $result = value( + 42); # value(42) $result = 'value' + 42; # ok, if you think this is Java... It's simple: barewords are not allowed. =head3 Indirect Objects - C<use strict 'objects'> Another major problem is ambiguous indirect objects. Under C<use strict 'objects'>, the indirect object I<must> be surrounded by braces: use strict 'objects'; no strict 'words'; print STDERR @stuff; # print(STDERR(@stuff)) print 'STDERR' @stuff; # syntax error print {'STDERR'} @stuff; # 'STDERR'->print(@stuff) print $fh @junk; # syntax error print {$fh} @junk; # $fh->print(@junk) This eliminates the possibility of ambiguity with indirect objects. When combined with C<strict 'words'>, code becomes even less ambiguous: use strict qw(words objects); $q = new 'CGI'; # syntax error $q = new {'CGI'}; # 'CGI'->new $q = new ('CGI'); # new('CGI') $q = new (CGI()); # new(CGI()) $q = new 'CGI' @args; # syntax error $q = new {'CGI'} (@args);# 'CGI'->new(@args) $q = new (CGI (@args)); # new(CGI(@args)) =head3 Syntactic Problems - C<use strict 'syntax'> There are many other "little ambiguities" throughout Perl. Adding C<strict 'syntax'> would remove these and require the user to specify them explicitly. In this category fits the bare C<//> problem mentioned in RFC 135, as well as several common "bugs" (mistakes). Under this rule, the following would apply: 1. No more // by itself, you must use m// 2. Trailing conditionals would require parens 3. Precedence other than for basic math and boolean ops would not apply This is designed to force you to write clean, unambiguous code that borders on being non-Perlish: use strict 'syntax'; next if /^#/ || /^$/; # syntax error next if m/^#/ || m/^$/; # syntax error next if (m/^#/ || m/^$/); # ok use strict 'syntax'; $data = $a + $b / $c - $d || $default or die; # no way $data = ($a + $b / $c - $d) || $default or die; # nope ($data = ($a + $b / $c - $d) || $default) or die; # ok Basically, the idea is to impose a truly unambiguous style so that people don't get carried away with precedence and special cases. =head2 Combining all these together Let's look at an example of how all these would be combined. With the additions to C<use strict> proposed in this RFC, the existing Perl 5 code snippet: $r = Apache->request; sub geturl { shift->{DATA}->{uri}; } sub seturl { shift->{DATA}->{uri} = $_[0]; } seturl $r->uri; print STDERR geturl or die "Can't print to STDERR: $!\n"; Would have to be rewritten: use strict; my $r = Apache::->request(); sub geturl { shift()->{DATA}->{uri}; } sub seturl { shift()->{DATA}->{uri} = $_[0]; } seturl($r->uri()); print {'STDERR'} (geturl()) or die("Can't print to STDERR: $!\n"); The syntax remains consistent, even though it's much more strict. If you don't like a particular aspect, such as indirect objects requiring braces, you can always turn that off with C<no strict>. =head2 Problems with RFC 244 and special-casing -> While syntactic ambiguities are problems in Perl, going about it in the way that RFC 244 proposes - by special-casing -> and bareword indirect objects - is not the solution. In the process it introduces new syntactic ambiguities, where sometimes a bareword is a function and other times it's a word, depending on context: print STDERR @data; # error? print 'STDERR' @data; # syntax error print {'STDERR'} @data; # Joe Scripter: "huh?!" $u = Apache->request->uri; # error? $u = Apache->request()->uri; # huh? parens just in the middle? Particularly annoying is that last one, since the second to last cannot involve ambiguity because of its context. However, an approach like RFC 244's would require us to explicitly use the last form, even where it wouldn't be needed. Instead, we need to attack the real problem: The ambiguity throughout Perl between unquoted barewords and functions. Perl 6 should remain as flexible as Perl 5 by default, so that people don't run away screaming. To tighten up code on important apps, the above rules can simply be invoked with C<use strict>: use strict; print {'STDERR'} @data; # works, is 'STDERR'->print my $u = Apache::->request()->uri(); # works, and more consistent Plus, C<use strict> is lexically scoped, allowing us to have blocks of loosely-evaluated code mixed with strict code, and vice versa. =head1 IMPLEMENTATION New hooks in C<use strict>. =head1 MIGRATION Simplest case: For those scripts that have a C<use strict>, add a C<no strict qw(words objects syntax)> right after it. Then code doesn't have to be changed. =head1 REFERENCES RFC 244: Method calls should not suffer from the action on a distance RFC 277: Method calls SHOULD suffer from ambiguity by default perl6storm issues #0003, #0035, #0050