This and other RFCs are available on the web at
http://dev.perl.org/rfc/
=head1 TITLE
Single quotes don't interpolate \' and \\
=head1 VERSION
Maintainer: Nicholas Clark <[EMAIL PROTECTED]>
Date: 28 Sep 2000
Mailing List: [EMAIL PROTECTED]
Number: 328
Version: 1
Status: Developing
=head1 ABSTRACT
Remove all interpolation within single quotes and the C<q()> operator, to
make single quotes 100% shell-like. C<\> rather than C<\\> gives a single
backslash; use double quotes or C<q()> if you need a single quote in your
string.
=head1 DESCRIPTION
Camel III (page 7) says "Double quotation marks (double quotes) do
I<variable interpolation> and I<backslash interpolation> while single quotes
suppress interpolation." Page 60 qualifies this with "except for C<\'> and
C<\\>".
In perl single quotes are used to generate strings. Double quotes also generate
strings.
In C single quotes are used to make character constants. Double quotes are
used to make string constants. Backslash interpretation is performed in
single quotes in C. While multi-character constants are allowed by C, they
are strongly discouraged as they are non-portable, and a character constant
in C is a type distinct from a string constant. Hence double quotes and
single quotes signify different things.
In shell, single quotes are used to make strings. Double quotes also make
strings. Within single quotes backslashes are ordinary characters, and do
not quote anything. As one can't quote a C<'> with a C<\> there is no way to
interpolate a single quote within a single quoted string, but a workaround
such as C<'don'\''t'> relying on the concatenation of C<'don'> C<\'> and
C<'t'> achieves the desired results.
Hence perl's single quoted strings are analogous to shell's single quoted
strings, not C's. However, they're not identical, as perl allows C<\\> to
mean an embedded C<\>, C<\'> to mean an embedded C<'>.
This RFC argues that the exception is confusing and proposes to remove it.
This makes perl more regular in shell terms, and slightly more easy to learn
for the shell programmer.
It also makes perl internally simpler more regular. Currently the behaviour
for C<q()> strings is that C<\(> C<\)> and C<\\> map to 1 character,
C<\>I<?> for all other I<?> maps to 2 characters. C<qq()> differs as
C<\>I<?> maps to 1 character both when I<?> is recognised as a backslash
escapes, B<and> when it is unrecognised. A further irregularity is that
currently single quoted here docs don't interpolate C<\\> or C<\'>.
With this RFC it is proposed that in a single quoted string C<\> is not
special. Hence C<\>C<?> always maps to 2 characters (C<\> then I<?>) unless
I<?> is the closing terminator, in which case the string terminates with
that C<\> . Single quoted strings behave like single quoted here docs, and
like shell single quoted strings.
=head1 IMPLEMENTATION
Modify the tokeniser/lexer not to treat C<\> as special, hence the first end
delimiter ends the string.
For 5.7's toke.c this doesn't appear that simple. it looks like
modifications would be needed to C<S_tokeq>, C<scan_str> and C<Perl_yylex>
(for a quoted string at the start of curlies). There are probably more; the
code that makes single quoted strings interpolate C<\'> and C<\\> appear to
be deeply ingrained into the core.
The perl5 to perl6 convertor would need to convert single quoted strings and
C<q()> operators containing C<\'> to the shortest (clearest?) equivalent of:
=over 4
=item *
double quoted string with C<\Q>...C<\E> or C<\> escapes
=item *
single quoted string(s) and explicit concatenation of C<"'">
=item *
C<q()> operator with delimiters not in the string
=back
Single quoted strings containing C<\\> but no quoting of delimiters would
need have C<\\> converted to C<\>
=head1 REFERENCES
Camel III - Programming Perl (3rd Edition)
perlop manpage for interpolation
I believe that Larry Wall once made a comment about \' and \\ in single
quoted strings being a mistake, but I can't find any reference. The idea
certainly isn't mine, but I feel it worthy of consideration.