James Mastros wrote:
Just a few more nits to pick...On 12/02/2002 6:58 AM, Joseph F. Ryan wrote:We need to decide if this is a user doc or a developer doc/language specification. If it's the later, we need a regirous defintion of what a pair is.The q() operator allows strings to be made with any non-space, non-letter, non-digit character as the delimeter instead of '. In addition, if the starting delimeter is a part of a paired set, such as (, [, <, or {, then the closing delimeter may be the matching member of the set. In addition, the reverse holds true; delimeters which are the tail end of a pair may use the starting item as the closing delimeter.
I'm more inclined towards a user doc; a rigorous definition of pairs in the tests should be good enough for the developers.
Are comments ever allowed within q() constructs? If not, ditch the statement about comments not being allowed in q## constructs.There are a few special cases for delimeters; specifically : and #. : is not allowed because it might be used by custom-defined quoting operators to apply a property; # is allowed, but there cannot be a space between the operator and the #. In addition, comments are not allowed within # delimeted expressions (for obvious reasons).
You're right, they're not. Woops.
A doubled set of angle brackets (<<text here>>) or a set of double-angle quotation marks (guillemets, «text here»).=head3 <<>>; expanding a string as a list. A set of braces is a special op that evaluates into the list of word
Are we getting rid of qw()? I assumed that we were keeping it as a longhand form of <<>>/guillemets, just like qq() is the longhand form of "".contained, using whitespace as the delimeter. It is similar to qw() from perl5, and can be thought of as roughly equivalent to:
I'd be more explicit here, and say C<<"STRING".split(/\s+/)>>. (The two are equivlent, but only because of special-casing; the second is more explicit.)C<< "STRING".split(' ') >>
Nope, split (' ', $string) is special; it eats up all preceding whitespace before splitting on the space, while with /\s+/ there will be an intial empty element. The example is straight from perl5's perlop anyways :)
Have these defaults been defined somewhere? I'd rather see them be ', ' and '=>' by default...
Well, that's what the RFC suggested, and there didnt seem to be many complaints about the defaults in the Apoc (besides the variable names) Like I said, I just winged it :)
Get rid of the therefore; it seems to refer to the preceding sentance, which has nothing to do with the example.Note that hashes are unordered, and so the output will be unordered. Therefore, the following two expressions are equivalant:
Has this been vetted? $(...)/etc seem to cover this case, and & being a qq() metachar makes using qq() strings to print HTML/XML difficult.=item Subroutines and Methods: C<"&sub($a1,$a2)">, C<"$obj.meth($a)"> Subroutines and Methods will interpolate their return value into the string, which will be handled in whichever type the return value is. Same for object methods. Note that parens B<are> required during interpolation so that the parser can disambiguate between object methods and object members.
Well, it was in Apoc 2:
http://www.perl.com/pub/a/2001/05/03/wall.html#rfc 252: interpolation of subroutines
http://www.perl.com/pub/a/2001/05/03/wall.html#rfc 222: interpolation of object method calls
Can we get some riggor here? Also, is \n the same everwhere, or do we play the same tricks we did with it in p5? (I think it should be the same everywhere, a CR char, "\cM". Disciplines, or encodings, or whatever we're calling them, can take care of it on IO.) Oh, and it might be nice for \0 to be NUL. (This used to be implicit with \0 as octal, but since \0 isn't octal anymore...)=item Escaped Characters # Basically the same as Perl5; also, how are locale semantics handled? \t tab \n newline \r return \f form feed \b backspace \a alarm (bell) \e escape
As someone who has had to use NT, Mac OS 9, and Solaris with much frequency, I can say I very much appreciated the special tricks that \n did (does).
Numeric Literals, take 3 (http:[EMAIL PROTECTED]/msg00462.html), in the "*** Bin/Hex/Oct shorthands" section, gives 0c123 as the shorthand form of octal numbers, so it doesn't make much sense for octal character constants to be \o123. Do we want to change shorthand octal literal numbers to 0o123 (I don't like this, it's hard to read), change octal chars to \c123 (can't do this without getting rid of, or changing, \c for control-character), get rid of octal chars entirely, or somthing else? (Baring a good "somthing else", I vote for killing octal chars.)\b10 binary char \o33 octal char
This seems to be going back and forth:
$octal_format = ($octal_format_still_exists) ?
sprintf("\\%s%d",$octals_current_letter_of_the_week, $number) :
undef;
That should clear things up.
Exactly two digits after the \x? Perl5 attempts to do the right thing either way, but this can be confusing too -- "\xA" eq chr(0xA), "\xABar" eq chr(0xAB)."ar", "\xAQux" eq chr(0xA)."Qux".\x1b hex char
That was in perl5's perldoc, so I assume it is encouraged. You brought this up before: http:[EMAIL PROTECTED]/msg00485.html I still say to stick with perl5's behavior.
\x{263a} wide hex char \c[ control charRigor? What is \c~? perl5 thinks it's >, should perl6 agree?
I don't see why it shouldn't.
How about \c\x{1000} (that's invalid, but you get the point), is that equiv to \x{ff9c}?
No, its "\c\" ~ "x{1000}"
What about \cé, (e+acute accent), does that capitalize, then subtract 64, or just subtract?
Reference to charnames pragmata, or however we end up defining the exact semantics of \N. (Since we don't know yet, just put in a FIXME, I suppose.)\N{name} named Unicode character
Just recycle perl5's, I suppose. Not *everything* needs to be redone from scratch.
Is there any way to give the ordnal in decimal, like "\d192"? (I'm not sure how useful this would be, but it would be nice parrellelisim. OTOH, you can use chr() easily enough.
That is a good point; if there is a 0dxxxxx, then there should be a "\dxxxxx".
=item Modifiers: C<\Q{}>, C<\L{}>, C<\U{}> Modifiers apply a modification to text which they enclose; they can be embedded within interpolated strings. \L{} Lowercase all characters within brackets \U{} Uppercase all characters within brackets \Q{} Escape all characters that need escaping within brackets (except "}")Rigor: escape all non-alphanumerics. Do we still have the other modifiers that p5 supports, \l and \u?
That's a good question. There was no reference to them in Apoc, however, that doesn't mean that they are gone. I haven't a clue, really.
Do we want a new titlecase modifier, \T{james mastros} eq "James Mastros", doing the Right Thing for other languages, where it isn't so simple (there are complicated cases for this, but IIRC Unicode defines a robust algo to do this). I'll check on the Unicode stuff if anybody thinks it's a good idea... I'm uncertian, myself, I never liked the qq() case-modifers, so don't use them.
There is ucfirst(), which I'm sure could be updated to handle Unicode; however, I don't know if it is important enough to deserve \T{}. You might want to ask Larry :)
This whole section is very unix-centric, but I'm not certian what to do about that -- the functionality is very system-specifc. Also, I suspect we're going to want to rewrite it anyway when we hammer out iterators, files, and context.A string which is (possibly) interpolated and then executed as a system command with /bin/sh or its equivalent. Shell wildcards, pipes, and redirections will be honored. The collected standard output of the command is returned; standard error is unaffected. In scalar context, it comes back as a single (potentially multi-line) string, or undef if the command failed. In list context, returns a of list of lines split on the standard input separator, or an empty list if the command failed.
Why?
A line-oriented form of quoting is based on the shell "here-document"s/shell/unix borne shell/I could have sworn that Larry recently put somthing out about the edge cases between << heredoc and << beginning-of-qw. I /think/ he said that qw("Foo" bar) must be written as << "Foo" bar>>, because otherwise it would be interpreted as a here-doc ending with Foo with double-quote interpolation. Can anybody find this, or is Larry watching?syntax. Following a << you specify a string to terminate the quoted material, and all lines following the current line down to the terminating string are the value of the item. The terminating string may be either an identifier (a word), or some quoted text. If quoted, the type of quotes you use determines the treatment of the text, just as in regular quoting. An unquoted identifier works like double quotes. The terminating string must appear by itself, and any preceding or following whitespace on the terminating line is discarded.
Are \qq()s still special, even in <<'noninterpolating's? Either way, it should be explicitly noted.Also note that with single quoted here-docs, backslashes are not special, and are taken for a literal backslash, a behaivor that is different from normal single-quoted strings.
As far as I know, *nothing* is special in a single quoted heredoc.
V-Strings are formed when 3 or digits are joined by decimal points, with a possible leading v. The resulting item is then treated like a string, rather than a number. =over 3 Examples: $var = v5.8.0; # $var = "5.8.0"; $var = 192.168.0.1; # $var = "192.168.0.1"; =backNote that the v is non-optional for two-character v-strings.
Good point, because otherwise its a number. Definately needs to be added to the test suite.
I'd say somthing like:
V-strings are actualy strings that just happen to look like numbers. Each dot-sepperated number is transformed into the character with that Unicode ordnal, and the string is concotantaed together.
(The transformation from normal string to v-string looks like C<<$vstring='v' ~ join '.', map {ord} split //, $instring>>; the transformation from v-string to normal string looks like
C<<print join '', map {chr} split /\./, $vstring>>;
(Where vstring cannot begin with a leading 'v', for purposes of illistration.))
Thus, C<<80.101.114.108.32.54.33 eq 'Perl 6!'>>
Also, your examples are misleading at best. v5.8.0 eq "\x05\x08\x00".
192.168.0.1 eq chr(192)~chr(168)~chr(0)~chr(1).
You're right, the vstring section should be totally redone. Thanks for the feedback., Joseph F. Ryan [EMAIL PROTECTED]