Allison Randal wrote: > More importantly, whitespace skipping isn't a very significant option in > grammars in general, so creating two keywords that distinguish between > skipping and no skipping is linguistically infelicitous. It's like > creating two different words for "shirts with horizontal stripes" and > "shirts with vertical stripes". Sure, they're different, but the > difference isn't particularly significant, so it's better expressed by a > modifier on "shirt" than by a different word.
This is not only "space" skipping; as we discussed, <ws> skips over
comments as well as spaces, because a language (such as Perl 6) can
defined its own <ws> that serves as valid separator. To wit:
void main () {}
void/* this also works */main () {}
Or, in Perl 6:
say time;
say#( this also works )time;
> From a practical perspective, both the Perl 6 and Punie grammars have
> ended up using 'token' in many places (for things that aren't tokens),
> because :words isn't really the semantics you want for parsing computer
> languages. (Though it is quite useful for parsing natural language and
> other things.) What you want is comment skipping, which isn't the same
> as :words.
Currently it's defined, and used, the same as :words.
I think the confusion arises from <ws> being read as "whitespace"
instead of as "word separator". Maybe an explicit <wordsep> can fix
that, or maybe rename it to something else, but the token/rule
distinction of :words is very useful, because it's more usual for
languages to behave like C and Perl 6, instead of:
ex/* this calls exit */it();
which is rarer, and can be treated with separate "token" rules than <ws>.
Audrey
signature.asc
Description: OpenPGP digital signature
