> On Sun, Mar 08, 2009 at 09:43:17AM +0100, pugs-comm...@feather.perl6.nl wrote:
> =item * ws
> 
> Match whitespace between tokens.
> 
> =item * space
> 
> Match a single whitespace character. Hence C< <ws> > is equivalent to C< 
> <space>+ >.


The definitions of <ws> and <space> above are incorrect, or at least
misleading.  <ws> matches required whitespace between pairs of word
characters; it's optional whitespace otherwise.  The default definition
of <ws> is something like:

    token ws { <?before \w> <?after \w> <!> || \s* }

It's certainly _not_ the case that <ws> is equivalent to <space>+ .

To make things a bit quicker for people writing custom versions of
<ws> (which may need to include "comment whitespace"), the Parrot
Compiler Toolkit also provides an optimized <ww> rule that matches 
only between a pair of word characters.  Then the default definition 
of <ws> becomes 

    token ws { <!ww> \s* }

Grammars can change this to things like:

    token ws { <!ww> [ \s+ || '#' \h* \n ]* }

Pm

Reply via email to