On Mon, Mar 09, 2009 at 10:32:02AM -0700, jerry gay wrote:
> > To make things a bit quicker for people writing custom versions of
> > <ws> (which may need to include "comment whitespace"), the Parrot
> > Compiler Toolkit also provides an optimized <ww> rule that matches
> > only between a pair of word characters.  Then the default definition
> > of <ws> becomes
> >
> >    token ws { <!ww> \s* }
>
> if you need a mnemonic to help you remember what 'ww' means, use 'within 
> word'.
> 
> this reminds me that pge's <ww> may be incorrect in its treatment of
> <apostrophe>.  these characters (<['-]> by default) are word
> characters, but i don't think that's been tested, and i don't think
> it's been implemented, either.

A couple of clarifications:

- PGE doesn't implement <ww> by default, because that's not (yet?)
  part of the spec.  It only appears in PCT::Grammar, for people
  using the Parrot Compiler Toolkit to create languages.

- AFAICT, apostrophe and hyphen are not yet "word characters" in
  the sense of being members of \w .  That is, they're considered
  to be valid in identifiers, but only when they are immediately
  preceded by a word character and immediately followed by an 
  alphabetic character.  Otherwise they're not part of the
  identifier.  (At least, that's how the current STD.pm reads.)

Pm

Reply via email to