> >It's pretty
> >much functional, including reOneof.  Still, these could be useful
> >internal functions... *ponder*
> 
> I was thinking that the places they could come in really handy for were 
> character classes. \w, \s, and \d are potentially a lot faster this way, 
> 'specially if you throw in Unicode support. (The sets get rather a bit 
> larger...) It also may make some character-set independence easier.

But why would you be generating character classes at runtime? For
ASCII or iso-8859 or whatever regular ol' bytes are properly called, I
would expect \w \s \d charclasses to be constants. In fact, all
character classes would be constants. And as Dax mentioned, the
constructors for those constants would properly be internal functions.

For UTF-32 etc., I don't know. I was thinking we'd have to have
something like a multi-level lookup table for character classes. I see
a character class as a full-blown ADT with operators for
addition/unions, subtraction/intersections, etc.

You aren't thinking that the regular expression _compiler_ needs to be
written in Parrot opcodes, are you? I assumed you'd reach it through
some callout mechanism in the same way that eval"" will be handled.

Reply via email to