At 04:34 PM 11/5/2001 -0800, Steve Fink wrote: >Quoting Dan Sugalski ([EMAIL PROTECTED]): > > At 11:54 AM 11/5/2001 -0800, Steve Fink wrote: > > > > >It's pretty > > > > >much functional, including reOneof. Still, these could be useful > > > > >internal functions... *ponder* > > > > > > > > I was thinking that the places they could come in really handy for were > > > > character classes. \w, \s, and \d are potentially a lot faster this > way, > > > > 'specially if you throw in Unicode support. (The sets get rather a bit > > > > larger...) It also may make some character-set independence easier. > > > > > >But why would you be generating character classes at runtime? > > > > Because someone does: > > > > while (<>) { > > next unless /[aeiou]/; > > } > > > > and we want that character class to be reasonably fast? > >? So don't generate it at runtime. When you generate the opcode >sequence for the regex, emit a bit vector into the constant table and >refer to it by address in the matchCharClass op's arguments. Be fancy >and check that you haven't already emitted that bit vector. Am I >missing something?
Just me being rather amazingly dense. No, there's no requirement for there to be a way to create or change at runtime the contents of one of these bit vectors, at least not for the regexes. That can be done entirely by the compiler or loader, depending on where the code's ultimately coming from. > > Ah, point. A bitmap won't work too well with the full UTF-32 set. > > > > Having a good set of set operations would be useful for the core, though. > >No argument there. Which would be a good argument for allowing these things to be created or modified at runtime, but that's a separate argument entirely. > > >You aren't thinking that the regular expression _compiler_ needs to be > > >written in Parrot opcodes, are you? I assumed you'd reach it through > > >some callout mechanism in the same way that eval"" will be handled. > > > > The core of the parser's still a bit up in the air. Larry's leaning > towards > > it being in perl. > >When you say "parser", do you mean parser + bytecode generator + >optimizer + syntax analyzer? (Of which only the bytecode generator is >relevant to [:classes:], I suppose.) No, really just the parser bit, which is where I was assuming most of the work would get done for this. Silly assumption--too much blood in my caffeine stream. The bytecode generator and optimizer may well be (probably will be) in C, though depending on how it works out the bytecode generator itself may well be doable in perl. In many ways it's just a fancy set of rules to transform the syntax tree to bytecode, and you could look at it as either a really beefy template system or a fancy regex machine. Dan --------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk