On Wed, Jan 30, 2002 at 08:13:55AM -0800, Ashley Winters wrote: > I think that's exactly what you should be doing! Neither parrot nor the > rx engine should try to be a full compiler. The rx engine definitely > should have opcodes in the virtual machine, but those opcodes should > simply contain state-machine/backtracking info, not godly unicode info.
So, basically, you just want to push Unicode onto the language that sits atop parrot. If that language were Perl, for instance, you'd advocate that everywhere the user had written /a/ be replaced (by the Perl compiler) with the big long "given" you described? Have I got that right? Excerpt from Apocalypse 2: Perl 6 programs are notionally written in Unicode, and assume Unicode semantics by default even when they happen to be processing other character sets behind the scenes. Note that when we say that Perl is written in Unicode, we're speaking of an abstract character set, not any particular encoding. (The typical program will likely be written in UTF-8 in the West, and in some 16-bit character set in the East.) It seems to me that in order for Perl 6 programs to be written in Unicode, Parrot needs to grok unicode (everwhere, including regular expressions). -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]