On Tue, Jul 05, 2005 at 08:42:44AM -0500, Patrick R. Michaud wrote: : In short, when PGE's : parser encounters a code block, it needs to hand off control to : the target language's compiler to parse to the end of the : code block and receive back from that compiler the length of : the block parsed.
That is the preferred way. : Or, we try to do something along the lines of Text::Balanced and : find the closing brace by doing simple delimiter counting... Perl 5 would have done it with two passes. Which is why Perl 6 is specced to always do one-pass parsing instead. :-) One wrinkle of one-pass is that the parser must somehow know where to quit. It's fine if you know you want a block and can call a rule that automatically terminates on '}', or if your rule reliably gets an error at the point it should stop. But in the general case you might want to be able to pass in a set of terminating delimiters that stop the outermost parse even if it looks like it should otherwise continue. : ...Perl 6 brings in several new delimiters that have to be taken into : account in the balancing act. These should generally be recognizable from Unicode properties, but that's not a good reason to take the balanced approach. And as soon as people start defining their own circumfix operators that violate the Unicode properties, all bets are off. Even a user-defined quote is going to cause grief unless you know the / of xxx/.../ is an opener. For languages that cannot do one-pass parsing, it would be saner in the long run for rules to delimit such code with delimiters that are unlikely to occur in the target language. Double curlies, or here docs, or some such. That's hacky, but counting brackets is also hacky. Any time you write two different parsers for the same language, it's a new set of bugs just waiting to happen. Larry