I'm writing a parser for a language that treats a double newline as a
statement terminator. It works if I make every rule a 'regex' (to turn
off smart whitespace). But I want spaces and tabs to act as smart
whitespace, and newlines to act as literal whitespace. I've
overloaded <ws> to match only spaces and tabs, but the grammar still
consumes newlines where it shouldn't consume newlines. For a simple
repeatable example, take the following grammar:

------

token start { ^<emptyline>*$ }

regex emptyline { ^^ $$ \n }

token ws { [<sp> | \t]* }

------

If I match this against a string of 7 newlines, it returns 7 <emptyline>
matches, and each match is a single newline. This is the behavior I want
for newlines.

I would like to add smart whitespace matching for spaces and tabs. But,
if I change <emptyline> to a 'rule' and match it against the same string
of 7 newlines, it returns a single <emptyline> match and the matched
string is 7 newlines. I've tried several variations on the <ws> rule,
but it seems to boil down to: no matter what the <ws> rule matches, if
:sigspace is on, it treats newlines as ignorable whitespace.

Is this a bug or a feature?

Thanks,
Allison

Reply via email to