A rule by any other name...

Allison Randal Tue, 09 May 2006 16:52:35 -0700

On Apr 20, 2006, at 1:32 PM, Damian Conway wrote:

     Keyword    Implicit adverbs    Behaviour
      regex     (none)              Ignores whitespace, backtracks
      token     :ratchet            Ignores whitespace, no backtracking
      rule      :ratchet :words     Skips whitespace, no backtracking


[...and following threads...]

I'm comfortable with the semantic distinction between 'rule' as "thingyinside a grammar" and 'regex' as "thingy outside a grammar". But, Ithink we can find a better name than 'regex'. The problem is both the'regex' vs. 'regexp' battle, and the fact that everyone knows 'regex(p)'means "regular expression" no matter how may times we say it doesn't.(I'm not fond of the idea of spending the next 20 years explaining thatover and over again.) Maybe 'match' is a better keyword.

Then again, from a practical perspective, it seems likely that we'llwant something like ":ratchet is set by default in all rules" turned onin some grammars and off in other grammars. In which case, the realdistinction is that rules inside a grammar pull default attributes fromtheir grammar class, while rules outside a grammar have no defaultattributes. Which brings us back to a single keyword 'rule' making sensefor both.

I'm not comfortable with the semantic distinction between 'rule' and'token'. Whitespace skipping is not the defining difference between arule and a token in general use of the terms, so the names are misleading.

More importantly, whitespace skipping isn't a very significant option ingrammars in general, so creating two keywords that distinguish betweenskipping and no skipping is linguistically infelicitous. It's likecreating two different words for "shirts with horizontal stripes" and"shirts with vertical stripes". Sure, they're different, but thedifference isn't particularly significant, so it's better expressed by amodifier on "shirt" than by a different word.

From a practical perspective, both the Perl 6 and Punie grammars haveended up using 'token' in many places (for things that aren't tokens),because :words isn't really the semantics you want for parsing computerlanguages. (Though it is quite useful for parsing natural language andother things.) What you want is comment skipping, which isn't the sameas :words.

I suggest making whitespace skipping a default setting on the grammarclass, so the grammars that need whitespace skipping most of the timecan turn it on by default for their rules. That means 'token' and 'rule'collapse into just 'rule'.

I also suggest a new modifier for comment skipping (or skipping ingeneral) that's separate from :words, with semantics much closer toParse::RecDescent's 'skip'.


Allison

A rule by any other name...

Reply via email to