Order doesn't matter. ANTLR will match the longest possible token.

One case when order matters is when the rule below cannot match any token inspite of the 'lengthiest token matching' mechanism.

Example:

ID   :   'a'..''z'+ ;

SOME_KEYWORD   :   'key' ;

In this case ANTLR will report an error as SOME_KEYWORD can never be matched. Disambiguating by 'lengthiest token' will not work here.

Cheers, Indhu

Avid Trober wrote:
thanks.
org.antlr.Tool is happy with these two, regardless of which one is 
above/below the other.
But, won't the DFA's care about the order???

DQUOTE : '"' ;
DQUOTE_STRING :  DQUOTE ( ~('"') )* DQUOTE



----- Original Message ----- 
From: "Gavin Lambert" <an...@mirality.co.nz>
To: "Avid Trober" <avidtro...@gmail.com>; <antlr-inter...@antlr.org>
Sent: Tuesday, April 21, 2009 6:53 AM
Subject: Re: [antlr-interest] Lexing 7-bit ASCII stream


  
At 21:59 21/04/2009, Avid Trober wrote:
    
I'm parsing a 7-bit ASCII stream ... 2 questions

Question 1: can't I just fall-thru wrt to lexer rules, where lexer rules 
are specific-to-general, and avoid indeterminisms at run-time?
      
[...]
    
... // (AND IF NOTHING ABOVE MATCHES, AT LEAST WE'RE MATCHING HERE ... )

CHAR    : '\u0000'..'\u007F'  // any 7-bit US-ASCII character
             ;
      
You can specify a catch-all match like so:

  CHAR : .;

If this is the last lexer rule, then it will behave as you're expecting.

    
Question 2: I'm at a loss how to match the notation in the spec I'm 
writing a grammar for where binary digits are '0' or '1'  and digits are 
'0'..'9'.  (ABNF-ish)  It is prefered to make the grammar rule names match 
that (whether lexer or parser, it doesn't matter)
      
Generally, it's best to have the lexer match as wide as possible (ie. have 
DIGIT, not BINARY_DIGIT) and sort it out in the parser, where you can use 
the context to give better error messages if you encounter something 
invalid.

    
Can I write a binary_digit parser rule that works with DIGIT above 
somehow?
      
Yep.  Depending on the context, you may want to either use a 
lookahead-based entry predicate to avoid entering the rule if the DIGITs 
aren't binary-safe, or a exit predicate that raises an error if it turns 
out that the sequence wasn't valid binary.

    


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

  


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "il-antlr-interest" group.
To post to this group, send email to il-antlr-interest@googlegroups.com
To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

Reply via email to