Jesper Larsson wrote:
If you want the longest match, then left factor everything and let it
do that: 

A ( B (C|) |) ;

And set the token type at the appropriate points.
    

Not always so easy, however. My original example was, even more
simplified, something like this:

FOO:    'foo';
BAR:    'bar';
FOOZ:   'foo'* 'z';
  

You're think about this the wrong way:

FOO: 'foo' ('foo'* 'z' {$type = FOOZ; }| ) ;

But that means if you have 'foofoo  bar', you would go awry. So, if you must accept only foofoofooz then you have to use a predicate (which is what lex is doing by the ways: sees your rule, trys a match, does not match, try the next rule listed, and so on).

So:

FOO: 'foo'
        (
              ('foo'* 'z')=>'foo'* 'z' {$type = FOOZ; }
            | 
        )
 ;

Here you are telling ANTLR exactly what should match a FOOZ is all.  HOwever, in this case you would just have FOO and Z and ask the parser for FOO Z, but I think you are sayig that you cannot find a way to do that with your input language.
It might be possible to refactor using emit() or something, I'm not
sure. Difficult, anyway. An alternative would be to force backtracking
using syntactic predicates in the manner Indhu suggested in a previous
reply, but that means the lexer would scan the same input more than
once, and avoiding this is sort of why I use a lexer generator tool
instead of just matching the input with regexps to start with.

By the way, I got around my own problem with the URL/IDENT conflict by
incorporating the URL in the larger context where it appears, getting a
larger token from the lexer which is split up later. This seemed to be
the most bearable inelegancy in my situation.
  
Again, I suggest that if it is very inelegant, there is probably a better way to do it, but sometimes there is not if the input language is unbearably ambiguous.

Jim

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "il-antlr-interest" group.
To post to this group, send email to il-antlr-interest@googlegroups.com
To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

Reply via email to