On Sat, 2009-05-16 at 08:27 +0530, Indhu Bharathi wrote: > This is because on seeing 'f' of foo lexer has two options - 1. IDENT > 2. URL. And it takes the second options since that seems to be longer > that the first alternative. Note that the lexer always tries to match > the longest token possible. > > After having decided to go for URL, it matches the input with URL and > it fails. Lexer doesn't backtrack and hence throws an exception.
I suppose the key here is "lexer doesn't backtrack". The following grammar has no problem matching inputs "foo" and "foobar", even though one is a prefix of the other: =============================================== grammar Y; options { output=AST; } file: (FOO|BAR|FOOBAR)* EOF; FOO: 'foo'; BAR: 'bar'; FOOBAR: 'foobar'; SPACE: (' ' | '\n')+ { $channel = HIDDEN; }; =============================================== If I remove the FOOBAR token from this grammar, input "foobar" is matched as two tokens FOO BAR. But if instead I change the FOOBAR token by adding another letter like this: FOOBAR: 'foobarz'; Then the lexer fails to match "foobar" as two tokens, complaining about a missing 'z'. It can still match "foo" however. So if I dare to extrapolate something from this, it looks like that while one token may be the prefix of another (obviously, otherwise it would be difficult to create lexers for most programming languages), it must not be the case that a proper prefix of one token can be matched as more than one token. Is that it? I can understand the motivation of this restriction in the interest of keeping the lexer target code at a certain complexity level, but I have not seen it stated in the documentation. It would have been nice if the lexer generator had issued a warning. J' List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-interest@googlegroups.com To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---