Achint, I am a newbie here too. I don't mean to talk form a position of authority.
The trick is do you have keywords that are reserved. Can you say in your language that Integer is a "type" or is it merely a string and only in certain contexts does it have a meaning? Yes it is restrictive to have reserved words but it makes the parsing much easier. Joe Achint Mehta wrote: > Hi Joe, > > Thanks for your response. > > You have proposed two solutions: > 1. Replace ver with SPECIAL_STRING and check in the target code for > allowed values. This means that if I intent to collect a generic > unquoted string in a antlr parser, then I cannot use any tokens in the > whole parser. In a big parser, this seems to be a limitation, which > means that the target language program validates every string where > token should have been placed in the parser. > > 2. The second option is that all the tokens have to given as alternate > rules/token with SPECIAL_STRING. Again, in a big/complicated parser, > all the tokens in the whole parser have to be repeated where-ever I > intend to use the SPECIAL_STRING. This can be simplified if I give the > tokens in the definition of SPECIAL_STRING iteself. But still in a > parser which could use tens or hundreds of tokens, it would seem to be > impractical to repeat all the tokens in SPECIAL_STRING rule and other > similar rules (intended for collecting the generic string). > > The parser that I have put in the e-mail is a simplified version of > the issue I am facing. I am writing a SIP protocol message parser. The > very first line of a SIP message starts as (I am compressing the rules > for clarity): > > Method SPACE Request-URI ... (other rules follow) > Method: "INVITE" | "ACK" | "OPTIONS" | "BYE" | "CANCEL" | "REGISTER" > Request-URI boils down to : "sip:" [userinfo "@"] hostport > url-parameters [headers] > and userinfo is an unquoted alpha-numeric string. > > if the SIP starts as REGISTER SIP:regis...@... > The parsing would fail if I write the rules as I mentioned in my > sample program earlier. > SIP protocol is filled with rules such as userinfo where unquoted > alphs-numeric strings have to be collected and there are tens of > tokens in its grammar. This is a typical scenario for any protocol > grammar. I am not sure repeating all tokens in rules or treating > everything as genric string would be a neat solution. > > I admit that I am a noob when it comes to familarity with other > lexers/parsers, and rest of them might require some other work-around > as well. But situation seems to be pretty common enough to have a > straight solution (though I might be wrong). > > Thanks. > > Regards, > Achint > > > > I don't see this as an ambiguity issue but rather a decision of > whether > your grammar uses reserved words or not. > I'm not an expert by any means but that doesn't mean I don't have an > opinion just that you should take it with a grain of salt. > > You can either handle this with a symbol table later in the process or > rewrite the requestline to something like > requestline : ver EQUAL (SPECIAL_STRING | ver); > > Joe > > > Achint Mehta wrote: > > Hi All, > > > > The section "Ambiguities and Non determinisms" of the book "The > > definitive ANTLR guide" talks about the ambiguities in lexer rules, > > but I am not sure how to resolve them. > > > > Consider a following grammar which assigns a value to an ID. The ID > > can either be VERSION or COUNT while its value can be anything: > > ------------------------------ > ----------------- > > grammar sample_parser; > > > > requestline : ver EQUAL SPECIAL_STRING ; > > > > /* Tokens */ > > ver:('VERSION'| 'V') {} > > | ('COUNT' | 'C') {} ; > > > > > > SPECIAL_STRING:(CHAR)+ ; > > WHITESPACE: ' '; > > NEWLINE: ('\r')? '\n'; > > EQUAL: '='; > > > > fragment > > CHAR: (('a'..'z')|('A'..'Z')); > > ----------------------------------------------- > > > > If the input is given as > > VERSION=FIRST > > Then it works, but if following input is given > > VERSION=VERSION > > Then I get an error (MissingTokenException after the "="). > > > > How can this ambiguity be resolved ? > > > > Thanks in advance. > > > > Regards, > > Achint > > > ------------------------------------------------------------------------ > > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > > > > > ------------------------------------------------------------------------ > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-interest@googlegroups.com To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---
List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address