Hello, I tried to generate some rules which match several numbers in a text (i.e. several numbers in specific format within arbitrary token sequences). My number rules work when assuming one number per line and matching them with:
file: ('\n' number)* When changing the newline to ".*" the numbers are not matched correctly anymore. I tracked down the problem to a very simple ruleset which can match things like "one" "two" "oneandone" "oneandthree" "oneandoneplusoneandthree" "oneandoneplustwo" with "and" and "plus" acting as number connectors. The simple rule set is grammar simpleNumbers; in : (.* numB)*; numB : numA 'plus' numA | numA 'plus' | 'plus' numA | numA; numA : num 'and' num | num; num : 'one' | 'two' | 'three'; I assumed when having something like: numA someTokens numA this would match 2 times the last OR of rule numB. But in some cases it matches the first OR of numB and returns a MissingTokenException as in following examples. (1) twoandone xx one matches --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-interest@googlegroups.com To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---
<<inline: parse_1.jpg>>
numB( numA(num("two"),"and",num("one")), MissingTokenException, numA(num("one")) ) where I would have expected to match two times the last OR of numA as numB(numA(num("two"),"and",num("one"))) and numB(numA(num("one"))). (2) plus xx one matches
<<inline: parse_2.jpg>>
where I would have expected numA(num("one")) only and skip "plus". For any ideas of a better solution to skip non-valid number tokens I would be grateful. Regards, Toby
List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address