[il-antlr-interest: 23998] Re: [antlr-interest] Inconsistent Parse Results

Indhu Bharathi Wed, 03 Jun 2009 08:15:05 -0700

That is an expected behavior. Seeing ' C' the lexer decides to go for 'CORP'
token instead of OTHER(space) and WORD. You need to do some left factoring
there. Or you can modify your grammar to avoid such problems. Here is a
suggested correction:


grammar Test ;

test1 : NUMBER CORP data {System.out.println("Data: " + $data.text);} ;

data : ~('\r' | '\n')* ;

NUMBER : '0'..'9'+ ;

CORP:   'CORP' ;

WORD : ('a'..'z' | 'A'..'Z')+ ;

WS      :       (' ' | '\t') {$channel=HIDDEN;}
        ;

OTHERCHAR
        :       .
        ;
        

Cheers, Indhu 


-----Original Message-----
From: antlr-interest-boun...@antlr.org
[mailto:antlr-interest-boun...@antlr.org] On Behalf Of Glen Miller
Sent: Wednesday, June 03, 2009 7:44 PM
To: antlr-inter...@antlr.org
Subject: [antlr-interest] Inconsistent Parse Results

When parsing the following data 
"2 CORP The Church of Jesus Christ of Latter-day Saints"

The parser is choking on Ch? and striping it out.

line 1:12 mismatched character 'h' expecting 'O'
line 1:28 mismatched character 'h' expecting 'O'
Data: Theurch of Jesusrist of Latter-day Saints

I am new to antlr, is my grammer wrong, or is it a bug?

Grammer -


grammar Test1 ;

test1 : NUMBER ' CORP ' data {System.out.println("Data: " +
$data.text);} ;

data : ~('\r' | '\n')* ;

NUMBER : '0'..'9'+ ;

OTHERCHAR : 
        '~' | 
        '!' | 
        '@' | 
        '#' | 
        '$' | 
        '%' | 
        '^' | 
        '&' | 
        '*' | 
        '(' | 
        ')' | 
        '-' | 
        '_' | 
        '+' | 
        '=' | 
        '{' | 
        '}' | 
        '[' | 
        ']' | 
        ':' | 
        ';' | 
        '<' | 
        '>' | 
        '?' | 
        ',' | 
        '.' | 
        '/' | 
        ' ' ;

WORD : ('a'..'z' | 'A'..'Z')+ ;


Test App -




import java.io.IOException;
import org.antlr.runtime.ANTLRFileStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;

public class TestApp
{
        public static void main(String[] inArgList)
        {
                try
                {
                        ANTLRFileStream theFileStream = new
ANTLRFileStream("/home/glenmiller/tmp1/output/TestData2");
                        Test1Lexer theLexer = new Test1Lexer(theFileStream);
                        CommonTokenStream theTokenStream = new
CommonTokenStream(theLexer);
                        Test1Parser theParser = new
Test1Parser(theTokenStream);
                        theParser.test1();

                }
                catch (IOException inException)
                {
                        inException.printStackTrace();
                }
                catch (RecognitionException inException)
                {
                        inException.printStackTrace();
                }
        }
}




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-interest@googlegroups.com
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

[il-antlr-interest: 23998] Re: [antlr-interest] Inconsistent Parse Results

Reply via email to