[il-antlr-interest: 24086] [antlr-interest] Customizing token separators without recompiling

2009-06-07 Thread Dukie Banderjee
Hi everyone, I'm new to the list and new to ANTLR. I have a specific problem I need to solve and I hope ANTLR can help. Our client has several end-customers who all have slightly different document formats used for data interchange. All the documents are basically 'standard' EDI documents, me

[il-antlr-interest: 24089] Re: [antlr-interest] Customizing token separators without recompiling

2009-06-07 Thread Dukie Banderjee
I'm guessing there's more to the problem than just supporting arbitrary field separation tokens, because if that's all there is, just use something like perl and store the separator(s) in a config file...? --S --- On Sun, 6/7/09, Dukie Banderjee wrote: From: Dukie Banderjee Su

[il-antlr-interest: 24092] Re: [antlr-interest] Customizing token separators without recompiling

2009-06-07 Thread Dukie Banderjee
"If you simply want to break apart a line of text based on an arbitrary delimiter, it would be much easier to write a program in Perl, Python, Java, etc. that split the text based on a configuration setting." That's basically what I'm doing right now (in C#, by hand). Are you saying that ANTLR

[il-antlr-interest: 24094] Re: [antlr-interest] Customizing token separators without recompiling

2009-06-07 Thread Dukie Banderjee
Thanks, Steve, that looks very promising! Rob > Date: Mon, 8 Jun 2009 01:14:19 +0100 > Subject: Re: [antlr-interest] Customizing token separators without recompiling > From: st...@stevecooper.org > To: dukie_bander...@hotmail.com > CC: jsrs...@yahoo.com;

[il-antlr-interest: 24108] Re: [antlr-interest] Customizing token separators without recompiling

2009-06-08 Thread Dukie Banderjee
nger get SEMI. > > Perhaps it would be best to write a custom lexer. > > EDU is another good idea screwed up by design by comittee where none > if the members will give up their proprietory formats :( > > Jim > > > On Jun 7, 2009, at 4:45 PM, Dukie Banderjee > wrote: &

[il-antlr-interest: 24111] [antlr-interest] Bug in AntlrWorks debugger

2009-06-08 Thread Dukie Banderjee
Hi, Is this the right place to post AntlrWorks bugs? I looked around but didn't find any other place. It seems that AntlrWorks does not accept Tab characters (or backslashes, for that matter) in the Text field of the Input Text dialog box when you press the Debug button. The result was tha

[il-antlr-interest: 24125] Re: [antlr-interest] Bug in AntlrWorks debugger

2009-06-08 Thread Dukie Banderjee
t, even the '[' char will cause the bug to occur.) Rob > CC: antlr-inter...@antlr.org > From: pa...@cs.usfca.edu > To: dukie_bander...@hotmail.com > Subject: Re: [antlr-interest] Bug in AntlrWorks debugger > Date: Mon, 8 Jun 2009 13:22

[il-antlr-interest: 24185] [antlr-interest] Keywords vs. freeform text

2009-06-12 Thread Dukie Banderjee
Hi, Hope this isn't too much of a newbie question. I need to parse a format (EDI) which is basically delimited fields, but some fields must contain standardized code values whereas other fields can contain freeform text. My question is related to lexing and/or parsing. Do I need to/want to ha

[il-antlr-interest: 24223] [antlr-interest] Basic predicate question re: lexer

2009-06-14 Thread Dukie Banderjee
Hi, I'm working on a parser for a file format that can contain text and delimiters. One of the delimiters is a ':', and you can escape the delimiter by following it with a '?' such as ':?'. I'd like to have the lexer consider the ':?' as part of the TEXT token, and ':' match the SEPARATOR toke

[il-antlr-interest: 24224] [antlr-interest] CSharp2 code generation bug for ANTLRWorks 1.2.3 with -debug

2009-06-14 Thread Dukie Banderjee
The following grammar produces uncompilable code when generated from ANTLRWorks using -debug in the ANTLR options. I'm not sure which version of ANTLR is being used by ANTLRWorks. If it matters, I have ANTLR 3.1.3 on my machine. grammar EdifactDelfor; options { language = 'CSharp2' ; }

[il-antlr-interest: 24225] [antlr-interest] Matching empty string

2009-06-14 Thread Dukie Banderjee
Hi, My grammar needs to handle the following situation: A line can have multiple fields, separated by a delimiter. A field can have multiple components, separated by another delimiter. If a field or component is blank, it should be counted as a blank field or blank component. For example with f

[il-antlr-interest: 24246] [antlr-interest] CSharp2 -debug generation bug

2009-06-15 Thread Dukie Banderjee
The following grammar compiles fine under ANTLR 3.1.3 except if you use the -debug option, in which case it throws an exception during generation. Exception trace follows. The culprit line is: message: unhSegment bgmSegment segment+ linLoop untSegment -> ^(MESSAGE unhSegment bgmSegment segment

[il-antlr-interest: 24287] [antlr-interest] Multi-phase tree rewriting question

2009-06-19 Thread Dukie Banderjee
Hi, I'm considering how to solve the following problem: The documents I'm parsing (EDI) are actually wrapped in 'envelopes'. Similar to XML, the envelopes have a beginning line and an ending line. The lines in between are the actual documents which are wrapped. However, different wrapped docu