On Nov 16, 12:53 pm, len <[EMAIL PROTECTED]> wrote: > On Nov 16, 12:40 pm, "Mark Tolonen" <[EMAIL PROTECTED]> wrote: > > > > You might want to check out the pyparsing library. > > > -Mark > > Thanks Mark I will check in out right now. > > Len
Len - Here is a rough pyparsing starter for your problem: from pyparsing import * COMP = Optional("USAGE IS") + oneOf("COMP COMPUTATIONAL") PIC = oneOf("PIC PICTURE") + Optional("IS") PERIOD,LPAREN,RPAREN = map(Suppress,".()") ident = Word(alphanums.upper()+"_-") integer = Word(nums).setParseAction(lambda t:int(t[0])) lineNum = Suppress(Optional(LineEnd()) + LineStart() + Word(nums)) rep = LPAREN + integer + RPAREN repchars = "X" + rep repchars.setParseAction(lambda tokens: ['X']*tokens[1]) strdecl = Combine(OneOrMore(repchars | "X")) SIGN = Optional("S") repdigits = "9" + rep repdigits.setParseAction(lambda tokens: ['9']*tokens[1]) intdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9")) ("intpart") realdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9")) ("intpart") + "V" + \ Combine(OneOrMore("9" + rep | "9"))("realpart") type = Group((strdecl | realdecl | intdecl) + Optional(COMP("COMP"))) fieldDecl = lineNum + "05" + ident("name") + \ PIC + type("type") + PERIOD structDecl = lineNum + "01" + ident("name") + PERIOD + \ OneOrMore(Group(fieldDecl))("fields") It prints out: SALESMEN-RECORD SALESMEN-NO ['999'] SALESMEN-NAME ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'] SALESMEN-TERRITORY ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'] SALESMEN-QUOTA ['S', '9999999', 'COMP'] SALESMEN-1ST-BONUS ['S', '99999', 'V', '99', 'COMP'] SALESMEN-2ND-BONUS ['S', '99999', 'V', '99', 'COMP'] SALESMEN-3RD-BONUS ['S', '99999', 'V', '99', 'COMP'] SALESMEN-4TH-BONUS ['S', '99999', 'V', '99', 'COMP'] I too have some dim, dark, memories of COBOL. I seem to recall having to infer from the number of digits in an integer or real what size the number would be. I don't have that logic implemented, but here is an extension to the above program, which shows you where you could put this kind of type inference logic (insert this code before the call to searchString): class TypeDefn(object): @staticmethod def intType(tokens): self = TypeDefn() self.str = "int(%d)" % (len(tokens.intpart),) self.isSigned = bool(tokens.sign) return self @staticmethod def realType(tokens): self = TypeDefn() self.str = "real(%d.%d)" % (len(tokens.intpart),len (tokens.realpart)) self.isSigned = bool(tokens.sign) return self @staticmethod def charType(tokens): self = TypeDefn() self.str = "char(%d)" % len(tokens) self.isSigned = False self.isComp = False return self def __repr__(self): return ("+-" if self.isSigned else "") + self.str intdecl.setParseAction(TypeDefn.intType) realdecl.setParseAction(TypeDefn.realType) strdecl.setParseAction(TypeDefn.charType) This prints: SALESMEN-RECORD SALESMEN-NO [int(3)] SALESMEN-NAME [char(1)] SALESMEN-TERRITORY [char(1)] SALESMEN-QUOTA [+-int(7), 'COMP'] SALESMEN-1ST-BONUS [+-real(5.2), 'COMP'] SALESMEN-2ND-BONUS [+-real(5.2), 'COMP'] SALESMEN-3RD-BONUS [+-real(5.2), 'COMP'] SALESMEN-4TH-BONUS [+-real(5.2), 'COMP'] You can post more questions about pyparsing on the Discussion tab of the pyparsing wiki home page. Best of luck! -- Paul -- http://mail.python.org/mailman/listinfo/python-list