On Dec 9, 11:01 pm, Prabhu Gurumurthy <[EMAIL PROTECTED]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > All, > > I have the following lines that I would like to parse in python using > pyparsing, but have some problems forming the grammar. > > Line in file: > table <ALINK> const { 207.135.103.128/26, 207.135.112.64/29 } > table <INTRANET> persist { ! 10.200.2/24, 10.200/22 } > table <RFC_1918> const { 192.168/16, ! 172.24.1/29, 172.16/12, 169.254/16 } > table <DIALER> persist { 10.202/22 } > table <RAVPN> const { 10.206/22 } > table <KS> const { \ > 10.205.1/24, \ > 169.136.241.68, \ > 169.136.241.70, \ > 169.136.241.71, \ > 169.136.241.72, \ > 169.136.241.75, \ > 169.136.241.76, \ > 169.136.241.77, \ > 169.136.241.78, \ > 169.136.241.79, \ > 169.136.241.81, \ > 169.136.241.82, \ > 169.136.241.85 } > > I have the following grammar defn. > > tableName = Word(alphanums + "-" + "_") > leftClose = Suppress("<") > rightClose = Suppress(">") > key = Suppress("table") > tableType = Regex("persist|const") > ip4Address = OneOrMore(Word(nums + ".")) > ip4Network = Group(ip4Address + Optional(Word("/") + > OneOrMore(Word(nums)))) > temp = ZeroOrMore("\\" + "\n") > tableList = OneOrMore(Optional("\\") | > ip4Network | ip4Address | Suppress(",") | Literal("!")) > leftParen = Suppress("{") > rightParen = Suppress("}") > > table = key + leftClose + tableName + rightClose + tableType + \ > leftParen + tableList + rightParen > > I cannot seem to match sixth line in the file above, i.e table name with > KS, how do I form the grammar for it, BTW, I still cannot seem to ignore > comments using table.ignore(Literal("#") + restOfLine), I get a parse error. > > Any help appreciated. > Thanks > Prabhu
Prabhu - This is a good start, but here are some suggestions: 1. ip4Address = OneOrMore(Word(nums + ".")) Word(nums+".") will read any contiguous set of characters in the string nums+".", so OneOrMore is not necessary for reading in an ip4Address. Just use: ip4Address = Word(nums + ".") 2. ip4Network = Group(ip4Address + Optional(Word("/") + OneOrMore(Word(nums)))) Same comment, OneOrMore is not needed for the added value to the ip4Address: ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums)))) 3. tableList = OneOrMore(Optional("\\") | ip4Network | ip4Address | Suppress(",") | Literal("!")) The list of ip4Networks is just a comma-delimited list, with some entries preceded with a '!' character. It is simpler to use pyparsing's built-in helper, delimitedList, as in: tableList = Group( delimitedList(Group("!"+ip4Network)|ip4Network) ) Yes, I know, you are saying, "but what about all those backslashes?" The backslashes look like they are just there as line continuations. We can define an ignore expression, so that the table expression, and all of its contained expressions, will ignore '\' characters as line continuations: table.ignore( Literal("\\") + LineEnd() ) And I'm not sure why you had trouble with ignoring '#' + restOfLine, it works fine in the program below. If you make these changes, your program will look something like this: tableName = Word(alphanums + "-" + "_") leftClose = Suppress("<") rightClose = Suppress(">") key = Suppress("table") tableType = Regex("persist|const") ip4Address = Word(nums + ".") ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums))) tableList = Group(delimitedList(Group("!"+ip4Network)|ip4Network)) leftParen = Suppress("{") rightParen = Suppress("}") table = key + leftClose + tableName + rightClose + tableType + \ leftParen + tableList + rightParen table.ignore(Literal("\\") + LineEnd()) table.ignore(Literal("#") + restOfLine) # parse the input line, and pprint the results result = OneOrMore(table).parseString(line) from pprint import pprint pprint(result.asList()) Prints out: ['ALINK', 'const', [['207.135.103.128', '/', '26'], ['207.135.112.64', '/', '29']], 'INTRANET', 'persist', [['!', ['10.200.2', '/', '24']], ['10.200', '/', '22']], 'RFC_1918', 'const', [['192.168', '/', '16'], ['!', ['172.24.1', '/', '29']], ['172.16', '/', '12'], ['169.254', '/', '16']], 'DIALER', 'persist', [['10.202', '/', '22']], 'RAVPN', 'const', [['10.206', '/', '22']], 'KS', 'const', [['10.205.1', '/', '24'], ['169.136.241.68'], ['169.136.241.70'], ['169.136.241.71'], ['169.136.241.72'], ['169.136.241.75'], ['169.136.241.76'], ['169.136.241.77'], ['169.136.241.78'], ['169.136.241.79'], ['169.136.241.81'], ['169.136.241.82'], ['169.136.241.85']]] -- Paul -- http://mail.python.org/mailman/listinfo/python-list