"John Machin" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 5/06/2006 10:07 AM, Paul McGuire wrote: > > "John Machin" <[EMAIL PROTECTED]> wrote in message > > news:[EMAIL PROTECTED] > >> Fantastic -- at least for the OP's carefully copied-and-pasted input. > >> Meanwhile back in the real world, there might be problems with multiple > >> tabs used for 'prettiness' instead of 1 tab, non-integer values, etc etc. > >> In that case a loop approach that validated as it went and was able to > >> report the position and contents of any invalid input might be better. > > > > Yeah, for that you'd need more like a real parser... hey, wait a minute! > > What about pyparsing?! > > > > Here's a pyparsing version. The definition of the parsing patterns takes > > little more than the re definition does - the bulk of the rest of the code > > is parsing/scanning the input and reporting the results. > > > > [big snip] > > I didn't see any evidence of error handling in there anywhere. > > Pyparsing has a certain amount of error reporting built in, raising a ParseException when a mismatch occurs.
This particular "grammar" is actually pretty error-tolerant. To force an error, I replaced "One for the money" with "1 for the money", and here is the exception reported by pyparsing, along with a diagnostic method, markInputline: stuff = 'Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless cord!\t2\tBuilding tools\t3\nOne for the money\t7\tTwo for the show\t4' badstuff = 'Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless cord!\t2\tBuilding tools\t3\n1 for the money\t7\tTwo for the show\t4' pattern = dictOf( itemDesc, integer ) + stringEnd print pattern.parseString(stuff) print try: print pattern.parseString(badstuff) except ParseException, pe: print pe print pe.markInputline() Gives: [['Yellow hat', '2'], ['Blue shirt', '1'], ['White socks', '4'], ['Green pants', '1'], ['Blue bag', '4'], ['Nice perfume', '3'], ['Wrist watch', '7'], ['Mobile phone', '4'], ['Wireless cord!', '2'], ['Building tools', '3'], ['One for the money', '7'], ['Two for the show', '4']] Expected stringEnd (at char 210), (line:6, col:1) >!<1 for the money 7 Two for the show 4 -- Paul -- http://mail.python.org/mailman/listinfo/python-list