"John Machin" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Fantastic -- at least for the OP's carefully copied-and-pasted input. > Meanwhile back in the real world, there might be problems with multiple > tabs used for 'prettiness' instead of 1 tab, non-integer values, etc etc. > In that case a loop approach that validated as it went and was able to > report the position and contents of any invalid input might be better.
Yeah, for that you'd need more like a real parser... hey, wait a minute! What about pyparsing?! Here's a pyparsing version. The definition of the parsing patterns takes little more than the re definition does - the bulk of the rest of the code is parsing/scanning the input and reporting the results. The pyparsing home page is at http://pyparsing.wikispaces.com. -- Paul stuff = 'Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless cord!\t2\tBuilding tools\t3\nOne for the money\t7\tTwo for the show\t4' print "Original input string:" print stuff print from pyparsing import * # define low-level elements for parsing itemWord = Word(alphas, alphanums+".!?") itemDesc = OneOrMore(itemWord) integer = Word(nums) # add parse action to itemDesc to merge separate words into single string itemDesc.setParseAction( lambda s,l,t: " ".join(t) ) # define macro element for an entry entry = itemDesc.setResultsName("item") + integer.setResultsName("qty") # scan through input string for entry's, print out their named fields print "Results when scanning for entries:" for t,s,e in entry.scanString(stuff): print t.item,t.qty print # parse entire string, building ParseResults with dict-like access results = dictOf( itemDesc, integer ).parseString(stuff) print "Results when parsing entries as a dict:" print "Keys:", results.keys() for item in results.items(): print item for k in results.keys(): print k,"=", results[k] prints: Original input string: Yellow hat 2 Blue shirt 1 White socks 4 Green pants 1 Blue bag 4 Nice perfume 3 Wrist watch 7 Mobile phone 4 Wireless cord! 2 Building tools 3 One for the money 7 Two for the show 4 Results when scanning for entries: Yellow hat 2 Blue shirt 1 White socks 4 Green pants 1 Blue bag 4 Nice perfume 3 Wrist watch 7 Mobile phone 4 Wireless cord! 2 Building tools 3 One for the money 7 Two for the show 4 Results when parsing entries as a dict: Keys: ['Wireless cord!', 'Green pants', 'Blue shirt', 'White socks', 'Mobile phone', 'Two for the show', 'One for the money', 'Blue bag', 'Wrist watch', 'Nice perfume', 'Yellow hat', 'Building tools'] ('Wireless cord!', '2') ('Green pants', '1') ('Blue shirt', '1') ('White socks', '4') ('Mobile phone', '4') ('Two for the show', '4') ('One for the money', '7') ('Blue bag', '4') ('Wrist watch', '7') ('Nice perfume', '3') ('Yellow hat', '2') ('Building tools', '3') Wireless cord! = 2 Green pants = 1 Blue shirt = 1 White socks = 4 Mobile phone = 4 Two for the show = 4 One for the money = 7 Blue bag = 4 Wrist watch = 7 Nice perfume = 3 Yellow hat = 2 Building tools = 3 -- http://mail.python.org/mailman/listinfo/python-list