On Jan 16, 6:54 pm, [EMAIL PROTECTED] wrote: > Hi there, > I'm struggling to find a sensible way to process a large chuck of > data--line by line, but also having the ability to move to subsequent > 'next' lines within a for loop. I was hoping someone would be willing > to share some insights to help point me in the right direction. This > is not a file, so any file modules or methods available for files > parsing wouldn't apply. > > I run a command on a remote host by using the pexpect (pxssh) module. > I get the result back which are pages and pages of pre-formatted text. > This is a pared down example (some will notice it's tivoli schedule > output). >
Pyparsing will work on a string or a file, and will do the line-by- line iteration for you. You just have to define the expected format of the data. The sample code below parses the data that you posted. >From this example, you can refine the code by assigning names to the different parsed fields, and use the field names to access the parsed values. More info about pyparsing at http://pyparsing.wikispaces.com. -- Paul from pyparsing import * integer = Word(nums) timestamp = Combine(Word(nums,exact=2)+":"+Word(nums,exact=2)) dateString = Combine(Word(nums,exact=2)+"/"+ Word(nums,exact=2)+"/"+ Word(nums,exact=2)) schedHeader = Literal("Schedule HOST") + Word("#",alphas+"_") + "(" + ")" + \ timestamp + integer + timestamp+"("+dateString+")" + \ Optional(~LineEnd() + empty + restOfLine) schedLine = Group(Word("(",alphanums) + Word(alphanums+"_") + timestamp + integer + Optional(~LineEnd() + empty + restOfLine) ) + LineEnd().suppress() schedTotal = Literal("Total") + timestamp sched = schedHeader + Group(OneOrMore(schedLine)) + schedTotal from pprint import pprint for s in sched.searchString(data): pprint( s.asList() ) print Prints: ['Schedule HOST', '#ALL_LETTERS', '(', ')', '00:01', '10', '22:00', '(', '01/16/08', ')', 'LTR_CLEANUP ', [['(SITE1', 'LTR_DB_LETTER', '00:01', '10']], 'Total', '00:01'] ['Schedule HOST', '#DAILY', '(', ')', '00:44', '10', '18:00', '(', '01/16/08', ')', 'DAILY_LTR ', [['(SITE3', 'RUN_LTR14_PROC', '00:20', '10'], ['(SITE1', 'LTR14A_WRAPPER', '00:06', '10', 'SITE3#RUN_LTR14_PROC '], ['(SITE1', 'LTR14B_WRAPPER', '00:04', '10', 'SITE1#LTR14A_WRAPPER '], ['(SITE1', 'LTR14C_WRAPPER', '00:03', '10', 'SITE1#LTR14B_WRAPPER '], ['(SITE1', 'LTR14D_WRAPPER', '00:02', '10', 'SITE1#LTR14C_WRAPPER '], ['(SITE1', 'LTR14E_WRAPPER', '00:01', '10', 'SITE1#LTR14D_WRAPPER '], ['(SITE1', 'LTR14F_WRAPPER', '00:03', '10', 'SITE1#LTR14E_WRAPPER '], ['(SITE1', 'LTR14G_WRAPPER', '00:03', '10', 'SITE1#LTR14F_WRAPPER '], ['(SITE1', 'LTR14H_WRAPPER', '00:02', '10', 'SITE1#LTR14G_WRAPPER ']], 'Total', '00:44'] ['Schedule HOST', '#CARDS', '(', ')', '00:02', '10', '20:30', '(', '01/16/08', ')', 'STR2_D ', [['(SITE7', 'DAILY_MEETING_FILE', '00:01', '10'], ['(SITE3', 'BEHAVE_HALT_FILE', '00:01', '10', 'SITE7#DAILY_HOME_FILE ']], 'Total', '00:02'] -- http://mail.python.org/mailman/listinfo/python-list