On Sat, 27 Aug 2016 08:33 am, ddream.mercha...@gmail.com wrote: > My log file has several sections starting with ==== START ==== and ending > with ==== END ====.
Um. Is this relevant? Are you saying that you only wish to search the file between those lines, and ignore anything outside of them? If the file looks like: xxxx xxxx xxxx --operation(): AutoAuthOSUserSubmit StartOperation xxxx xxxx ==== START ==== xxxx xxxx xxxx ==== END ==== xxxx xxxx xxxx --operation(): AutoAuthOSUserSubmit StartOperation xxxx do you expect to say that nothing is found? I'm going to assume that you wouldn't have mentioned this if it wasn't important, so let's start by filtering out everything outside of ===START=== and ===END=== sections. For that, we want a filter that swaps between "ignore these lines" and "search these lines" depending on whether you are inside or outside of a START...END section. We'll use regular expressions for matching. import re START = r''' (?x) (?# verbose mode) ={2,} (?# two or more equal signs) \s* (?# any amount of whitespace) START (?# the literal word START in uppercase) \s* (?# more optional whitespace) ={2,} (?# two or more equal signs) $ (?# end of the line) ''' END = r'={2,}\s*END\s*={2,}$' # Similar to above, without verbose mode. START = re.compile(START) END = re.compile(END) def filter_sections(lines): outside = True for line in lines: line = line.strip() # ignore leading and trailing whitespace if outside: # ignore all lines until we see START if re.match(START, line): outside = False else: pass # just ignore the line else: # pass on every line until we see END if re.match(END, line): outside = True else: yield line Now you need to test that this does what you expect: with("mylogfile.log") as f: for line in filter_sections(f): print(line) should print *only* the lines between the START and END lines. Once you are satisfied that this works correctly, move on to the next part: extracting the relevant information from each line. There are three things you wish to look for, so you want three regular expressions. I'm not being paid for this, so here's one, the other two are up to you: OPERATION = r''' (?x) (?# verbose mode) --operation\(\): (?# literal string) \s* (?# optional whitespace) (.*) (?# anything at all, in a group) \s* (?# more optional whitespace) StartOperation (?# another literal string) .*?$ (?# ignore everything to the end of the line) ''' OPERATION = re.compile(OPERATION) FOO = ... # match second thing, similar to above BAR = ... # match third thing Now let's extract the data we want: def extract(lines): for line in lines: line = line.strip() mo = (re.match(OPERATION, line) or re.match(FOO, line) or re.match(BAR, line) ) if mo: yield mo.groups(0) with open('mylogfile.log') as f: for match in extract(filter_sections(f)): print(match) By the way, the above code is untested. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list