I have hundreds of file in a directory from all of which I need to extract multiple values namely filename with pathname (which start with test*), 1,1,25296896:8192 ( only the one containing pattern corrupting), before corruption( it’s a hex value), offset(digit), size(digit)
Sample file contents ( All my files are small files ): 07/22/2017 12:34:28 AM INFO: --offset=18 --mirror=1 --path=/ifs/i/inode.txt --size=4 07/22/2017 12:34:28 AM INFO:The mirror selected is 1,1,25296896:8192 07/22/2017 12:34:28 AM INFO:Data before corruption : 1b000100 07/22/2017 12:34:28 AM INFO:Corrupting disk object 6 at 1,1,25296896:8192 07/22/2017 12:34:28 AM INFO:Data after corruption : 00000000 I am expecting something like this # Filename : /var/01010101/test01log object: 1,1,25296896:8192 checksum : 1b000100 offset: 18 size:4 # Filename : /var/01010101/test03log object: 1,2,25296896:8192 checksum : 1b200120 offset: 8 size:8 Here is how I have started coding this but not sure how to to group multiple patterns and return it as a function , I am trying with group() amd groupdicts() any tips and better idea import glob import re for filename in sorted(glob.glob('/var/01010101/test*.log')): with open(filename, 'r') as f: for linenum, line in enumerate(f): m = re.search(r'(Corrupting.*)',line) if not m: # uninteresting line continue x = m.group().split() print filename , x[-1] x123-45# python test.py /var/01010101/test01_.log 1,1,25296896:8192 I am on Python 2.7 and Linux Regards, Ganesh -- https://mail.python.org/mailman/listinfo/python-list