On Jan 18, 11:56 pm, Tim Chase <python.l...@tim.thechases.com> wrote: > kak...@gmail.com wrote: > > I want to parse a log file with the following format for > > example: > > TIMESTAMPE Operation FileName > > Bytes > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151 > > > How can i count the operations for a month(e.g total of 40 Operations, > > 30 exists, 10 delete?) > > It can be done pretty easily with a regexp to parse the relevant > bits: > > import re > r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)') > stats = {} > for line in file('log.txt'): > m = r.match(line) > if m: > stats[m.groups()] = stats.get(m.groups(), 0) + 1 > print stats > > This prints out > > {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1} > > With the resulting data structure, you can manipulate it to do > coarser-grained aggregates such as the total operations, or remap > month-name abbreviations into integers so they could be sorted > for output. > > -tkc
Thank you both so much Antonis -- http://mail.python.org/mailman/listinfo/python-list