kak...@gmail.com wrote:
I want to parse a log file with the following format for
example:
              TIMESTAMPE            Operation     FileName
Bytes
12/Jan/2010:16:04:59 +0200   EXISTS       sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
12/Jan/2010:16:05:05 +0200  DELETE      sample3.3gp   37151

How can i count the operations for a month(e.g total of 40 Operations,
30 exists, 10 delete?)

It can be done pretty easily with a regexp to parse the relevant bits:

  import re
  r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)')
  stats = {}
  for line in file('log.txt'):
    m = r.match(line)
    if m:
      stats[m.groups()] = stats.get(m.groups(), 0) + 1
  print stats

This prints out

  {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1}


With the resulting data structure, you can manipulate it to do coarser-grained aggregates such as the total operations, or remap month-name abbreviations into integers so they could be sorted for output.

-tkc


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to