Thanks for the responses.
Basically, I have a large file with this format, Date INFO username command srcipaddress filename I would like to do statistics on: total number of usernames and who they are username and commands username and filenames unique source ip addresses unique filenames Then I would like to bucket findings with days (date). Overall, I would like to build a log file analyzer. On Sat, Apr 2, 2011 at 10:59 PM, Dan Stromberg <drsali...@gmail.com> wrote: > > On Sat, Apr 2, 2011 at 5:24 PM, Chris Angelico <ros...@gmail.com> wrote: >> >> On Sun, Apr 3, 2011 at 9:58 AM, Mag Gam <magaw...@gmail.com> wrote: >> > I suppose I can do something like this. >> > (pseudocode) >> > >> > d={} >> > try: >> > d[key]+=1 >> > except KeyError: >> > d[key]=1 >> > >> > >> > I was wondering if there is a pythonic way of doing this? I plan on >> > doing this many times for various files. Would the python collections >> > class be sufficient? >> >> I think you want collections.Counter. From the docs: "Counter objects >> have a dictionary interface except that they return a zero count for >> missing items instead of raising a KeyError". >> >> ChrisA > > I realize you (Mag) asked for a Python solution, but since you mention > awk... you can also do this with "sort < input | uniq -c" - one line of > "code". GNU sort doesn't use as nice an algorithm as a hashing-based > solution (like you'd probably use with Python), but for a sort, GNU sort's > quite good. > > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- http://mail.python.org/mailman/listinfo/python-list