I have a large text (4GB) which I am parsing.

I am reading the file to collect stats on certain items.

My approach has been simple,

for row in open(file):
  if "INFO" in row:
    line=row.split()
    user=line[0]
    host=line[1]
    __time=line[2]
    ...

I was wondering if there is a framework or a better algorithm to read such
as large file and collect it stats according to content. Also, are there any
libraries, data structures or functions which can be helpful? I was told
about 'collections' container.  Here are some stats I am trying to get:

*Number of unique users
*Break down each user's visit according to time, t0 to t1
*what user came from what host.
*what time had the most users?

(There are about 15 different things I want to query)

I understand most of these are redundant but it would be nice to have a
framework or even a object oriented way of doing this instead of loading it
into a database.


Any thoughts or ideas?




--- Get your facts first, then you can distort them as you please.--
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to