I have a large text (4GB) which I am parsing. I am reading the file to collect stats on certain items.
My approach has been simple, for row in open(file): if "INFO" in row: line=row.split() user=line[0] host=line[1] __time=line[2] ... I was wondering if there is a framework or a better algorithm to read such as large file and collect it stats according to content. Also, are there any libraries, data structures or functions which can be helpful? I was told about 'collections' container. Here are some stats I am trying to get: *Number of unique users *Break down each user's visit according to time, t0 to t1 *what user came from what host. *what time had the most users? (There are about 15 different things I want to query) I understand most of these are redundant but it would be nice to have a framework or even a object oriented way of doing this instead of loading it into a database. Any thoughts or ideas? --- Get your facts first, then you can distort them as you please.--
-- http://mail.python.org/mailman/listinfo/python-list