Re: fastest method

Jean-Michel Pichavant Thu, 21 Jun 2012 07:02:49 -0700

david.gar...@gmail.com wrote:

I am looking for the fastest way to parse a log file.

currently I have this... Can I speed this up any? The script iswritten to be a generic log file parser so I can't rely on somepredictable pattern.



def check_data(data,keywords):
    #get rid of duplicates
    unique_list = list(set(data))
    string_list=' '.join(unique_list)
    #print string_list
    for keyword in keywords:
        if keyword in string_list:
            return True


I am currently using file seek and maintaining a last byte count file:

with open(filename) as f:
    print "Here is filename:%s" %filename
    f.seek(0, 2)
    eof = f.tell()
    print "Here is eof:%s" %eof
    if last is not None:
        print "Here is last:%s" %last
        # if last is less than current
        last = int(last)
        if (eof - last  > 0):
            offset = eof - last
            offset = offset * -1
            print "Here is new offset:%s" %offset
            f.seek(offset, 2)
            mylist = f.readlines()
    else:
        # if last doesn't exist or is greater than current
        f.seek(0)
        bof = f.tell()
        print "Here is bof:%s" %bof
        mylist = f.readlines()



Thanks,
--
David Garvey

I have a log parser that take action upon some log patterns.

I rely on my system 'grep' program to do the hard work, i.e. findoccurrences

Of course that means it is system dependant, but I don't think you canbeat grep's speed.


   def _grep(self, link, pattern):
       # return the number of occurences for the pattern in the file

proc = subprocess.Popen(['grep', '-c', pattern, link],stdout=subprocess.PIPE)

       return int(proc.communicate()[0])

Cheers,

JM

--
http://mail.python.org/mailman/listinfo/python-list

Re: fastest method

Reply via email to