Brad Causey wrote:
Python Version: Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32

List,

I am trying to do some basic log parsing, and well, I am absolutely floored at this seemingly simple problem. I am by no means a novice in python, but yet this is really stumping me. I have extracted the pertinent code snippets and modified them to function as a standalone script. Basically I am reading a log file ( in this case, testlog.log) for entries and comparing them to entries in a safe list (in this case, safelist.lst). I have spent numerous hours doing this several ways and this is the most simple way I can come up with:

<code>
import string

safelistfh = file('safelist.lst', 'r')
safelist = safelistfh.readlines()

logfh = file('testlog.log', 'r')
loglines = logfh.readlines()

def safecheck(line):
    for entry in safelist:
        print 'I am searching for\n'
        print entry
        print '\n'
        print 'to exist in\n'
        print line
        comp = line.find(entry)
        if comp <> -1:
            out = 'Failed'
        else:
            out = 'Passed'
    return out

Unless I've misunderstood what you're doing, wouldn't it be better as:

def safecheck(line):
    for entry in safelist:
        print 'I am searching for\n'
        print entry
        print '\n'
        print 'to exist in\n'
        print line
        if entry in line:
            return 'Passed'
    return 'Failed'

for log in loglines:
    finalentry = safecheck(log)
    if finalentry == 'Failed':
        print 'This is an internal site'
    else:
        print 'This is an external site'
</code>

Actually, I think it would be better to use True and False instead of 'Passed' and 'Failed.

The contents of the two files are as follows:

<safelist.lst>
http://www.mysite.com <http://www.mysite.com/>
</safelist.lst>

<testlog.log>
http://www.mysite.com/images/homepage/xmlslideshow-personal.swf
</testlog.log>

It seems that no matter what I do, I can't get this to fail the " if comp <> -1:" check. (My goal is for the check to fail so that I know this is just a URL to a safe[internal] site) My assumption is that the HTTP:// is somehow affecting the searching capabilities of the string.find function. But I can't seem to locate any documentation online that outlines restrictions when using special characters.

Any thoughts?

You'll still need to strip off the '\n'.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to