Parsing a log file
I am looking for a way to parse a simple log file to get the information in a format that I can use. I would like to use python, but I am just beginning to learn how to use it. I am not a programmer, but have done some simple modifications and revisions of scripts. I am willing to attempt this on my own, if someone can point me in the right direction (any example scripts that do similar things would be helpful). This doesn't have to be Python, but I need a cross-platform solution (i.e. Perl or some other kind of script). I just wanted to try Python because I like the concept of it. Here is my scenario: I have a program that connects and disconnects to a server. It writes a simple log file like this: 08-13-2005 13:19:37:564 Program: CONNECTED to 'Server' 08-13-2005 15:40:08:313 Program: DISCONNECTED from 'Server' 08-13-2005 15:45:39:234 Program: CONNECTED to 'Server' 08-13-2005 15:55:18:113 Program: DISCONNECTED from 'Server' 08-13-2005 16:30:57:264 Program: CONNECTED to 'Server' 08-13-2005 16:59:46:417 Program: DISCONNECTED from 'Server' 08-13-2005 17:10:33:264 Program: CONNECTED to 'Server' 08-13-2005 18:25:26:316 Program: DISCONNECTED from 'Server' 08-13-2005 18:58:13:564 Program: CONNECTED to 'Server' 08-13-2005 19:29:10:715 Program: DISCONNECTED from 'Server' What I basically want to do is end up with a text file that can be easily imported into a database with a format like this (or I guess it could be written in a SQL script form that could write directly to a database like Mysql): Connect_Date Connect_Time Disconnect_date Disconnect_time User --- --- --- 08-13-2005 13:19:37 08-13-2005 15:40:08John 08-13-2005 15:45:39 08-13-2005 15:55:18John 08-13-2005 16:30:57 08-13-2005 16:59:46John 08-13-2005 17:10:33 08-13-2005 18:25:26John 08-13-2005 18:58:13 08-13-2005 19:29:10John Here are some notes about this: * the username would come from the log file name (i.e. John_Connect.log) * I don't need the fractions of seconds in the timestamps * I only need date, time, and connect or disconnect, the other info is not important * If it is possible to calculate the elapsed time between Connect and Disconnect and create a new field with that data, that would help (but I can easily do that with SQL queries) * This log file layout seems to be consistent * There may not be a "disconnect" statement if the log file is read while connected, so the next time it would have to insert the disconnect information. The file will be read quite regularly, so this is very likely. * This would eventually need to be done without intervention (maybe every 5 minutes). I am open to other ideas or existing programs and am flexible about the final solution. Thanks, Clint -- http://mail.python.org/mailman/listinfo/python-list
Re: Parsing a log file
Thanks Andreas, In your first paragraph, you ask about incorrect input. I guess it is possible, but without that information, my collection of the data is useless, so I really don't know what I would do with that. As for the other stuff, I can hack the data in other ways, such as with VBA and MSAccess, which I am more familiar with, but I am trying to move to Linux and want to do it right the first time. I figure Perl is the more common language for this kind of stuff, but I did want to try to learn some Python while I am at it. I have started the tutorial, but being a businessman, time is an issue, which, if I had an example script that did a similar thing, I can learn by doing that (I am looking for something similar now). I do live in a low-labor cost country, so I can hire someone to do it for a small amount of money, but Python people are a little harder to find. Thanks for the comments, Clint -- http://mail.python.org/mailman/listinfo/python-list
Re: Parsing a log file
John, Your comments are very helpful. I will take the datetime stamp as the way to go. I don't have a need to throw away the time info, it is You said: >What do you do if servers are in different >timezones? This is all inhouse in a non-daylight savings country and would not be an issue You also said: >Any chance of your using ISO standard format >for representing dates? I think I have very little control over the actual logfile data. I seem to be able to control what info it collects, but I don't think I can change the formatting. Thanks, Clint -- http://mail.python.org/mailman/listinfo/python-list
Re: python script under windows
I ran into a similar issue a couple of months back, the solution on Windows is to run it as a service. It is very simple, you need Mark Hammond's Win32 extensions. For path you have to use absolute filepath for all local files and for network drive use the UNC path i.e. \\servername\folder-filename\ . All these steps will let your machine running the program survive logouts after a login. If your machine is part of windows network and there is domain login then in order for it to work after a machine restart you need to goto the Service panel (in Control Panel) find the Python service you registered, right-click and goto its properties, goto the "Log On" panel, select a domain user for "This account" by clicking the Browse button, note the selected user has access to windows domain and admin access to that particular machine. Enter user network password, hit Apply, OK and there u go. All this requires admin access to machine. You can configure a couple of things about the service in the Services panel. The code itself is simple: #-- import win32service, win32serviceutil class MyService(win32serviceutil.ServiceFramework): """NT Service.""" _svc_name_ = "MyServiceName" _svc_display_name_ = "A Little More Descriptive" def SvcDoRun(self): #do your stuff here, call your main application code. def SvcStop(self): #the following line is not really needed, basically put here any code that should execute #before the service stops self.ReportServiceStatus(win32service.SERVICE_STOP_PENDING) if __name__ == '__main__': win32serviceutil.HandleCommandLine(MyService) #-- After this if you have python in your path and Win32 extensions installed, goto command prompt and run: c:\> MyService.py -startup=auto install Trying to have your service have Network access after a machine restart is a bit tricky. This thing works but somehow I feel there is more to it. If anyone has a better way, please post. -- http://mail.python.org/mailman/listinfo/python-list