Israel Carr wrote: > Thanks for anyone who takes the time to read this. If I posted to the > wrong list, I apologize and you can disregard. > > I need help with a script to pull data from a postgres database. I'm ok > with the database connection just not sure how to parse the data to get > the results I need. > > I'm running Python 2.4.4. For what it's worth, once I can get my logic > correct I'll be publishing the reports mentioned below via zope for web > clients. > > Here is a small sample of the records in the table: > > name date time status > machine1 01/01/2008 13:00:00 system ok > machine1 01/01/2008 13:05:00 system ok > machine1 01/01/2008 13:10:00 status1 > machine1 01/01/2008 13:10:30 status1 > machine1 01/01/2008 13:11:00 system ok > machine1 01/01/2008 13:16:30 status2 > machine1 01/01/2008 13:17:00 status2 > machine1 01/01/2008 13:17:30 status2 > machine1 01/01/2008 13:18:00 status2 > machine1 01/01/2008 13:18:30 status2 > machine1 01/01/2008 13:19:00 system ok > machine1 01/01/2008 13:24:00 status2 > machine1 01/01/2008 13:24:30 status2 > machine1 01/01/2008 13:25:00 system ok > > I need to report from this data. > The detail report needs to be something like: > machine1 01/01/2008 13:10:00 status1 00:01:30 > machine1 01/01/2008 13:16:30 status2 00:02:30 > machine1 01/01/2008 13:24:00 status2 00:01:00
Well, just for fun of the SQL challenge, I tossed together the following (using sqlite3) SELECT name, Min(ts) as ts, next_ts, status FROM ( SELECT *, ( SELECT ts FROM test WHERE test.name = t.name AND test.ts > t.ts AND test.status = 'system ok' ORDER BY test.ts ASC LIMIT 1) AS next_ts FROM test t WHERE status <> 'system ok' ) with_next GROUP BY name, status, next_ts where my table has "name", "ts" (a timestamp field combo of your "date" and "time" fields, and for sqlite, formatting in "YYYY-MM-DD mm:ss" format) which yields rows with the machine name, the non "system ok" status, the timestamp of the initial event, and the timestamp of the subsequent "system ok" stamp. There's a bit of an underdefined case where you have more than one non-OK status before OK gets reset: 00:10 status1 00:20 status1 00:30 status2 00:40 status ok If this can't happen, it should work fine. If the above can happen, you'll get odd overlaps in your reporting. Since I couldn't find an Interval data type in sqlite, you'd just have to take the "ts" and "next_ts" columns and subtract them to get the interval you want. > and the summary needs to be > machine1 01/01/2008 total 'status1' time = 00:01:30 > machine1 01/01/2008 total 'status2' time = 00:03:30 > _____ > machine1 01/01/2008 total 'non-OK' time = 00:05:00 #this is the > sum of status1 and status2 times While the below doesn't track the changing of the machine, you can follow the basic framework given here. I threw in a couple helper functions to normalize whatever data types ("normalize_status()" and "make_timestamp()") NO_TIME = datetime.datetime(datetime.MINYEAR, 1, 1) OK = 'system ok' normalize_status = lambda s: s.lower() def log(s): print s print '=' * len(s) def make_timestamp(date, time): d = datetime.datetime(*(int(s) for s in date.split('-') + time.split(':'))) return d status_tally = {} last_status = OK last_ts = NO_TIME log('Intervals (your first request)') for i, (machine, date, time, status) in enumerate(fetchall()): ts = make_timestamp(date, time) status = normalize_status(status) if status == OK and last_status <> OK: interval = ts - last_ts print machine, last_status, last_ts, interval if last_status in status_tally: status_tally[last_status] += interval else: status_tally[last_status] = interval last_status = status elif status <> OK and last_status == OK: last_ts = ts last_status = status log('Summary (your 2nd request)') for k,v in status_tally.iteritems(): print k, v log('Grand Total (your 3rd request)') print sum(status_tally.values(), datetime.timedelta(0)) Thanks for the mental exercise. :) -tkc -- http://mail.python.org/mailman/listinfo/python-list