On Aug 13, 2012, at 20:28, Carey Tilden <carey.til...@gmail.com> wrote:
> Apologies for the awkward title. I haven't quite thought of the right way to > describe my problem, which may be why I've had a hard time figuring out how > to solve it. I have a list of program start/stop times, and I want to know > how long each run takes to complete. The thing that's really tripping me up > is there are gaps in the sequence. I've figured out how to collapse the > results down to a single row per attempt, but I can't quite figure out how to > further collapse down each full run to its own row. It'd be easy if I had a > session_id or something to group on, but I don't. All I have are the > start/stop times. > > Here's some sample data. Hopefully this clarifies what I'm talking about: > > drop table if exists program_runs; > > create temporary table program_runs ( > id serial, > time_stamp timestamptz, > action text > ); > > insert into program_runs (time_stamp, action) values > ('2012-01-01 10:00:00 PST', 'started'), ('2012-01-01 10:10:00 PST', > 'stopped early'), > ('2012-01-01 10:20:00 PST', 'started'), ('2012-01-01 10:30:00 PST', > 'stopped early'), > ('2012-01-01 10:40:00 PST', 'started'), ('2012-01-01 10:47:00 PST', > 'completed'), > ('2012-01-01 10:50:00 PST', 'started'), ('2012-01-01 11:00:00 PST', > 'stopped early'), > ('2012-01-01 11:10:00 PST', 'started'), ('2012-01-01 11:13:00 PST', > 'completed'), > ('2012-01-01 11:20:00 PST', 'started'), ('2012-01-01 11:30:00 PST', > 'stopped early'), > ('2012-01-01 11:40:00 PST', 'started'), ('2012-01-01 11:50:00 PST', > 'stopped early'), > ('2012-01-01 12:00:00 PST', 'started'), ('2012-01-01 12:10:00 PST', > 'stopped early'), > ('2012-01-01 12:20:00 PST', 'started'), ('2012-01-01 12:29:00 PST', > 'completed'); > > select > this_time_stamp as starting_time_stamp, > next_time_stamp - this_time_stamp as time_elapsed, > next_action as closing_action > from ( > select > time_stamp as this_time_stamp, lead(time_stamp) over (order by > id) as next_time_stamp, > action as this_action, lead(action) over (order by id) as > next_action, > id as this_id, lead(id) over (order by id) as next_id > from program_runs > ) q > where this_action = 'started'; > > Note that each run has a pair of entries in the table. The first is always > "started", but the second may be either "stopped early" or "completed". The > final results I'd like to see are: > > starting_time_stamp | total_time_elapsed > ------------------------+-------------------- > 2012-01-01 10:00:00-08 | 00:27:00 > 2012-01-01 10:50:00-08 | 00:13:00 > 2012-01-01 11:20:00-08 | 00:39:00 > > Hope that's enough detail. Any ideas or suggestions gladly accepted! > > Regards, > Carey First artificially generate row (pair) identifiers by integer dividing the ordered row number by 2. Using window or sub-queries identify the bookends for each group (i.e., the identifier for each completed and the prior completed). Give these groups artificial session identifiers/row numbers. Assign the artificial session id to each transaction row by using the bookends. Now you have identifiers with which to group. This makes a number of assumptions regarding the form of the input data. It will solve for your example data but it may not generalize. In particular it assumes non-overlapping sessions. HTH David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general