Chuck Fox wrote:
Fellow perl coders,
I am having trouble wrapping my head around how to accomplish the following:
Given a dbi result set that consists of datetime, servername, databasename, procedurename and average elapsed time.
Convert this into an data structure suitable for export to a file for use with gnuplot. The data is hourly and there may be gaps in the hours. The base problem is that I have to pivot the result sets so that for each unique occurrence of server/database/procedure I can associate the elapsed time to a reporting datetime. I think I understand what I have to feed gnuplot, ie. a file with a column for each server/database/procedure of elapsed times for each hour in the period. I believe that I want to make a hash of arrays, but I am confused as to how to pivot the data and accumulate/assign the arrays.
Here is a sample of the test data that I am using, (the full set for one application for a two week period is over 15,000 rows)
<sql_result>
report_datetime server_name database_name procedure_name avg_elapsed
10/30/2003 1:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 CommentEntry_Del 0.0004
10/30/2003 3:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 CommentEntry_Del 0.0020
10/30/2003 4:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 CommentEntry_Del 0.0004
10/30/2003 9:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 CommentEntry_Del 0.0013
10/30/2003 12:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 Comment_RtrByEntry 0.0007
10/30/2003 1:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 Comment_RtrByEntry 0.0007
10/30/2003 2:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 Comment_RtrByEntry 0.0011
10/30/2003 3:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 Comment_RtrByEntry 0.0012
10/30/2003 4:00:00.000 PM BLOGSDB_D01A_SYB BlogData001 Comment_RtrByEntry 0.0007
10/30/2003 2:00:00.000 PM BLOGSDB_D01A_SYB BlogData002 Comment_Rtr 0.0008
10/30/2003 9:00:00.000 PM BLOGSDB_D01A_SYB BlogData002 Comment_Rtr 0.0007
10/30/2003 10:00:00.000 PM BLOGSDB_D01A_SYB BlogData002 Comment_Rtr 0.0006
10/30/2003 11:00:00.000 PM BLOGSDB_D01A_SYB BlogData002 Comment_Rtr 0.0007
</sql_result>
What I am having difficulty with is how do I end up with :
<some_fancy_data_structure>
BLOGSDB_D01A_SYB BLOGSDB_D01A_SYB BLOGSDB_D01A_SYB
BlogData001 BlogData001 BlogData002
report_datetime CommentEntry_Del Comment_RtrByEntry Comment_Rtr 10/30/2003 12:00:00.000 PM 0.0013 0.0007 under
10/30/2003 1:00:00.000 PM 0.0004 0.0007 under
10/30/2003 2:00:00.000 PM undef 0.0011 0.0008
10/30/2003 3:00:00.000 PM 0.0020 0.0012 under
10/30/2003 4:00:00.000 PM 0.0004 0.0007 under
10/30/2003 9:00:00.000 PM undef undef 0.0007
10/30/2003 10:00:00.000 PM undef undef 0.0006
10/30/2003 11:00:00.000 PM undef undef 0.0007
</some_fancy_data_structure>
Ultimately, I am writing the above structure to a file so that I can take advantage of gnuplot (wish I had the luxury of GD or Chart, but system calls will have to do). I think I have to parse the data twice, once to determine the names of all the nodes and then again to populate each node. Is there a single pass solution ?
<groveling>
Thanks in advance, this is a wonderful list, I have over 500 tips saved for the 3 or 4 months that I have been listening in.
</groveling>
Chuck Fox
Got some sleep, some coffee and now that my head is properly wrapped, I am getting the output I want. Now my question is, how to make sure that I am not doing too much extraneous stuff and make it run as fast as possible. I think the only solution involves at least two passes through the data. One to determine the number of rows and columns and one to populate the table. Does anyone see optimizations for a one pass solution ?
Heres the code, <code> #!/usr/local/bin/perl
# always, always ! use strict; use warnings;
# Declare some storage my @ary = (); my %times = (); my %cols = (); my @colnames = (); my @row = ();
# Create some data
$ary[0] = [( "10/30/03 12:00", "BLOGSDB_M01A_SYB", "BlogData001", "Comment_Ins", 0.0004 )];
$ary[1] = [( "10/30/03 2:00", "BLOGSDB_M01A_SYB", "BlogData001", "Comment_Ins", 0.0007 )];
$ary[2] = [( "10/30/03 3:00", "BLOGSDB_M01A_SYB", "BlogData001", "Comment_Ins", 0.0011 )];
$ary[3] = [( "10/30/03 7:00", "BLOGSDB_M01A_SYB", "BlogData001", "Comment_Ins", 0.0007 )];
$ary[4] = [( "10/30/03 12:00", "BLOGSDB_M01A_SYB", "BlogData002", "Comment_Ins", 0.0004 )];
$ary[5] = [( "10/30/03 1:00", "BLOGSDB_M01A_SYB", "BlogData002", "Comment_Ins", 0.0006 )];
$ary[6] = [( "10/30/03 2:00", "BLOGSDB_M01A_SYB", "BlogData002", "Comment_Ins", 0.0009 )];
$ary[7] = [( "10/30/03 12:00", "BLOGSDB_M01A_SYB", "BlogData003", "Comment_Ins", 0.0005 )];
$ary[8] = [( "10/30/03 2:00", "BLOGSDB_M01A_SYB", "BlogData003", "Comment_Ins", 0.0003 )];
$ary[9] = [( "10/30/03 5:00", "BLOGSDB_M01A_SYB", "BlogData003", "Comment_Ins", 0.0006 )];
$ary[10] = [( "10/30/03 7:00", "BLOGSDB_M01A_SYB", "BlogData003", "Comment_Ins", 0.0005 )];
$ary[11] = [( "10/30/03 9:00", "BLOGSDB_M01A_SYB", "BlogData003", "Comment_Ins", 0.0004 )];
# Load up the map of times we are going to use foreach my $elem ( @ary ) { # print @{$elem}[2] . "\n"; $times{ @{ $elem }[0] }++; push( @colnames, join( ".", @{ $elem }[1..3] ) ) unless $cols{ join( ".", @{ $elem }[1..3] ) }++; }
# Assign the array index number to the hash for( my $idx = 0; $idx < $#colnames + 1; $idx++ ) { $cols{ $colnames[ $idx ] } = $idx; # print "creating hash index for $colnames[ $idx ], $idx\n"; }
# Declare some more storage $#row = $#colnames; for( my $idx = 0; $idx <= $#colnames; $idx++ ){ $row[$idx] = "0.0000"; }
# assign for each row of time foreach my $time ( keys %times ) { $times{$time} = [ @row ]; }
# Now load each value into the structure
foreach my $elem ( @ary ) {
@{ $times{ @{ $elem }[0] } }[ $cols{ join( ".", @{ $elem }[1..3] ) } ] = @{ $elem }[4];
}
# And voila, results pivoted foreach my $row ( sort keys %times ) { print $row . ", " . join( ", ", @{ $times{$row} } ) . "\n" ; }
</code>
Regards,
Chuck
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]