Serge Shakarian wrote: > > thank you for your reply. I think I should have been a bit more descriptive > in what I am trying to do. > > The log file is about 1.2 - 1.5 GB per 24 hours. Each record is look like > this: > Mon 10 Mar 2003 12:07:55 PM EST (1047316075.181206) > [EMAIL PROTECTED]:<16.0>:L4:S14 > req=/script-root-tran/accounts/MyAccounts sid=1433628848 tid=295244861 mod=1 > err=0 hit=0 fix=0 pri=8 qul=0 jah=0 tql=0 tet=24.009162 qut=0.000228 > mut=0.000012 sct=0.000000 set=24.005668 gct=0.000000 > > This log is a Broadvision performance log. The reason for using split is a > maintenance issue and I would be open to using regex as a solution as these > logs are from a production box and the script would be executed on the prod > box. Here is a jist of my code: > > while(<PERFLOG>) > { > chomp; > if(/\breq=/o){ &collect_stats; }
First, calling a sub is going to be slower than writing the code in the loop. Second, calling a sub with the ampersand prefix is frowned upon in Perl5. Third, using global variables in a sub is frowned upon, better to pass the variables explicitly. Fourth, using the /o option on the match operator only applies if there is variable interpolation in the regular expression. In other words, it doesn't apply here. > } > > [snip code] I would probably write it like this: while ( <PERFLOG> ) { next unless /\breq=(\S+)/; # get the script/jsp name my $script_name = $1; # get"(1047316075.181206) " which is # time since 00:00:00 UTC January 1, 1970 my ($script_time) = /\(([\d.]+)\)/; # compare the time stamps if ( $opt_t ) { $start_time ||= $script_time; if ( $start_time + $report_interval < $script_time ) { $start_time = 0; output(); } } #============================================================ # now, all thats left are the name/value pairs # store every element in a hash. There will be a # hash for every script that is logged and a master # hash that stores each script has. The key for each # script hash is the script name. # # These are the name/value pairs # SID= session id # TID= transaction id # MOD= IM state ( 1=normal, 2=overload, <=0 drain) # ERR= error ( 0=no error, 1=error) # HIT= request cache hit ( 0=miss, 1=hit) # FIX= fix up script hit ( 0=miss, 1=hit) # PRI= request priority ( range: 0-15, 0=highest, 15=lowest) # QUL= queue length of the priority # JAH= jobs ahead in queues # TQL= total jobs in all queues # TET= total execution time # QUT= queuing time # MUT= mutex waiting time # SCT= script compile time # SET= script execution time # GCT= garbage collection time my %hash = map /^(\S+)=([\d.]+)$/, split; for my $key ( keys %hash ) { $map_store{$script_name}{$key} += $hash{$key}; } # increment the number of times this script # has been processed $map_store{$script_name}{occurence}++; $lines_processed++; } John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]