No, I was not expecting anyone to give me a ready made program! At the end of this email, is the script that I wrote.
Thanks for the pointer. I looked at the Parse Log and looked to me that the report that this module generates is not what I am looking at. Here is the perl script that I wrote. I am able to count the number of multiple timestamps alright. I am having problem with the time interval and that's where I need help. May be the algorithm I followed needs to be modified. Pl. have a look at the script and make any constructive suggestions. Thanks Anand --------- The script: ------------ #!/usr/bin/perl use Getopt::Long; use Time::Local; my $file="access_log_modified"; my $line; my $count; my $begin_time = ""; my $end_time; my %seen = (); my @visual_pages = (); my ($datetime, $get_post, $Day, $Month, $Year, $Hour, $Minute, $Second); my $interval = 60; #An interval of 1 minute my @pages_processed; count_recs(); sub count_recs { open (INFILE, "<$file") || die "Cannot read from $file"; WHILELOOP: while (<INFILE>) { $line = $_; chomp; ($datetime,$get_post) = (split / /) [3,6]; $datetime =~ s/\[//; ($Day,$Month,$Year,$Hour,$Minute,$Second)= $datetime =~m#^(\d\d)/(\w\w\w)/(\d\d\d\d):(\d\d):(\d\d):(\d\d)#; next WHILELOOP if ($get_post =~ /\.js$/ || $get_post =~ /\.gif$/ || $get_post =~ /\.css$/); unless ($begin_time) { $begin_time = $datetime; } $end_time = $datetime; &calculate_time($begin_time, $end_time); } #while foreach $visual_page (sort by_seen keys %seen) { push (@{$pages_processed{$visual_page}}, $seen{$visual_page}); } foreach $page_processed (sort keys %pages_processed) { print "$page_processed: @{$pages_processed{$page_processed}}\n"; } close(INFILE); } sub calculate_time { my @visual_pages = (); my @processed_visual_pages = (); ###Break up the date time into Day, Month, Year, Hour, Minute and Second. ($begin_Day,$begin_Month,$begin_Year,$begin_Hour,$begin_Minute,$begin_Second )= $begin_time =~m#^(\d\d)/(\w\w\w)/(\d\d\d\d):(\d\d):(\d\d):(\d\d)#; ($end_Day,$end_Month,$end_Year,$end_Hour,$end_Minute,$end_Second)= $end_time =~m#^(\d\d)/(\w\w\w)/(\d\d\d\d):(\d\d):(\d\d):(\d\d)#; ###Since the Day above is in the Alpha format, Jan, Feb,... and not numeric ###format, 01, 02, 03,..., we need to convert it to a numeric format.Otherwise, ###we cannot pass Day to timelocal or localtime modules. That's why the ###subroutine is called. It converts Jan into 01 and so on. &Initialize; my $begin_seconds = timelocal($begin_Second, $begin_Minute, $begin_Hour, $begin_Day, $MonthToNumber{$begin_Month}, $begin_Year-1900); my $end_seconds = timelocal($end_Second, $end_Minute, $end_Hour, $end_Day, $MonthToNumber{$end_Month}, $end_Year-1900); ###elapsed time is the difference between two timestamps of two consecutive ###records in the log file. my $elapsed = $end_seconds - $begin_seconds; ###We check whether the elapsed time is greater than the interval that we ###choose, 1 minute or 15 minutes. If yes, then we need to start counting the ###records into a new 15 minute interval. If no, count the number of records ###in the same interval. Also, reset the begin_time and end_time, for the new ###count. Store all the interval periods into an array, processed_visual_pages. if ( $elapsed > $interval ){ $count = 0; $begin_time = $end_time; $end_time = $datetime; push (@processed_visual_pages, $end_time); } else { push (@visual_pages, $end_time); foreach $visual_page (@visual_pages) { $seen{$visual_page}++; } } } sub Initialize { my %MonthToNumber=( 'Jan', '01', 'Feb', '02', 'Mar', '03', 'Apr', '04', 'May', '05', 'Jun', '06', 'Jul', '07', 'Aug', '08', 'Sep', '09', 'Oct', '10', 'Nov', '11', 'Dec', '12', ); my %NumberToMonth=( '01', 'Jan', '02', 'Feb', '03', 'Mar', '04', 'Apr', '05', 'May', '06', 'Jun', '07', 'Jul', '08', 'Aug', '09', 'Sep', '10', 'Oct', '11', 'Nov', '12', 'Dec', ); } sub by_seen () { ( $seen{$b} cmp $seen{$a} ); } ---------------- The output I get is: 25/Apr/2003:13:54:02: 3 25/Apr/2003:13:54:19: 2 25/Apr/2003:13:54:22: 4 25/Apr/2003:13:54:34: 3 25/Apr/2003:13:54:38: 5 25/Apr/2003:13:54:41: 3 25/Apr/2003:13:54:43: 6 25/Apr/2003:13:54:44: 3 25/Apr/2003:13:54:46: 5 25/Apr/2003:13:54:47: 2 25/Apr/2003:13:54:48: 3 25/Apr/2003:13:54:50: 7 25/Apr/2003:13:54:51: 4 25/Apr/2003:13:54:53: 2 25/Apr/2003:13:54:58: 3 25/Apr/2003:13:55:01: 2 25/Apr/2003:13:55:02: 4 25/Apr/2003:13:55:05: 4 25/Apr/2003:13:55:08: 1 25/Apr/2003:13:55:14: 3 25/Apr/2003:13:55:15: 1 25/Apr/2003:13:56:13: 5 25/Apr/2003:13:56:27: 5 25/Apr/2003:13:56:35: 4 25/Apr/2003:13:56:40: 4 25/Apr/2003:13:56:45: 1 25/Apr/2003:13:56:51: 5 ------------------------ -----Original Message----- From: Rai,Dharmender [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 18, 2003 1:30 AM To: '[EMAIL PROTECTED]'; 'Anand Ayyagary' Subject: RE: Parsing the Apache web log file, access_log module Apache::ParseLog would help you !! > ---------- > From: Anand Ayyagary[SMTP:[EMAIL PROTECTED] > Sent: Wednesday, June 18, 2003 1:02 AM > To: '[EMAIL PROTECTED]' > Subject: Parsing the Apache web log file, access_log > > Help needed for Perl script > Hi all, > > I am new to this group. I need help regarding a perl script which parses > the > web log file, access_log. > > The format of the access_log is: > > 127.0.0.1 - - [15/Jun/2003:13:54:02 -0100] "GET /xxxx HTTP/1.1" 200 34906 > > The goal is to > > 1. Perfom a count of the pages for the given timestamp. It is possible > that > multiple pages exist with the same timestamp (As the timestamp I mentioned > above). > 2. Within a range of time interval, say, 15 minutes starting with the > timestamp of the first line in the log file, I would like to compute the > average of the number of pages, minimum and maximum number of pages in > that > interval. > > 3. I would like the output as below. Following is just an example. > > Time Average Pages Min Pages Max Pages > --------------------------- ----------------- ----------------- > 15/Jun/2003:14:09:02 6.5 3 10 > 15/Jun/2003:14:24:02 5.5 4 7 > > > I shall appreciate an early response. > > Thanks in advance > > Regards > Anand > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > ____________________________________________ Confidential: This electronic message and all contents contain information from Syntel, Inc. which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be for the addressee only. If you are not the addressee, any disclosure, copy, distribution or use of the contents of this message is prohibited. If you have received this electronic message in error, please notify the sender immediately and destroy the original message and all copies. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]