On Dec 16, 2003, at 1:15 PM, Perl wrote:

I wrote a small script that uses message ID's as unique values and
extracts recipient address info. The goal is to count 1019 events per
message ID. It also gets the sum of recipients per message ID. The
script works fine but when it runs against a very large file (2GB+) I
receive an out of memory error.

Is there a more efficient way of handling the hash portion that is less
memory intense and preferably faster?

Sure is.


# Tracking log parser


use strict;


my $recips;
my %event_id;
my $counter;
my $total_recips;
my $count;


# Get log file


die "You must enter a tracking log. \n" if $#ARGV <0;

Or:


die "..." unless @ARGV;

my $logfile = shift;

open (LOGFILE, $logfile) || die "Unable to open $logfile because\n
$!\n";

You won't need either of the above lines after the following change. Drop 'em.


foreach (<LOGFILE>) {

Here's your problem. Change to:


while (<>) {

Your old loop is reading in the entire file, then handing you one line at a time. The while version reads one line at a time and when we leave out the file handle, it operates on @ARGV entries by default.

It sets $_, just like your foreach() loop was doing so the rest of this stuff should just work.

        next if /^#/;   #skip any comment lines that contain the pound
sign.                   
        my @fields = split (/\t/, $_); #split the line by tabs
        
       $recips = $fields[13]; # Number or recipients column

This $recips variable looks like it should be declared here, not above.


        my $message_id = $fields[9]; # message ID
        
         if ($fields[8] == "1019")    {
                
                $event_id{$message_id}++ unless exists
$event_id{$message_id};
                $counter++;
                $total_recips = ($total_recips + $recips);

Or:


$total += $recips;

                }
        
        
close LOGFILE;  

You need to drop this also.


}
        

print "\n\nTotal instances of 1019 events in \"$logfile\" is
$counter.\n\n";    

print "\nTotal single instances of 1019 event per message ID is ";

#print keys %event_id;

foreach my $key (keys (%event_id))      {
        $count ++;
}

print $count;

print "\n\nTotal # of recipients per message ID is ";
print $total_recips;

Well, hope that helps put you back on track.


James


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to