On Wed, May 5, 2010 at 7:21 PM, Akhthar Parvez K
<akht...@sysadminguide.com>wrote:

> On Wednesday 05 May 2010, Rob Coops wrote:
> > Would it not be more efficient to reset the file handle to the star of
> the
> > file?
> >
> > use strict;
> > use warnings;
> > use Fcntl qw(:seek);
> > ...
> > foreach my $condition (@conditions) {
> >  seek ( $fh, 0, 0 ) or die "ERROR: Could not reset file handle\n";
> >  while (<LOGFILE>) {
> >   my $line = $_;
> >   ...
> >  }
> > }
> >
> > That way there is not need to re-open the file handle over and over again
> > saving on that IO overhead. Of course you need to then deal with the
> > posibility of the file being rotated etc. if this is a long running
> script,
> > but if this is a once every now and then started script say once every 5
> or
> > 10 minutes then you should be fine doing this.
> >
> > If you want this script to run all the time in a never ending loop and
> deal
> > with rotating files due to size/dates etc, then the reopening of the file
> is
> > the simplest way of doing this.
>
> He may also avoid opening filehandle again by doing this:
>
> eg:-
>
> while ($line = <$fh>)
> {
>  if ($line =~ /$condition1/) { print $fh_out1 "$line"; }
>  elsif ($line =~ /$condition2/) { print $fh_out1 "$line"; }
> }
>
> However, this wouldn't be appropriate if there're more conditions. So the
> big question is, which way is better if we need to do a check statement
> (grep or regex) with a file multiple times in a program?
>
> 1. Storing the file into a list or a variable at the beginning - But what
> if the file is huge?
> 2. Opening filehandles again and again - would cause I/O overhead
> 3. any other method?
>
> --
> Regards,
> Akhthar Parvez K
> http://Tips.SysAdminGUIDE.COM
> UNIX is basically a simple operating system, but you have to be a genius to
> understand the simplicity - Dennie Richie
>

A file never starts life being huge but certainly logs tend to grow, and
they are not always kept in check properly so assume they will be massive
(I've seen flat text logs that grew by as much as +1GB per day) assuming
that the file will always be relatively small because it is at this point in
time is very dangerous especially if you have little or no control over that
file your self.

An if/elsif construction is nice for a few conditions but that will not work
for say 50 of them even at 10 the script will become hard to read, of course
one could construct a hash where the key is the condition and the value is
an array of lines found that match that condition, but this runs into the
same memory problem again.
The only way to do this and not have to worry about memory issues is by
resetting the file handle to the start of the file at the start of the loop,
that way you avoid any memory problems and you avoid having to open and
close the file say 50 times.

One other thing I would like to remark is that when you are monitoring a log
or anything else just reporting errors is not a good idea you want to report
both the good and the bad. Think of the following situation: Your machines
logs are flooded with errors but due to a configuration error your scheduler
is not starting the monitoring script or the script is started but due to a
permissions problem it cannot access the file or it cannot find a path to
the mail server due to a routing issue etc... you will never know that your
logs are being spammed with errors since if you don't get a mail this means
everything is fine, right?
I have been in a situation where a critical production system had been
screaming that the restoration of its raid array had caused the database on
it to become inconsistent but due to an error in the routing tables the
error mails where being sent out from an incorrect interface and the mail
server could not be reached. It took over 3 months before the system went
down in a catastrophic failure when there was no other solution then to
rebuild the whole database, all of that just because positive news was not
being send out.
So if you want to rely on mails send from the machine to monitor logs make
sure you send a mail even if all it says that there is nothing to report.
;-)

Rob

Reply via email to