On Wed, May 5, 2010 at 8:50 PM, Akhthar Parvez K <akht...@sysadminguide.com>wrote:
> On Wednesday 05 May 2010, Rob Coops wrote: > > > > A file never starts life being huge but certainly logs tend to grow, and > > they are not always kept in check properly so assume they will be massive > > (I've seen flat text logs that grew by as much as +1GB per day) assuming > > that the file will always be relatively small because it is at this point > in > > time is very dangerous especially if you have little or no control over > that > > file your self. > > Eventhough what you said does make sense, I don't think it's relevant here. > It's not about whether you have the control over the file or not, but what > would you do if you want to open a huge file in a Perl program. You can > either skip it or you will have to store in a variable and get on with the > memory issue. > > > An if/elsif construction is nice for a few conditions but that will not > work > > for say 50 of them even at 10 the script will become hard to read, of > course > > one could construct a hash where the key is the condition and the value > is > > an array of lines found that match that condition, but this runs into the > > same memory problem again. > > The only way to do this and not have to worry about memory issues is by > > resetting the file handle to the start of the file at the start of the > loop, > > that way you avoid any memory problems and you avoid having to open and > > close the file say 50 times. > > Yes, this is the right solution if you are going to play with the file > contents multiple times, but the only exception is when the file size is > huge and I've already talked about this above. > > > One other thing I would like to remark is that when you are monitoring a > log > > or anything else just reporting errors is not a good idea you want to > report > > both the good and the bad. Think of the following situation: Your > machines > > logs are flooded with errors but due to a configuration error your > scheduler > > is not starting the monitoring script or the script is started but due to > a > > permissions problem it cannot access the file or it cannot find a path to > > the mail server due to a routing issue etc... you will never know that > your > > logs are being spammed with errors since if you don't get a mail this > means > > everything is fine, right? > > I have been in a situation where a critical production system had been > > screaming that the restoration of its raid array had caused the database > on > > it to become inconsistent but due to an error in the routing tables the > > error mails where being sent out from an incorrect interface and the mail > > server could not be reached. It took over 3 months before the system went > > down in a catastrophic failure when there was no other solution then to > > rebuild the whole database, all of that just because positive news was > not > > being send out. > > So if you want to rely on mails send from the machine to monitor logs > make > > sure you send a mail even if all it says that there is nothing to report. > > In a real world scenario, system admins will have to setup a script to send > out notifications if something fails. You can't set it up in such a way that > it will send the email whatever the result be. You can do it if you are > doing it only for a few, but not really feasible when you have got a lot > more systems. If you do, you'll end up with receiving lots of emails and you > might miss those failure notifications in such cases. So it's better to have > the system to send out email only when there's an issue. But then, the issue > you mentioned might happen. If the email wasn't delivered due to some > reason, you wouldn't know there's an issue. But you have a better solution > there as well: Just perform periodic audits so that you can manually confirm > everything is fine. > > For a critical component such as RAID, it's important to have a live > monitoring system such as nagios so that you'll never miss out on such > things. At the end of day, it comes down to how you manage what you have got > in your hands. > > -- > Regards, > Akhthar Parvez K > http://Tips.SysAdminGUIDE.COM > UNIX is basically a simple operating system, but you have to be a genius to > understand the simplicity - Dennie Richie > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > Of course and a system like nagios does exactly that, it reports errors and positives, mail is of course not the way right to deal with monitoring certainly in large environments it is simply not done via mail. Currently looking at the monitoring system for some 10k machines in the rooms behind me, thank god they are not mailing me to tell me they are feeling happy today :-)