If you want something automated, then you may want to write a script that
would
issue a query event * * t=c to search for 'failed', 'severed', 'missed', &
'uncertain' scheduled client events.

After it identifies the problem clients, it immediately starts off
incremental backups from the command line for each node every 15,30,45 mins
(depending on the amount of time each node takes to backup & the # of
drives, & other processes running on the server at the time).

After the server has identified all clients that fit the above criteria, it
would then write the clients name, date, time, & status in a file
(unsuccessful.log) so you could track the failures for resolution purposes.


Unfortunately, we do not have a script like that here, but maybe this IDEA
would ignite you in the right direction!

Regards,

Demetrius Malbrough
UNIX/TSM Administrator

-----Original Message-----
From: Sean McNamara [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 07, 2002 7:41 AM
To: [EMAIL PROTECTED]
Subject: Process for re-running failed of missed backups


Good Morning,

        I am looking for some best practices from the list on how to handle
backups that have either failed or were missed during the night.  We
currently
backup 1000 nodes to four TSM servers.  Most mornings we have about 20-30
servers that have had a problem completing their backup for one reason or
another.  We currently investigate why each failure occurred and then re-run
the
backup to capture changed data. It can get to be quite time consuming.  I'm
wondering what other shops do to handle these situations?  I am looking to
not
only reduce the number of failure we experience each night but to may
automate
the process of checking these backups and re-running them.  Thanks in
advance.


Sean

Sean McNamara
Senior Analyst
PJM Interconnection, L.L.C.
955 Jefferson Ave
Norristown, PA  19403
(610)666-4206
(610)666-4285 (fax)

Reply via email to