Oops. Thanks for pointing me toward the activity log again. It turns out that the problem was "IdleTimeOut" after all. I was looking for a server message with aapproximately the same timestamp as the client lost connection message.
dsmsched.log, with the server activity log entry in bold:
11/16/04 02:34:03 Normal File--> 1,048,584,192 /ess36/oracle/BNRD096/bnrd096.fgbtrnd_key_index ** Unsuccessful ** 11/16/04 02:34:03 ANS1114I Waiting for mount of offline media. 11/16/2004 02:49:33 ANR0482W Session 37606 for node VCMR-96.SERVER.RPI.EDU (AIX) terminated - idle for more than 15 minutes. 11/16/04 03:06:57 Retry # 1 Normal File--> 1,048,584,192 /ess36/oracle/BNRD096/bnrd096.fgbtrnd_key_index [Sent] 11/16/04 03:06:57 ANS1809W Session is lost; initializing session reopen procedure. 11/16/04 03:06:57 Successful incremental backup of '/ess36'
dsmerror.log:
1/16/04 03:06:56 ANS1005E TCP/IP read error on socket = 4, errno = 73, reason : 'Connection reset by peer'. 11/16/04 03:06:57 ANS1809W Session is lost; initializing session reopen proced ure. 11/16/04 03:06:57 ANS1809W Session is lost; initializing session reopen proced ure. 11/16/04 03:07:12 ANS1810E TSM session has been reestablished.
At 08:01 PM 11/16/2004, you wrote:
... >ANS1809W Session is lost; ...
Chet - This is most commonly caused by preemption, where client scheduling is too clumped rather than spread out over the day, and a higher priority task (e.g., Restore) needs a drive when they are all in use by lower priority tasks (e.g., Backup). Check the server Activity Log.
Richard Sims