Hi Gang I've had two cases of a bad scratch tape causing me issues this week, but the behavior of TSM was different in each case and to my mind inconsistent.
Case 1. Full backup of a TDP for exchange node. TSM 5.4.4.0 for Windows Server, TSM client 5.3.6, Storage agent 5.3.6, TDP for Exchange 5.3.3.1 TS3100 library with LTO3 drives. LAN Free TDP backup is part way through and goes to mount the next tape. Tape mount fails... 02/26/2009 04:16:48 ANR8944E Hardware or media error on drive DRIVE1 (\\.\Tape2) with volume DIA142L3(OP=TESTREADY, Error Number= 23, CC=0, KEY=03, ASC=53, ASCQ=00, SENSE=70.00.03.00.00.00.00.58.00.00.00.00.53.00.36.00.2E- .07.00.02.00.02.20.20.20.20.20.20.20.00.00.00.24.98.01.7- 4.00.00.00.00.00.00.00.00.00.00.00.00.00.00.12.02.00.00.- 00.00.00.00.00.60.00.00.00.00.70.00.03.00.00.00.00.58.00- .00.00.00.53.00.36.00.2 E.07.00.02.00.02.20.20.20.20.20.2- 0.00.00.00, Description=An undetermined error has occurred). Refer to Appendix C in the 'Messages' manual for recommended action. (SESSION: 995) 02/26/2009 04:16:48 ANR8304E Time out error on drive DRIVE1 (\\.\Tape2) in library ATL. (SESSION: 995) 02/26/2009 04:16:48 ANR8945W Scratch volume mount failed DIA142L3. (SESSION: 995) 02/26/2009 04:17:17 ANR8381E LTO volume DIA142L3 could not be mounted in drive DRIVE1 (\\.\Tape2). (SESSION: 995) 02/26/2009 04:17:17 ANR9790W Request to mount volume *SCRATCH* for library This is the only scratch tape in the library and is physically damaged. The TDP aborts the transaction, and retries the backup which writes to the end part of the first tape until it is full, at which point it tries to mount the scratch again, gets the same error, and the cycle repeats. Case 2. Library Manager/Library client set up, Multiple P595 AIX LPARS. One TSM Server is set up as Library manager and Config Manager. 4 library client servers, all at TSM Server 5.5.1.0 . Big TS3500 library, 30 drives and 2500 LTO4 tapes. This site is ramping up and has about 2000 scratch tapes. For reasons that we haven't quite understood yet, tapes are being left in drives and not properly dismounted. Thats not the interesting part. A library client tries to mount a scratch on a drive that is unable to do so, this produces an immediate IO error. In this case the scratch that had the IO error is marked private. TSM assumes the problem is the *tape* and attempts to mount the next available scratch in the same drive. Again this gets the IO error and is marked private. In a minute or two the server has run through all 2000 scratches and we have none left 02/24/2009 03:01:22 ANR8300E I/O error on library LTOCV1 (OP=00006C03, CC=207, KEY=05, ASC=21, ASCQ=01, SENSE=70.00.05.00.00.00.00.0 A.0- 0.00.00.00.21.01.00.C0.00.06., Description=Device is not in a state capable of performing request). Refer to Appendix C in the 'Messages' manual for recommended action. (SESSION: 7658) more... (<ENTER> to continue, 'C' to cancel) 02/24/2009 03:01:22 ANR8779E Unable to open drive , error number=2. (SESSION: 7658) 02/24/2009 03:01:22 ANR8300E I/O error on library LTOCV1 (OP=00006C03, CC=207, KEY=05, ASC=21, ASCQ=01, SENSE=70.00.05.00.00.00.00.0 A.0- 0.00.00.00.21.01.00.C0.00.04., Description=Device is not in a state capable of performing request). Refer to Appendix C in the 'Messages' manual for recommended action. (SESSION: 7658) 02/24/2009 03:01:22 ANR8778W Scratch volume CV0288L4 changed to Private Status to prevent re-access. (SESSION: 7658) 02/24/2009 03:01:22 ANR8942E Could not move volume CV0288L4 from slot-element 1356 to slot-element 65535. (SESSION: 7658) 02/24/2009 03:01:22 ANR8381E LTO volume CV0288L4 could not be mounted in drive . (SESSION: 7658) 02/24/2009 03:01:22 ANR9790W Request to mount volume *SCRATCH* for library client CV01 failed. (SESSION: 7658) Questions. Why did the first case not mark the tape Private? Why did the second case retry the mount repeatedly on the same drive rather than moving on to the next one? Does anyone understand this behavior? Thanks Steve Steven Harris TSM Admin, Sydney Australia