The pool in question here is not collocated, so I expected to have lots of tape mounts.... but the long period of inactivity is what puzzles us. I've also seen notes indicating that the "No Query Restore" may be the cause... I'll suggest to our Windows guys to try a classic restore.
-Robin "Connor, Jeffrey P." <[EMAIL PROTECTED] To: ADSM-L@VM.MARIST.EDU .NGRID.COM> cc: Sent by: "ADSM: Subject: Dist Stor Manager" Re: Very slow restores (days), hours to locate files <[EMAIL PROTECTED] EDU> 07/07/2005 04:39 PM Please respond to "ADSM: Dist Stor Manager" Robin, I hope the LTO firmware resolves your problem. However, I have seen a similar situation for Windows clients in our shop and it was not a tape drive issue. The situation here was that we had a tape stgpool, 3590Ks /3590E1A drives, collocated by node, that reached its maxscratch value. This led to what some folks call imperfect collocation where even though the stgpool is setup to collocate by node, data for more than one node can end up on the same tape. The problem we had with the node intermix in a collocated by node pool showed itself with a situation that sounds similar to yours. We attempted to run a restore of an 8GB Win2k C: drive about 50% full and saw very long delays where nothing appeared to be happening. A tape change would occur, some data would transfer, and then a VERY long pause before a mount request for the next tape. Query Session while tape was mounted showed what your Q SE showed below, session in Run state, zero seconds wait time, but send and recv byte counts remain unchanged. While not as many but similar to you, our incremental backups of the servers C: drive had files spread around a number of tapes. We never determined the root cause of the "think" time between tape mount requests. We resolved the issue by moving tapes in our stgpool to a new pool with a high MAXSCR value effectively re-collocating the data. All restores ran very happy after that. Sorry I could not provide a root cause of our situation but that's how we addressed it. Just curious but do the 310 tapes you identified also contain data for other nodes and are you using collocation? Jeff Connor National Grid USA -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Robin Sharpe Sent: Wednesday, July 06, 2005 10:10 AM To: ADSM-L@VM.MARIST.EDU Subject: Very slow restores (days), hours to locate files Hi guys, We're having problems restoring some windows servers (W2K)... The servers in question had some disk problems and are being rebuilt, so the Windows admins are restoring the C: drive. It is an 8GB drive and less than 50% used, so only 4GB to restore. It has taken several days to restore. I know one of our problems is that the data is spread over hundreds of volumes (literally... I counted 310 from a volumeusage query). Another problem is that we have an overflowed library, but we have loaded all of the tapes from the Windows storage pool. What I don't understand is why it takes so long to locate a file once the tape is mounted. We have seen the same tape mounted for hours before any data is transferred. Here is an excerpt from a "q se f=d" of a restore that is running right now: Sess Number: 1,143 Comm. Method: TCP/IP Sess State: Run Wait Time: 0 S Bytes Sent: 670.9 M Bytes Recvd: 58.2 K Sess Type: Node Platform: WinNT Client Name: WANO01 Media Access Status: Current input volume(s): 200658,(2279 Seconds) User Name: Date/Time First Data Sent: Proxy By Storage Agent: This restore has been running for almost 12 hours now (they have been restarting them periodically). There has been NO DATA transferred from that tape in the 38 minutes it has been mounted... I know this from doing an lsof command and looking at the offset which indicates the number of bytes transferred. I know that when I restore a single file, it can be found within seconds of mounting a tape (these are all LTO-2)... so, why does it take so long in this case? Is TSM actually reading the entire tape? If so, wouldn't I see lots of data being transferred? Or is there some kind of SCSI command that allows the drive to read and compare the data it gets? I thought TSM stored actual locations of the files in the DB, so it could quickly find any file (or aggregate) without reading the whole tape... I've been searching the literature, and I can't find any details on this. The TSM server is on HP-UX 11i, IBM LTO-2 drives, fiber attached, in a STK L700 library. Also, my DB is huge (314GB), and we are currently (for the last year) unable to delete anything, so we have many versions of volatile files. We are planning to split our environment into several TSMs, and in the short term, our windows admins will start doing weekly selective backups of the C: drives to consolidate active versions on few tapes. Thanks for any thoughts on this.... Robin Sharpe Berlex Labs This e-mail and any files transmitted with it, are confidential to National Grid and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please reply to this message and let the sender know.