We need to perform scheduled full/cold backups of a 1TB Oracle database on a Sun Enterprise 10000 (E10K) server running Solaris 8 and TSM Server/Client v5.1. The server has a local 30-slot tape library with 4 SCSI DLT7000 drives, and a 32GB diskpool. TSM Shared Memory protocol and both "SELFTUNE..." settings are enabled. The Oracle database is comprised of very large files (most are 10GB) across 20 filesystem mountpoints (RAID 1+0).
We defined 1 TSM Client node with ResourceUtilization=10, collocation disabled, compression disabled and the DevClass Format=DLT35C (DLT7000, hardware compression enabled). The Backup CopyGroup is directed to a Primary TapePool. The DiskPool is 30GB, but is not used for the Oracle DB backup because of the huge filesizes. GOOD - During the first backup we saw 4 active data sessions and 4 mounted tapes at all times. The backup used all 4 drives 100% of the time, and because of hardware compression ultimately used only 4 tapes. NOT SO GOOD - The backup took approx 18.5 hours, which translates to about 13.5 GB/hour per tape drive. We expected to achieve 25GB/hour per drive (based on prior performance with TSM 3.7 on the same hardware). BAD - portions of every filespace (possibly even portions of each 10GB file?) were scattered across all 4 tapes BAD - a subsequent 'BACKUP STGPOOL <tapepool> <copypool>' had to decompress then recompress all 1TB of data, & took longer than the backup BAD - a subsequent Disaster Recovery test took 30 hours - we couldn't find any 'dsmc' command-line syntax to initiate a multi-threaded restore for multiple filespaces. We had to initiate filespace restores individually and sequentially to avoid tape contention problems (i.e., strong likelihood of concurrent threads or processes requesting data from different locations on the same 4 tapes at the same time) We cannot backup to DiskPool and let Migration manage tape drive utilization, because the individual files are so large (10GB) that they would almost immediately flood the 30GB DiskPool. Q1 - We are considering whether to set TSM Client Compression=Yes and use DevClass Format=DLT35 (instead of DLT35C) to solve the decompress/recompress problem. However, there will then be 4 TSM Client processes performing cpu-intensive data compression plus the TSM Server process, all running concurrently on one server. Will that be too processor-intensive, & could the server deliver data fast enough to keep all 4 drives streaming? Q2 - We are also considering whether to set Collocation=Filespace to permit multiple concurrent restores with minimal tape contention. Wouldn't that use 20 or more tapes during a backup instead of just 4, and wouldn't we overrun our library's 30-slot limit just doing the 1st backup, 1st 'BACKUP STGPOOL <tapepool> <copypool>' and 1st 'Backup DB' ? Q3 - How can we boost performance of the tape drives back to 25GB/hour? What are likely reasons why TSM 5.1 is only getting 13.5GB/hour? Q4 - If we set Collocation=Filespace, can we limit the number of tapes used, or influence TSM's selection of tapes, during backup? Q5 - Does backup of 10GB files to a local tape device via Shared Memory require special performance tuning? What else should we do - increase slots/drives; increase DiskPool size; return to the 4-logical-node model and abandon multi-threading in both backups and restores; other? Comments & recommendations welcome - rsvp, thanks Kent Monthei GlaxoSmithKline