We need to perform scheduled full/cold backups of a 1TB Oracle database on
a Sun Enterprise 10000 (E10K) server running Solaris 8 and TSM
Server/Client v5.1.  The server has a local 30-slot tape library with 4
SCSI DLT7000 drives, and a 32GB diskpool.  TSM Shared Memory protocol and
both "SELFTUNE..." settings are enabled.  The Oracle database is comprised
of very large files (most are 10GB) across 20 filesystem mountpoints (RAID
1+0).

We defined 1 TSM Client node with ResourceUtilization=10, collocation
disabled, compression disabled and the DevClass Format=DLT35C (DLT7000,
hardware compression enabled).  The Backup CopyGroup is directed to a
Primary TapePool.  The DiskPool is 30GB, but is not used for the Oracle DB
backup because of the huge filesizes.

GOOD - During the first backup we saw 4 active data sessions and 4 mounted
tapes at all times.  The backup used all 4 drives 100% of the time, and
because of hardware compression ultimately used only 4 tapes.

NOT SO GOOD - The backup took approx 18.5 hours, which translates to about 13.5
GB/hour per tape drive.  We expected to achieve 25GB/hour per drive (based
on prior performance with TSM 3.7 on the same hardware).

BAD - portions of every filespace (possibly even portions of each 10GB file?)
were scattered across all 4 tapes

BAD - a subsequent 'BACKUP STGPOOL <tapepool> <copypool>' had to decompress
then recompress all 1TB of data, & took longer than the backup

BAD - a subsequent Disaster Recovery test took 30 hours - we couldn't find
any 'dsmc' command-line syntax to initiate a multi-threaded restore for
multiple filespaces.   We had to initiate filespace restores individually
and sequentially to avoid tape contention problems (i.e., strong
likelihood of concurrent threads or processes requesting data from
different locations on the same 4 tapes at the same time)

We cannot backup to DiskPool and let Migration manage tape drive
utilization, because the individual files are so large (10GB) that they
would almost immediately flood the 30GB DiskPool.

Q1 - We are considering whether to set TSM Client Compression=Yes and use
DevClass Format=DLT35 (instead of DLT35C) to solve the
decompress/recompress problem.  However, there will then be 4 TSM Client
processes performing cpu-intensive data compression plus the TSM Server
process, all running concurrently on one server.  Will that be too
processor-intensive, & could the server deliver data fast enough to keep
all 4 drives streaming?

Q2 -  We are also considering whether to set Collocation=Filespace to
permit multiple concurrent restores with minimal tape contention. Wouldn't
that use 20 or more tapes during a backup instead of just 4, and wouldn't
we overrun our library's 30-slot limit just doing the 1st backup, 1st
'BACKUP STGPOOL <tapepool> <copypool>' and 1st 'Backup DB' ?

Q3 - How can we boost performance of the tape drives back to 25GB/hour?
What are likely reasons why TSM 5.1 is only getting 13.5GB/hour?

Q4 - If we set Collocation=Filespace, can we limit the number of tapes
used, or influence TSM's selection of tapes, during backup?

Q5 - Does backup of 10GB files to a local tape device via Shared Memory
require special performance tuning?

What else should we do - increase slots/drives; increase DiskPool size;
return to the 4-logical-node model and abandon multi-threading in both
backups and restores; other?

Comments & recommendations welcome  - rsvp, thanks

Kent Monthei
GlaxoSmithKline

Reply via email to