Re: Resourceutilization & Backup Question

Ben Bullock Wed, 17 Aug 2005 09:05:04 -0700

        Debbie,
        We find that problem with the resourceutilization/collocation
from time to time. Most nights, all the threads from one host are going
to the diskpool, so they all run at the same time with no problems. The
times we end up in the situation you describe is:
        - When the diskpool in front of the collocated tapepool fills up
and starts to migrate. Then we have the migration get the one tape and
the others line up for it because the diskpool is full. A large number
of unusual changes across multiple hosts going to a collocated diskpool
is usually what has happened. Making the diskpool larger could fix the
issue if it happens all the time.
        - When the "maxsize" is set on the diskpool in front of the
collocated tapepool is set, and you have a bunch of files that are
larger than the threshold. Then we see all the sessions queue up for the
same tape. One way to get around it is to lower the maxsize attribute,
but then your diskpool may fill (see situation one). 
        You might be able to alleviate the issue by changing the
tapepool to "collocation=filesystem" but then the number of partially
filled tapes may skyrocket.
___


        As for the backups taking longer than usual when it runs at
different times of the day, I've seen cases where if an expire inventory
is deleting a bunch of files (many DB updates), the movement of data on
the server will slow down, usually when the backup/migrations running
have many little files to move around (also many db updates). Basically
a bottleneck for I/O to the TSM database.
        In your case, however, I'm guessing that these Oracle backup
files are big, so they shouldn't be impacted by heavy TSM DB activity,
as it only needs to make DB inserts once in a while.

        Depending on if your oracle data is heading to disk or directly
to tape, there are couple other possibilities.
        - If it was going directly to tape and all the tapes were busy
at the time (doing migrations, reclamations, db backups, etc.) then your
session may have waited for a tape mount for a while. I've found the
best way to hunt those down is in one of the fields in the accounting
log on the TSM server that tracks seconds that sessions are in
mediawait. I believe it is field 24.
        - If the data was headed to disk and the diskpool was migrating
data to tape, then there might have been some I/O contention on the
disks as many reads and writes could be happening on the same diskpool
volumes. 

        Those are my meandering thoughts on what may be happening.

Ben
        

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Debbie Bassler
Sent: Wednesday, August 17, 2005 8:32 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Resourceutilization & Backup Question

Is there any benefit to having the resourceutilization parameter set to
5 when the backup is collocated? The other sessions go into media wait
status, waiting on a tape that one of the other sessions is using. Does
this cause contention, slowing the server down?

Also, I have another question. I'm not sure anyone else uses this
technique, but, I'll throw it out there any way. We do hot backups on
our oracle databases via a command script using TSM. We create filelists
for our files.  A hot backup means the database can be updated while the
backup is running. We usually run our backup at 9:00 PM, it backs up
200G of data, and takes approx 5 hours to complete. However, due to some
maintenance we wanted to do  yesterday, we started the backup at 4:30
PM. It had only backed up 130G of data in 5 hours. In an effort to start
the maintenance, I cancelled the backup. However, it was decided a good
backup was needed, so, we put the maintenance off until tonight. I
started the backup, again, last night at 22:35 and it took 5 hr and 25
minutes to complete the 201G backup.

I can't understand why the big time difference. There was no more
activity in TSM during the slow backup than there is when the backup
normally executes.

We have an internal Gig switch in our SP environment. Our server level
is
5.1.1 and the client level is 5.1.1.5.

We're going to bring the database down, at 4:30 this afternoon, and
perform the backup again. I need to ensure it will execute in the
allotted 5 hour time frame. We don't want to have people waiting 5 hours
in hopes of performing maintenace, just to send them home again.

Any ideas or suggestions will be greatly appreciated.....

Thanks for any help,
Debbie

Re: Resourceutilization & Backup Question

Reply via email to