Larry, We just got through a similar move from AIX to AIX hosts, although with different disks and tape drives. We had to do this process on 6 TSM servers.
The actual downtime to the TSM clients was only the time it took to run an incremental DB backup on the production server and restore it to the new server. Typically less than 1 hour. This is straight copy of my procedure, so obviously host names and software locations will be different, but it will show you the flow of the process. Take it and adjust it for your own installation and hardware. Ben ************************** TSM Server Migration These steps can be used to migrate the TSM services to a new server. This process is very much like what would need to be done in a Disaster recovery situation. This process was used in the summer of 2006 to migrate the TSM servers from old RS6000 servers and SSA disks to new P550 servers and EMC Clariion CX3-80 SANs. Steps to be done on the new servers before hand. 1. Build the new hosts with AIX 5.3 according to the documentation. NOTE that you can do the NIM installation of the OS over a 10GB NIC, it works. At the time of this documentation, BOTSMTEST1 is the NIM server used to install AIX 5.3. This can be found in this document: Server - AIX5.3 NIM installation 2. look in the /etc/inittab file and take out the 'dsmserv' line that automatically starts the TSM server if it is there. 3. Configure the /etc/ibmatl.conf and the 3494 tape library itself to talk to each other. The 'mtlib -l /dev/lmcp0 -q L' command will show you if you can connect to the library. You may need to remove the device and reconfigure it back in to get it to connect. 4. Move over a standard copy of the dsmserv.opt to the new host. * mount botsmtestX:/export/post /mnt * cp /mnt/dsmserv.opt /usr/tivoli/tsm/server/bin/dsmserv.opt 5. Zone the appropriate HBAs to the tape drives and the remaining HBAs to the Clariion. 6. 'cfgmgr' the tape drives and configure them as they are on the current server: Tape drive configuration steps 7. Create a new volume group with the POWERPATH devices for TSM to use. Make sure you choose to make a Powerpath VG. Make the data devices RAW 8. Create 2 LVs: 1 database LV and 1 log volume that is 12G. Make them 'raw' devices as it is simpler. There is no need to mirror the devices as they are protected on the Clariion backend. 9. Initialize the new LVs. This step will create a "virgin" TSM server and wipe out the default TSM configuration. You will need to remove the existing "/usr/tivoli/tsm/server/bin/dsmserv.dsk" file before you format these new devices to TSM. format of the command: * dsmserv format #-of-log-devices /dev/r??? \u2026 #-of-db-volumes /dev/r???? Example command: * dsmserv format 1 /dev/rtsmlogvol01 1 /dev/rtsmdbvol01 See the "TSM Administrator's Reference", near the back of the book for more information. 10. Create as many storagepool volumes as you will need by putting them in the volume group and creating LVs on the LUNs. 11. Bring up the TSM server in interactive mode to make sure it is able to come up. dsmserv 12. Re-enter in the tsm license information. You only need to run this one command. * register lic file=/usr/tivoli/tsm/server/bin/tsmee.lic 13. Define the devclasses needed to restore the DB backup: NOTE: You need only run the first command to define the "file" device type if you will be using an NFS mount to migrate the database. NOTE: If you will be using a tape drive, you will not able to define the path to the tape drive until you take it offline on the production server, so you may not want to do that until the day of the migration. * DEFINE DEVCLASS FILE DEVTYPE=FILE FORMAT=DRIVE MAXCAPACITY=66060288K MOUNTLIMIT=1 DIRECTORY=/mnt * DEFINE DEVCLASS 3592DEV DEVTYPE=3592 FORMAT=DRIVE MOUNTLIMIT=DRIVES MOUNTWAIT=60 MOUNTRETENTION=60 PREFIX=ADSM LIBRARY=BOITAPELIBX * DEFINE LIBRARY BOITAPELIBX LIBTYPE=349X PRIVATECATEGORY=88 SCRATCHCATEGORY=89 SHARED=NO * DEFINE PATH SERVER1 BOITAPELIBX SRCTYPE=SERVER DESTTYPE=LIBRARY DEVICE=/dev/lmcp0 * DEFINE DRIVE BOITAPELIBX 3592DRV10 * DEFINE PATH SERVER1 3592DRV10 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=BOITAPELIBX DEVICE=/dev/rmt9.7aeb.0 * Bring down server. o tsm> halt Steps to be done on the day of the migration. NOTE - you can move the database over through a tape mount or and NFS mount, for this procedure we will restore the full from the tape backup and the incremental from the NFS mount. If you choose to make a full backup again, the NFS area must be mounted on /mnt for this process. These are the commands you would use: * BACKUP DB dev=FILE ty=full scratch=yes - to backup to the NFS mount. * BACKUP DB dev=3592dev ty=full scratch=yes - to backup to a tape drive. 1. Find the full backup of the production server database that was made to tape today. - Look at the e-mail sent for the day or do a "q volhist ty=dbb" command within TSM. 2. Make note of the volume name. 3. Restore the full database backup to the new server but do NOT commit the changes on the restore. NOTE:If you are going to use a tape drive, you need to take it offline on the production server to use it. You need to use the same drive name that you configured in step #13 in the Pre-upgrade process. In the command below you will obviously change the 'vol=' option to the tape that has the TSM DB backup on it. * dsmserv restore db devclass=3592dev vol=A00032 OR for an NFS mount: * dsmserv restore db devclass=file vol=/mnt/52119599.dbs 4. Your milage may vary, but our test restores, it ran at about 100GB/hour. 5. While the restore is working on the new server, migrate all the data on disk to the tapes with command similar to these: NOTE: try to keep all the tape drives working to speed data getting to tape. * The 'q stg *disk*' command will show you which storage pools to drain to tape. * update stg I_DISKPOOL hi=0 low=0 migproc=2 * update stg DB_DISKPOOL hi=0 low=0 migproc=3 * etc. etc. for all the diskpools. 6. When the migrations are nearly complete, it's time to get ready for the actual server downtime. * Update the event to predict the actual time of the downtime. * Disable sessions to the TSM server 'disable sessions'. * Cancel any running client sessions. * Cancel any 'expire inventory' processes. 7. When the migrations are complete, you should be able to remove all the disks from the diskpools. * q vol dev=disk * delete vol /dev/rstorage01 * delete vol /dev/rstorage02 ... NOTE: If you get any errors deleting the disks, you should try to drain the data out of the storage pool again with the commands listed in step 5. 8. When all disk volumes have been completed, make an incremental backup of the database to the NFS mount: * BACKUP DB dev=FILE ty=incr scratch=yes 9. Dismount any tape drives that are still mounted. * q mount * dismount vol VOLNAME 10. Halt the production TSM services. * halt 11. Tar up the /opt/tsm area on the production server, move it to the new host and untar it. * tar -cvf opt.tar /opt/tsm/* * tar -xvf opt.tar 12. Tar up other critical TSM files and put them on the new server: * From the /usr/tivoli/tsm/client/ba/bin directory, you want to copy the dsm.sys, dsm.opt and inclexcl* files over to the new host. * From the /usr/tivoli/tsm/server/bin directory, you want to copy over the vol.hist1, and dev.config1 file. * From /, copy over the tsm.hints files and make changes to it as needed. 13. Restore the incremental database backup to the new server and DO commit the changes on the restore. * dsmserv restore db devclass=file vol=/mnt/52119599.dbb commit=yes 14. Go into DNS and change it so that the old TSM server addresses are now aliases on the new TSM server. i.e. "botsmX" should have aliases of 'tsmhostX'. 15. Once the restore is complete, you will need to upgrade the DB to the current TSM server version with this command: * dsmserv upgradedb 16. At this point the TSM server may be complaining in the log messages that it notices that the date on the server, if so you will need to run this command to assure the TSM server that it did not participate in any time travel: * ACCEPT DATE 17. Make note of any errors you see in the activity log. You hopefully will see none. If you see any, try to resolve them. 18. Define the new SAN storagepool LVs to the new TSM server to replace the ones you delete before. * q stg *disk* * define vol DB_DISKPOOL /dev/rstgvol01 * define vol I_DISKPOOL /dev/rstgvol02 * define vol A_DISKPOOL /dev/rstgvol03 ... 19. Update the threshholds on the disk storagepools so they will be back to normal levels. * q stg *disk* * update stg DB_DISKPOOL hi=90 low=70 * update stg I_DISKPOOL hi=90 low=70 * update stg A_DISKPOOL hi=90 low=70 * ... 20. If you are relatively sure that the TSM server is ready to go, halt the server remove this line from the dsmserv.opt file 'DISABLESCHEDS YES' and restart the TSM server * vi /usr/tivoli/tsm/server/bin/dsmserv.opt 21. Copy over the entries in root's crontab and set them to run on the new server. 22. Eject the tape you used to restore the full DB backup from the library: * mtlib -l /dev/lmcp0 -C -V VOLNAME -tFF10 23. Contact the Patrol group and have them put the server in the TSM group so that the TSM KM will start to monitor the system NOTE: Make sure to tell them the host will inherit the settings of the old production server. 24. Monitor TSM server over the next few hours to make sure tapes are being mounted and no errors are being reported. *************************** Good luck, Ben -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Larry Peifer Sent: Wednesday, October 04, 2006 11:01 AM To: ADSM-L@VM.MARIST.EDU Subject: TSM system migration planning TSM system migration We're in the beginning planning stages for a major transition in our TSM configuration. We've purchased all new hardware, Host and tape libraries and disk sub-system; so, I'm looking into the best practices for making the transition with as little disruption to ongoing production operations as possible. Any and all experiences from fellow TSM'ers is appreciated. Backup window runs from 6pm to 6am. Daily TSM Administration jobs, expiration / migration / dbbackups / prepare runs from 6am-4pm. Data to migrate totals about 25TBytes. ======EXISTING CONFIGURATION All tapes are in libraries at all times - we have no tapes offline. TSM Host: One AIX p650 with 32G memory hosting a number of heavily used Oracle databases, TSM server and disk pools, and all tape libraries connected via SAN fabric. TSM Server version 5.3.1 2G fibre channel to Disks via SAN switch (zoned) is used for storage pools and large raw logical volumes for the Oracle databases. 1G fibre channel to 4 Tape Libraries via SAN switch (zoned) Onsite tapes libraries IBM 3583 with 4 SCSI LTO-1 drives and 40 tapes used for only AIX and Oracle node data. IBM 3583 with 6 fibre channel LTO-2 drives and 60 LTO-2 tapes only Windows and Lotus node data. Remote tapes libraries IBM 3583 with 4 SCSI LTO-1 drives and 40 tapes used for only AIX and Oracle node data. IBM 3583 with 6 fibre channel LTO-2 drives and 60 LTO-2 tapes used for only Windows and Lotus node data. Once per day each Onsite tape stgpool is copied to the remote tape stgpool via 'backup stgpool' process. Each library is a single stgpool. Clients: 95 Windows NT and 2000 servers with TSM 5.0, 5.02, 5.03 data access via 100M ethernet and some Gb-ethernet. 10 TDP for Lotus Notes via Gb-ethernet 12 IBM AIX 5.3 MR4 with TSM 5.3.x via 1Gb-fibre channel 1 IBM AIX 4.3.3.x server with TSM 4.3.x via 100Mb ethernet 15 user managed Oracle Database backups via 1Gb-fibre channel (not using TDP for Oracle nor RMAN) ======NEW CONFIGURATION TSM Host: AIX P520 with 8G memory only used for TSM backup server and disk pools and all tape libraries connected via new dedicated SAN fabric. SAN disks available for storage pool is 4.3T TSM Server version: 5.3.most recent 4G Fibre Channel to storage pool Disks via SAN switch (zoned) 4G Fibre Channel to 2 Tape Libraries via SAN switch (zoned) Onsite tape library IBM 3584 with 10 Fibre Channel LTO-2 drives and 140 tapes Offsite tape library IBM 3584 with 10 Fibre Channel LTO-2 drives and 140 tapes Clients all stay the same with two major exceptions; 12 IBM AIX 5.3 MR4 nodes with TSM 5.3.x via Gb-Ethernet rather then 1Gb-Fibre Channel 15 user managed Oracle Database backups via Gb-Ethernet rather then 1Gb-Fibre Channel (not using TDP for Oracle nor RMAN) Turn on Collocation? Since we want to maintain tape separation for AIX/Oracle and Windows/Lotus data it seems like at least 2 collocation groups are need for the primary sequential stgpool. ===============QUESTIONS What is the easiest, fastest, least disruptive method to move data from the 2 onsite tape libraries to the 1 new onsite library? One goal is to NOT disrupt the nightly backup window. Deferring the administrative window would be ok. The migration process could be done during 1 8-hour work window or it could be done over any number of days as long as daily backup and recovery is still available. It would be possible to connect one new 3584 with a few of its tape drives active without salvaging nics and Gbics from the existing configuration and thereby have both old and part of the new system active in parallel. Moving all onsite LTO-1 and LTO-2 tapes to the new 3584 library is another option; and then aging out the LTO-1's via 'move data' perhaps. Any methods and trade-offs for this would be appreciated. And then there is the issue of getting our data from the primary sequential storage pools to the copy storage pools in the offsite library. I figure to just use 'backup storage pool' to make that happen. The 2 libraries are connected via 2 4Gb-Fibre Channel SAN switches. The question is how will collocation on the primary affect this and will using many processes help or hurt future recovery processing from the copy pools? Thanks for your time, Larry Peifer AIX / Oracle / TSM System Administrator San Clemente, California