Re: Dwindling Performance
I have posted many times in the past saying you should never do a database unload/reload to gain performance. But this just might be the one case where it might make sense - the remaining half of a split server. But before you do something that drastic, dangerous, and time consuming, look for the things that are easier to fix. My basic metric of whether or not you are in trouble is, how long does expiration take? If you start it daily, the closer it is to 24 hours running time, the closer you are to doomsday. Never-ending expiration is the classic symptom of TSM Server Meltdown. But on the other hand, if your expiration runs nice and fast, your server and its database are probably OK. Look to clients as the problem. They can't all squeeze in the door at once, so don't let them try. If they use the client-polling scheduler, how long is the backup window, and what is your setting for Schedule Randomization Percentage? Make it as high as possible - SET RANDOMIZE 50. This will also help if you are having any kind of a network bottleneck. Look at these clients on a micro level. About how much are they each actually backing up? If it's not much, then your theory might be right, that they are very busy downloading their lists of backed up files. In that case, load spreading will be the best thing you could do. You might consider a schedule where not every client does a full "Incremental" every night - perhaps they only do one every other night and on the other nights they do an "incrbydate" backup which is much faster, because it goes only by the timestamps in the file system. Not to ask the obvious, but what's your Database Cache Hit Percentage? (Q DB F=D) If it's below 99%, it needs help. Even (especially) a badly fragmented database will run a lot faster if you have it swimming in cache. Look at other differences between your two instances - are they basically different types of clients? Roger Deschner University of Illinois at Chicago [EMAIL PROTECTED] The short fortuneteller who escaped from prison= ==was a small medium at large.== On Tue, 13 Jan 2004, Andy Carlson wrote: >We are having terrible performance with one of our instances of TSM. I >have suspicions, but I want to hear what you guys say. Here is what we >have: > >2 instances of TSM - TSMI and TSMU (TSMI is the problem) > >TSM 5.2.1.1 >AIX 51.ML4 >RS/6000 P670 - 8 processors, 16GB memory >Fastt700 SAN >STK9840 Tape drives > >The Database is 85% of 88GB (with room to expand another 50GB or so). > >Right at this moment, we have 233 sessions with TSMI. The backup >sessions grind to a halt for hours at a time, with nothing apparently >happening. I suspect that the directory trees are being downloaded and >built, but not sure > >When we split TSMI and TSMU, we created the TSMU instance, and did a >full backup on all the servers that moved there. The TSMI database is a >restored copy of the original database, with the TSMU stuff deleted out. > >Any ideas would be greatly appreciated. > > >Andy Carlson|\ _,,,---,,_ >Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ >BJC Health Care|,4- ) )-,_. ,\ ( `'-' >St. Louis, Missouri '---''(_/--' `-'\_) >Cat Pics: http://andyc.dyndns.org/animal.html >
TDP for SQL backup fails
Hi, I'm running 5 backups using TDP for SQL on Win2000 SP3 servers. One of them fails with errors that don't give me any clues as to what goes wrong. I've got TSM Server at 5.1.5 on Windows and TDP for SQL 5.1.5. The error message ANS1017E is unknown to my TSM. ACO5436E suggests to look for the next message for clues. And rc=418 or rc=402 cannot be found in the return code section of the messages manual. Please help if you can, Thanks Yiannakis Here's what I get from the error logs: 01/12/2004 20:32:14 == Log file pruned using log retention period of 60 day(s) 01/12/2004 20:32:14 == No log entries pruned 01/12/2004 20:32:17 = 01/12/2004 20:32:18 = 01/12/2004 20:32:18 Request : FULL BACKUP 01/12/2004 20:32:18 Database Input List : BOCDSS 01/12/2004 20:32:18 Group Input List : - 01/12/2004 20:32:18 File Input List : - 01/12/2004 20:32:18 Number of Buffers : 5 01/12/2004 20:32:18 Buffer Size : 1024 01/12/2004 20:32:18 Number of SQL Buffers : 0 01/12/2004 20:32:18 SQL Buffer Size : 1024 01/12/2004 20:32:18 Number of Stripes specified : 2 01/12/2004 20:32:18 Estimate : - 01/12/2004 20:32:18 Truncate Log? : - 01/12/2004 20:32:18 Wait for Tape Mounts? : Yes 01/12/2004 20:32:18 TSM Options File : C:\Progra~1\Tivoli\TSM\TDPSql\dsm.opt 01/12/2004 20:32:18 TSM Nodename Override : - 01/12/2004 20:32:18 Sqlserver : DWDSS 01/12/2004 20:32:18 01/12/2004 23:48:54 ACO5436E A failure occurred on stripe number (1), rc = 418 01/12/2004 23:48:54 ANS1017E (RC-50) Session rejected: TCP/IP connection failure 01/12/2004 23:48:55 Backup of BOCDSS failed. 01/12/2004 23:48:55 ANS1017E (RC-50) Session rejected: TCP/IP connection failure 01/12/2004 23:48:55 Total SQL backups selected: 1 01/12/2004 23:48:55 Total SQL backups attempted: 1 01/12/2004 23:48:55 Total SQL backups completed: 0 01/12/2004 23:48:55 Total SQL backups excluded: 0 01/12/2004 23:48:55 Total SQL backups inactivated:0 01/12/2004 23:48:55 Throughput rate: 2,808.32 Kb/Sec 01/12/2004 23:48:55 Total bytes transferred: 33,917,686,272 01/12/2004 23:48:55 Elapsed processing time: 11,794.51 Secs 01/12/2004 23:48:55 ACO0151E Errors occurred while processing the request. 01/12/2004 23:48:56 ACO0151E Errors occurred while processing the request. dsierror.log 01/08/2004 23:48:07 cuConfirm: Received rc: -50 trying to receive ConfirmResp verb 01/08/2004 23:48:07 sessSendVerb: Error sending Verb, rc: -50 01/08/2004 23:48:07 sessSendVerb: Error sending Verb, rc: -50 01/12/2004 23:48:53 cuConfirm: Received rc: -50 trying to receive ConfirmResp verb 01/12/2004 23:48:53 sessSendVerb: Error sending Verb, rc: -50 01/12/2004 23:48:53 sessSendVerb: Error sending Verb, rc: -50 dsmerror.log 01/08/2004 23:48:09 ANS1909E The scheduled command failed. 01/08/2004 23:48:09 ANS1512E Scheduled event 'DWDSS_BOCDSS_FULL' failed. Return code = 402. 01/12/2004 23:48:56 ANS1909E The scheduled command failed. 01/12/2004 23:48:56 ANS1512E Scheduled event 'DWDSS_BOCDSS_FULL' failed. Return code = 402.
Re: Automated / manual DRM actions
Maybe this is what you're looking for. Only one question ? Was the customer also not backupping the tapepool to copypool ? Maybe this could be the answer, because no updates ( changes) we're done on the copypool. Then you cannot restore the tsm environment with the latest db backup volume . You need the db backup ( in the DRM file) that was made after the latest copypool backup, otherwise your drm plan is also useless. Also the following dbbackup expiration condition could be interesting : For volumes other than virtual volumes, all volumes in the series are in the VAULT state. I hope this helps >From the Help at the db expire parameter . This value should match the value of the Delay Period for Volume Reuse field for the copy storage pool. This ensures that the database can be restored to an earlier level and that database references to files in the storage pool are valid. A database backup volume is considered eligible for expiration if all of the following conditions exist: The age of the last volume of the series is greater than the expiration value specified in this operation. For volumes other than virtual volumes, all volumes in the series are in the VAULT state. The volume is not part of the most recent series. disaster recovery manager does not expire the most recent database backup series. >From the TSM Admin manual . Moving Backup Volumes Onsite Use the following procedure to expire the non-virtual database backup volumes and return the volumes back onsite for reuse or disposal. To specify the number of days before a database backup series is expired, issue the SET DRMDBBACKUPEXPIREDAYS command. To ensure that the database can be returned to an earlier level and database references to files in the copy storage pool are still valid, specify the same value for the REUSEDELAY parameter in your copy storage pool definition. The following example sets the number of days to 30. set drmdbbackupexpiredays 30 A database backup volume is considered eligible for expiration if all of the following conditions are true: The age of the last volume of the series has exceeded the expiration value. This value is the number of days since the last backup in the series. At installation, the expiration value is 60 days. To override this value, issue the SET DRMDBBACKUPEXPIREDAYS command. For volumes that are not virtual volumes, all volumes in the series are in the VAULT state. The volume is not part of the most recent database backup series. Note: Database backup volumes that are virtual volumes are removed during expiration processing. This processing is started manually by issuing the EXPIRE INVENTORY command or automatically through the EXPINTERVAL option setting specified in the server options file. Move a backup volume onsite for reuse or disposal when the volume is reclaimed and: The status for a copy storage pool volume is EMPTY. The database backup series is EXPIRED. To determine which volumes to retrieve, issue the following command: query drmedia * wherestate=vaultretrieve After the vault location acknowledges that the volumes have been given to the courier, issue the following command: move drmedia * wherestate=vaultretrieve The server does the following for all volumes in the VAULTRETRIEVE state: Change the volume state to COURIERRETRIEVE. Update the location of the volume according to what is specified in the SET DRMCOURIERNAME command. For more information, see Specifying Defaults for Offsite Recovery Media Management. When the courier delivers the volumes, acknowledge that the courier has returned the volumes onsite, by issuing: move drmedia * wherestate=courierretrieve The server does the following for all volumes in the COURIERRETRIEVE state: The volumes are now onsite and can be reused or disposed. The database backup volumes are deleted from the volume history table. For scratch copy storage pool volumes, the record in the database is deleted. For private copy storage pool volumes, the access is updated to read/write. If you do not want to step through all the states, you can use the TOSTATE parameter on the MOVE DRMEDIA command to specify the destination state. For example, to transition the volumes from VAULTRETRIEVE state to ONSITERETRIEVE state, issue the following command: move drmedia * wherestate=vaultretrieve tostate=onsiteretrieve The server does the following for all volumes with in the VAULTRETRIEVE state: The volumes are now onsite and can be reused or disposed. The database backup volumes are deleted from the volume history table. For scratch copy storage pool volumes, the record in the database is deleted. For private copy storage pool volumes, the access is updated to read/write. Best Regards, Met Vriendelijke Groet, Mit freundlichen Gruessen, Arnold van Wijnbergen -Oorspronkelijk bericht- Van: Alex den Hartog [mailto:[EMAIL PROTECTED] Verzonden: dinsdag 13 januari 2004 16:13 Aan: [EMAIL PROTECTED] Onderwerp: Automated / manual DRM actions Dear
Errors in Windows client gating more and more vague
Hi *SM-ers! I have several Windows clients (a mix of NT, 2K and 2003) which produce vague errors every now and then. I see errors like "ANS1301E Server detected system error", "Sending of object 'C:' failed" and "ANS1931E An error saving one or more eventlogs." Just one line every time, no further explanation given... These messages are seen randomly on most 5.1.6 Windows clients, but the same machine runs fine the next day. So tracing isn't an option. I'm a little bit stuck here... Kindest regards, Eric van Loon KLM Royal Dutch Airlines ** For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. **
Recall: Inconsistency between summary and actlog
רוברט אוזן would like to recall the message, "Inconsistency between summary and actlog".
Inconsistency between summary and actlog
Hello to everyone Run every day a script to collect the amount and time for clients backup as: SELECT ENTITY AS NODE, - CAST(SUM(BYTES/1024/1024) - AS DECIMAL(8,2)) AS "MB", - substr(cast(min(start_time) as char(26)),1,19) as "DATETIME ", - cast(substr(cast(max(end_time)-min(start_time) as char(20)),3,8) as char(8)) as "Length " - FROM SUMMARY - WHERE ACTIVITY IN ('BACKUP','ARCHIVE') - and current date - 1 days = date(Start_time) and time(start_time)>'18:00:00' - GROUP BY ENTITY - ORDER BY MB SELECT ENTITY AS NODE, - CAST(SUM(BYTES/1024/1024) - AS DECIMAL(8,2)) AS "MB", - substr(cast(min(start_time) as char(26)),1,19) as "DATETIME ", - cast(substr(cast(max(end_time)-min(start_time) as char(20)),3,8) as char(8)) as "Length " - FROM SUMMARY - WHERE ACTIVITY IN ('BACKUP','ARCHIVE') - and current date = date(Start_time) and time(end_time)<'08:00:00' - GROUP BY ENTITY - ORDER BY MB The OUTPUT look like: NODEMB DATETIME Length -- ---- - HIGHLEARN2K2766.09 2004-01-13 22:30:06 00:10:30 EXCHANGE1 57807.11 2004-01-13 22:07:16 01:38:54 NODEMB DATETIME Length -- -- -- - WEB2000_DB0.13 2004-01-14 03:30:07 00:00:23 EXCHANGE2 23187.052004-01-14 01:08:48 00:38:03 This script give almost all my clients but a few are missing but when running a query actlog the client missing as an output: 01/13/2004 23:04:02 ANE4952I (Session: 23, Node: IBROWSE2) Total number of objects inspected: 17,827 01/13/2004 23:04:02 ANE4954I (Session: 23, Node: IBROWSE2) Total number of objects backed up: 67 01/13/2004 23:04:02 ANE4958I (Session: 23, Node: IBROWSE2) Total number of objects updated: 0 01/13/2004 23:04:02 ANE4960I (Session: 23, Node: IBROWSE2) Total number of objects rebound: 0 01/13/2004 23:04:02 ANE4957I (Session: 23, Node: IBROWSE2) Total number of objects deleted: 0 01/13/2004 23:04:02 ANE4970I (Session: 23, Node: IBROWSE2) Total number of objects expired: 1 01/13/2004 23:04:02 ANE4959I (Session: 23, Node: IBROWSE2) Total number of objects failed: 0 01/13/2004 23:04:02 ANE4961I (Session: 23, Node: IBROWSE2) Total number of bytes transferred: 1.07 GB The node IBROWSE2 is not shown in my summary query ??? Why is a difference between summary table to actlog table Can I run something else to give me a correct result !!!
Inconsistency between summary and actlog
Hello to everyone Run every day a script to collect the amount and time for clients backup as: SELECT ENTITY AS NODE, - CAST(SUM(BYTES/1024/1024) - AS DECIMAL(8,2)) AS "MB", - substr(cast(min(start_time) as char(26)),1,19) as "DATETIME ", - cast(substr(cast(max(end_time)-min(start_time) as char(20)),3,8) as char(8)) as "Length " - FROM SUMMARY - WHERE ACTIVITY IN ('BACKUP','ARCHIVE') - and current date - 1 days = date(Start_time) and time(start_time)>'18:00:00' - GROUP BY ENTITY - ORDER BY MB SELECT ENTITY AS NODE, - CAST(SUM(BYTES/1024/1024) - AS DECIMAL(8,2)) AS "MB", - substr(cast(min(start_time) as char(26)),1,19) as "DATETIME ", - cast(substr(cast(max(end_time)-min(start_time) as char(20)),3,8) as char(8)) as "Length " - FROM SUMMARY - WHERE ACTIVITY IN ('BACKUP','ARCHIVE') - and current date = date(Start_time) and time(end_time)<'08:00:00' - GROUP BY ENTITY - ORDER BY MB The OUTPUT look like: NODEMB DATETIME Length -- ---- - HIGHLEARN2K2766.09 2004-01-13 22:30:06 00:10:30 EXCHANGE1 57807.11 2004-01-13 22:07:16 01:38:54 NODEMB DATETIME Length -- -- -- - WEB2000_DB0.13 2004-01-14 03:30:07 00:00:23 EXCHANGE2 23187.052004-01-14 01:08:48 00:38:03 This script give almost all my clients but a few are missing but when running a query actlog the client missing as an output: 01/13/2004 23:04:02 ANE4952I (Session: 23, Node: IBROWSE2) Total number of objects inspected: 17,827 01/13/2004 23:04:02 ANE4954I (Session: 23, Node: IBROWSE2) Total number of objects backed up: 67 01/13/2004 23:04:02 ANE4958I (Session: 23, Node: IBROWSE2) Total number of objects updated: 0 01/13/2004 23:04:02 ANE4960I (Session: 23, Node: IBROWSE2) Total number of objects rebound: 0 01/13/2004 23:04:02 ANE4957I (Session: 23, Node: IBROWSE2) Total number of objects deleted: 0 01/13/2004 23:04:02 ANE4970I (Session: 23, Node: IBROWSE2) Total number of objects expired: 1 01/13/2004 23:04:02 ANE4959I (Session: 23, Node: IBROWSE2) Total number of objects failed: 0 01/13/2004 23:04:02 ANE4961I (Session: 23, Node: IBROWSE2) Total number of bytes transferred: 1.07 GB The node IBROWSE2 is not shown in my summary query ??? Why is a difference between summary table to actlog table Can I run something else to give me a correct result !!! Tsm Server version 5.1.8.0 on Windows2000 , Tsm client version 5.2.2.0 Regards Robert Ouzen E-mail: [EMAIL PROTECTED]
Re: Inconsistency between summary and actlog
Robert - As you go through your professional life administering TSM, keep one thing in mind: the Summary table is unreliable. It has historically been unreliable. As testament, see the many postings from customers who have wasted their time trying to get reliable data from it. As we keep saying: Use the TSM accounting records as your principal source of data for usage statistics. That's what they are there for, and they reliably contain solid, basic data about sessions. Beyond that, use the Activity Log, the Events table, and client backup logs. Richard Sims, BU
Re: Inconsistency between summary and actlog
Richard Thanks for the advice . Do you know where I can found the format/structure of the accounting records (dsmaccnt.log) Regards Robert -Original Message- From: Richard Sims [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 2:19 PM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Robert - As you go through your professional life administering TSM, keep one thing in mind: the Summary table is unreliable. It has historically been unreliable. As testament, see the many postings from customers who have wasted their time trying to get reliable data from it. As we keep saying: Use the TSM accounting records as your principal source of data for usage statistics. That's what they are there for, and they reliably contain solid, basic data about sessions. Beyond that, use the Activity Log, the Events table, and client backup logs. Richard Sims, BU
Re: TDP for SQL backup fails
>I'm running 5 backups using TDP for SQL on Win2000 SP3 servers. One of them >fails with errors that don't give me any clues as to what goes wrong. >I've got TSM Server at 5.1.5 on Windows and TDP for SQL 5.1.5. >The error message ANS1017E is unknown to my TSM. ACO5436E suggests to look >for the next message for clues. And rc=418 or rc=402 cannot be found in the >return code section of the messages manual. ... >01/12/2004 20:32:18 Sqlserver : DWDSS >01/12/2004 20:32:18 >01/12/2004 23:48:54 ACO5436E A failure occurred on stripe number (1), rc = >418 Yiannakis - Indeed, all the return codes *should* be in the Messages manual: by all means use IBM feedback procedures to have the publications people get them in there. Always keep in mind that the TDPs are based upon the TSM API; hence, you can always fall back to the API manual for return code information...up to a point. That manual has the 418 (but not 402): DSM_RC_OPT_CLIENT_DOES_NOT_WANT 418 /* Client doesn't want this value*/ /* from the server */ I'd recommend looking in the server Activity Log for further reasons. Note the long period between the 20:32 session initiation time and the 23:48 failure message. This *might* reflect the server abandoning the client session because it exceeded timeout values - which may reflect a problem with client processing (delays from disk reliability or being pushed aside by other processes on the client) or the server timeout values being too low. Richard Sims, BU
Re: Inconsistency between summary and actlog
OK Richard, then why did the propeller heads at IBM decide to use the summary table for collecting stats for the Operational Reporter? I'm now sitting with ten nodes that don't get reported on, and the number is growing. Leigh. -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Richard Sims Sent: 14 January 2004 14:19 PM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Robert - As you go through your professional life administering TSM, keep one thing in mind: the Summary table is unreliable. It has historically been unreliable. As testament, see the many postings from customers who have wasted their time trying to get reliable data from it. As we keep saying: Use the TSM accounting records as your principal source of data for usage statistics. That's what they are there for, and they reliably contain solid, basic data about sessions. Beyond that, use the Activity Log, the Events table, and client backup logs. Richard Sims, BU This communication is subject to specific restrictions and disclaimers. Details are contained at the following link: http://www.omam.com/OMAMSA_E-mail_Disclosure/
Re: Inconsistency between summary and actlog
>Richard > >Thanks for the advice . Do you know where I can found the >format/structure of the accounting records (dsmaccnt.log) Sure, Robert: in the Admin Guide manual. See also "ACCOUNTING RECORD FORMAT" toward the bottom of http://people.bu.edu/rbs/ADSM.QuickFacts for further insight on field contents. Also on my http://people.bu.edu/rbs you can find a "professional level" Perl program which produces a report from the dsmaccnt.log file. I'm embarassed to say that the program still needs updating from its ADSMv3 level, but is can certainly form the foundation for tailoring to your site's needs. There are plenty of internal comments to make things apparent. Richard Sims, BU
Re: Inconsistency between summary and actlog
>OK Richard, then why did the propeller heads at IBM decide to use the >summary table for collecting stats for the Operational Reporter? > >I'm now sitting with ten nodes that don't get reported on, and the number is >growing. > >Leigh. Check for there being only one blade on the propeller. :-) Sometimes, development efforts occur outside the main development area, and those developers take things at face value as a premise for their development. I recommend the Dilbert day calendar as causal reference for such things. Richard
Re: Inconsistency between summary and actlog
Hi, also please note there is a bug about accounting records, up to version 5.1: as soon as your reach session number 64k, accounting will be erroneous, TSM server has to be restarted, René LAMBELET NESTEC SA GLOBE - Global Business Excellence Central Support Center SD/ESN Av. Nestlé 55 CH-1800 Vevey (Switzerland) tél +41 (0)21 924 35 43 fax +41 (0)21 924 13 69 local K4-104 mailto:[EMAIL PROTECTED] This message is intended only for the use of the addressee and may contain information that is privileged and confidential. -Original Message- From: Richard Sims [mailto:[EMAIL PROTECTED] Sent: Wednesday,14. January 2004 13:39 To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog >Richard > >Thanks for the advice . Do you know where I can found the >format/structure of the accounting records (dsmaccnt.log) Sure, Robert: in the Admin Guide manual. See also "ACCOUNTING RECORD FORMAT" toward the bottom of http://people.bu.edu/rbs/ADSM.QuickFacts for further insight on field contents. Also on my http://people.bu.edu/rbs you can find a "professional level" Perl program which produces a report from the dsmaccnt.log file. I'm embarassed to say that the program still needs updating from its ADSMv3 level, but is can certainly form the foundation for tailoring to your site's needs. There are plenty of internal comments to make things apparent. Richard Sims, BU
Re: Inconsistency between summary and actlog
Robert, As Richard suggested, the many postings to ADSM-L regarding the summary table's contents (or lack thereof) are very informative. Among other things, the location of the Accounting records is detailed. Repeatedly. (The format is recorded in the Admin Guide.) Before posting a question to ADSM-L, search the message archives on adsm.org to see if the subject that's vexing you has discussed before. It's an invaluable resource, and it can save you considerable time in resolving whatever issue you are facing. Richard's ADSM QuickFacts web page (http://people.bu.edu/rbs/ADSM.QuickFacts) is another invaluable resource for the TSM administrator, whether novice or experienced. Ted At 02:22 PM 1/14/2004 +0200, you wrote: Richard Thanks for the advice . Do you know where I can found the format/structure of the accounting records (dsmaccnt.log) Regards Robert -Original Message- From: Richard Sims [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 2:19 PM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Robert - As you go through your professional life administering TSM, keep one thing in mind: the Summary table is unreliable. It has historically been unreliable. As testament, see the many postings from customers who have wasted their time trying to get reliable data from it. As we keep saying: Use the TSM accounting records as your principal source of data for usage statistics. That's what they are there for, and they reliably contain solid, basic data about sessions. Beyond that, use the Activity Log, the Events table, and client backup logs. Richard Sims, BU
Re: TDP for SQL backup fails
Yiannakis RC=418 means there was a TSM API error. The DSIERROR.LOG file should have something that explains what the TSM API thought was the problem. Please look there. Also, look in the TSM Server activity log file to look for a message that might tell you what went wrong. RC=402 is the general error that means errors occurred processing the command. Thanks, Del > > I'm running 5 backups using TDP for SQL on Win2000 SP3 servers. One of them > >fails with errors that don't give me any clues as to what goes wrong. > >I've got TSM Server at 5.1.5 on Windows and TDP for SQL 5.1.5. > >The error message ANS1017E is unknown to my TSM. ACO5436E suggests to look > >for the next message for clues. And rc=418 or rc=402 cannot be found in the > >return code section of the messages manual. > > ... > >01/12/2004 20:32:18 Sqlserver : DWDSS > >01/12/2004 20:32:18 > >01/12/2004 23:48:54 ACO5436E A failure occurred on stripe number (1), rc = > >418 > > Yiannakis - Indeed, all the return codes *should* be in the Messages manual: > by all means use IBM feedback procedures to have the publications > people get them in there. > > Always keep in mind that the TDPs are based upon the TSM API; hence, you can > always fall back to the API manual for return code information...up to a point. > That manual has the 418 (but not 402): > DSM_RC_OPT_CLIENT_DOES_NOT_WANT 418 /* Client doesn't want this value*/ > /* from the server */ > > I'd recommend looking in the server Activity Log for further reasons. > Note the long period between the 20:32 session initiation time and the > 23:48 failure message. This *might* reflect the server abandoning the > client session because it exceeded timeout values - which may reflect a > problem with client processing (delays from disk reliability or being > pushed aside by other processes on the client) or the server timeout > values being too low.
Re: Inconsistency between summary and actlog
Ted By the way I almost always search for subject dealing with the same question , but sometimes the answers are very old and quite not absolutely cleared. So sorry if I ask again . Regards Robert -Original Message- From: Ted Byrne [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 3:02 PM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Robert, As Richard suggested, the many postings to ADSM-L regarding the summary table's contents (or lack thereof) are very informative. Among other things, the location of the Accounting records is detailed. Repeatedly. (The format is recorded in the Admin Guide.) Before posting a question to ADSM-L, search the message archives on adsm.org to see if the subject that's vexing you has discussed before. It's an invaluable resource, and it can save you considerable time in resolving whatever issue you are facing. Richard's ADSM QuickFacts web page (http://people.bu.edu/rbs/ADSM.QuickFacts) is another invaluable resource for the TSM administrator, whether novice or experienced. Ted At 02:22 PM 1/14/2004 +0200, you wrote: >Richard > >Thanks for the advice . Do you know where I can found the >format/structure of the accounting records (dsmaccnt.log) > >Regards Robert > > >-Original Message- >From: Richard Sims [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 14, 2004 2:19 PM >To: [EMAIL PROTECTED] >Subject: Re: Inconsistency between summary and actlog > >Robert - As you go through your professional life administering TSM, keep >one > thing in mind: the Summary table is unreliable. It has >historically >been unreliable. As testament, see the many postings from customers who >have >wasted their time trying to get reliable data from it. > >As we keep saying: Use the TSM accounting records as your principal source >of >data for usage statistics. That's what they are there for, and they >reliably >contain solid, basic data about sessions. Beyond that, use the Activity >Log, >the Events table, and client backup logs. > > Richard Sims, BU
Re: Dwindling Performance
Andy - what kind of clients do you have? I've had problems with Windows clients all trying to do systemobject backups simultaneously and had to reorganize my nightly work to split out the systemobject backups from the regular files and then to serialize them because of contention problems with the TSM database. It's supposed to be better in 5.2 but I'm not there yet. Also, you may be right about the directory trees; if your stalls seem to happen at the beginning of the backups you can look at the cpu being used by the TSM BA client process and see if it's churning away. If so, that may be what's happening. Andy Carlson <[EMAIL PROTECTED]> wrote: We are having terrible performance with one of our instances of TSM. I have suspicions, but I want to hear what you guys say. Here is what we have: 2 instances of TSM - TSMI and TSMU (TSMI is the problem) TSM 5.2.1.1 AIX 51.ML4 RS/6000 P670 - 8 processors, 16GB memory Fastt700 SAN STK9840 Tape drives The Database is 85% of 88GB (with room to expand another 50GB or so). Right at this moment, we have 233 sessions with TSMI. The backup sessions grind to a halt for hours at a time, with nothing apparently happening. I suspect that the directory trees are being downloaded and built, but not sure When we split TSMI and TSMU, we created the TSMU instance, and did a full backup on all the servers that moved there. The TSMI database is a restored copy of the original database, with the TSMU stuff deleted out. Any ideas would be greatly appreciated. Andy Carlson |\ _,,,---,,_ Senior Technical Specialist ZZZzz /,`.-'`' -. ;-;;,_ BJC Health Care |,4- ) )-,_. ,\ ( `'-' St. Louis, Missouri '---''(_/--' `-'\_) Cat Pics: http://andyc.dyndns.org/animal.html Joe Howell Shelter Insurance Companies Columbia, MO - Do you Yahoo!? Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
Re: Dwindling Performance
>We are having terrible performance with one of our instances of TSM. ... Andy - We'd like to help, but would need a lot more information about the context of the issue, and in particular what you've already investigated. I'd refer you first to the TSM Performance Tuning Guide at http://publib.boulder.ibm.com/tividd/td/IBMStorageManagerMessages5.2.2.html I also have performance issue summaries in ADSM QuickFacts, compiled from our community experiences over the years. Richard Sims, BU
Re: Dwindling Performance
Thanks for the quick response. Expiration is not finishing. Before the main backups start, it maybe expires 20 objects, but during the backup window it slows to a crawl. It picks up some during the day when the backups and migrations are running, but since we now have some 100 sessions not finished, its slow then too. I didn't look at randomize, but these sessions are staying out there for hours. I will take a look at that today. I currently have them doing an incrbydate every other day, and a full incr the othter. The cache hit ratio of the database is about 98.5%, but we have about 3.5GB of memory in the cache. I don't think I can go much higher, but I will try it if I can. P.S. The TSMI clients are Windows and Netware, the TSMU are Unix and a couple of VMS. Thanks for the input. Andy Carlson|\ _,,,---,,_ Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ BJC Health Care|,4- ) )-,_. ,\ ( `'-' St. Louis, Missouri '---''(_/--' `-'\_) Cat Pics: http://andyc.dyndns.org/animal.html On Wed, 14 Jan 2004, Roger Deschner wrote: > I have posted many times in the past saying you should never do a > database unload/reload to gain performance. But this just might be the > one case where it might make sense - the remaining half of a split > server. But before you do something that drastic, dangerous, and time > consuming, look for the things that are easier to fix. > > My basic metric of whether or not you are in trouble is, how long does > expiration take? If you start it daily, the closer it is to 24 hours > running time, the closer you are to doomsday. Never-ending expiration is > the classic symptom of TSM Server Meltdown. > > But on the other hand, if your expiration runs nice and fast, your > server and its database are probably OK. Look to clients as the problem. > They can't all squeeze in the door at once, so don't let them try. If > they use the client-polling scheduler, how long is the backup window, > and what is your setting for Schedule Randomization Percentage? Make it > as high as possible - SET RANDOMIZE 50. This will also help if you are > having any kind of a network bottleneck. > > Look at these clients on a micro level. About how much are they each > actually backing up? If it's not much, then your theory might be > right, that they are very busy downloading their lists of backed up > files. In that case, load spreading will be the best thing you could > do. You might consider a schedule where not every client does a full > "Incremental" every night - perhaps they only do one every other night > and on the other nights they do an "incrbydate" backup which is much > faster, because it goes only by the timestamps in the file system. > > Not to ask the obvious, but what's your Database Cache Hit Percentage? > (Q DB F=D) If it's below 99%, it needs help. Even (especially) a badly > fragmented database will run a lot faster if you have it swimming in > cache. > > Look at other differences between your two instances - are they > basically different types of clients? > > Roger Deschner University of Illinois at Chicago [EMAIL PROTECTED] > The short fortuneteller who escaped from prison= > ==was a small medium at large.== > > > > On Tue, 13 Jan 2004, Andy Carlson wrote: > > >We are having terrible performance with one of our instances of TSM. I > >have suspicions, but I want to hear what you guys say. Here is what we > >have: > > > >2 instances of TSM - TSMI and TSMU (TSMI is the problem) > > > >TSM 5.2.1.1 > >AIX 51.ML4 > >RS/6000 P670 - 8 processors, 16GB memory > >Fastt700 SAN > >STK9840 Tape drives > > > >The Database is 85% of 88GB (with room to expand another 50GB or so). > > > >Right at this moment, we have 233 sessions with TSMI. The backup > >sessions grind to a halt for hours at a time, with nothing apparently > >happening. I suspect that the directory trees are being downloaded and > >built, but not sure > > > >When we split TSMI and TSMU, we created the TSMU instance, and did a > >full backup on all the servers that moved there. The TSMI database is a > >restored copy of the original database, with the TSMU stuff deleted out. > > > >Any ideas would be greatly appreciated. > > > > > >Andy Carlson|\ _,,,---,,_ > >Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ > >BJC Health Care|,4- ) )-,_. ,\ ( `'-' > >St. Louis, Missouri '---''(_/--' `-'\_) > >Cat Pics: http://andyc.dyndns.org/animal.html > > >
Re: Dwindling Performance
I'll supply as much information as anyone wants if they have any clues about what could be going on. I will take a look at the current performance book, but I have looked at it in the past. Andy Carlson|\ _,,,---,,_ Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ BJC Health Care|,4- ) )-,_. ,\ ( `'-' St. Louis, Missouri '---''(_/--' `-'\_) Cat Pics: http://andyc.dyndns.org/animal.html On Wed, 14 Jan 2004, Richard Sims wrote: > >We are having terrible performance with one of our instances of TSM. ... > > Andy - We'd like to help, but would need a lot more information about the >context of the issue, and in particular what you've already > investigated. > > I'd refer you first to the TSM Performance Tuning Guide at > http://publib.boulder.ibm.com/tividd/td/IBMStorageManagerMessages5.2.2.html > I also have performance issue summaries in ADSM QuickFacts, compiled from > our community experiences over the years. > > Richard Sims, BU >
setting max capacity for storage pools
i have a disk storage pool which i previously never coded a maximum size threshold for i changed the disk storage pool to a max of 100mb, the server started backig up. it was running ok till it tried a 300mb file, i thought the 300mb file would then go to the next storage pool directly which is a tape pool. this mesage appeared in the dsmerror log file ANS1312E Server media mount not possible the device class for the disk storage pool is Device Class Name: DISK Device Access Strategy: Random Storage Pool Count: 7 Format: Device Type: Est/Max Capacity (MB): Mount Limit: Mount Retention (min): Mount Wait (min): Unit Name: Comp: Library Name: the device class for the tape storage pool is Device Class Name: NT3590 Device Access Strategy: Sequential Storage Pool Count: 1 Format: 3590B Device Type: 3590 Est/Max Capacity (MB): 9,216.0 Mount Limit: 3 Mount Retention (min): 1 Mount Wait (min): 60 Unit Name: 3590-1 Comp: Yes Library Name: tim brown
Re: Dwindling Performance
Andy, Have you thought about using the TSM Journal Service? If you're building a ton of directories/files but not backing up much the journal will cut down on the processing and keep your sessions down to a minimum. Just a thought... Brian Scott EDS - EOGDE GM Distributed Management Systems Engineering MS 3234 750 Tower Drive Troy, MI 48098 * phone: +01-248-265-4596 (8-365) * mailto:[EMAIL PROTECTED] -Original Message- From: Andy Carlson [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 9:07 AM To: [EMAIL PROTECTED] Subject: Re: Dwindling Performance Thanks for the quick response. Expiration is not finishing. Before the main backups start, it maybe expires 20 objects, but during the backup window it slows to a crawl. It picks up some during the day when the backups and migrations are running, but since we now have some 100 sessions not finished, its slow then too. I didn't look at randomize, but these sessions are staying out there for hours. I will take a look at that today. I currently have them doing an incrbydate every other day, and a full incr the othter. The cache hit ratio of the database is about 98.5%, but we have about 3.5GB of memory in the cache. I don't think I can go much higher, but I will try it if I can. P.S. The TSMI clients are Windows and Netware, the TSMU are Unix and a couple of VMS. Thanks for the input. Andy Carlson|\ _,,,---,,_ Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ BJC Health Care|,4- ) )-,_. ,\ ( `'-' St. Louis, Missouri '---''(_/--' `-'\_) Cat Pics: http://andyc.dyndns.org/animal.html On Wed, 14 Jan 2004, Roger Deschner wrote: > I have posted many times in the past saying you should never do a > database unload/reload to gain performance. But this just might be the > one case where it might make sense - the remaining half of a split > server. But before you do something that drastic, dangerous, and time > consuming, look for the things that are easier to fix. > > My basic metric of whether or not you are in trouble is, how long does > expiration take? If you start it daily, the closer it is to 24 hours > running time, the closer you are to doomsday. Never-ending expiration > is the classic symptom of TSM Server Meltdown. > > But on the other hand, if your expiration runs nice and fast, your > server and its database are probably OK. Look to clients as the > problem. They can't all squeeze in the door at once, so don't let them > try. If they use the client-polling scheduler, how long is the backup > window, and what is your setting for Schedule Randomization > Percentage? Make it as high as possible - SET RANDOMIZE 50. This will > also help if you are having any kind of a network bottleneck. > > Look at these clients on a micro level. About how much are they each > actually backing up? If it's not much, then your theory might be > right, that they are very busy downloading their lists of backed up > files. In that case, load spreading will be the best thing you could > do. You might consider a schedule where not every client does a full > "Incremental" every night - perhaps they only do one every other night > and on the other nights they do an "incrbydate" backup which is much > faster, because it goes only by the timestamps in the file system. > > Not to ask the obvious, but what's your Database Cache Hit Percentage? > (Q DB F=D) If it's below 99%, it needs help. Even (especially) a badly > fragmented database will run a lot faster if you have it swimming in > cache. > > Look at other differences between your two instances - are they > basically different types of clients? > > Roger Deschner University of Illinois at Chicago [EMAIL PROTECTED] > The short fortuneteller who escaped from > prison= ==was a small medium at > large.== > > > > On Tue, 13 Jan 2004, Andy Carlson wrote: > > >We are having terrible performance with one of our instances of TSM. > >I have suspicions, but I want to hear what you guys say. Here is > >what we > >have: > > > >2 instances of TSM - TSMI and TSMU (TSMI is the problem) > > > >TSM 5.2.1.1 > >AIX 51.ML4 > >RS/6000 P670 - 8 processors, 16GB memory > >Fastt700 SAN > >STK9840 Tape drives > > > >The Database is 85% of 88GB (with room to expand another 50GB or so). > > > >Right at this moment, we have 233 sessions with TSMI. The backup > >sessions grind to a halt for hours at a time, with nothing apparently > >happening. I suspect that the directory trees are being downloaded > >and built, but not sure > > > >When we split TSMI and TSMU, we created the TSMU instance, and did a > >full backup on all the servers that moved there. The TSMI database > >is a restored copy of the original database, with the TSMU stuff > >deleted out. > > > >Any ideas would be greatly appreciated. > > > > > >Andy Carlson
Re: Dwindling Performance
Hmmm, interesting that the expire inventory grinds to a halt during incremental backups. My setup is AIX similar to yours (host, DB size) although my disks are on locally attached SSA drives. I recently upgraded my 8 TSM servers from TSM 5.1.1.0 to 5.2.1.3 (mainly to get the NDMP file-level backups, finally). On one of them I saw the same issue. They are all set up almost identically, so why 1 would misbehave is a mystery to me. To fix the immediate problem, I put a " duration=" on the expire inventory job so that it would only run during the day when backups are less likely. Sure, the expire inventory now takes 2 days to run, but it's better than having all the backups go extremely slow and not complete. I then started to look into the performance issues. Some of the things I have done: - I changed the DB volumes from JFS to raw (that made a very good improvement). - Turn the SSA fastwrite cache on the db volumes. - Tried out these settings for vmtune (gleaned from this listsrv) /usr/samples/kernel/vmtune -t10 -P10 -p5 -s1 -W16 -c8 -R256 -F512 -u25 -b2200 -B2200 All of these changes have improved the speed of the expire inventory, but to be honest I haven't tried to run the expire inventory during the incremental backups since. Once bitten twice shy, and I can live with the expire inventory taking 2 days to complete. That's kind of where I am now. No solid solution, but improved performance enough that it's workable now. I'd love to hear what other changes you make to resolve your situation. Ben Micron Technology Inc. Boise, Id -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Andy Carlson Sent: Wednesday, January 14, 2004 7:07 AM To: [EMAIL PROTECTED] Subject: Re: Dwindling Performance Thanks for the quick response. Expiration is not finishing. Before the main backups start, it maybe expires 20 objects, but during the backup window it slows to a crawl. It picks up some during the day when the backups and migrations are running, but since we now have some 100 sessions not finished, its slow then too. I didn't look at randomize, but these sessions are staying out there for hours. I will take a look at that today. I currently have them doing an incrbydate every other day, and a full incr the othter. The cache hit ratio of the database is about 98.5%, but we have about 3.5GB of memory in the cache. I don't think I can go much higher, but I will try it if I can. P.S. The TSMI clients are Windows and Netware, the TSMU are Unix and a couple of VMS. Thanks for the input. Andy Carlson|\ _,,,---,,_ Senior Technical Specialist ZZZzz /,`.-'`'-. ;-;;,_ BJC Health Care|,4- ) )-,_. ,\ ( `'-' St. Louis, Missouri '---''(_/--' `-'\_) Cat Pics: http://andyc.dyndns.org/animal.html On Wed, 14 Jan 2004, Roger Deschner wrote: > I have posted many times in the past saying you should never do a > database unload/reload to gain performance. But this just might be the > one case where it might make sense - the remaining half of a split > server. But before you do something that drastic, dangerous, and time > consuming, look for the things that are easier to fix. > > My basic metric of whether or not you are in trouble is, how long does > expiration take? If you start it daily, the closer it is to 24 hours > running time, the closer you are to doomsday. Never-ending expiration > is the classic symptom of TSM Server Meltdown. > > But on the other hand, if your expiration runs nice and fast, your > server and its database are probably OK. Look to clients as the > problem. They can't all squeeze in the door at once, so don't let them > try. If they use the client-polling scheduler, how long is the backup > window, and what is your setting for Schedule Randomization > Percentage? Make it as high as possible - SET RANDOMIZE 50. This will > also help if you are having any kind of a network bottleneck. > > Look at these clients on a micro level. About how much are they each > actually backing up? If it's not much, then your theory might be > right, that they are very busy downloading their lists of backed up > files. In that case, load spreading will be the best thing you could > do. You might consider a schedule where not every client does a full > "Incremental" every night - perhaps they only do one every other night > and on the other nights they do an "incrbydate" backup which is much > faster, because it goes only by the timestamps in the file system. > > Not to ask the obvious, but what's your Database Cache Hit Percentage? > (Q DB F=D) If it's below 99%, it needs help. Even (especially) a badly > fragmented database will run a lot faster if you have it swimming in > cache. > > Look at other differences between your tw
Re: setting max capacity for storage pools
Take a look at maxnummp for the node backing up the data. If it's set to 0, your data will not go to tape. Ted At 10:49 AM 1/14/2004 -0500, you wrote: i have a disk storage pool which i previously never coded a maximum size threshold for i changed the disk storage pool to a max of 100mb, the server started backig up. it was running ok till it tried a 300mb file, i thought the 300mb file would then go to the next storage pool directly which is a tape pool. this mesage appeared in the dsmerror log file ANS1312E Server media mount not possible the device class for the disk storage pool is Device Class Name: DISK Device Access Strategy: Random Storage Pool Count: 7 Format: Device Type: Est/Max Capacity (MB): Mount Limit: Mount Retention (min): Mount Wait (min): Unit Name: Comp: Library Name: the device class for the tape storage pool is Device Class Name: NT3590 Device Access Strategy: Sequential Storage Pool Count: 1 Format: 3590B Device Type: 3590 Est/Max Capacity (MB): 9,216.0 Mount Limit: 3 Mount Retention (min): 1 Mount Wait (min): 60 Unit Name: 3590-1 Comp: Yes Library Name: tim brown
Re: setting max capacity for storage pools
Ted Works!!! thanks tim - Original Message - From: "Ted Byrne" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, January 14, 2004 11:13 AM Subject: Re: setting max capacity for storage pools > Take a look at maxnummp for the node backing up the data. If it's set to > 0, your data will not go to tape. > > Ted > > At 10:49 AM 1/14/2004 -0500, you wrote: > >i have a disk storage pool which i previously never coded a maximum size > >threshold for > >i changed the disk storage pool to a max of 100mb, the server started > >backig up. it > >was running ok till it tried a 300mb file, i thought the 300mb file would > >then go to the > >next storage pool directly which is a tape pool. > > > >this mesage appeared in the dsmerror log file > > > >ANS1312E Server media mount not possible > > > > > >the device class for the disk storage pool is > > > >Device Class Name: DISK > >Device Access Strategy: Random > > Storage Pool Count: 7 > > Format: > >Device Type: > > Est/Max Capacity (MB): > >Mount Limit: > > Mount Retention (min): > > Mount Wait (min): > > Unit Name: > > Comp: > > Library Name: > > > > > >the device class for the tape storage pool is > > > >Device Class Name: NT3590 > >Device Access Strategy: Sequential > > Storage Pool Count: 1 > > Format: 3590B > >Device Type: 3590 > > Est/Max Capacity (MB): 9,216.0 > >Mount Limit: 3 > > Mount Retention (min): 1 > > Mount Wait (min): 60 > > Unit Name: 3590-1 > > Comp: Yes > > Library Name: > > > > > >tim brown >
Re: setting max capacity for storage pools
ted disregard previous note i thought that fixed it, i set maxnummp=2 still same problem toim - Original Message - From: "Ted Byrne" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, January 14, 2004 11:13 AM Subject: Re: setting max capacity for storage pools > Take a look at maxnummp for the node backing up the data. If it's set to > 0, your data will not go to tape. > > Ted > > At 10:49 AM 1/14/2004 -0500, you wrote: > >i have a disk storage pool which i previously never coded a maximum size > >threshold for > >i changed the disk storage pool to a max of 100mb, the server started > >backig up. it > >was running ok till it tried a 300mb file, i thought the 300mb file would > >then go to the > >next storage pool directly which is a tape pool. > > > >this mesage appeared in the dsmerror log file > > > >ANS1312E Server media mount not possible > > > > > >the device class for the disk storage pool is > > > >Device Class Name: DISK > >Device Access Strategy: Random > > Storage Pool Count: 7 > > Format: > >Device Type: > > Est/Max Capacity (MB): > >Mount Limit: > > Mount Retention (min): > > Mount Wait (min): > > Unit Name: > > Comp: > > Library Name: > > > > > >the device class for the tape storage pool is > > > >Device Class Name: NT3590 > >Device Access Strategy: Sequential > > Storage Pool Count: 1 > > Format: 3590B > >Device Type: 3590 > > Est/Max Capacity (MB): 9,216.0 > >Mount Limit: 3 > > Mount Retention (min): 1 > > Mount Wait (min): 60 > > Unit Name: 3590-1 > > Comp: Yes > > Library Name: > > > > > >tim brown >
Upgrading drives in jukebox.
I have 3 TSM servers each with a SCSI attached L700 jukebox with 8 - DLT7000 drives and 300 + full tapes. I intend on upgrading to SDLT320 by the end of next month. Now for the problem. How do I get the data off the existing tapes and onto the new tapes ? Is there a way to define two drive types in a scsi attached jukebox ? TSM 5.1.6.3 on AIX 5.1 . I do have about 1.5tb of diskpools on each TSM server. Thanks, > Duane Ochs > Enterprise Computing > Quad/Graphics > Sussex, Wisconsin > 414-566-2375 phone > 414-917-0736 beeper >
Win 5.2.2 Client on W2003 Server CAD Error
Hello. I have installed TMS Client 5.2.2 on a new Win2003 server to backup non C: disk data. All is fine except when I try to use the CAD daemon (service), page 14 in the Tivoli 5.2 Users Guide for Windows "Configuring the Web Client". This is to access Tivoli on the W2003 server via http 1581 to do restores. The W2003 GUI Tivoli console client works fine and the auto backups work fine using the regular Win service with and without the MANAGEDSERVICES setup in dsm.opt. I have tried the auto web access setup way using the console GUI client (page 14) and the manual way (page 482). When trying the http 1581 access, all starts up well until you have to login, then you receive on your browser screen ...ANS2619S The Client Acceptor Daemon was unable to start the Remote Client Agent in the dsmerror.log as follows... 01/12/2004 16:10:48 Error starting agent service: The service name, '', is invalid. 01/12/2004 16:10:48 Error starting Remote Client Agent. 01/12/2004 16:10:59 Error starting agent service: The service name, '', is invalid. 01/12/2004 16:10:59 Error starting Remote Client Agent. 01/12/2004 16:23:01 Error starting schedule service: CadSchedName registry value is empty 01/12/2004 16:23:01 ANS1977E Dsmcad schedule invocation was unsuccessful - will try again. 01/12/2004 16:24:37 Error starting agent service: The service name, '', is invalid. 01/12/2004 16:24:37 Error starting Remote Client Agent. 01/12/2004 16:25:44 ConsoleEventHandler(): Caught Ctrl-C console event . 01/12/2004 16:25:44 ConsoleEventHandler(): Cleaning up and terminating Process ... In dsmwebcl.log ... Executing scheduled command now. 01/09/2004 20:47:24 (dsmcad) ANS1977E Dsmcad schedule invocation was unsuccessful - will try again. in 10 minutes. 01/09/2004 20:47:24 (dsmcad) Time remaining until execution: I'm thinking this is a new bug in 5.2.2 for Windows, maybe just on Win2003? I'm using 5.2.2 TSM server on a Linux ES 2.1 kernel 27 server. Other Linux 5.2.2 clients work fine via http 1581. Charlie Hurtubise Tecsys Inc. [EMAIL PROTECTED]
Re: Inconsistency between summary and actlog
I have been struggling with this issue recently myself, Richard was kind enough to answer my stupid questions several weeks ago too *8<). I originally looked at the accounting file, but it did not contain all of the info that my management and customers wanted so I wrote my own script that pulls from the actlog and from the events table to get backup stats and schedule info. If anyone wants a copy of the script, I will let you have it, it's a korn shell script for Solaris though. If you just want the backup stats, you can use the query I started out with (which I pilfered from the Operational reporting tool): select msgno,nodename,sessid,message from actlog where ( msgno=4952 or msgno=4953 or msgno=4954 or msgno=4955 or msgno=4956 or msgno=4957 or msgno=4958 or msgno=4959 or msgno=4960 or msgno=4961 or msgno=4964 or msgno=4967 or msgno=4968 or msgno=4970 ) and (date_time between '2004-01-06 19:42' and '2004-01-07 19:42') order by sessid It dumps out data that looks like: 4952,ALL,60358,"ANE4952I Total number of objects inspected: 50,572 " 4954,ALL,60358,"ANE4954I Total number of objects backed up: 50,572 " 4958,ALL,60358,ANE4958I Total number of objects updated: 0 4960,ALL,60358,ANE4960I Total number of objects rebound: 0 4957,ALL,60358,ANE4957I Total number of objects deleted: 0 4970,ALL,60358,ANE4970I Total number of objects expired: 0 4959,ALL,60358,ANE4959I Total number of objects failed: 0 4961,ALL,60358,ANE4961I Total number of bytes transferred: 2.04 GB 4967,ALL,60358,"ANE4967I Aggregate data transfer rate: 4,602.53 KB/sec " 4968,ALL,60358,ANE4968I Objects compressed by:0% 4964,ALL,60358,ANE4964I Elapsed processing time:00:07:46 With a little manipulation with sed: 4952,ALL,60358,"50,572 " 4954,ALL,60358,"50,572 " 4958,ALL,60358,0 4960,ALL,60358,0 4957,ALL,60358,0 4970,ALL,60358,0 4959,ALL,60358,0 4961,ALL,60358,2.04 GB 4967,ALL,60358,"4,602.53 KB/sec " 4968,ALL,60358,0% 4964,ALL,60358,00:07:46 My final script returns an output like (field headers listed first for reference value): ${NODE},${NODEIPADDRESS},${TSM_SERVER_INFO},${SESSIONID},${SCHEDULENAME} ,${STARTTIME},${ELAPSEDPROCTIME},${NUMOFBYTESXFERRED},${NUMOFOBJECTS},${ NUMOFOBJECTSBACKEDUP},${NUMOFOBJECTSFAILED},${SUCCESSFUL} AD01-IPP,10.81.10.10,TSM3.USSNTC6,10.81.96.22,60633,DAILY_04,01/13/04 00:53:40,00:01:26,348.72MB,37222,885,0,Completed AD02-IPP,10.81.10.12,TSM3.USSNTC6,10.81.96.22,61706,DAILY_05,01/13/04 11:59:16,00:01:27,289.16MB,29793,438,0,Completed ALL,,TSM3.USSNTC6,10.81.96.22,60878,,,00:25:22,2.05GB,50573,50573,0, CMS-DB1-IPP,10.81.215.11,TSM3.USSNTC6,10.81.96.22,61669,DAILY_06,01/13/0 4 11:38:13,00:02:46,827.55MB,101,85,0,Completed DEVSTUDIO-IPP,10.81.215.4,TSM3.USSNTC6,10.81.96.22,61573,DAILY_05,01/13/ 04 10:45:03,00:02:43,550.66MB,1445,824,0,Completed At the end of the report is missed: ${MISSEDNODES},${NODEIPADDRESS},${TSM_SERVER_INFO},,${ASSIGNEDSCHEDULE}, ${ASSIGNEDSTARTTIME},,${STATUS} REPORTDB-IPP,10.81.215.14,TSM3.USSNTC6,10.81.96.22,,DAILY_03,01/13/04 08:00:00,,Missed PCLOBS1-IPP,,TSM3.USSNTC6,10.81.96.22,,DAILY_03,01/13/04 08:00:00,,Missed S216060SC1SW01,,TSM3.USSNTC6,10.81.96.22,,DAILY_05,01/13/04 10:00:00,,Missed Michael French Savvis Communications IDS01 Santa Clara, CA (408)450-7812 -- desk (408)239-9913 -- mobile -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Robert Ouzen Sent: Wednesday, January 14, 2004 5:19 AM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Ted By the way I almost always search for subject dealing with the same question , but sometimes the answers are very old and quite not absolutely cleared. So sorry if I ask again . Regards Robert -Original Message- From: Ted Byrne [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 3:02 PM To: [EMAIL PROTECTED] Subject: Re: Inconsistency between summary and actlog Robert, As Richard suggested, the many postings to ADSM-L regarding the summary table's contents (or lack thereof) are very informative. Among other things, the location of the Accounting records is detailed. Repeatedly. (The format is recorded in the Admin Guide.) Before posting a question to ADSM-L, search the message archives on adsm.org to see if the subject that's vexing you has discussed before. It's an invaluable resource, and it can save you considerable time in resolving whatever issue you are facing. Richard's ADSM QuickFacts web page (http://people.bu.edu/rbs/ADSM.QuickFacts) is another invaluable resource for the TSM administrator, whether novice or experienced. Ted At 02:22 PM 1/14/2004 +0200, you wrote: >Richard > >Thanks for the advice . Do you know where I can found the >format/structure of the accounting records (dsmaccnt.log) > >Regards Robert > > >-Original Message- >From: Richard Sims [mailto:[EMAIL PROTECTED] >Sent:
Backing up NDS/eDirectory server
I need to backup a Netware 6.0 SP3 server with eDirectory (NDS + LDAP). Do I need to code anything special in the DSM.OPT / INCLUDE or is this an automatic thing ? If it is automatic, how can I tell ? Is it part of the "server specific" stuff ? Also plan to do a test recovery of this server to another box, to test the restore aspect, including the NDS/eDirectory piece. Any hints/tips/suggestion/war-stories in this arena ? So far the only experience we have with recovering Netware systems has been user-data only, not NDS !
Re: Backing up NDS/eDirectory server
The answer on the automatic part is "it depends". The newest version of the TSM client for NetWare (5.2.2.0) will automatically backup NDS on a server that holds a master replica. For other servers and previous versions of the TSM client, you need to add NDS: to the domain line in your dsm.opt file. For example: DOMAIN ALL-LOCAL NDS: When you query the filespaces on the TSM server, you'll see a filespace called NODE\NDS: which is listed separate from other filespaces for the node. Matt Zufelt Southern Utah University >>> [EMAIL PROTECTED] 1/14/2004 1:34:14 PM >>> I need to backup a Netware 6.0 SP3 server with eDirectory (NDS + LDAP). Do I need to code anything special in the DSM.OPT / INCLUDE or is this an automatic thing ? If it is automatic, how can I tell ? Is it part of the "server specific" stuff ? Also plan to do a test recovery of this server to another box, to test the restore aspect, including the NDS/eDirectory piece. Any hints/tips/suggestion/war-stories in this arena ? So far the only experience we have with recovering Netware systems has been user-data only, not NDS !
Re: Backing up NDS/eDirectory server
Hi, >>> I need to backup a Netware 6.0 SP3 server with eDirectory (NDS + LDAP). Do I need to code anything special in the DSM.OPT / INCLUDE or is this an automatic thing ? If it is automatic, how can I tell ? Is it part of the "server specific" stuff ? <<< you have two requirements to backup eDirectory: - load TSANDS.NLM before loading DSMC.NLM or DSMCAD.NLM - specify the domain "NDS:" I recommend using the Netware-Server that holds all master-replicas of your NDS tree. Second it's useful to configure a separate job for this task. >>> Also plan to do a test recovery of this server to another box, to test the restore aspect, including the NDS/eDirectory piece. <<< This test server shouldn't be connected to your production network, otherwise you'll get a lot of synchronization problems within your NDS-tree. In a real-time scenario you have to recreate your new and empty NDS-tree before you can proceed with a full-restore of NDS on your server, which is holding all your master-replicas. For further informations regarding restoring NDS you should read the many TIDs you can find on http://support.novell.com. Best regards, Wolfgang Bayrhof
Novell Client TSM 5.1.5.15 Deleted File Error
Does anyone know of an option that can be put into the dsm.opt for TSM 5.1.5 on a Novell 5.1 sp5 machine that does not continually try to backup a file that a user deleted after the original backup list was created during a scheduled incremental job? We have found that when there is a large change of data (over 2 gig's) on a Novell Server, the scheduled job looks at the files, compares to the include/exclude list and then starts the incremental backup. By the time it reaches certain files, they might have been deleted. (ie. user deletes a document on a shared drive) TSM continually tries to find the file, errors out and then tries over and over again. A backup job that takes normally 30 minutes now goes on for over 12 hours until we cancel the session from the server.
Re: Novell Client TSM 5.1.5.15 Deleted File Error
Hi Roger, >>> We have found that when there is a large change of data (over 2 gig's) on a Novell Server, the scheduled job looks at the files, compares to the include/exclude list and then starts the incremental backup. By the time it reaches certain files, they might have been deleted. (ie. user deletes a document on a shared drive) TSM continually tries to find the file, errors out and then tries over and over again. A backup job that takes normally 30 minutes now goes on for over 12 hours until we cancel the session from the server. <<< I would test the option MEMORYEFFICIENTBACKUP on your server. The backup operation probably will take more time to complete, but the incr command looks at the files directory by directory. Are there any Retry-Errors written into the log? If yes, you may want to look at the option CHANGINGRETRIES as well. Best Regards, Wolfgang Bayrhof
Re: Upgrading drives in jukebox.
From: Ochs, Duane [mailto:[EMAIL PROTECTED] >I have 3 TSM servers each with a SCSI attached L700 jukebox with 8 - DLT7000 drives and 300 + full tapes. I intend on upgrading to SDLT320 by the end of next month. Now for the problem. How do I get the data off the existing tapes and onto the new tapes ? Is there a way to define two drive types in a scsi attached jukebox ?< About the only way to logically partition your L700 (the only way to use mixed media and mixed drives) is to use ACSLS to drive the library--something that some people think is a cure that is worse than the disease. -- Mark Stapleton ([EMAIL PROTECTED])