John, Thanks for the script to check if any exports are running prior to starting the next batch job of 15 exports. It is currently running to check if it makes life easier for TSM. I managed to get another crash of the TSM server when the export of the first 15 nodes was started. No other activities were being performed on the TSM server except the Tivoli Operational Reporting tool that monitors each hour the TSM activities. I've stopped the reporting service as well to see if this has anything to do with it. No news from IBM support so far, I'll update the thread if I know more later on. regards, Kurt
________________________________ From: ADSM: Dist Stor Manager on behalf of John Monahan Sent: Mon 2/27/2006 22:36 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] export nodes causes TSM server crash I agree that the TSM server shouldn't ever crash, but just because it shouldn't crash doesn't necessarily mean you should try to run 75 or 100 or 1000 exports concurrently either. Until a fix is produced, I would just limit your concurrent exports to what you know works without committing a self-imposed denial of service attack on your TSM server. Here is what I would do with your scripts that have the exports separated into groups of 15 nodes each: 1. Kick off the first one as is. 2. Modify all the other scripts to first check for any export processes still running, and if there are, then have those scripts reschedule themselves. ie: select * from processes where upper(process)='EXPORT NODE' if (rc_ok) goto reschedule <run next set of export node commands here> exit :reschedule del sched <thisschedname> type=a def sched <thisschedname> type=a cmd="run <thisscriptname>" active=yes startt=NOW+0:30 perunits=onetime exit ______________________________ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> 02/27/2006 02:57 PM Please respond to "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> To ADSM-L@VM.MARIST.EDU cc Subject Re: export nodes causes TSM server crash John, The export of just 15 nodes was tested earlier on. It contained the larger nodes already. At that time, the TSM server just was slowly (high CPU consumption and a lot of disk I/O which is normal of course). It worked fine. The export of all of the nodes at the same time causes an immediate crash of the TSM server. I did not mean to do the export at once but did not notice that the parallel/serial commands would not work as the exports are started in the background. So I changed the script to work in groups of 15 nodes. The export of the nodes in groups of 15 caused a new crash when the last group export was started. A few of the earlier exports were still running at that time, the nodes in the latest group export were rather small nodes. A support call was logged of course. The question is what causes the TSM server crash. Except the PK_EXCEPTION and PK_THREAD messages in the application log, nothing else is found. Just have to wait for some new from the labs at this time. And will contact them tomorrow again. regards, Kurt ________________________________ Van: ADSM: Dist Stor Manager namens John Monahan Verzonden: ma 2/27/2006 20:15 Aan: ADSM-L@VM.MARIST.EDU Onderwerp: Re: [ADSM-L] export nodes causes TSM server crash Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? ______________________________ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt