Thanks Jeff, very nice description of an important issue for all of us ! Hoping Tivoli puts all necessary ressources to improve this. René Lambelet Nestec S.A. / Informatique du Centre 55, av. Nestlé CH-1800 Vevey (Switzerland) *+41'21'924'35'43 7+41'21'924'28'88 * K4-117 email [EMAIL PROTECTED] Visit our site: http://www.nestle.com This message is intended only for the use of the addressee and may contain information that is privileged and confidential. > -----Original Message----- > From: Jeff Connor [SMTP:[EMAIL PROTECTED]] > Sent: Thursday, February 15, 2001 8:01 PM > To: [EMAIL PROTECTED] > Subject: Re: Performance Large Files vs. Small Files > > Diana, > > Sorry to chime in late on this but you've hit a subject I've been > struggling with for quite some time. > > We have some pretty large Windows NT file and print servers using MSCS. > Each server has lots of small files(1.5 to 2.5 million) and total disk > space(the D: drive) between 150GB and 200GB, Compaq server, two 400mhz > xeon > with 400MB ram. We have been running TSM on the mainframe since ADSM > version 1 and are currently at 3.7 of the TSM server with 3.7.2.01 and > 4.1.2 on the NT clients. > > Our Windows NT admins have had a concern for quite some time regarding > TSM > restore performance and how long it would take to restore that big old D: > drive. They don't see the value in TSM as a whole as compared to the > competition they just want to know how fast can you recover my entire D: > drive. They decided they wanted to perform weekly full backups to direct > attached DLT drives using Arcserve and would use the TSM incrementals to > forward recover during full volume restore. We had to finally recover > one > of those big D: drives this past September. The Arcserve portion of the > recovery took about 10 hours if I recall correctly. The TSM forward > recovery ran for 36 hours and only restored about 8.5GB. They were not > pleased. It seems all that comparing took quite some time. I've been > trying to get to the root of the bottleneck since then. I've worked with > support on and off over the last few months performing various traces and > the like. At this point we are looking in the area of mainframe TCPIP and > delay's in acknowledgments coming out of the mainframe during test > restores. > > If you've worked with TSM for a number of years and through sources in > IBM/Tivoli and the valuable information from this listserv, over time you > learn about all the TSM client and server "knobs" to turn to try and get > maximum performance. Things like Bufpoolsize, database cache hits, > housekeeping processes running at the same time as backups/restores > slowing > things down, network issues like auto-negotiate on NIC's, MTU sizes, TSM > server database and log disk placement, tape drive load/seek times and > speeds and feeds. Basically, I think we are pretty well set with all > those > important things to consider. This problem we are having may be a > mainframe TCPIP issue in the end, but I am not sure that will be the > complete picture. > > We have recently installed an AIX TSM server, H80 two-way, 2GB memory, > 380GB EMC 3430 disk, 6 Fibre Channel 3590-E1A drives in a 3494, TSM server > at 4.1.2. We plan to move most of the larger clients from the TSM OS/390 > server to the AIX TSM server. A good move to realize a performance > improvement according to many posts on this Listserv over the years. I am > in the process of testing my NT "problem children" as quickly as I can to > prove this configuration will address the concerns our NT Admins have > about > restores of large NT servers. I'm trying to prevent them from installing > a > Veritas SAN solution and asking them to stick with our Enterprise Backup > Strategic direction which is to utilize TSM. As you probably know, the > SAN > enabled TSM backup/archive client for NT is not here and may never be from > what I've heard. My only option at this point is SAN tape library sharing > with the TSM client and server on the same machine for each of our MSCS > servers. > > Now I'm sure many of you reading this may be thinking of things like, "why > not break the D: drive into smaller partitions so you can collocate by > filespace and restore all the data concurrently". No go guys, they don't > want to change the way they configure their servers just to accommodate > TSM > when the feel they would not have to with other products. They feel that > with 144GB single drives around the corner who is to say what a "big" NT > partition is? NT seems to support these large drives without issues. > (Their words not mine). > > Back to the issue. Our initial backup tests using our new AIX TSM server > have produced significant improvements in performance. I am just getting > the pieces in place to perform restore tests. My first test a couple days > ago was to restore part of the data from that server we had the issue with > in September. It took about one hour to lay down just the directories > before restoring any files. Probably still better than the mainframe but > not great. My plan for future tests is to perform backups and restores of > the same data to and from both of my TSM servers to compare performance. > I > will share the results with you and the rest of the listserv as I > progress. > > In general I have always, like many other TSM users, achieved much better > restore/backup rates with larger files versus lots of smaller files. > Assuming you've done all the right tuning, the question that comes to my > mind is, does it really come down to the architecture? The TSM database > makes things very easy for day to day smaller recoveries which is the type > we perform most. But does the architecture that makes day to day > operations easier not lend itself well to backup/recovery of large amounts > of data made up of small files? I have very little experience with > competing products. Do they struggle with lots of small files as well? > Veritas, Arserve anyone? If the issue is, as some on the Listserv have > suggested, frequent interaction with the client file system the > bottleneck, > then I suppose the answer would be yes the other products have the same > problem. Or is the issue more on the TSM database side due to it's > design, > and other products using different architectures may not have this > problem? > Maybe the competitions architecture is less bulletproof but if you're one > of our NT Admins you don't seem to care when the client keeps calling > asking how much longer the restore will be running. I know TSM > development is aware of the issues with lots of small files and I would be > curious what they plan to do about the problems Diana and I have > experienced. > > The newer client option, Resourceutilization, has helped with backing up > clients with lots of small files more quickly. I would love to see the > same type of automated multi-tasking on restores. I don't know the > specifics of how this actually works but it seems to me that when I ask to > restore an entire NT drive, for example, the TSM client/server must sort > the file list in some fashion to intelligently request tape volumes to > minimize the mounts required. If that's the case could they take things > one step further and add an option to the restore specifying the number of > concurrent sessions/mountpoints to be used to perform the restore? For > example, if I have a node who's collocated data is spread across twenty > tapes and I have 6 tape drives available for the recovery, how about an > option for the restore command like: > > RES -subd=y -nummp=6 d:\* > > where the -nummp option would be the number of mount points/tape drives to > be used for the restore. TSM could sort the file list coming up with the > list of tapes to be used for the restore and perhaps spread the mounts > across 6 sessions/mount points. I'm sure I've probably made a complex > task > sound simple but this type of option would be very useful. I think many > of > us have seen the benefits of running multiple sessions to reduce recovery > elapsed time. I find my current choices for doing so difficult to > implement or politically undesirable. > > If others have the same issues with lots of small files in particular with > Windows NT clients lets hear from you. Maybe we can come up with some > enhancement requests. I'll pass on the results of my tests as stated > above. I'd be interested in hearing from those of you that have worked > with other products and can tell me if they have the same performance > problems with lots of small files. If the performance of other products > is > impacted in the same was as TSM performance then that would be good to > know. If it's more about the Windows NT NTFS file system then I'd be > satisfied with that explanation as well. If it's about lots of > interaction > with the TSM database leads to slower performance, even when optimally > configured, then I'd like to know what Tivoli has in the works to address > the issue. Because if it's the TSM database, I could probably install the > fattest Fibre Channel/network pipe with the fastest peripherals and server > hardware around and it might not change a thing. > > Thanks > Jeff Connor > Niagara Mohawk Power Corp. > > > > > > > "Diana J.Cline" <[EMAIL PROTECTED]>@VM.MARIST.EDU> on > 02/14/2001 10:04:52 AM > > Please respond to "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> > > Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> > > > To: [EMAIL PROTECTED] > cc: > > Subject: Performance Large Files vs. Small Files > > > Using an NT Client and an AIX Server > > Does anyone have a TECHNICAL reason why I can backup 30GB of 2GB files > that > are > stored in one directory so much faster than 30GB of 2kb files that are > stored > in a bunch of directories? > > I know that this is the case, I just would like to find out why. If the > amount > of data is the same and the Network Data Transfer Rate is the same between > the > two backups, why does it take the TSM server so much longer to process the > files being sent by the larger amount of files in multiple directories? > > I sure would like to have the answer to this. We are trying to complete > an > incremental backup an NT Server with about 3 million small objects > (according > to TSM) in many, many folders and it can't even get done in 12 hours. The > actual amount of data transferred is only about 7GB per night. We have > other > backups that can complete 50GB in 5 hours but they are in one directory > and > the > # of files is smaller. > > Thanks > > > > > > Network data transfer rate > -------------------------- > The average rate at which the network transfers data between > the TSM client and the TSM server, calculated by dividing the > total number of bytes transferred by the time to transfer the > data over the network. The time it takes for TSM to process > objects is not included in the network transfer rate. Therefore, > the network transfer rate is higher than the aggregate transfer > rate. > . > Aggregate data transfer rate > ---------------------------- > The average rate at which TSM and the network transfer data > between the TSM client and the TSM server, calculated by > dividing the total number of bytes transferred by the time > that elapses from the beginning to the end of the process. > Both TSM processing and network time are included in the > aggregate transfer rate. Therefore, the aggregate transfer > rate is lower than the network transfer rate.
Re: Performance Large Files vs. Small Files
Lambelet,Rene,VEVEY,FC-SIL/INF. Fri, 16 Feb 2001 02:21:23 -0800
- Re: Performance Large Files vs. Small File... arhoads
- Re: Performance Large Files vs. Small File... David Longo
- Re: Performance Large Files vs. Small File... Reinhold Wagner
- Re: Performance Large Files vs. Small File... Thomas Denier
- Re: Performance Large Files vs. Small File... Richard Sims
- Re: Performance Large Files vs. Small File... George Lesho
- Re: Performance Large Files vs. Small File... Richard Sims
- Re: Performance Large Files vs. Small... Joe Faracchio
- Re: Performance Large Files vs. Small File... Prather, Wanda
- Re: Performance Large Files vs. Small File... Jeff Connor
- Re: Performance Large Files vs. Small File... Lambelet,Rene,VEVEY,FC-SIL/INF.
- Re: Performance Large Files vs. Small File... bbullock
- Re: Performance Large Files vs. Small... Suad Musovich
- Re: Performance Large Files vs. Small... Bill Colwell
- Re: Performance Large Files vs. Small... Per Ekman
- Re: Performance Large Files vs. Small... Mark S.
- Re: Performance Large Files vs. Small File... Stephen Mackereth
- Re: Performance Large Files vs. Small File... Steve Harris
- Re: Performance Large Files vs. Small File... bbullock
- Re: Performance Large Files vs. Small... Petr Prerost
- Re: Performance Large Files vs. Small File... bbullock