Andrew, The crowd may be right, and the XIV may be your bottleneck for the DB, but I wouldn't focus on that. In your test environment, with only a small number of backups running at once, there probably isn't all that much database traffic generated, is there? And not many database reads, if much of your database should fit in memory. Database writes should be going to cache in the XIV, if it is as lightly loaded as you say, so I don't see that as much of a bottleneck when only a few clients are getting backed up. What kind of client backups are you testing? Are they large file database backups? Those can generate very good I/O throughput, because the client is sending the data as fast as possible. Or incremental filesystem backups on Windows servers? Those can generate very pool I/O throughput, if they have to examine thousands of files for each file that needs to be sent to the server. Can you say with assurance that the clients themselves are able to send more than 20-30MB/sec? Do you know what performance those same clients get when they backup to your production environment? Try backing them up to their production environment, at some time of night when the TSM server is not maxed out. Use that as a known starting point. If you just want to test throughput, and don't care about anything else:
1) Turn off client compression, if it is on. 2) Do "selective" backups of the whole filesystem, so the clients send everything without having to make any time-consuming decisions about what gets sent. 3) Pick a time for the test with the client is very lightly loaded. 4) Try to pick a client with a small number of very large (multi-GB) files, not zillions of small files. Andrew, I know you already know these things, but I include them for the benefit of the rest of the list. The point I am making is to allow the TSM client shove data across as fast as it can, and if it performs really well, then the device that is absorbing all that incoming data (The DataDomain, or other disk storage pool) is performing well. If another client is sending zillions of files, but performing very slowly, maybe that client is creating a lot more traffic to the database, and that is where your bottle neck. In other words, different clients can be used to show what part of the TSM server is the slowest performer. Best Regards, John D. Schneider The Computer Coaching Community, LLC Office: (314) 635-5424 / Toll Free: (866) 796-9226 Cell: (314) 750-8721 -------- Original Message -------- Subject: Re: [ADSM-L] Frustrated by slowness in TSM 6.2 From: Paul Zarnowski <p...@cornell.edu> Date: Fri, October 08, 2010 11:37 pm To: ADSM-L@VM.MARIST.EDU Rick, I think their response would be something along these lines... The XIV can perform better than other traditional arrays because the [cache miss] I/Os are spread across so many more spindles. I get that. But it seems to be that that can break down when the overall I/O load gets sufficiently high, across all of the spindles. In an I/O intensive environment such as TSM, I think this could be more likely to happen - particularly if you are using XIV for storage pools as well as for database volumes. I'm still skeptical about how far it can go. I can buy that it has good performance --- for a SATA-based product. But not compared to a pure 15K spindle-based product. Oh, and the SATA drives are larger than the SAS or FC drives, which doesn't help. ..Paul At 01:57 PM 10/8/2010, Richard Rhodes wrote: >> I would be suspicious of having the db on XIV. Do you have any FC >> or SAS Disk you could try putting the DB on? I know XIV has lots >> of CPU & cache, but underneath it all is still SATA. I've heard >> Marketing types rave about how fast XIV is, even with SATA, >> because I/O can be spread across many spindles, but I'm not >> entirely convinced it's as good as 15k FC or SAS. > >This is _exactly_ what IBM has not, and seems unwilling, to explain. > >Soon after IBM finalized the purchase of XIV, they had a series >of seminars around the country (usa) about the box. This wasn't some >little out of the way seminar . . . Moshe (inventor of the box) >was there and gave much of the presentation. I attended one - Lets >just say it was strange!!! They hammered on "high performance", over >and over. They threw up one graph where they claimed 25k iops at >3ms response time for a "cache miss" workload. Lets see, cache miss >means having to go to the spindle to do the I/O. SATA drives come >no where close to this response time. The workload was either >not cache miss, or, they effectively short-stroked the drive such >that the heads never moved. When I questioned this claim I >got nowhere - just run-around. > >Rick > > > >----------------------------------------- >The information contained in this message is intended only for the personal >and confidential use of the recipient(s) named above. If the reader of this >message is not the intended recipient or an agent responsible for delivering >it to the intended recipient, you are hereby notified that you have received >this document in error and that any review, dissemination, distribution, or >copying of this message is strictly prohibited. If you have received this >communication in error, please notify us immediately, and delete the original >message. -- Paul Zarnowski Ph: 607-255-4757 Manager, Storage Services Fx: 607-255-8521 719 Rhodes Hall, Ithaca, NY 14853-3801 Em: p...@cornell.edu