Like it says in the document, it's a recommendation and not a technical limit.
However, having the server running at 100% utilization all the time doesnt seem like a healthy scenario. Why arent you deduplicating files larger than 1GB? From my experience, datafiles from SQL, Exchange and such has a very large de-dup ratio, while TSM's deduplication skips files smaller than 2KB? I have a customer up north who used this configuration on an HP EVA based box with SATA disks. The disks where breaking down so fast that the arrays within the box was in a constant "rebuild" phase. HP claimed it was TSM dedup that was breaking the disks (they actually claimed TSM was writing so often that the disks broke), a scenario I have very hard to believe. Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- Till: ADSM-L@VM.MARIST.EDU Från: "Colwell, William F." <bcolw...@draper.com> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> Datum: 09/28/2011 20:43 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Hi Daniel, I remember hearing about a 6 TB limit for dedup in a webinar or conference call, but what I recall is that that was a daily thruput limit. In the same section of the redbook as you quote is this paragraph - Experienced administrators already know that Tivoli Storage Manager database expiration was one of the more processor-intensive activities on a Tivoli Storage Manager Server. Expiration is still processor intensive, albeit less so in Tivoli Storage Manager V6.1, but this is now second to deduplication in terms of consumption of processor cycles. Calculating the MD5 hash for each object and the SHA1 hash for each chunk is a processor intensive activity. I can say this is absolutely correct; my processor is frequently running at or near 100%. I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows for the 2 instances on my server - select cast(stgpool_name as char(12)) as "Stgpool", - cast(sum(num_files) / 1024 /1024 as decimal(4,1)) as "Mil Files", - cast(sum(physical_mb) / 1024 /1024 as decimal(4,1)) as "Physical_TB", - cast(sum(logical_mb) / 1024 /1024 as decimal(4,1))as "Logical_TB", - cast(sum(reporting_mb) / 1024 /1024 as decimal(4,1))as "Reporting_TB" - from occupancy - where stgpool_name in (select stgpool_name from stgpools where deduplicate = 'YES') - group by stgpool_name Stgpool Mil Files Physical_TB Logical_TB Reporting_TB ------------- ---------- ------------ ----------- ------------- BKP_2 368.0 0.0 30.0 95.8 BKP_2X 341.0 0.0 23.9 58.6 Stgpool Mil Files Physical_TB Logical_TB Reporting_TB ------------- ---------- ------------ ----------- ------------- BKP_2 224.0 0.0 35.7 74.1 BKP_FS_2 49.0 0.0 21.0 45.5 Also, I am not using any random disk pool, all the disk storage is scratch allocated file class volumes. There is also a tape library (lto5) for files larger than 1GB which are excluded from deduplication. Regards, Bill Colwell Draper Lab -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Wednesday, September 28, 2011 3:49 AM To: ADSM-L@VM.MARIST.EDU Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool To be honest, it doesnt really say. The information is from the Tivoli Storage Manager Technical Guide: Note: In terms of sizing Tivoli Storage Manager V6.1 deduplication, we currently recommend using Tivoli Storage Manager to deduplicate up to 6 TB total of storage pool space for the deduplicated pools. This is a rule of thumb only and exists solely to give an indication of where to start investigating VTL or filer deduplication. The reason that a particular figure is mentioned is for guidance in typical scenarios on commodity hardware. If more than 6 TB of real diskspace is to be duplicated, you can either use Tivoli Storage Manager or a hardware deduplication device. The 6 TB is in addition to whatever disk is required by non-deduplicated storage pools. This rule of thumb will change as processor and disk technologies advance, because the recommendation is not an architectural, support, or testing limit. http://www.redbooks.ibm.com/redbooks/pdfs/sg247718.pdf I'm guessing it's server-side since client-side shouldnt use any resources @ the server. I'm also guessing you could do 8TB or 10, but not 60TB. Best Regards Daniel Sparrman Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- Till: ADSM-L@VM.MARIST.EDU Från: Hans Christian Riksheim <bull...@gmail.com> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> Datum: 09/28/2011 09:56 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool This 6 TB supported limit for deduplicated FILEPOOL does this limit apply when one does client side deduplication only? Just wondering since I have just set up a 30 TB FILEPOOL for this purpose. Regards Hans Chr. On Tue, Sep 27, 2011 at 8:44 PM, Daniel Sparrman <daniel.sparr...@exist.se> wrote: > Just to put an end to this discussion, we're kinda running out of limits here: > > a) No VTL solution, neither DD, neither Sepaton, neither anyone, is a > replacement for random diskpools. Doesnt matter if you can configure 50 > drives, 500 drives or 5000 drives, the way TSM works, you're gonna make the > system go bad since the system is made from having random pools infront, > sequential pools in the back. A sequential device is not gonna replace that, > independent being a sequential file pool or a VTL (or, for that question, a > tape library). > > b) VTL's where invented because most backup software (I've only worked with > TSM, Legato & Veritas aka Symantec) is used to working with sequential > devices. That havent changed, and wont change in the near future. VTL's (and > the file device option) is just a replacement. Performance wise, VTL's are > gonna win all the time compared to a file device, question you need to ask > yourself is, do I need the VTL, or can I go along with using file devices. > According to the TSM manual (dont have the link , but if you want i'll find > it) the maximum supported file device pool for deduplication is 6TB... so if > you're thinking of replacing a VTL with a seq. file pool, keep that in mind. > The limit is because the amount of resources needed by TSM to do the file > deduplication is limited, or as the manual says, "until new technologies are > available". > > The discussion here where people are actually planning on just having a > sequential pool (since noone is actually discussing that there's a random > pool infront) is plain scary. No sequential device is gonna have their time > of the life having a fileserver serving 50K blocks at a time. > > So my last 50 cents worth is: > > a) Have a random pool infront > > b) Depending on the size of your environment, you're either gonna go with a > filepool and use de-dup (limit is 6TB for each pool, you might not want to > de-dup everything), or you're gonna go with a fullscale VTL. Choice here is > size vs costs. > > I've seen alot of posts here lately about the disadvantages with VTL's .. > well, I havent seen one this far with mine. I have a colleague who bought a > XXXX VTL and found out he needed another VTL just todo the de-dup, since one > VTL wasnt a supported configuration to do de-dup. I have another colleague > who bought a very cheap VTL solution (from a very mentioned name around here) > and ended up with having same hashes, but different data, leaving him with > unrestorable data. > > Comparing eggs to apples just isnt fair. Different manufactures of VTL's do > different things, meaning both performance and availability is completely > different. > > Just to sum up, we've had both 3584's and (back in the days) 3575, and I've > never been happier with our VTL (and yes, we do restore tests). > > Best Regards > > Daniel > > > > Daniel Sparrman > Exist i Stockholm AB > Växel: 08-754 98 00 > Fax: 08-754 97 30 > daniel.sparr...@exist.se > http://www.existgruppen.se > Posthusgatan 1 761 30 NORRTÄLJE > > > > -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- > > > Till: ADSM-L@VM.MARIST.EDU > Från: Rick Adamson <rickadam...@winn-dixie.com> > Sänt av: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> > Datum: 09/27/2011 18:02 > Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary > pool > > Interesting. Every VTL based solution, including data domain, that I looked > at had limits on the amount of drives that could be emulated which were > nowhere near a hundred let alone a thousand. Perhaps it's time to revisit > this. > > The license is a data domain fee, and a hefty one at that. > > The bigger question I have is since the file based storage is native to TSM > why exactly is using a file based storage not supported? > > ~Rick > > > -----Original Message----- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Daniel Sparrman > Sent: Tuesday, September 27, 2011 10:30 AM > To: ADSM-L@VM.MARIST.EDU > Subject: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool > > Not really sure where the general idea that a VTL will limit the number of > available mount points. > > I'm not familiar with Data Domain, but generally speaking, the number of > virtual tape drives configured within a VTL is usually thousands. Not sure > why you'd want that many though, I always prefer having a small diskpool > infront of whatever sequential pool I have, and let the bigger files pass the > diskpoool and go straightly to the seq. pool. > > As far as for LAN-free, the only available option I know of is SANergy. And > going down that road (concerning both price & complexity) will probably make > the VTL look cheap. > > Not sure what kind of licensing you're talking about concerning VTL, but I > assume it's a Data Domain license and not a TSM license? > > Best Regards > > Daniel Sparrman > > > > Daniel Sparrman > Exist i Stockholm AB > Växel: 08-754 98 00 > Fax: 08-754 97 30 > daniel.sparr...@exist.se > http://www.existgruppen.se > Posthusgatan 1 761 30 NORRTÄLJE > > > > -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- > > > Till: ADSM-L@VM.MARIST.EDU > Från: Rick Adamson <rickadam...@winn-dixie.com> > Sänt av: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> > Datum: 09/27/2011 16:52 > Ärende: Re: [ADSM-L] vtl versus file systems for pirmary pool > > A couple of things that I did not see mentioned here which I experienced > was.... for Data Domain the VTL is an additional license and it does > limit the available mount points (or emulated drives), where a TSM file > based pool does not. Like Wanda stated earlier depends what you can > afford ! > > I myself have grown fond of using the file based approach, easy to > manage, easy to configure, and never worry about an available tape drive > (virtual or otherwise). The lan-free issue is something to consider but > from what I have heard lately is that it can still be accomplished using > the file based storage. If anyone has any info on it I would appreciate > it. > > ~Rick > Jax, Fl. > > -----Original Message----- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Tim Brown > Sent: Monday, September 26, 2011 4:05 PM > To: ADSM-L@VM.MARIST.EDU > Subject: [ADSM-L] vtl versus file systems for pirmary pool > > What advantage does VTL emulation on a disk primary storage pool have > > as compared to disk storage pool that is non vtl ? > > > > It appears to me that a non vtl system would not require the daily > reclamation process > > and also allow for more client backups to occur simultaneously. > > > > Thanks, > > > > Tim Brown > Systems Specialist - Project Leader > Central Hudson Gas & Electric > 284 South Ave > Poughkeepsie, NY 12601 > Email: tbr...@cenhud.com <<mailto:tbr...@cenhud.com>> > Phone: 845-486-5643 > Fax: 845-486-5921 > Cell: 845-235-4255 > > > > > This message contains confidential information and is only for the > intended recipient. If the reader of this message is not the intended > recipient, or an employee or agent responsible for delivering this > message to the intended recipient, please notify the sender immediately > by replying to this note and deleting all copies and attachments.