Ditto on Lindsay's "it depends" For my NetApp devices, observed NAS filesystem dedupe renges from 10% to 70% depending on the data. VMware NFS shares typically show a good ratio. We for our VM environment, we split our OS apart from data and paging space as depicted below: Filesystem used saved %saved /vol/PROD_VM_OS/ 98314436 227793716 70% /vol/PROD_VM_PAGING/ 3107084 1090756 26% /vol/PROD_VM_DATA1/ 11253900 17343096 61% /vol/DR_VM_OS1/ 105852808 236518940 69% /vol/DR_VM_DATA1/ 431134632 216285060 33% /vol/DR_VM_PAGING1/ 35520 4272 11%
The paging space is very dynamic and I don't expect much savings. The OS space (where VM operating systems are installed) is relatively static and redundant and reflects that with high dedup ratios. The data space (where applications and everything else is) has a wide variance - as expected. But the end result is that I am saving disk space and actually improving overall performance because redundant data has a higher probability of residing in cache and the reference to a particular bit of redundant data has a higher probability of residing in the cached lookup table. If you are looking for dedupe on tape media, I don't think it is feasable nor desired. Simple compression now allows me to put nearly 3TB on a single 3592 tape (again depending on the data). At a nominal cost of $150/tape this results in about 5 cents/GB. Not too shabby. I make a second offsite copy of the same data resulting in an overall cost of 10 cents to provide +"five nines" probability that my company's data is recoverable for the next 6 years. This is less than the cost of electricity for disk based storage for the same time period. Dedupe has it's place as do most technologies. It is not a golden egg unless you force it to be ... and then, when it hatches, it may be a fine goose or it may be a platypus - it depends on your environment. Cheers, Neil Strand Storage Engineer - Legg Mason Baltimore, MD. (410) 580-7491 Whatever you can do or believe you can, begin it. Boldness has genius, power and magic. -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Ochs, Duane Sent: Thursday, June 25, 2009 7:35 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Dedupe For common practice de-dup is not a tape oriented process. It is usually to reduce data on disks. One concern would be the amount of tape mounts required to restore data in the event of a DR scenario. As the article has stated there are not many "global" de-dup products yet. We have been able to implement some dedup on specific applications, for instance E-mail attachments and it has worked out fairly well. However, it primarily was to reduce the size of the Storage Groups of our Exchange cluster, in the event of a DR scenario, which is on tier 1 storage. And the de-dupped attachments are now on tier 2. It reduced our SGs by 1/3. The exchange SGs backups are retained based on legal requirements and replicated. The attachments are not. I also tested Data Domain and was very unimpressed by the numbers I saw. It had very little impact on our largest amounts of data. Imaging, Exchange and DB dumps. But that is also the hardest type of data to de-dup. My two cents. -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of madunix Sent: Wednesday, June 24, 2009 11:37 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Dedupe However, for my thoughts of Dedupe it could be interesting for those who need to decrease the number of tape cartridges, but they could suffer signifigannt CPU and I/O spec. for dedupe processing, and one issue i was thinking about is a fauiler or if one part is corrupted, i.e. many files would be affected by loss of common chunk, and what about encryption is it compatible with encryption. Thanks madunix >> -----Original Message----- >> From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf >> Of lindsay morris >> Sent: Wednesday, June 24, 2009 1:07 PM >> To: ADSM-L@VM.MARIST.EDU >> Subject: Re: [ADSM-L] Dedupe >> >> Short and clear answer about de-dupe: >> >> It depends. >> >> Hope this helps. >> >> ------ >> Mr. Lindsay Morris >> Principal >> www.tsmworks.com >> 919-403-8260 >> lind...@tsmworks.com >> IMPORTANT: E-mail sent through the Internet is not secure. Legg Mason therefore recommends that you do not send any confidential or sensitive information to us via electronic mail, including social security numbers, account numbers, or personal identification numbers. Delivery, and or timely delivery of Internet mail is not guaranteed. Legg Mason therefore recommends that you do not send time sensitive or action-oriented messages to us via electronic mail. This message is intended for the addressee only and may contain privileged or confidential information. Unless you are the intended recipient, you may not use, copy or disclose to anyone any information contained in this message. If you have received this message in error, please notify the author by replying to this message and then kindly delete the message. Thank you.