Here's an article on the odds of a collision occurring in a hash only environment for Anyone who is interested in more information.
-----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Thursday, October 06, 2011 7:19 AM To: ADSM-L@VM.MARIST.EDU Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool I have customers who during an audit sees the object count go from 0 to +2 billion and then starts counting backwards with a "-" (that was during TSM 5.5) several times. So no, it's not bullocks. I did however mean "million" (as in, several billion) so a mistake from my side there. Some of those customers also hit the technical limit during 5.5 for the database size (524GB) on several of their TSM instances. Thus having even more instances of TSM today. Sorry for the mistake saying "billion" and not "million". As for how much objects they actually have in each TSM instance, it's fairly hard to tell since there is no possibility to do a select on contents for example to count the amount of objects. Those kind of SQL statements just hangs. And like I said, during the last audit we did on one of the TSM instances, it went up to 2100000000 objects and then started counting backwards several times so we actually have no clue about the exact amount of objects in that database. Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 Posthusgatan 1 761 30 NORRTÄLJE -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- Till: ADSM-L@VM.MARIST.EDU Från: Ben Bullock Sänt av: "ADSM: Dist Stor Manager" Datum: 10/06/2011 16:11 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Ok, I have been following this thread with some interest, since I have a dedupe appliance. From the conversation, I've come to the conclusion that Daniel is a very cautious administrator who would like to eliminate any risk of data loss. Don't we all, it's a noble and worthwhile endeavor. All the discussed options are worthwhile if you are concerned about hash collisions (copypools, Async replication, reuse delay, etc) At some point in the pursuit, you get to the point where there are diminishing returns and it is not worth the money to eliminate the next .01% probability of failure. Everyone will have a different stopping point. We get it. I think we have beat this horse within an inch of its life. But I gotta ask... Daniel, you said "but I've got several TSM customers who have several thousands of billions of objects". Are you telling us that someone has a TSM server with multiple ~TRILLIONS~ of objects backed up? Is that hyperbole or truth? Ben -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Wednesday, October 05, 2011 3:26 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Hi Remco Not sure if you're talking about hardware de-dup or TSM de-dup (which is using a larger block size due to the load) but: Relatively small? I've only seen it happen once, but then I live in a relatively small market since I live in Sweden. So you're telling me (based on facts) that this haven't happened elsewhere? I seriously have to disagree. In my opinion, it think it's more likley that others that had this issue have decided to keep it in the dark. Sweden is a relatively small market, and the odds that it would have happened here, but nowhere else, is quite small. Not sure about the size or anything in your TSM comparison, but I've got several TSM customers who have several thousands of billions of objects ... And like I said, if it's a chance of 1000.000.000.000 it's much more likely to hit you at 1000.000. It's not a quota that needs to be filled before it hits you. It's a random chance. And, alike the customer I had who got it, if it's a very common block geting that hash conflict, yes, it will hit you badly since every file that contains that block will be invalid. I do agree about your comment about TSM v6 though, I'd consider it very stable, I'd actually (today, with the amount of checking being done) consider it more stable than still being at version 5.5 Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 Posthusgatan 1 761 30 NORRTÄLJE -----"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> skrev: ----- Till: ADSM-L@VM.MARIST.EDU Från: Remco Post Sänt av: "ADSM: Dist Stor Manager" Datum: 10/05/2011 21:11 Ärende: Re: [ADSM-L] vtl versus file systems for pirmary pool Hi, I saw last week that about half of the people visiting the TSM Symposium were running V6, it's been stable for me so far. The likeliness of an accidental SHA1 hash collision is relatively small even compared to the total number of objects that a TSM server could possibly ever store during its entire lifetime, insignificant. That being said, if you think that your data is to valuable to even risk that, don't dedup. -- Gr., Remco Op 5 okt. 2011 om 19:24 heeft Shawn Drew <> het volgende geschreven: > Along this line, we are still using TSM5.5 Some of the features > mentioned previously require TSM6. TSM6 still feels risky to me. > Maybe more risky than a hash collision. > Just looking for a consensus, Do people think its mature enough now > that it is as stable/reliable as TSM5 ? > > PS. Test restores are the only way to be sure your backups are good. > You shouldn't just "trust" TSM. > > Regards, > Shawn > ________________________________________________ > Shawn Drew > > > > > > Internet > > > Sent by: ADSM-L@VM.MARIST.EDU > 10/05/2011 11:03 AM > Please respond to > ADSM-L@VM.MARIST.EDU > > > To > ADSM-L > cc > > Subject > Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: > Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl > versus file systems for pirmary pool > > > > > > >> When TSM is duplicating your data (aka backing up storage pools), >> there is no logical connection between your primary storage pool and >> your copypool. > > Well . . .yes . .. no . . . > > All our eggs are in one basket no matter what. The logical connection > between pri and copy pools is TSM itself. A logical corruption in TSM > can take out both. Your data could be sitting there on tape and > completely useless. Yes, that's why we have TSM db backups, but are > they good? What if there is a TSM bug that renders all your backups > bad - we don't find out until we need it! > > At some point you have to trust something. We all trust TSM. Yes, we > do the db backup, create pri and copy pools, use reuse delay . . > .everything to allow for problems . . . but we are still trusting that > TSM workss as advertised. A really, really paranoid would run two > complete separate/different backup systems - but who can afford that, or want > to? > But then, we do do that for our biggest SAP/ORacle systems. We use > Oracle/RMAN-to-flasharea/RMAN-to-TDPO/TSM, but we also run EMC/clone > backups off our DR sites R2's . . but also to TSM. > > > Rick > > > > > > ----------------------------------------- > The information contained in this message is intended only for the > personal and confidential use of the recipient(s) named above. If the > reader of this message is not the intended recipient or an agent > responsible for delivering it to the intended recipient, you are > hereby notified that you have received this document in error and that > any review, dissemination, distribution, or copying of this message is > strictly prohibited. If you have received this communication in error, > please notify us immediately, and delete the original message. > > > > This message and any attachments (the "message") is intended solely > for the addressees and is confidential. If you receive this message in > error, please delete it and immediately notify the sender. Any use not > in accord with its purpose, any dissemination or disclosure, either > whole or partial, is prohibited except formal approval. The internet > can not guarantee the integrity of this message. BNP PARIBAS (and its > subsidiaries) shall (will) not therefore be liable for the message if > modified. Please note that certain functions and services for BNP Paribas may > be performed by BNP Paribas RCC, Inc. The BCI Email Firewall made the following annotations --------------------------------------------------------------------- *Confidentiality Notice: This E-Mail is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you have received this communication in error, please do not distribute, and delete the original message. Thank you for your compliance. You may contact us at: Blue Cross of Idaho 3000 E. Pine Ave. Meridian, Idaho 83642 1.208.345.4550 ---------------------------------------------------------------------