Hi Bob Considering the amount of data, I would presume that it's not only a few files being backed up, but a huge amount since most systems out there cant hold a few files for a total of 50PB of data.
You will also have to consider what impact storing 50PB of data will have on the TSM database. We recently hit the max size of the database at a customer location which is 524 GB (don't remember the exact size, but you'll notice when TSM says you've reached the maximum size ;)) Considering this, you'll probably be using more than 1 instance, thus spreading out both data and database load on multiple servers. Also, cartridge handling will be a problem since storing petabytes of data will require a lot of cartridges, thus creating a need for fast mounts / dismounts and a pretty large number of tape drives, both for backing up / restoring data but also to keep the internal TSM housecleaning running smoothly. Another way of going at it is looking at enterprise VTL technology thus eliminating the need for cartridge handling and the requirements for a numerous amount of tape drives. As far as I know though, there is no VTL that can store that amount of data in a single system without deduplication. That kinda brings up another question; what are the customer's expectation / regulations when it comes to disaster recovery? It would be easier to actually point out the issues involved if some of the questions you already came up with would've been answered ;) Best Regards Daniel Sparrman -----Ursprungligt meddelande----- Från: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] För Bob Talda Skickat: den 23 januari 2008 21:16 Till: ADSM-L@VM.MARIST.EDU Ämne: Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data Folks: Our group has been approached by a customer who asked if we could backup/archive 50 petabytes of data. And yes, they are serious. We've begun building questions for the customer, but as this is roughly 1000 times the current amount of data we backup, we are on unfamiliar turf here. At a high level, here are some of the questions we are asking: 1) Is the 50 Petabytes an initial, or envisioned data size? If envisioned, how big is the initial data load and how fast will it grow? 2) What makes up the data: databases, video/audio files, other? (subtext: how many objects are involved? What are the opportunities to compress/deduplicate?) 3) how is the data distributed - over a number of systems or from a supercluster? 4) Is the data static, or changing slowly or changing rapidly? (subtext: is it a backup or archive scenario) 5) What are the security requirments? 6) What are the restore (aka RTO) requirements? We are planning on approaching vendors to get some sense of the probable data center requirements (cooling, power, footprint). If anyone in the community has experience with managing petatybes of backup data, we'd appreciate any feedback we could incorporate. Thanks in advance!