Greetings, The size of the volumes can be very relevant. Any of us who has managed a large virtual tape environment has run into this same issue. If you have 10TB in your pool, it would be a tempation to define 100 volumes of 100GB. That might not create any problem if you have a relatively small number of clients backing up at once, and they don't have enough data to overwhelm your pool, but it is not hard at all to accidentally design your solution so within a couple weeks all 100 of your volumes are "full" and you can't back up any new data. But in reality they are all only 50% full, and can't be reused. 1000 volumes of 10GB each would have worked better, because in reality the data doesn't expire randomly across the volume; some of those smaller volumes would have been reclaimed before the others. And if you have client files that are larger than your volume size, some of your volumes' data will expire all at once and never need to be reclaimed at all.
I have tried building virtual tape environments using various sizes, and smaller is better, to a point. We use 50GB volumes in our environment because we have 60-70TB virtual tape libraries with 1600 clients. It would probably not hurt to use even smaller volumes. Volumes smaller than 10GB starts to hit a point of diminishing returns. In a virtual tape environment you have to think about how many simultaneous tape mounts you have going on, and when you have 100+ simultaneous tape mounts you can have problems and you may have to bump up LIBSHRTIMEOUT in dsmserv.opt if you are using libarary sharing. But in a file-type storage pool you don't have that concern. Best Regards, John D. Schneider The Computer Coaching Community, LLC Office: (314) 635-5424 / Toll Free: (866) 796-9226 Cell: (314) 750-8721 -------- Original Message -------- Subject: Re: [ADSM-L] Seeking wisdom on dedupe..filepool file size client compression and reclaims From: "Allen S. Rout" <a...@ufl.edu> Date: Sat, August 29, 2009 6:57 pm To: ADSM-L@VM.MARIST.EDU >> On Sat, 29 Aug 2009 09:24:11 +0200, Stefan Folkerts >> <stefan.folke...@itaa.nl> said: > Now I am thinking, dedupe only occurs when you move data the volumes > or reclaim them but 10G volumes might not get reclaimed for a LONG > time since they contain so little data the chance of that getting > reclaimed and thus deduplicated is relatively smaller than that > happening on a 100G volume. I think that, to a first approximation, the size of the volume is irrelevant to the issues you're discussing here. Do a gedankenexperiment: Split 100TB into 100G vols, and into 10 10G vols. Then randomly expire data from them. What you'll have is a bunch of volumes ranging from (say) 0% to 49% reclaimable. You will reclaim your _first_ volume a skewtch sooner in the 10G case. But on the average, you'll reclaim 500G of space in about the same number of days. Or said differently: in a week you'll reclaim about the same amount of space in each case. I need to publish a simulator. So pick volume sizes that avoid being silly in any direction. - Allen S. Rout