TSM guru's of the world, I am toying around with a new TSM server we have and I am pondering some options and would like your thoughts about them. I am setting up a 6.1.2.0 TSM server with a filepool only, planning on using deduplication.
When I set up a filepool I usually make a fairly small volume size..10G maybe ~20G depending on the expected size of the TSM environment. I do this because if a 100G volume is full and starts expiring relaim won't occur for a while and that makes up until 49% (49GB) of the volume space useless and wasted. So I set up 10G volumes in our shop (very small server) and just accept the fact that I have a lot of volumes, no problem TSM can handle a lot of volumes. Now I am thinking, dedupe only occurs when you move data the volumes or reclaim them but 10G volumes might not get reclaimed for a LONG time since they contain so little data the chance of that getting reclaimed and thus deduplicated is relatively smaller than that happening on a 100G volume. As an example, I migrated all the data from our old 5.5 TSM server to the new one using a export node command, once it was done I scripted a move data for all the volumes and I went from 0% to 20% dedupe save in 8 hours. If I would let TSM handle this it would have taken me a LONG time to get there. If I do a full Exchange backup I fill 10 volumes with data, identify will mark data on them for deduplication but it won't have an effect at all since the data will expire before the volumes are reclaimed. This full Exchange backup will happen every week and is held for 1 month, that means the bulk of my data has no use of deduplication with this setup or am I missing something here? :) So I am thinking, with a 10G volume being filled above the reclaim threshold so easy and therefor missing the dedupe action what should one do? I would almost consider a query nodedata script that would identify Exchange node data and move that around for some dedupe action. Also client compression, does anybody have an figures on how this effect the effectiveness of deduplication? Because these are both of interest in a filepool, if deduplication works just as good in combination with compression that would be great. Regards, Stefan