Re: Seeking wisdom on dedupe..filepool file size client compression and reclaims

Grigori Solonovitch Sun, 30 Aug 2009 00:34:34 -0700

Hello Stefan,
I do not think you can use client compression and de-duplication together.
De-duplication is skipping all data compressed by TSM Client or TSM Server.
By the way, de-duplication process is trying to reduce files, compressed by 
operating system (for example, *.Z, *.zip, AIX system images, etc).
Regards,


Grigori G. Solonovitch

Senior Technical Architect

Information Technology  Bank of Kuwait and Middle East  http://www.bkme.com

Phone: (+965) 2231-2274  Mobile: (+965) 99798073  E-Mail: g.solonovi...@bkme.com

Please consider the environment before printing this Email

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Stefan 
Folkerts
Sent: Saturday, August 29, 2009 10:24 AM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] Seeking wisdom on dedupe..filepool file size client 
compression and reclaims

TSM guru's of the world,

I am toying around with a new TSM server we have and I am pondering some
options and would like your thoughts about them.
I am setting up a 6.1.2.0 TSM server with a filepool only, planning on
using deduplication.

When I set up a filepool I usually make a fairly small volume size..10G
maybe ~20G depending on the expected size of the TSM environment.
I do this because if a 100G volume is full and starts expiring relaim
won't occur for a while and that makes up until 49% (49GB) of the volume
space useless and wasted.
So I set up 10G volumes in our shop (very small server) and just accept
the fact that I have a lot of volumes, no problem TSM can handle a lot
of volumes.

Now I am thinking, dedupe only occurs when you move data the volumes or
reclaim them but 10G volumes might not get reclaimed for a LONG time
since they contain so little data the chance of that getting reclaimed
and thus deduplicated is relatively smaller than that happening on a
100G volume.

As an example, I migrated all the data from our old 5.5 TSM server to
the new one using a export node command, once it was done I scripted a
move data for all the volumes and I went from 0% to 20% dedupe save in 8
hours.
If I would let TSM handle this it would have taken me a LONG time to get
there.

If I do a full Exchange backup I fill 10 volumes with data, identify
will mark data on them for deduplication but it won't have an effect at
all since the data will expire before the volumes are reclaimed.
This full Exchange backup will happen every week and is held for 1
month, that means the bulk of my data has no use of deduplication with
this setup or am I missing something here? :)

So I am thinking, with a 10G volume being filled above the reclaim
threshold so easy and therefor missing the dedupe action what should one
do?
I would almost consider a query nodedata script that would identify
Exchange node data and move that around for some dedupe action.

Also client compression, does anybody have an figures on how this effect
the effectiveness of deduplication?
Because these are both of interest in a filepool, if deduplication works
just as good in combination with compression that would be great.

Regards,

  Stefan

Please consider the environment before printing this Email.

"This email message and any attachments transmitted with it may contain 
confidential and proprietary information, intended only for the named 
recipient(s). If you have received this message in error, or if you are not the 
named recipient(s), please delete this email after notifying the sender 
immediately. BKME cannot guarantee the integrity of this communication and 
accepts no liability for any damage caused by this email or its attachments due 
to viruses, any other defects, interception or unauthorized modification. The 
information, views, opinions and comments of this message are those of the 
individual and not necessarily endorsed by BKME."

Re: Seeking wisdom on dedupe..filepool file size client compression and reclaims

Reply via email to