Data Deduplication

2007-08-26 Thread Hughes, George
Is TSM planning on adding data deduplication similar to avamar? I
understand how TSM does not duplicate data now but minor edits in files
or simple file name changes would result in additional copies of the
entire file using TSM today.

We recently had a pitch from EMC on avamar. I can think of some reasons
to pass on it (Having two separate backup/restore solutions is a big
one, cost etc) but some persuasive arguments were made supporting their
solution. If TSM is going to be adding similar functionality soon it may
be another reason to focus on other efforts.



George Hughes

Senior UNIX Engineer

Children's National Medical Center

12211 Plum Orchard Dr.

Silver Spring, MD 20904

(301) 572-3693


Confidentiality Notice: This e-mail message, including any attachments, is
for the sole use of the intended recipient(s) and may contain confidential
and privileged information.  Any unauthorized review, use, disclosure or
distribution is prohibited.  If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original 
message.



Re: Data Deduplication

2007-08-26 Thread Richard Sims

On Aug 26, 2007, at 4:58 AM, Hughes, George wrote:


Is TSM planning on adding data deduplication similar to avamar? I
understand how TSM does not duplicate data now but minor edits in
files
or simple file name changes would result in additional copies of the
entire file using TSM today.


Except in Windows, where Adaptive Subfile Backup may be employed.
That's as far as it has gone in the product thus far.

Richard Sims


Re: Data Deduplication

2007-08-26 Thread Fred Johanson
But it is being pursued for future release - after the conversion of the DB to 
DB2.



From: ADSM: Dist Stor Manager on behalf of Richard Sims
Sent: Sun 8/26/2007 7:23 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Data Deduplication



On Aug 26, 2007, at 4:58 AM, Hughes, George wrote:

> Is TSM planning on adding data deduplication similar to avamar? I
> understand how TSM does not duplicate data now but minor edits in
> files
> or simple file name changes would result in additional copies of the
> entire file using TSM today.

Except in Windows, where Adaptive Subfile Backup may be employed.
That's as far as it has gone in the product thus far.

 Richard Sims


FW: TSM Server Sizing

2007-08-26 Thread Lamb, Charles P.
> Hi..
> 
> We are struggling on sizing a future TSM system.  Is there any
> programs which would help on sizing a future TSM server??  We predict
> the change data of the our TSM system would see 3TB of daily data
> change in YR2010.  We predict that a library would need about 200TB of
> tape capacity.  Our off-site storage capability should also be 200TB
> of tape capacity.  Here is our shoot of sizing our TSM server.  Should
> this be the size of the TSM server??
> 
IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM
3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about
2000 LTO4 tapes on-site and off-site.  10GB & 1GB network connections.
4TB of disk cache from IBM DS4800 w/IBM SVC system.


Re: FW: TSM Server Sizing

2007-08-26 Thread David Longo
Depending on what your main problems/issues are, I think this system
is VERY adequate, maybe even over sized.

I have TSM srever on p630  (7028-6C4) with 4x 1.45Ghz Procs and
8GB BRAM.  2-3TB on disk pool and the TSM DB on FASTT4300 Turbo.
Hav eONe 7311-D20 drawer with FC Adapters.

3584-L32 and (1) D52 Exp frame.  Have 12 LTO2 and 2 LTO3
drives.  Mostly LTO2 tapes, have 1100 tapes total onsite and
offsite.  Backup 2.5 TB per day and have 80TB onsite tape.
We do compression on tape.

And my system works fine, may upgrade my Disk pool later.

So, to your specs compared to mine.

1.  Tape.  If I went to LTO4 tapes, my 80TB (plus the same offsite)
would fit on probably 300tapes.  200TB then would need say 750 tapes,
maybe 800.  Also you shouldn't need 24 drives. And only 1 D52 frame
as 24 drives would fit on just 2 frames, base and exp and then tapes will
definitely fit in there also.

2.  Server.  A 4 way with 8-16GB would be very adequate.  But why
need (6) D20 drawers? Even if you put 1 tape drive per FC Adapter port,
the  24 drives would need 24 ports.  There are dual port 4 GB Adapters
now, and I think can get (6) in a D20 (double check that,  not completely
sure of IBM specs on that).  So only (2) D20 drawers would support 24 drives.

24 drives - maybe you have a lot of restore requests?  Or use HSM?

You say this is for 2010?  Let's see, they will have LTO6 or so by then also.

3.  Other thoughts.  Is money any object?  If not, I may send you my resume!
(A Bunch of others on this list may too!)  Unless you are direct connecting all
devices to TSM, you also need budget for SAN switches.
And don't forget TSM licenses!  Also Disaster Recovery, High Availability
and related items.

You didn't mention much else about environment.  How many clients, what type.
Doing backups only or archives also.  Do you do many restores?
Planning to have multiple instances of TSM Server (My comments about server
oversized might go away if you are planning multiple instances.)

Based on what you provided though, I think most experienced folks on this
email list would agree with my comments, with only minor differences.  And
if you get the system you mentioned, you will be the envy of most out there!




David B. Longo
System Administrator
Health First, Inc.
3300 Fiske Blvd.
Rockledge, FL 32955-4305
PH  321.434.5536
Pager  321.634.8230
Fax:321.434.5509
[EMAIL PROTECTED]


>>> "Lamb, Charles P." <[EMAIL PROTECTED]> 8/26/2007 11:31 AM >>>
> Hi..
> 
> We are struggling on sizing a future TSM system.  Is there any
> programs which would help on sizing a future TSM server??  We predict
> the change data of the our TSM system would see 3TB of daily data
> change in YR2010.  We predict that a library would need about 200TB of
> tape capacity.  Our off-site storage capability should also be 200TB
> of tape capacity.  Here is our shoot of sizing our TSM server.  Should
> this be the size of the TSM server??
> 
IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM
3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about
2000 LTO4 tapes on-site and off-site.  10GB & 1GB network connections.
4TB of disk cache from IBM DS4800 w/IBM SVC system.


#
This message is for the named person's use only.  It may
contain confidential, proprietary, or legally privileged
information.  No confidentiality or privilege is waived or
lost by any mistransmission.  If you receive this message
in error, please immediately delete it and all copies of it
from your system, destroy any hard copies of it, and notify
the sender.  You must not, directly or indirectly, use,
disclose, distribute, print, or copy any part of this message
if you are not the intended recipient.  Health First reserves
the right to monitor all e-mail communications through its
networks.  Any views or opinions expressed in this message
are solely those of the individual sender, except (1) where
the message states such views or opinions are on behalf of
a particular entity;  and (2) the sender is authorized by
the entity to give such views or opinions.
#


Re: Data Deduplication

2007-08-26 Thread Curtis Preston
>Is TSM planning on adding data deduplication similar to avamar? 

As mentioned by Richard, the closest thing TSM has to this now is
subfile backup.  It is related to de-duplication, where once it has a
backup of a given file, it backs up only the changed bytes of that file.
This is also referred to as delta incrementals.

True de-duplication takes this much farther, as it would recognize a
file or email that's duplicated on two or three different systems, such
as an attachment/email that's sent to users on several different
Exchange servers.  The "compression" ratios it can achieve are therefore
much higher than delta differentials.

>I understand how TSM does not duplicate data now but minor edits in
files
>or simple file name changes would result in additional copies of the
>entire file using TSM today.

Instead of switching from TSM to something like Avamar (EMC) or Puredisk
(Symantec), a TSM user can benefit from de-dupe today by using a
de-duplication backup target, such as de-dupe VTL or NAS device.  Just
make sure you realize that you won't the same de-dupe as non-TSM users.
(TSM customers who switch to a de-dupe target are seeing approximately
10:1 de-dupe ratios, where non-TSM customers are seeing 20:1.)

Most TSM users don't do repeated full backups of their filesystems, and
a lot of the duplicated data comes from those full backups.  But TSM
users still have duplicated data: multiple versions of the same file and
database backups.   You already mentioned edited versions of the same
file.  It is also common that a file will be present in multiple places.
In addition, TSM users do perform periodic full backups of their
database data.

>We recently had a pitch from EMC on avamar. I can think of some reasons
>to pass on it (Having two separate backup/restore solutions is a big
>one, cost etc) but some persuasive arguments were made supporting their
>solution.

If you like the idea of using de-dupe to backup your remote offices
(which is what Avamar and Puredisk are designed for), but want to stay
with TSM, again de-dupe targets can help.  Buy a small de-dupe target to
place at your remote site, perform TSM backups to it, then replicate the
new/unique blocks to a central location as your offsite mechanism.

>If TSM is going to be adding similar functionality soon it may
>be another reason to focus on other efforts.

Writing a de-dupe backup product isn't easy.  EMC bought Avamar and
Symantec bought Data Center Technologies to get their respective
products. I don't know of any other de-dupe companies for IBM to
acquire, so they'll have to write their own.  That may take them a bit
longer.


Re: FW: TSM Server Sizing

2007-08-26 Thread Lamb, Charles P.
Hi...

We presently have an IBM 7026-H80 6-WAY 750MHZ w/16GB memory.  IBM
3584-L32 & D32 w/14-LTO2 tape drives which are directly connected.
Total number of LTO2 tapes is 800.  Backup 1.5TB per day.  We have
almost the full TSM suite of products for IBM equipment.  We are an IBM
blue Data Center except for an EMC Symmetrix box.  Each year, we predict
an increase of 500GB of backup data per day.  Each year, we predict an
increase of 8TB of additional disk space of enterprise disk storage
systems for our multiple server type landscapes on a couple of hundred
servers.  We are implementing the other suites of SAP R/3, i.e. Data
Warehouse, Billing, etc on multiple landscapes.  We have AIX V5 and MS
servers including WmWare systems, exchange and MS SharePoint, etc.  We
are a state utility district which services all but a few counties of
the great state of Nebraska.

IBM/TSM consultant does not endorse duel port FC adapters because of I/O
restriction on 55A & D20 bus.  These dual port FC adapters can only have
a few per D20s because of load on a PCI bus.  Must provide backup &
restore service on a 24x7 basis.

We are white collar professional gals & guys which are to work 40 hours
per week, M-F, etc., however, we seem to work 60hrs or each week!!  

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
David Longo
Sent: Sunday, August 26, 2007 12:16 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] FW: TSM Server Sizing

Depending on what your main problems/issues are, I think this system
is VERY adequate, maybe even over sized.

I have TSM srever on p630  (7028-6C4) with 4x 1.45Ghz Procs and
8GB BRAM.  2-3TB on disk pool and the TSM DB on FASTT4300 Turbo.
Hav eONe 7311-D20 drawer with FC Adapters.

3584-L32 and (1) D52 Exp frame.  Have 12 LTO2 and 2 LTO3
drives.  Mostly LTO2 tapes, have 1100 tapes total onsite and
offsite.  Backup 2.5 TB per day and have 80TB onsite tape.
We do compression on tape.

And my system works fine, may upgrade my Disk pool later.

So, to your specs compared to mine.

1.  Tape.  If I went to LTO4 tapes, my 80TB (plus the same offsite)
would fit on probably 300tapes.  200TB then would need say 750 tapes,
maybe 800.  Also you shouldn't need 24 drives. And only 1 D52 frame
as 24 drives would fit on just 2 frames, base and exp and then tapes
will
definitely fit in there also.

2.  Server.  A 4 way with 8-16GB would be very adequate.  But why
need (6) D20 drawers? Even if you put 1 tape drive per FC Adapter port,
the  24 drives would need 24 ports.  There are dual port 4 GB Adapters
now, and I think can get (6) in a D20 (double check that,  not
completely
sure of IBM specs on that).  So only (2) D20 drawers would support 24
drives.

24 drives - maybe you have a lot of restore requests?  Or use HSM?

You say this is for 2010?  Let's see, they will have LTO6 or so by then
also.

3.  Other thoughts.  Is money any object?  If not, I may send you my
resume!
(A Bunch of others on this list may too!)  Unless you are direct
connecting all
devices to TSM, you also need budget for SAN switches.
And don't forget TSM licenses!  Also Disaster Recovery, High
Availability
and related items.

You didn't mention much else about environment.  How many clients, what
type.
Doing backups only or archives also.  Do you do many restores?
Planning to have multiple instances of TSM Server (My comments about
server
oversized might go away if you are planning multiple instances.)

Based on what you provided though, I think most experienced folks on
this
email list would agree with my comments, with only minor differences.
And
if you get the system you mentioned, you will be the envy of most out
there!




David B. Longo
System Administrator
Health First, Inc.
3300 Fiske Blvd.
Rockledge, FL 32955-4305
PH  321.434.5536
Pager  321.634.8230
Fax:321.434.5509
[EMAIL PROTECTED]


>>> "Lamb, Charles P." <[EMAIL PROTECTED]> 8/26/2007 11:31 AM >>>
> Hi..
> 
> We are struggling on sizing a future TSM system.  Is there any
> programs which would help on sizing a future TSM server??  We predict
> the change data of the our TSM system would see 3TB of daily data
> change in YR2010.  We predict that a library would need about 200TB of
> tape capacity.  Our off-site storage capability should also be 200TB
> of tape capacity.  Here is our shoot of sizing our TSM server.  Should
> this be the size of the TSM server??
> 
IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM
3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about
2000 LTO4 tapes on-site and off-site.  10GB & 1GB network connections.
4TB of disk cache from IBM DS4800 w/IBM SVC system.


#
This message is for the named person's use only.  It may
contain confidential, proprietary, or legally privileged
information.  No confidentiality or privilege is waived or
lost by any mistransmission.  If you receive this message
in error, please immed