Data Deduplication
Is TSM planning on adding data deduplication similar to avamar? I understand how TSM does not duplicate data now but minor edits in files or simple file name changes would result in additional copies of the entire file using TSM today. We recently had a pitch from EMC on avamar. I can think of some reasons to pass on it (Having two separate backup/restore solutions is a big one, cost etc) but some persuasive arguments were made supporting their solution. If TSM is going to be adding similar functionality soon it may be another reason to focus on other efforts. George Hughes Senior UNIX Engineer Children's National Medical Center 12211 Plum Orchard Dr. Silver Spring, MD 20904 (301) 572-3693 Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Re: Data Deduplication
On Aug 26, 2007, at 4:58 AM, Hughes, George wrote: Is TSM planning on adding data deduplication similar to avamar? I understand how TSM does not duplicate data now but minor edits in files or simple file name changes would result in additional copies of the entire file using TSM today. Except in Windows, where Adaptive Subfile Backup may be employed. That's as far as it has gone in the product thus far. Richard Sims
Re: Data Deduplication
But it is being pursued for future release - after the conversion of the DB to DB2. From: ADSM: Dist Stor Manager on behalf of Richard Sims Sent: Sun 8/26/2007 7:23 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Data Deduplication On Aug 26, 2007, at 4:58 AM, Hughes, George wrote: > Is TSM planning on adding data deduplication similar to avamar? I > understand how TSM does not duplicate data now but minor edits in > files > or simple file name changes would result in additional copies of the > entire file using TSM today. Except in Windows, where Adaptive Subfile Backup may be employed. That's as far as it has gone in the product thus far. Richard Sims
FW: TSM Server Sizing
> Hi.. > > We are struggling on sizing a future TSM system. Is there any > programs which would help on sizing a future TSM server?? We predict > the change data of the our TSM system would see 3TB of daily data > change in YR2010. We predict that a library would need about 200TB of > tape capacity. Our off-site storage capability should also be 200TB > of tape capacity. Here is our shoot of sizing our TSM server. Should > this be the size of the TSM server?? > IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM 3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about 2000 LTO4 tapes on-site and off-site. 10GB & 1GB network connections. 4TB of disk cache from IBM DS4800 w/IBM SVC system.
Re: FW: TSM Server Sizing
Depending on what your main problems/issues are, I think this system is VERY adequate, maybe even over sized. I have TSM srever on p630 (7028-6C4) with 4x 1.45Ghz Procs and 8GB BRAM. 2-3TB on disk pool and the TSM DB on FASTT4300 Turbo. Hav eONe 7311-D20 drawer with FC Adapters. 3584-L32 and (1) D52 Exp frame. Have 12 LTO2 and 2 LTO3 drives. Mostly LTO2 tapes, have 1100 tapes total onsite and offsite. Backup 2.5 TB per day and have 80TB onsite tape. We do compression on tape. And my system works fine, may upgrade my Disk pool later. So, to your specs compared to mine. 1. Tape. If I went to LTO4 tapes, my 80TB (plus the same offsite) would fit on probably 300tapes. 200TB then would need say 750 tapes, maybe 800. Also you shouldn't need 24 drives. And only 1 D52 frame as 24 drives would fit on just 2 frames, base and exp and then tapes will definitely fit in there also. 2. Server. A 4 way with 8-16GB would be very adequate. But why need (6) D20 drawers? Even if you put 1 tape drive per FC Adapter port, the 24 drives would need 24 ports. There are dual port 4 GB Adapters now, and I think can get (6) in a D20 (double check that, not completely sure of IBM specs on that). So only (2) D20 drawers would support 24 drives. 24 drives - maybe you have a lot of restore requests? Or use HSM? You say this is for 2010? Let's see, they will have LTO6 or so by then also. 3. Other thoughts. Is money any object? If not, I may send you my resume! (A Bunch of others on this list may too!) Unless you are direct connecting all devices to TSM, you also need budget for SAN switches. And don't forget TSM licenses! Also Disaster Recovery, High Availability and related items. You didn't mention much else about environment. How many clients, what type. Doing backups only or archives also. Do you do many restores? Planning to have multiple instances of TSM Server (My comments about server oversized might go away if you are planning multiple instances.) Based on what you provided though, I think most experienced folks on this email list would agree with my comments, with only minor differences. And if you get the system you mentioned, you will be the envy of most out there! David B. Longo System Administrator Health First, Inc. 3300 Fiske Blvd. Rockledge, FL 32955-4305 PH 321.434.5536 Pager 321.634.8230 Fax:321.434.5509 [EMAIL PROTECTED] >>> "Lamb, Charles P." <[EMAIL PROTECTED]> 8/26/2007 11:31 AM >>> > Hi.. > > We are struggling on sizing a future TSM system. Is there any > programs which would help on sizing a future TSM server?? We predict > the change data of the our TSM system would see 3TB of daily data > change in YR2010. We predict that a library would need about 200TB of > tape capacity. Our off-site storage capability should also be 200TB > of tape capacity. Here is our shoot of sizing our TSM server. Should > this be the size of the TSM server?? > IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM 3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about 2000 LTO4 tapes on-site and off-site. 10GB & 1GB network connections. 4TB of disk cache from IBM DS4800 w/IBM SVC system. # This message is for the named person's use only. It may contain confidential, proprietary, or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please immediately delete it and all copies of it from your system, destroy any hard copies of it, and notify the sender. You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. Health First reserves the right to monitor all e-mail communications through its networks. Any views or opinions expressed in this message are solely those of the individual sender, except (1) where the message states such views or opinions are on behalf of a particular entity; and (2) the sender is authorized by the entity to give such views or opinions. #
Re: Data Deduplication
>Is TSM planning on adding data deduplication similar to avamar? As mentioned by Richard, the closest thing TSM has to this now is subfile backup. It is related to de-duplication, where once it has a backup of a given file, it backs up only the changed bytes of that file. This is also referred to as delta incrementals. True de-duplication takes this much farther, as it would recognize a file or email that's duplicated on two or three different systems, such as an attachment/email that's sent to users on several different Exchange servers. The "compression" ratios it can achieve are therefore much higher than delta differentials. >I understand how TSM does not duplicate data now but minor edits in files >or simple file name changes would result in additional copies of the >entire file using TSM today. Instead of switching from TSM to something like Avamar (EMC) or Puredisk (Symantec), a TSM user can benefit from de-dupe today by using a de-duplication backup target, such as de-dupe VTL or NAS device. Just make sure you realize that you won't the same de-dupe as non-TSM users. (TSM customers who switch to a de-dupe target are seeing approximately 10:1 de-dupe ratios, where non-TSM customers are seeing 20:1.) Most TSM users don't do repeated full backups of their filesystems, and a lot of the duplicated data comes from those full backups. But TSM users still have duplicated data: multiple versions of the same file and database backups. You already mentioned edited versions of the same file. It is also common that a file will be present in multiple places. In addition, TSM users do perform periodic full backups of their database data. >We recently had a pitch from EMC on avamar. I can think of some reasons >to pass on it (Having two separate backup/restore solutions is a big >one, cost etc) but some persuasive arguments were made supporting their >solution. If you like the idea of using de-dupe to backup your remote offices (which is what Avamar and Puredisk are designed for), but want to stay with TSM, again de-dupe targets can help. Buy a small de-dupe target to place at your remote site, perform TSM backups to it, then replicate the new/unique blocks to a central location as your offsite mechanism. >If TSM is going to be adding similar functionality soon it may >be another reason to focus on other efforts. Writing a de-dupe backup product isn't easy. EMC bought Avamar and Symantec bought Data Center Technologies to get their respective products. I don't know of any other de-dupe companies for IBM to acquire, so they'll have to write their own. That may take them a bit longer.
Re: FW: TSM Server Sizing
Hi... We presently have an IBM 7026-H80 6-WAY 750MHZ w/16GB memory. IBM 3584-L32 & D32 w/14-LTO2 tape drives which are directly connected. Total number of LTO2 tapes is 800. Backup 1.5TB per day. We have almost the full TSM suite of products for IBM equipment. We are an IBM blue Data Center except for an EMC Symmetrix box. Each year, we predict an increase of 500GB of backup data per day. Each year, we predict an increase of 8TB of additional disk space of enterprise disk storage systems for our multiple server type landscapes on a couple of hundred servers. We are implementing the other suites of SAP R/3, i.e. Data Warehouse, Billing, etc on multiple landscapes. We have AIX V5 and MS servers including WmWare systems, exchange and MS SharePoint, etc. We are a state utility district which services all but a few counties of the great state of Nebraska. IBM/TSM consultant does not endorse duel port FC adapters because of I/O restriction on 55A & D20 bus. These dual port FC adapters can only have a few per D20s because of load on a PCI bus. Must provide backup & restore service on a 24x7 basis. We are white collar professional gals & guys which are to work 40 hours per week, M-F, etc., however, we seem to work 60hrs or each week!! -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of David Longo Sent: Sunday, August 26, 2007 12:16 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] FW: TSM Server Sizing Depending on what your main problems/issues are, I think this system is VERY adequate, maybe even over sized. I have TSM srever on p630 (7028-6C4) with 4x 1.45Ghz Procs and 8GB BRAM. 2-3TB on disk pool and the TSM DB on FASTT4300 Turbo. Hav eONe 7311-D20 drawer with FC Adapters. 3584-L32 and (1) D52 Exp frame. Have 12 LTO2 and 2 LTO3 drives. Mostly LTO2 tapes, have 1100 tapes total onsite and offsite. Backup 2.5 TB per day and have 80TB onsite tape. We do compression on tape. And my system works fine, may upgrade my Disk pool later. So, to your specs compared to mine. 1. Tape. If I went to LTO4 tapes, my 80TB (plus the same offsite) would fit on probably 300tapes. 200TB then would need say 750 tapes, maybe 800. Also you shouldn't need 24 drives. And only 1 D52 frame as 24 drives would fit on just 2 frames, base and exp and then tapes will definitely fit in there also. 2. Server. A 4 way with 8-16GB would be very adequate. But why need (6) D20 drawers? Even if you put 1 tape drive per FC Adapter port, the 24 drives would need 24 ports. There are dual port 4 GB Adapters now, and I think can get (6) in a D20 (double check that, not completely sure of IBM specs on that). So only (2) D20 drawers would support 24 drives. 24 drives - maybe you have a lot of restore requests? Or use HSM? You say this is for 2010? Let's see, they will have LTO6 or so by then also. 3. Other thoughts. Is money any object? If not, I may send you my resume! (A Bunch of others on this list may too!) Unless you are direct connecting all devices to TSM, you also need budget for SAN switches. And don't forget TSM licenses! Also Disaster Recovery, High Availability and related items. You didn't mention much else about environment. How many clients, what type. Doing backups only or archives also. Do you do many restores? Planning to have multiple instances of TSM Server (My comments about server oversized might go away if you are planning multiple instances.) Based on what you provided though, I think most experienced folks on this email list would agree with my comments, with only minor differences. And if you get the system you mentioned, you will be the envy of most out there! David B. Longo System Administrator Health First, Inc. 3300 Fiske Blvd. Rockledge, FL 32955-4305 PH 321.434.5536 Pager 321.634.8230 Fax:321.434.5509 [EMAIL PROTECTED] >>> "Lamb, Charles P." <[EMAIL PROTECTED]> 8/26/2007 11:31 AM >>> > Hi.. > > We are struggling on sizing a future TSM system. Is there any > programs which would help on sizing a future TSM server?? We predict > the change data of the our TSM system would see 3TB of daily data > change in YR2010. We predict that a library would need about 200TB of > tape capacity. Our off-site storage capability should also be 200TB > of tape capacity. Here is our shoot of sizing our TSM server. Should > this be the size of the TSM server?? > IBM 9133-55A w/6-IBM 7311-D20 drawers 8-WAY 1.9GHZ w/32GB of memory. IBM 3584-L32 & 3-IBM 3584-D32 library having 24-LTO4 tape drives and about 2000 LTO4 tapes on-site and off-site. 10GB & 1GB network connections. 4TB of disk cache from IBM DS4800 w/IBM SVC system. # This message is for the named person's use only. It may contain confidential, proprietary, or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please immed