On Jun 16, 2011, at 8:34 PM, Paul Zarnowski wrote: > At 05:59 PM 6/16/2011, Nick Laflamme wrote: >> We need to do a bake-off -- or study someone else's -- between using >> deduplication in a DataDomain box and using both client-side deduplication >> and server-side deduplication in TSM V6 and then writing to relatively >> inexpensive, relatively simple (but replicating) storage arrays. However, we >> keep pushing the limits of stability with our TSM V6 servers, so we haven't >> dared tried such a back-off yet. > > Nick, > > We are heading down this path. My analysis is that in a TSM environment, the > fairly low dedup ratio does not justify the higher price of duduping VTLs. > Commodity disk arrays have gotten very inexpensive. We're using DS3500s > which are nice building blocks. We put some behind IBM SVCs for servers, > some attached to TSM or Exchange servers (without SVC). Common technology - > different uses. We use them for both TSM DB, LOG and FILE (different RPM > disks, obviously). Using cheap disk vs VTLs has different pros and cons. > using disk allows for source-mode (client-side) dedup, which a VTL will not > do. VTLs, on the other hand, allow for global dedup pools and LAN-free > virtual tape targets. deduping VTLs will be more effective in TSM > environments where you have known duplicate data, such as lots of Oracle or > MSSQL full backups, or other cases where you have multiple full backups. For > normal progressive incremental file backups, however, TSM already does a good > job of reducing data so VTL dedup doesn't get you as much, and in this case > IMHO cheap disk is, well, cheaper and gets you source-mode dedup as well. > > We are in process of implementing this, but I know a few others are a bit > further along.
It's too bad there isn't a USA-based users group for TSM that meets annually or even semi-annually for things like user presentations and panels on topics like this. :-) I'd love to go to a "TSM Workshop" at some university campus to geek out on topics like this. Dedupe ratios are all over the place for us. We've got some in the high teens, and I already mentioned the low end, low single-digits. Part of me wishes we'd broken our library volumes out into smaller replication pools (and corresponding TSM library pools) so we could get a little more granularity on dedupe ratios, but what I really want is volume-by-volume (or file-by-file for NFS mounts? Directory-by-directory?) dedupe ratios. Adding a copy storage pool on the same DDR is cheap from a DDR point of view; it seems to dedupe great when I do that. The TSM DB size becomes the limiting factor in that case. > We will continue to use TSM Backup Stgpool to replicate offsite. We're replicating at the HW level; TSM doesn't know it. That makes me a little nervous. I forgot one thing I don't like: having specific "cleaning" cycles on the DDR is annoying. We run it two or three times a week on a couple of our DDRs to keep them from getting too high (above 80%) in utilization, but I wish it just constantly did its own garbage collection. > ..Paul Nick