>> On Sun, 9 Nov 2008 14:08:56 -0500, "Wanda Prather" <[EMAIL PROTECTED]> said:
> When you are doing reclaims of the virtual volumes, doesn't the data > that is being reclaimed from the virtual tape have to travel back > across the network to the original TSM server's buffer, then out > across the network again to the new virtual volume? Short answer: "No". The design is for a new copy volume to be created from primary volumes, and when all the data on a given to-be-reclaimed copy volume is present on newly built copy volumes, the old one goes pending. The answer gets longer, though. "sometimes", the reclaiming server decides to read from a remote volume. The Good Reason for this is when the primary volume is for some reason damaged or Unavailable. There are other times when the reclaiming server just gets an impulse and changes gears. I had a few PMRs about this, and the response was somewhat opaque. The conclusion I drew was "Oh, we found a couple of bits of bad logic, and we tuned it some". One interesting aspect of this is the changing locality of the offsite data. When you make initial copies, your offiste data is grouped by time-of-backup. When you reclaim that same data, the new offsite volumes are built by mounting one primary volume at a time, so the locality gradually comes to resemble that of the primary volumes. Collocated, perhaps? It's a side effect, but a pleasant trend. I have often wished there were a collocation setting "Do what you can, but don't have a fit about it". > What has been your experience of managing that? Do you just keep > the virtual volumes really small compared to physical media? > (assuming the target physical media is still something enormous like > LTO3) Or do you just resolve to have a really low utilization on the > target physical media? This is something I don't have a good theoretical answer for, yet. And boy howdy, I've tried. Certainly, I waste more space in reclaimable blocks, because there are two levels, at least, of reclaiming going on: the physical remote volumes, and the virtual volumes within them. Here is the answer I have used, with no particular opinion that it's theoretically sound: + Most of my remote storage access is directly to the remote tapes. I have a few clients who have tighter bottlenecks and send them to disk, but 'direct to tape' is the rule. Note that this means when I have >N streams trying to write, clients get in line and wait for a drive, round-robin style. + I have some servers storing remote volumes of 50G MAXCAP, some of 20. I haven't noted a big difference between them. Biggest theoretical basis for choosing I can come up with is the speed of round-robin on access to the remote tapes. + My biggest pain in the patoot so far comes from individual files that are much bigger than the remote volume size. I hate re-sending an initial chunk, then 4 intermediate volumes I know to be identical to the remote volumes already present, and then re-sending the tail chunk. + The other biggest pain in the patoot is that, while devices are round-robining at the remote site, the source media is allocated at the local site. This means that you can deadlock your way into a mess of 'no tapes available' if you get congested. I find this to be a metastable situation: Things go very smoothly until you hit some boundary condition, and then you have a turbulence incident which takes intense, sustained effort to resolve. > How do you know how big the "reallly big pipe" needs to be to take > care of the reclaims? This I -do- have a theoretical answer for. See above when I talked about round-robin on the remote tape drives? You want a pipe big enough to stream all the remote drives. By implication, you can stream the same count of local drives; this means that, while you may have processes waiting in line for remote access, they won't be waiting for a network-constrained bottleneck. Of course, that's easier said than done: 3592E05s are theoretically capable of what, 200M/s? I mean, that's what the brag sheet says... :) In realistic terms I get 60M sustained, 80-90M spikes. In other realistic terms, you don't often have -everything- streaming at once. So to calculate what your site would want, I suggest: + Get a Gb connection up. Run one stream. Optimize. Measure sustained bandwidth. + Multiply sustained bandwidth * number of remote drives. Attempt to get this size pipe. + Return to your cube, frustrated that they Just Don't Understand. Be happy you've got a Gb. Work to fill it 24x7. Actually, I'm lucky: I've got budget to go to about 2G this fiscal year. Woot! - Allen S. Rout