Does PanFS have a tool like GPFS's tslistall? If it does you can use that to at least get a list of files to pass to TSM so TSM itself doesn't have to do a scan. If you can get mtime information you could even figure out exactly what files have changed since the last backup and just do a selective backup. Another thing you can do if your data is easily partitionable is to setup a proxy node group and only backup part of the filesystem on each node.
James R Owen wrote:
Yale uses Panasus PanFS, a massive parallel storage system, to store research data generated from HPC clusters. In considering feasibility to backup PanFS using TSM, we are concerned about whether TSM is appropriate to backup and restore: 1. very large volumes, 2. deep subdirectory hierarchy with 100's to 1000's of sublevels, 3. large numbers of files within individual subdirectories, 4. much larger numbers of files within each directory hierarchy. Are there effective maximum limits for any of the above, beyond which TSM becomes inappropriate to effectively perform backups and restores? Please advise about the feasibility and any configuration recommendation(s) to maximize PanFS backup and restore efficiency using TSM. Thanks for your help. -- jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030)
-- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S048, (206)-685-7354 -- University of Washington School of Medicine