On 08/06/2012 04:12 PM, Arbogast, Warren K wrote:
There is a Linux fileserver here that serves web content. It has 21 million files in one filesystem named /ip. There are over 4,500 directories at the second level of the filesystem. The server is running the 6.3.0.0 client, and has 2 virtual cpus and 16 GB of RAM. Resourceutilization is set to 10, and currently there are six client sessions running. I am looking for ways to accelerate the backup of this server since currently it never ends.
The filesystem is NFS mounted so a journal based backup won't work. Recently, we added four proxy agents, and are splitting up the one big filesystem among them using include/exclude statements. Here is one of the agent's include/exclude files.
exclude /ip/[g-z]*/.../* include /ip/[a-f]*/.../*
... You say "proxy agents", but it's not clear what you mean by this.
__Since we added the proxies the proxy backups are copying many thousands of files, as if this were the first backup of the server as a whole. Is that expected behavior?
I see two possible arrangements you might have implemented. Possibility one: where you had BIGFS-NODE which was taking a long time, you now have BIGFS-NODEAF with the config above, and BIGFS-NODEGL which is defined to include G through L, etc. In this case, each BIGFS subnode will need to re-back-up its initial incremental, and thereafter you should see normal change rates. I will note that this is unlikely to accelerate your wall-clock time; if you've got resourceutilization 10, you've probably got 5+ threads walking the FS, you've probably moved your bottleneck to IOPS on your NAS as it tries to pull the metadata to satisfy the FS walk. 20 threads won't do that faster. Possibility two: you have four agents with different include/exclude statements but using the same BIGFS node. In this case you are running a charlie-foxtrot formation. One process is backing stuff up while another is expiring the very same stuff. If you're doing this, you should stop it immediately, go back to one agent, and complete an incremental, because you have your backups in an indeterminate state.
__Recently, the TSM server database is growing faster than it usually does, and I'm wondering whether there could be any correlation between the ultra long running backup, many thousands of files copied, and the faster pace of the database growth.
This symptom is what makes me think you're doing the latter. If one process is adding stuff while another throws it away, the result is a rapidly growing tail of inactive versions, capped by (I think) VERDELETED. I don't know off the top of my head if an excluded file is retained at VEREXISTS or VERDELETED. Interesting. - Allen S. Rout