:Thank you all for the answers. : :.. :A lot of impact also produced by rm -rf of old backups. I assume that :low performance is also related to a large numbers of hardlinks. There :was a moment when I had ~15 backups hardlinked by rsync, and rm -rf of
Yes, hardlinked backups pretty much destroy performance, mainly because it destroys all locality of reference on the storage media when files are slowly modified and get their own copies, mixed with other 'old' files which have not been modified. But theoretically that should only effect the backup target storage and not the server's production storage. Here is what I would suggest: Move the backups off the production machine and onto another totally separate machine, then rsync between the two machines. That will solve most of your problems I think. If the backup disk is a single drive then just use a junk box lying around somewhere for your backup system with the disk installed in it. -- The other half of the problem is the stat()ing of every single file on the production server (whether via local rsync or remote rsync). If your original statement is accurate and you have in excess of 11 million files then the stat()ing will likely force the system vnode cache on the production system to cycle, whether it has a max of 100,000 or 500,000... doesn't matter, it isn't 11 million so it will cycle. This in turn will tend to cause the buffer and VM page caches (which are linked to the vnode cache) to get blown away as well. The vnode cache should have code to detect stat() style accesses and avoid blowing away unrelated cached vnodes which have cached data associated with them, but it's kinda hit-or-miss how well that works. It is very hard to tune those sorts of algorithms and when one is talking about a inode:cache ratio of 22:1 even a good algorithm will tend to break down. Generally speaking when caches become inefficient server throughput goes to hell. You go from e.g. 10uS to access a file to 6mS to access a file, a 1:600 loss. :May be it is possible to increase disk performance somehow? Server has :a lot of memory. At this time vfs.ufs.dirhash_maxmem = 67108864 (max :monitored value for vfs.ufs.dirhash_mem was 52290119) and :kern.maxvnodes = 500000 (max monitored value for vfs.numvnodes was :450567). Can increasing of these (or other) sysctls help? I ask :because (as you can see) these tunables are already incremented, and I :am not sure further increment really makes sense. I'm not sure how this can be best dealt with in FreeBSD. If you are using ZFS it should be possible to localize or cache the meta-data associated with those 11 million+ files in some very fast storage (i.e. like a SSD). Doing so will make the stat() portion of the rsync go very fast (getting it over with as quickly as possible). With UFS the dirhash stuff only caches the directory entries, not the inode contents (though I'm not 100% positive on that), so it won't help much. The directory entries are already linear and unless you have thousands of files in each directory ufs dirhash will not save much in the way of I/O. :Also, is it possible to limit disk operations for rm -rf somehow? The :only idea I have at the moment is to replace rm -rf with 'find | :slow_down_script | xargs rm' (or use similar patch as for rsync)... No, unfortunately there isn't much you can do about this due to the fact that the files are hardlinked, other than moving the backup storage entirely off the production server or otherwise determining why disk I/O to the backup storage is effecting your primary storage and hacking a fix. The effect could be indirect... the accesses to the backup storage are blowing away the system caches and causing the production storage to get overloaded with I/O. I don't think there is an easy solution other than to move the work off the production server entirely. :And also, maybe there are other ways to create incremental backups :instead of using rsync/hardlinks? I was thinking about generating :list of changed files with own script and packing it with tar, but I :did not find a way to remove old backups with such an easy way as it :is with hardlnks.. : :Thanks in advance! :... :-- :// cronfy Yes. Use snapshots. ZFS is probably your best bet here in FreeBSDland as ZFS not only has snapshots it also has a streaming backup feature that you can use to stream changes from one ZFS filesystem (i.e. on your production system) to another (i.e. on your backup system). Both the production system AND the backup system would have to be running ZFS to make proper use of the feature. But before you start worrying about all of that I suggest taking the first step, which is to move the backups entirely off the production system. There are many ways to handle LAN backups. My personal favorite (which doesn't help w/ the stat problem but which is easy to set up) is for the backup system to NFS mount the production system and periodically 'cpdup' the production system's filesystems over to the backup system. Then create a snapshot (don't use hardlinks), and repeat. As a fringe benefit the backup system does not have to rely on backup management scripts running on the production system... i.e. the production system can be oblivious to the mechanics of the backup. And with NFS's (NFSv3 here) rdirplus scanning the production filesystem via NFS should go pretty quickly. It is possible for files to be caught mid-change but also fairly easy to detect the case if it winds up being a problem. And, of course, more sophisticated methodologies can be built on top. -Matt Matthew Dillon <dil...@backplane.com> _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"