On Fri, Sep 12, 2003 at 08:35:01AM -0400, Dave Mangelsdorf (CBIZ Tech) wrote: > > Not sure if this is not the proper channel (forum) for this, but I need some > help. > > We have been using rsync in various ways on various platforms. > > Linux-SGI (IRIX)-MacOSX > > In all cases the actual LOCAL file transfer seems to be limited to 10MB/sec > from disk to disk. Always copy whole files. (no rolling checksums) > > Rsync ?avW > > rsync version 2.5.6 protocol version 26
Rsync is not an efficient local copy utility. It can be used for local copying but local and high-bandwidth network speed is sacrificed for low-bandwidth performance and for data integrity. The only sense in which rsync will be faster than a normal copy is in its selectiveness of what files to copy, and in many cases that can be had in ways other than rsync. Even with local copy the file checksums are still calculated. What whole-file on the receiver eliminates one pass on the baseline file to generate block sums and it saves disk-disk copy of matched data. Whole-file on the sender reduces the CPU load of hash lookups for block matching. > We have tested everything from Small I-mac, to 16 processor SGI server with > multiple fibre channel interfaces to very large disk arrays. Always tops out > at 10MB/sec, plus or minus. Each of these may have different reasons for the performance limitations. You would need to examine the system impact to identify the bottleneck. You mention the 16 CPU SGI server as though massive SMP will improve performance. It won't. Nor will fibre channel. <rant> With few exceptions disk interfaces have little impact on disk subsystem performance. The biggest factor is seek time followed (in varying order) by i/o protocol constraints, elevator control, RAID scheduling, fragmentation, buss contention, embedded logic, rotational latency, cylinder capacity, and one or two others that have slipped my mind at the moment. As a matter of fact fibre channel arrays will often perform poorer than other interconnects because their 100MBps half-duplex interconnect suffers from contention of too many drives JBODed for software RAID. </rant> Since you are comparing cp to rsync on the same system and disks that won't apply. As for the 16 CPUs... Rsync will only fork three processes for the transfer. So more than three CPU's will have little benefit on a quiescent system. I've also noted that large SMB systems often have slower CPUs than smaller systems. The three processes are the generator, the sender and the receiver. The generator is the process that walks the receiver's tree comparing it with the sender's file list. In a local copy all three are forks assuming memory is copy-on-write. The generator and the receiver share the file list. In an SMP system rsync is likely to suffer from some SMP pathologies that can make it slower on SMP than on a comparable UP system. Rsync is fairly well pipelined so the three processes will keep each other busy making it likely that they will be scheduled on separate CPUs. Their intercommunication is through pipes so the efficiency of the underlying pipe implementation will be significant. If the pipe implementation causes lots of cache invalidations or worse, TBL flushes due to page remapping rsync will suffer. Remember the copy-on-write? Depending on the precise nature of the vm system that is likely to cause a good deal of cache-line bouncing as well. Finally, we come to the issue of rsync's IO methodology. Rsync was optimised for low-bandwidth networks and portability at the expense of IO, CPU, memory. It does not take advantage of any OS IO performance enhancements. I'm sorry that you find rsync's local performance disappointing but that isn't what rsync is really for. If you do find specific enhancements that can be made that won't adversely affect portability we'd be glad to hear of them. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html