On Mon, Aug 02, 2004 at 10:54:19AM -0700, Wayne Davison wrote: > On Sun, Aug 01, 2004 at 06:16:05PM -0400, Chris Shoemaker wrote: > > Attached is a patch that makes window strides constant when files are > > walked with a constant block size. In these cases, it completely > > avoids all memmoves. > > Seems like a good start to me. Here's a patch I created that also makes > these changes: > > - The map_file() function now takes the window-size directly > rather than the block-size. This lets the the caller choose > the value.
Yes, this is good, especially for file_checksum, which can now use a substantially different polic than the others. > > - Figure out an appropriate window-size for the receiver, > sender, generator, and the file_checksum() function to send to > map_file(). The modulo checks are good. Maybe there's someway they can be in one place instead of three, though. However, I can't immediately see the reason for the different min and max window sizes (3x vs. 2x and 16k vs. MAX_MAP_SIZE) > > - Also removed the (offset > 2*CHUNK_SIZE) check in map_ptr(). > (Did you leave this in for a reason?) > > - The sender now calls map_ptr() with a range of memory that > encompasses both the rolling-checksum data and the data at > last_match that we may need to reread. > > - Defined MAX_BLOCK_SIZE as a separate value from MAX_MAP_SIZE. Suggest renaming BLOCK_SIZE to MIN_BLOCK_SIZE and remvoing the report of this as the "default block-size" in the usage statement. Maybe with a comment at the #defines saying that MIN_BLOCK_SIZE can be overridden by --block-size, and that MAX_MAP_SIZE is a hint since the actual map size can sometimes be a bit larger. > > - Increased the size of MAX_MAP_SIZE. Makes sense. Does it still make sense to limit maximum allowed --block-size? Afterall, won't the modulo checking always give a map that's big enough, and should also avoid the pathological memmoves when block-sizes are large? > > I think this should improve several things. Comments? I think improvements on this vein are theoretically sound, but I'm struggling to measure any "real-world" performance increase. I have some pretty wimpy h/w, though. On the flip side, I'm not aware of any tests we have to prevent performance regressions. Perhaps some optional (since not all h/w would handle them) performance tests would serve both purposes. -chris > > ..wayne.. > -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html