On Tue, May 11, 2010 at 7:27 AM, Stefan Sperling <s...@elego.de> wrote: > On Tue, May 11, 2010 at 01:36:26AM +0200, Johan Corveleyn wrote: >> As I understand your set of patches, you're mainly focusing on saving >> cpu cycles, and not on avoiding I/O where possible (unless I'm missing >> something). Maybe some of the low- or high-level algorithms in the >> back-end can be reworked a bit to reduce the amount of I/O? Or maybe >> some clever caching can avoid some file accesses? > > In general, I think trying to work around I/O slowness by loading > stuff into RAM (caching) is a bad idea. You're just taking away memory > from the OS buffer cache if you do this. A good buffer cache in the OS > should make open/close/seek fast. (So don't run a windows server if > you can avoid it.) > > The only point where it's worth thinking about optimizing I/O > access is when you get to clustered, distributed storage, because > at that point every I/O request translated into a network packet.
You had me until that last part. I think we should ALWAYS be thinking about optimizing I/O. I have little doubt that is where the biggest performance bottlenecks live (other than network of course). I agree that making a big cache is probably not the best way to go, but I think we should always be looking for optimizations where we avoid repeated open/closes that are not necessary. I think it is extremely common that our customers have their repositories on NFS-mounted or SAN storage. While these often have fast disk subsystems there is still a noticeable penalty for file opens. Have you looked at Blair's wiki before? http://www.orcaware.com/svn/wiki/Server_performance_tuning_for_Linux_and_Unix -- Thanks Mark Phippard http://markphip.blogspot.com/