On Thursday 21 April 2005 18:47, Linus Torvalds wrote:
> On Thu, 21 Apr 2005, Chris Mason wrote:
> > Shrug, we shouldn't need help from the kernel for something like this. 
> > git as a database hits worst case scenarios for almost every FS.

[ ... ]

We somewhat agree on most of this, I snipped out the parts that aren't worth 
nitpicking over.  git is really fast right now, and I'm all for throwing 
drive space at things to solve problems.  I just don't think we have to throw 
as much space at it as we are.

> The _seek_ issue is real, but git actually has a very nice architecture
> even there: not only dos it cache really really well (and you can do a
> simple "ls-tree $(cat .git/HEAD)" and populate the case from the results),
> but the low level of indirection in a git archive means that it's almost
> totally prefetchable with near-perfect access patterns.

We can sort by the files before reading them in, but even if we order things 
perfectly, we're spreading the io out too much across the drive. It works 
right now because the git archive is relatively dense.  At a few hundred MB 
when we order things properly the drive head isn't moving that much.

At 3-6 GB this hurts more.  The data gets farther apart as things age, and 
drive performance rots away.  I'll never convince you without numbers, which 
means I'll have to wait for the full load of old history and try it out ;)

-chris
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to