Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tom Lord
> However you're > right that the original structure proposed by Linus is too flat. That was the only point I *meant* to defend. The rest was error. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at ht

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tom Lord
> Tom, please stop this ext* filesystem bashing ;-) For one thing... yes, i'm totally embarassed on this issue. I made a late-night math error in a spec. *hopefully* would have noticed it on my own as I coded to that spec but y'all have been wonderful at pointing out my mistake to me even

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tom Lord
> [your 0:3/4:7 directory hierarchy is horked] Absolutely. Made a dumb mistake the night I wrote that spec and embarassed that I initially defended it. I had an arithmetic error. Thanks, this time, for your persistence in pointing it out. -t - To unsubscribe from this list: send the line

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tom Lord
> Yes, it really doesn't make much sense to have so big keys on the > directories. It's official... i'm blushing wildly thank you for the various replies that pointed out my thinko. That part of my spec hasn't been coded yet --- i just wrote text. It really was the silly late-night e

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tom Lord
> Using your suggested indexing method that uses [0:4] as the 1st level key and [0:3] > [4:8] as the 2nd level key, I obtain an indexed archive that occupies 159M, > where the top level contains 18665 1st level keys, the largest first level

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread duchier
Tomas Mraz <[EMAIL PROTECTED]> writes: >> Btw, if, as you indicate above, you do believe that a 1 level indexing should >> use [0:2], then it doesn't make much sense to me to also suggest that a 2 >> level >> indexing should use [0:1] as primary subkey :-) > > Why do you think so? IMHO we should

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tomas Mraz
On Thu, 2005-04-21 at 11:09 +0200, Denys Duchier wrote: > Tomas Mraz <[EMAIL PROTECTED]> writes: > > > If we suppose the maximum number of stored blobs in the order of milions > > probably the optimal indexing would be 1 level [0:2] indexing or 2 > > levels [0:1] [2:3]. However it would be necessa

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Denys Duchier
Tomas Mraz <[EMAIL PROTECTED]> writes: > If we suppose the maximum number of stored blobs in the order of milions > probably the optimal indexing would be 1 level [0:2] indexing or 2 > levels [0:1] [2:3]. However it would be necessary to do some > benchmarking first before setting this to stone.

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-21 Thread Tomas Mraz
On Wed, 2005-04-20 at 16:04 -0700, Tom Lord wrote: > I think that to a large extent you are seeing artifacts > of the questionable trade-offs that (reports tell me) the > ext* filesystems make. With a different filesystem, the > results would be very different. Tom, please stop this ext* filesy

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Denys Duchier
Tom Lord <[EMAIL PROTECTED]> writes: > Thank you for your experiment. you are welcome. > I think that to a large extent you are seeing artifacts > of the questionable trade-offs that (reports tell me) the > ext* filesystems make. With a different filesystem, the > results would be very differ

Re: chunking (Re: [ANNOUNCEMENT] /Arch/ embraces `git')

2005-04-20 Thread C. Scott Ananian
On Wed, 20 Apr 2005, Linus Torvalds wrote: What's the disk usage results? I'm on ext3, for example, which means that even small files invariably take up 4.125kB on disk (with the inode). Even uncompressed, most source files tend to be small. Compressed, I'm seeing the median blob size being ~1.6kB

Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Tom Lord
From: [EMAIL PROTECTED] Thank you for your experiment. I'm not surprised by the result but it is very nice to know that my expectations are right. I think that to a large extent you are seeing artifacts of the questionable trade-offs that (reports tell me) the ext* filesystems make. Wit

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Tomas Mraz
On Wed, 2005-04-20 at 19:15 +0200, [EMAIL PROTECTED] wrote: ... > As data, I used my /usr/src/linux which uses 301M and contains 20753 files and > 1389 directories. To compute the key for a directory, I considered that its > contents were a mapping from names to keys. I suppose if you used the blo

Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Tomas Mraz
On Wed, 2005-04-20 at 19:15 +0200, [EMAIL PROTECTED] wrote: ... > As data, I used my /usr/src/linux which uses 301M and contains 20753 files and > 1389 directories. To compute the key for a directory, I considered that its > contents were a mapping from names to keys. I suppose if you used the blo

chunking (Re: [ANNOUNCEMENT] /Arch/ embraces `git')

2005-04-20 Thread Linus Torvalds
On Wed, 20 Apr 2005, C. Scott Ananian wrote: > > I'm hoping my 'chunking' patches will fix this. This ought to reduce the > size of the object store by (in effect) doing delta compression; rsync > will then Do The Right Thing and only transfer the needed deltas. > Running some benchmarks right

Re: [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread C. Scott Ananian
On Wed, 20 Apr 2005, Petr Baudis wrote: I think one thing git's objects database is not very well suited for are network transports. You want to have something smart doing the transports, comparing trees so that it can do some delta compression; that could probably reduce the amount of data needed

Re: [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Petr Baudis
Dear diary, on Wed, Apr 20, 2005 at 12:00:36PM CEST, I got a letter where Tom Lord <[EMAIL PROTECTED]> told me that... > >From the /Arch/ perspective: `git' technology will form the > basis of a new archive/revlib/cache format and the basis > of new network transports. I think one thing git's obje

Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread duchier
Hi Tom, just as a datapoint, here is an experiment I carried out. I wanted to evaluate how much overhead is incurred by using several levels of directories to implement a discrimating index. I used the key format you specified: SHA1,SIZE As data, I used my /usr/src/linux which uses 301

Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Miles Bader
Way to go. -Miles -- Do not taunt Happy Fun Ball. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

[ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Tom Lord
`git', by Linus Torvalds, contains some very good ideas and some very entertaining source code -- recommended reading for hackers. /GNU Arch/ will adopt `git': >From the /Arch/ perspective: `git' technology will form the basis of a new archive/revlib/cache format and the basis of new network tra

[ANNOUNCEMENT] /Arch/ embraces `git'

2005-04-20 Thread Tom Lord
`git', by Linus Torvalds, contains some very good ideas and some very entertaining source code -- recommended reading for hackers. /GNU Arch/ will adopt `git': >From the /Arch/ perspective: `git' technology will form the basis of a new archive/revlib/cache format and the basis of new network tra