Re: Yet another base64 patch

2005-04-18 Thread David A. Wheeler
I asked: > > Does anyone know of any other issues in how git data is stored that > > might cause problems for some situations? ... Kevin said: > If git is retaining hex naming, and not moving to base64, then I don't > think what I am about to say is relevant. However, if base64 file naming > is st

Re: Yet another base64 patch

2005-04-18 Thread Kevin Smith
David A. Wheeler wrote: > Does anyone know of any other issues in how git data is stored that > might cause problems for some situations? Windows' case-insensitive/ > case-preserving model for NTFS and vfat32 seems to be enough > (since the case is preserved) so that the format should work, If gi

Re: Yet another base64 patch

2005-04-17 Thread H. Peter Anvin
Paul Dickson wrote: Since 160-bits does not go into base64 evenly anyways, what happens if you use 2^10 instead of 2^12 for the subdir names? That will be 1/4 the directories of the base64 given above. I was going to try one-character subdirs, so 2^6, but I haven't had a chance to do that since I

Re: Yet another base64 patch

2005-04-17 Thread H. Peter Anvin
David Lang wrote: note that default configs of ext2 and ext3 don't qualify as sane filesystems by this definition. Not using dir_index *IS* insane. -hpa - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at ht

Re: Yet another base64 patch

2005-04-17 Thread David A. Wheeler
I said: I'd look at some of the more constraining, yet still common cases, and make sure it worked reasonably well without requiring magic. My list would be: ext2, ext3, NFS, and Windows' NTFS (stupid short filenames, case-insensitive/case-preserving). Petr Baudis replied: I personally don't mind g

Re: Yet another base64 patch

2005-04-17 Thread Petr Baudis
Dear diary, on Sun, Apr 17, 2005 at 08:38:10AM CEST, I got a letter where "David A. Wheeler" <[EMAIL PROTECTED]> told me that... > I'd look at some of the more constraining, yet still > common cases, and make sure it worked reasonably > well without requiring magic. My list would be: > ext2, ext3,

Re: Yet another base64 patch

2005-04-17 Thread David A. Wheeler
Paul Jackson wrote: David wrote: My list would be: ext2, ext3, NFS, and Windows' NTFS (stupid short filenames, case-insensitive/case-preserving). I'm no mind reader, but I'd bet a pretty penny that what you have in mind and what Linus has in mind have no overlaps in their solution sets. Sadly, I l

Re: Yet another base64 patch

2005-04-17 Thread David A. Wheeler
I wrote: It's a trade-off, I know. Paul Jackson replied: So where do you recommend we make that trade-off? Daniel Barkalow wrote: So why do we have to be consistant? It seems like we need a standard format for these reasons: - We use rsync to interact with remote repositories, and rsync won't u

Re: Yet another base64 patch

2005-04-17 Thread Daniel Barkalow
On Sat, 16 Apr 2005, Paul Jackson wrote: > David wrote: > > It's a trade-off, I know. > > So where do you recommend we make that trade-off? So why do we have to be consistant? It seems like we need a standard format for these reasons: - We use rsync to interact with remote repositories, and rs

Re: Yet another base64 patch

2005-04-17 Thread Paul Jackson
David wrote: > My list would be: > ext2, ext3, NFS, and Windows' NTFS (stupid short filenames, > case-insensitive/case-preserving). I'm no mind reader, but I'd bet a pretty penny that what you have in mind and what Linus has in mind have no overlaps in their solution sets. Happy coding ... --

Re: Yet another base64 patch

2005-04-16 Thread David A. Wheeler
Paul Jackson wrote: David wrote: It's a trade-off, I know. So where do you recommend we make that trade-off? I'd look at some of the more constraining, yet still common cases, and make sure it worked reasonably well without requiring magic. My list would be: ext2, ext3, NFS, and Windows' NTFS (stu

Re: Yet another base64 patch

2005-04-16 Thread David Lang
On Thu, 14 Apr 2005, H. Peter Anvin wrote: Linus Torvalds wrote: Even something as simple as "ls -l" has been known to have O(n**2) behaviour for big directories. For filesystems with linear directories, sure. For sane filesystems, it should have O(n log n). note that default configs of ext2 an

Re: Yet another base64 patch

2005-04-16 Thread Paul Jackson
David wrote: > It's a trade-off, I know. So where do you recommend we make that trade-off? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe

Re: Yet another base64 patch

2005-04-16 Thread David A. Wheeler
Paul Jackson wrote: Earlier, hpa wrote: The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 characters as the hash key just like the hex version.) Later, hpa wrote: Ultimately the question is: do we care about old (broken) filesystems? I'd imagine we care a little - just not

Re: Yet another base64 patch

2005-04-15 Thread Paul Dickson
On Wed, 13 Apr 2005 21:19:48 -0700, H. Peter Anvin wrote: > Checking out the total kernel tree (time checkout-cache -a into an empty > directory): > > Cache cold Cache hot > stock 3:46.95 19.95 > base64 5:56.20 23.74 > flat2:44.13 15.68 > > It seems t

Re: Yet another base64 patch

2005-04-14 Thread Paul Jackson
Earlier, hpa wrote: > The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 > characters as the hash key just like the hex version.) Later, hpa wrote: > Ultimately the question is: do we care about old (broken) filesystems? I'd imagine we care a little - just not alot. I'd th

Re: Yet another base64 patch

2005-04-14 Thread H. Peter Anvin
Linus Torvalds wrote: On Thu, 14 Apr 2005, bert hubert wrote: It is too easy to get into a O(N^2) situation. Git may be able to deal with it but you may hurt yourself when making backups, or if you ever want to share your tree (possibly with yourself) over the network. Even something as simple as

Re: Yet another base64 patch

2005-04-14 Thread H. Peter Anvin
Linus Torvalds wrote: Even something as simple as "ls -l" has been known to have O(n**2) behaviour for big directories. For filesystems with linear directories, sure. For sane filesystems, it should have O(n log n). -hpa - To unsubscribe from this list: send the line "unsubscribe git"

Re: Yet another base64 patch

2005-04-14 Thread Linus Torvalds
On Thu, 14 Apr 2005, bert hubert wrote: > > It is too easy to get into a O(N^2) situation. Git may be able to deal with > it but you may hurt yourself when making backups, or if you ever want to > share your tree (possibly with yourself) over the network. Even something as simple as "ls -l" has

Re: Yet another base64 patch

2005-04-14 Thread bert hubert
On Thu, Apr 14, 2005 at 12:25:40PM -0700, H. Peter Anvin wrote: > >That may be true :-), but from the "front lines" I can report that > >directories with > 32000 or > 65000 entries is *asking* for trouble. There > >is a whole chain of systems that need to get things right for huge > >directories to

Re: Yet another base64 patch

2005-04-14 Thread H. Peter Anvin
bert hubert wrote: That may be true :-), but from the "front lines" I can report that directories with > 32000 or > 65000 entries is *asking* for trouble. There is a whole chain of systems that need to get things right for huge directories to work well, and it often is not that way. Specifics, plea

Re: Yet another base64 patch

2005-04-14 Thread bert hubert
On Thu, Apr 14, 2005 at 10:42:56AM -0700, Linus Torvalds wrote: > > Eh?! n_link limits the number of *subdirectories* a directory can > > contain, not the number of *entries*. > > Duh. I'm a git. That may be true :-), but from the "front lines" I can report that directories with > 32000 or > 65

Re: Yet another base64 patch

2005-04-14 Thread Linus Torvalds
On Thu, 14 Apr 2005, H. Peter Anvin wrote: > > Eh?! n_link limits the number of *subdirectories* a directory can > contain, not the number of *entries*. Duh. I'm a git. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL P

Re: Yet another base64 patch

2005-04-14 Thread H. Peter Anvin
Linus Torvalds wrote: So why is "base64" worse than the stock one? As mentioned, the "flat" version may be faster, but it really isn't an option. 32000 objects is peanuts. Any respectable source tree may hit that in a short time, and will break in horrible ways on many Linux filesystems. If it does

Re: Yet another base64 patch

2005-04-14 Thread H. Peter Anvin
Linus Torvalds wrote: I'll tell you why a flat object directory format simply isn't an option. Hint: maximum directory size. It's limited by n_link, and it's almost universally a 16-bit number on Linux (and generally artifically limited to 32000 entries). In other words, if you ever expect to have

Re: Yet another base64 patch

2005-04-14 Thread Linus Torvalds
On Wed, 13 Apr 2005, H. Peter Anvin wrote: > > Checking out the total kernel tree (time checkout-cache -a into an empty > directory): > > Cache cold Cache hot > stock 3:46.95 19.95 > base645:56.20 23.74 > flat 2:44.13 15.68 S

Re: Yet another base64 patch

2005-04-14 Thread Linus Torvalds
On Wed, 13 Apr 2005, H. Peter Anvin wrote: > > Actually, the subdirectory hack has the same effect, so you lose > regardless. Doesn't mean that you can't construct cases where the > subdirectory hack doesn't win, but I maintain that those are likely to > be artificial. I'll tell you why a f

Re: Yet another base64 patch

2005-04-13 Thread H. Peter Anvin
H. Peter Anvin wrote: Actually, the subdirectory hack has the same effect, so you lose regardless. Doesn't mean that you can't construct cases where the subdirectory hack doesn't win, but I maintain that those are likely to be artificial. That should, of course, be "... where the subdirectory

Re: Yet another base64 patch

2005-04-13 Thread H. Peter Anvin
Christopher Li wrote: But if you write a large number of random files, when htree has three levels index. htree will suffer on the effect that it dirty random block very quickly, most block get dirty only contain one or two new entries. Ext3 will choke on it due to the limited journal size. While n