On Thursday April 12, [EMAIL PROTECTED] wrote: > On Thu, 12 April 2007 11:46:41 +1000, Neil Brown wrote: > > > > I could argue that nfs came before ext3+dirindex, so ext3 should have > > been designed to work properly with NFS. You could argue that fixing > > it in nfsd fixes it for all filesystems. But I'm not sure either of > > those arguments are likely to be at all convincing... > > Caring about a non-ext3 filesystem, I sure would like an nfs solution as > well. :)
I have a non-ext3 filesystem I care about too..... But my perspective is that a solution in nfsd at-best a work-around. Caching the whole 'struct file' when there is just a small bit that we might want seems like a heavy hammer. The filesystem is in the best place to know what needs to be cached, and it should be the one doing the caching. > > > Hmmm. I wonder. Which is more likely? > > - That two 64bit hashes from some set are the same > > - or that 65536 48bit hashes from a set of equal size are the same. > > The former. Each bit going from hash strength to collision chain length > reduces the likelihood of an overflow. In the extreme case of a 0bit > hash and 64bit collision chain, you need 2^64 entries compared to 2^32 > for the other extreme. > > However, the collision chain gives me quite a bit of headache. One > would have to store each entry's position on the chain, deal with older > entries getting deleted, newer entries getting removed, etc. All this > requires a lot of complicated code that basically never gets tested in > the wild. This is a simple consequence of the design decision to use hashes as the search key. They aren't dense and they will collide. So the solution will be a bit fuzzy around the edges. And maybe that is an acceptable tradeoff. But the filesystem should take full responsibility for it, whether in performance or correctness :-) > > Just settling for a 64bit hash and returning -EEXIST when someone causes > a collision an creat() sounds more appealing. Directories with 4 > billion entries will cause problems, but that is hardly news to anyone. > I think you want -EFBIG or -ENOSPC. -EEXIST sounds just wrong. But there are alternatives. e.g. internal chaining. Insist on a unique 64bit hash for every file. If the hash is in use, increment and try again. On lookup, if the hash leads you to a file with the wrong name, increment and try again until you find a hole (hash value that is not stored). When you delete an entry, leave a place holder if the next hash is in use. Conversely if the next hash is not in use, delete the entry and delete the previous one if it is a place holder. Then you get 100% correct semantics and a performance hit in the face of hash collisions that is probably no worse than that which ext3 currently gets. It probably does cost you a bit of storage to store those 64bit hashes, though I suspect some clever compression can help out there (You only need one bit more than the filename when there is no chaining). You have to require 64bit cookies/fpos, but I think that today, that is a reasonable thing to require (5 years ago it might not have been). NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/