> From: Peter Jeremy [mailto:peter.jer...@alcatel-lucent.com] > > I gather you are suggesting that the inode be extended to contain a > list of the inode numbers of all directories that contain a filename > referring to that inode.
Correct. > [inodes] can have up to 32767 links [to them]. Where is > this list of (up to) 32767 "parent" inodes going to be stored? Naively, I suggest storing the list of "parents" in the inode itself. Let's see if that's unreasonable. How many bytes long is an inode number? I couldn't find that easily by googling, so for the moment, I'll guess it's a fixed size, and I'll guess 64bits (8 bytes). Which means all inodes of Link Count 1 would be extended by 8 bytes, and an inode could possibly require a maximum 32767*8 = 256kbytes maximum to store all the parent inode backpointers. > Well, you need to find somewhere to store up to 32K inode numbers, > whilst having minimal space overhead for small numbers of links. I think you're saying: The number of bytes in an inode is fixed. Not variable. How many bytes is that? Would it be exceptionally difficult to extend and/or make variable? Perhaps all inodes (including files) could have a property similar to directories, where they reference a variable number of bytes written somewhere on disk (kind of like how directories reference variable sized files) and that allows the list of parent inodes to be stored in a block separate from the usual inode information. One important consideration in that hypothetical scenario would be fragmentation. If every inode were fragmented in two, that would be a real drag for performance. Perhaps every inode could be extended (for example) 32 bytes to accommodate a list of up to 4 parent inodes, but whenever the number of parents exceeds 4, the inode itself gets fragmented to store a variable list of parents. > >In which case, it would be trivially easy to walk back up the whole > >tree, almost instantly identifying every combination of paths that > >could possibly lead to this inode, while simultaneously correctly > >handling security concerns about bypassing security of parent > >directories and everything. > > Whilst it's trivially easy to get from the file to the list of > directories containing that file, actually getting from one directory > to its parent is less so: A directory containing N sub-directories has > N+2 links. Whilst the '.' link is easy to identify (it points to its > own inode), distinguishing between the name of this directory in its > parent and the '..' entries in its subdirectories is rather messy > (requiring directory scans) unless you mandate that the reference to > the parent directory is in a fixed location (ie 1st or 2nd entry in > the parent inode list). Interesting. In other words, because of the ".." entry in every subdirectory, every parent directory is linked to, not just by its parents, but also by its children. If extending inodes to include the list of "inodes that link to this inode" as I suggested, there would need to be a simple way of distinguishing which inodes in the "inodes that link to this inode" list are actually parents, and which ones are backpointers of children. I would suggest something simple, like this: The only reason to create a list of "parent inodes" is for the sake of quickly identifying the absolute path of any arbitrary inode number, so you can quickly locate all the past snaps of any arbitrary file or directory, even if that file or directory has been renamed, moved, or relocated in the directory tree. Instead of creating a list of all "inodes that link to this inode", just make it a "parent inodes" list. That is: when you create a subdirectory, even though the subdir does link back to its parent, the inode of the subdir is not stored in the parent's "parent inodes" list. Thus, the Link Count of a directory is allowed to differ from the number of inodes listed in the "parent inodes" field. All inodes listed in the "parent inodes" field would, I think, then be links to a more shallow location in the tree hierarchy. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss