(many months pass...)

I've recently been doing something a bit related:

While I've been refitting my old Linux 9P server, I decided to altered it to 
use openat(2), and the other fd-relative *at calls. It used to keep an absolute 
pathname string with each FID, so would start a file system walk from the root 
directory on every access. If directories were moved or renamed, things broke. 
But openat can help with that:
On Linux, 
newfd = openat (oldfd, relpath, O_PATH)
effectively does a Twalk. newfd is not an open file, it's a stable reference to 
the file itself - much like the current directory in a process. Or, indeed, 
like a FID. If it's a directory, you can then open it with
dirfd = openat (fd, ".", OREAD);
but if it's a file you can't! You can stat an fd obtained with O_PATH like this:
fstatat (fd, "", &stat, AT_EMPTY_PATH);
but that doesn't work for opening files:
openat (fd, "", OREAD)
doesn't work. There's no O_EMPTYPATH flag for open.
Fortunately:
open ("/proc/self/fd/<fd#>", OREAD)
*d****oes* work. This effectively implements Topen.
Indeed, as hinted on the man page, the /proc/self/fd virtual directory on Linux 
seems to be able to do pretty much everything the openat calls do, without 
requiring the openat calls - you can use O_PATH with plain open(2). This is 
somewhat analogous to the devdup (#d) ideas that were discussed here last April.
There are two things going on here: O_PATH gives you a way to get a file 
descriptor for a file without actually opening it, and openat (or 
/proc/self/fd) lets you start your walk from any directory rather than just . 
or /.

To use this I added a data structure somewhat like the one exportfs uses: each 
fid points at the leaf of a ref-counted tree of path elements. As my server 
walks the path, it now keeps an O_PATH file descriptor at every step, in 
addition to the element name, so it can maintains stable references  and still 
"get dot dot right". v9fs (the Linux 9P client in the kernel) keeps a trail of 
FIDs like this too, so it doesn't need to Twalk "..", but I think the Plan9 
kernel (devmnt?) keeps just one FID for a cwd, so has to Twalk ".."

Tremove (and Twstat with a new name) are difficult (impossible?) to implement 
properly on Linux, because Plan 9 files have a single name, whereas on Linux 
they can have several names (hard links). Probably the best we can do is to 
remember the name used to get to a file so we have an old name to give to 
Linux. Renameat unfortunately doesn't make it possible to refer to a file to 
rename by an O_PATH reference - and indeed in the presence of hard links, it 
wouldn't identify which link to change - unless it remembered how it got there.
I've not been able to think of a way to implement Tremove without either a race 
condition, or risking removing the wrong file. The Linux API (or my Linux-fu) 
seems to fall just short of making this possible.
For my 9P server it's now at least only the file to be renamed or removed that 
there's a problem here; if an ancestor directory is moved it remains stable.

So I suppose I almost found a use for openat(2)... :)

Glibc on Linux converts open(2) calls into openat(2) calls, but it's still 
possible to call open(2) using the syscall(2) mechanism. Linux has 
open_by_handle_at which is similar to the freebsd fhopen, apparently. I think 
they're there mostly for NFS. I haven't really explored whether they have a 
role to play here.
------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T675e737e776e5a9c-Mdb5ecf364ba9f0735082d1bc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to