Re: New procfs implementation

2010-09-01 Thread Jeremie Koenig
Hi,

On Wed, Sep 01, 2010 at 01:06:32AM +0200, Samuel Thibault wrote:
> > { "anonymous-owner", 'a', "USER", 0,
> > "Make USER the owner of files related to processes without one.  "
> > "Be aware that USER will be granted access to the environment and "
> > "other sensitive information about the processes in question.  "
> > "(default: use uid 0)" },
> 
> Which use do you envision?

You may want to add an entry to /etc/passwd (say, "noone"), used only to
distinguish the anonymous processes from those owned by root, though as
the comment suggests you would have to be careful not to use it for
anything else.

> > procfs.c
> 
> is caching data really useful for anything but directories?
> I'm afraid of all the various re-reading patterns that tools may have.

Caching data seemed the only way to ensure that you get consistent
results if you read a file in multiple runs. As for re-reading, seeking
to 0 invalidates the cached contents.[1]

Since Linux only supports reading /proc files in one run, there are
probably not many other re-reading patterns to worry about, though
arguably this could also mean that we shouldn't worry about supporting
that either.

[1] This is not enough for process files, which build their contents
based on the associated, non-refreshed proc_stat structure, though 'top'
and 'htop' seem to work fine regardless (top definitely keeps
/proc/uptime open).

> procfs.c: About 42, 2 can be a better FIXME guess for now, since that's
> the root inode in ext2fs.

Okay, changed (ebf8ede).

> procfs_make_ino: this is not handling collisions. This can pose problems
> with e.g. tar-ing /proc with hardlink management (yes, I sometimes do
> such thing). I'm afraid it might be better to just assign known major
> numbers to the various content providers (yes, that's not elegant), let
> them handle minor numbers, and combine both. That would help to fill
> d_fileno.

I considered this, but ended up implementing the current solution first,
since it was quicker and kept the interface simple.

Instead of explicitely assigning inode numbers, we could make the
probability of collision negligibly small by extending the inode numbers
to 64 bits (but is there widespread userspace support for this?).

Also, since pid_t is 32-bits wide (at least in theory), dividing a
32-bits inode namespace among processes might prove somewhat tricky and
collision-prone as well.

> netfs_*: shouldn't we check the node type?

Such checks are already done in libnetfs; if some are missing I guess it
would be better to add them there. (As for read()-ing directories, it's
allowed by ext2fs too so I figured it would be okay.)

> rootdir.c: you have #define KERNEL_PID 2 while it was made a translator
> parameter in main.c, is it on purpose?  If so, it's probably worth
> documenting.

It's not really on purpose (it was initially INIT_PID and used for
uptime only). Changed (7f3a812).

Thanks for your comments,
-- 
Jeremie Koenig 
http://jk.fr.eu.org



Why SIGLOST on read()?

2010-09-01 Thread Samuel Thibault
Hello,

screen reads /proc/pid/stat, which happened to be bogus on my box.  The
problem I have is that instead of just getting -1 (EIO) or something
similar, screen got the SIGLOST signal.  Indeed, _hurd_fd_error_signal
does so when the server disappears.

This seems quite problematic as POSIX doesn't say that read() might
raise signals (while write may indeed raise SIGPIPE): a posixly correct
application may then unexpectedly terminate due to a non-POSIX signal by
just read()ing some opened file from an odd translator.

Samuel



Re: New procfs implementation

2010-09-01 Thread Samuel Thibault
Jeremie Koenig, le Wed 01 Sep 2010 13:04:33 +0200, a écrit :
> On Wed, Sep 01, 2010 at 01:06:32AM +0200, Samuel Thibault wrote:
> > > { "anonymous-owner", 'a', "USER", 0,
> > >   "Make USER the owner of files related to processes without one.  "
> > >   "Be aware that USER will be granted access to the environment and "
> > >   "other sensitive information about the processes in question.  "
> > >   "(default: use uid 0)" },
> > 
> > Which use do you envision?
> 
> You may want to add an entry to /etc/passwd (say, "noone"), used only to
> distinguish the anonymous processes from those owned by root, though as
> the comment suggests you would have to be careful not to use it for
> anything else.

Ah, so it's really not like "nobody", that's for tasks whose owner is
yet unknown, but potentially root-owned or such, or something like this?

I don't know exactly the rules, but I feel like (uid_t) -1 might be
exactly what we need here.

> > > procfs.c
> > 
> > is caching data really useful for anything but directories?
> > I'm afraid of all the various re-reading patterns that tools may have.
> 
> Caching data seemed the only way to ensure that you get consistent
> results if you read a file in multiple runs.

Does Linux assert that? Reading 128 bytes from /proc/interrupts several
times with an lseek(-128,SEEK_CUR) provides different values. interrupts
is using the seq_* functions like most other files in /proc

> As for re-reading, seeking to 0 invalidates the cached contents.[1]

Yes, that's probably a common pattern, but you never know what people
do.

> Since Linux only supports reading /proc files in one run, there are
> probably not many other re-reading patterns to worry about, though
> arguably this could also mean that we shouldn't worry about supporting
> that either.

Yes.  It makes things quite simpler.

> > procfs_make_ino: this is not handling collisions. This can pose problems
> > with e.g. tar-ing /proc with hardlink management (yes, I sometimes do
> > such thing). I'm afraid it might be better to just assign known major
> > numbers to the various content providers (yes, that's not elegant), let
> > them handle minor numbers, and combine both. That would help to fill
> > d_fileno.
> 
> I considered this, but ended up implementing the current solution first,
> since it was quicker and kept the interface simple.
> 
> Instead of explicitely assigning inode numbers, we could make the
> probability of collision negligibly small by extending the inode numbers
> to 64 bits (but is there widespread userspace support for this?).

Well, I don't like having to neglect something :)

> Also, since pid_t is 32-bits wide (at least in theory), dividing a
> 32-bits inode namespace among processes might prove somewhat tricky and
> collision-prone as well.

Right.

Another way would be to allocate them on the fly, like Linux actually
seems to be doing.  That will however actually increase the amount of
collision too, when not keeping files open.

> > netfs_*: shouldn't we check the node type?
> 
> Such checks are already done in libnetfs;

Ok.

Samuel