On Fri, Jul 11, 2014 at 04:43:51AM +0200, Mateusz Guzik wrote: > On Mon, Jun 23, 2014 at 07:35:23PM +0300, Konstantin Belousov wrote: > > On Mon, Jun 23, 2014 at 03:16:53PM +0200, Mateusz Guzik wrote: > > > If traversal while transition to P_INEXEC is allowed, execve dealing > > > with a setuid binary is problematic. This is more of hypothetical nature, > > > but with sufficienly long delay it could finish the syscall and start > > > opening some files, which paths would now be visible for an unprivileged > > > reader. > > > > > > That said, I propose adding a counter to struct proc which would which > > > would block execve. It would be quite similar to p_lock. > > I thought about this too. In fact, I considered using PHOLD for this. > > > > > > > > iow execve would: > > > > > > PROC_LOCK(p); > > > p->p_flag |= P_INEXEC; > > > while (p->p_execlock > 0) > > > msleep(&p->p_execlock, &p->p_mtx, PWAIT, "execlock", 0); > > > PROC_UNLOCK(p); > > > > > > And it would be mandatory for external fdp consumers to grab the counter. > > > > > > I'm tempted to add P_GETPIN which would both increase p_lock and > > > p_execlock, > > > that way the process is guaranteed not to exit and not to execve even > > > after proc lock is dropped. > > See above about PHOLD. > > > > > > > > There is a separate question if p_execlock should be renamed and > > > extended to also block any kind of credential changes. > > > > > > Then the guarantee is even stronger since we know that credentials we > > > checked against are not going to change for the duration of our > > > operations, but it is unclear if we need this. > > > > If doing separate execlock/p_lock, I think that it could be possible > > to use per-process sx lock instead of hand-rolling the counter. The > > accessors would lock sx shared, while kern_execve would take it in > > exclusive mode. > > Both patches need some cleaning up. The name 'keeplock' is no exactly > the best either. > > In both cases the same mechanism blocks both exec and exit, this can be > split if needed (p_lock would still cover exit, p_something would cover > exec). > > Here is a version with sx lock: > > http://people.freebsd.org/~mjg/patches/exec-exit-hold-wait.patch > > I'm not really happy with this. Reading foreign fdt is very rare and > this adds lock + unlock for every exec and exit. > > On the other hand mere counter version is rather simple: > > http://people.freebsd.org/~mjg/patches/exec-exit-hold-nolock.patch > > I don't have strong opinion here, but prefer the latter.
I suggest the name 'imagelock' for the beast. The nolock version requires two atomics on both entry and leave from the protected region, while sx-locked variant requires only one atomic for entry and leave. I am not sure why you decided to acquire p->p_keeplock in after the proc lock in pget(), which indeed causes the complications of dropping the proc_lock and rechecking to avoid LOR. Did you tried to add a flag to pfind*() functions to indicate that p_keeplock should be acquired, instead ?
pgpDPPmoqHYU0.pgp
Description: PGP signature