On Fri, Jul 19, 2019 at 01:40:13PM -0400, Andy Lutomirski wrote:
> > On Jul 19, 2019, at 1:03 PM, Sean Christopherson 
> > <sean.j.christopher...@intel.com> wrote:
> > 
> > The generic vDSO implementation, starting with commit
> > 
> >   7ac870747988 ("x86/vdso: Switch to generic vDSO implementation")
> > 
> > breaks seccomp-enabled userspace on 32-bit x86 (i386) kernels.  Prior to
> > the generic implementation, the x86 vDSO used identical code for both
> > x86_64 and i386 kernels, which worked because it did all calcuations using
> > structs with naturally sized variables, i.e. didn't use __kernel_timespec.
> > 
> > The generic vDSO does its internal calculations using __kernel_timespec,
> > which in turn requires the i386 fallback syscall to use the 64-bit
> > variation, __NR_clock_gettime64.
> 
> This is basically doomed to break eventually, right?

Just so I'm understanding: the vDSO change introduced code to make an
actual syscall on i386, which for most seccomp filters would be rejected?

> I’ve occasionally considered adding a concept of “seccomp aliases”.  The idea 
> is that, if a filter returns anything other than ALLOW, we re-run it with a 
> different nr that we dig out it a small list of such cases. This would be 
> limited to cases where the new syscall does the same thing with the same 
> arguments.

Would that help here? The kernel just sees this a direct syscall. I
guess it could whitelist it by checking the return address?

> I want this for restart_syscall: I want to renumber it.

Oh man, don't get me started on restart_syscall. Some architectures make
it invisible to seccomp and others don't. ugh.

-- 
Kees Cook

Reply via email to