On Thu, Jul 02, 2026 at 10:12:35AM +0200, Sven Schnelle wrote:
> Michal Suchánek <[email protected]> writes:
> 
> > The return value of syscall_enter_from_user_mode is used both for the
> > adjusted syscall number and the indicator that a syscall should be
> > skipped.
> >
> > As seccomp can be invoked on any syscall, including invalid ones this
> > somewhat undermines seccomp.
> >
> > While the seccomp variants that terminate the process do not need to
> > care about this for the filter that sets the syscall return value this
> > disctinction is required.
> >
> > Pass the syscall number as a pointer to the inline entry functions, and
> > use the return value exclusively for the indication that the syscall is
> > already handled.
> >
> > This should avoid the need for the s390 PIF_SYSCALL_RET_SET which is the
> > workaround for exactly this deficiency.
> 
> I'm not sure whether PIF_SYSCALL_RET_SET can be removed - the syscall
> return might still get set by PTRACE_SET_SYSCALL_INFO when the tracee is
> stopped. This might be a positive number which can't be distinguished
> from a syscall number. But maybe i'm missing something? It's been quite
> a while since I touched all that ptrace stuff.

When the syscall return value is set (in the registers) the return value
which is also the modified syscall number is set to -1 indicating the
syscall was handled. At least that's how the API is described.

So yes, if the syscall number range is restricted or the syscall number
is returned through a path different from the function return value the
flag should not be needed in the entry path because the case can be
detected through the return value alone.

Thanks

Michal

Reply via email to