On Thu, Jul 02, 2026 at 10:12:35AM +0200, Sven Schnelle wrote: > Michal Suchánek <[email protected]> writes: > > > The return value of syscall_enter_from_user_mode is used both for the > > adjusted syscall number and the indicator that a syscall should be > > skipped. > > > > As seccomp can be invoked on any syscall, including invalid ones this > > somewhat undermines seccomp. > > > > While the seccomp variants that terminate the process do not need to > > care about this for the filter that sets the syscall return value this > > disctinction is required. > > > > Pass the syscall number as a pointer to the inline entry functions, and > > use the return value exclusively for the indication that the syscall is > > already handled. > > > > This should avoid the need for the s390 PIF_SYSCALL_RET_SET which is the > > workaround for exactly this deficiency. > > I'm not sure whether PIF_SYSCALL_RET_SET can be removed - the syscall > return might still get set by PTRACE_SET_SYSCALL_INFO when the tracee is > stopped. This might be a positive number which can't be distinguished > from a syscall number. But maybe i'm missing something? It's been quite > a while since I touched all that ptrace stuff.
When the syscall return value is set (in the registers) the return value which is also the modified syscall number is set to -1 indicating the syscall was handled. At least that's how the API is described. So yes, if the syscall number range is restricted or the syscall number is returned through a path different from the function return value the flag should not be needed in the entry path because the case can be detected through the return value alone. Thanks Michal
