On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > Hi Mukesh.
> > 
> > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > > After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> > > SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> > > (Function not implemented) instead of the expected EPERM (Operation
> > > not permitted).
> > > 
> > > The issue occurs in system_call_exception() when 
> > > syscall_enter_from_user_mode()
> > > returns -1 to indicate the syscall should be skipped (e.g., blocked by 
> > > seccomp).
> > > The current code treats this -1 as a syscall number and compares it 
> > > against
> > > NR_syscalls. Since -1 is greater than NR_syscalls,
> > > the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> > > already set via syscall_set_return_value().
> > > 
> > > The generic entry code in syscall_trace_enter() calls 
> > > __secure_computing(),
> > > which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> > > that the syscall should be skipped. However, the PowerPC syscall handler
> > > was not checking for this -1 return value before validating the syscall
> > > number.
> > > 
> > > Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> > > -1 and returning the value already set in regs->gpr[3] (the errno from
> > > seccomp) before performing the syscall number validation.
> > > 
> > > Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> > > skip check to after the NR_syscalls bounds check.
> > > 
> > > When syscall -1 was passed, the r0 == -1L check would trigger before
> > > the NR_syscalls check, causing syscall_get_error() to return 0 instead
> > > of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> > > of the expected ENOSYS error.
> > > 
> > > By moving syscall_enter_from_user_mode() after the bounds check, an
> > > initial syscall number of -1 is correctly rejected with -ENOSYS first.
> > > The seccomp/ptrace skip path still works correctly for valid syscall
> > > numbers that get overridden to -1 by seccomp or ptrace.
> > > 
> > > This aligns PowerPC's behavior with other architectures using 
> > > GENERIC_ENTRY
> > > and restores correct seccomp errno handling.
> > > 
> > > Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> > > Reported-by: Michal Suchánek <[email protected]>
> > > Closes: https://lore.kernel.org/all/[email protected]/
> > > Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <[email protected]>
> > > ---
> > > 
> > > v1 -> v2:
> > >   - Fix issues in the previous fix (Michal)
> > > v1: 
> > > https://lore.kernel.org/all/[email protected]
> > > 
> > >   arch/powerpc/kernel/syscall.c | 7 ++++++-
> > >   1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > index a9da2af6efa8..36d73933a311 100644
> > > --- a/arch/powerpc/kernel/syscall.c
> > > +++ b/arch/powerpc/kernel/syscall.c
> > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs 
> > > *regs, unsigned long r0)
> > >           syscall_fn f;
> > >           add_random_kstack_offset();
> > > - r0 = syscall_enter_from_user_mode(regs, r0);
> > >           if (unlikely(r0 >= NR_syscalls)) {
> > >                   if (unlikely(trap_is_unsupported_scv(regs))) {
> > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs 
> > > *regs, unsigned long r0)
> > >                   return -ENOSYS;
> > >           }
> > > + r0 = syscall_enter_from_user_mode(regs, r0);
> > > +
> > 
> > I see many arch first do syscall_enter_from_user_mode and then check for 
> > return value.
> > take x86 for example,
> > 
> > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > {
> >         nr = syscall_enter_from_user_mode(regs, nr);
> > 
> >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != 
> > -1) {
> >                 /* Invalid system call, but still a system call. */
> >                 regs->ax = __x64_sys_ni_syscall(regs);
> >         }
> > 
> > }
> > 
> > So seccomp fails silently there if initial nr was -1?
> > 
> Hey,
> 
> No the -1 syscall ignores the error silently and returns 0.
> 

There seems to be some inconsistency with the invalid syscalls.

Adapting the example from seccomp man page to ignore architecture I get
on x86_64 (presumably with GENERIC_ENTRY since long ago):

./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print 
"ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=55 (No anode)

but on ppc64le (with GENEREC_ENTRY):

./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print 
"ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=38 (Function not implemented)

That said, behavior of seccomp on invalid syscalls is not particularly
concerning. The tools that people typically use for constructing those
filters typically require a valid syscall number.

It would be nice to align, though.

Thanks

Michal
#include <linux/audit.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <unistd.h>

	static int
install_filter(int syscall_nr, int f_errno)
{
	struct sock_filter filter[] = {
		/* [0] Load architecture from 'seccomp_data' buffer into
		   accumulator.  */
		BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
				(offsetof(struct seccomp_data, arch))),

		/* [1] Load system call number from 'seccomp_data' buffer into
		   accumulator.  */
		BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
				(offsetof(struct seccomp_data, nr))),

		/* [2] Jump forward 1 instruction if system call number
		   does not match 'syscall_nr'.  */
		BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, syscall_nr, 0, 1),

		/* [3] Matching system call: don't execute
		   the system call, and return 'f_errno' in 'errno'.  */
		BPF_STMT(BPF_RET | BPF_K,
				SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)),

		/* [4] Destination of system call number mismatch: allow other
		   system calls.  */
		BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
	};

	struct sock_fprog prog = {
		.len = sizeof(filter) / sizeof(*filter),
		.filter = filter,
	};

	if (syscall(SYS_seccomp, SECCOMP_SET_MODE_FILTER, 0, &prog)) {
		perror("seccomp");
		return 1;
	}

	return 0;
}

	int
main(int argc, char *argv[])
{
	if (argc < 4) {
		fprintf(stderr, "Usage: "
				"%s <syscall_nr> <errno> <prog> [<args>]\n"
				"\n", argv[0]);
		exit(EXIT_FAILURE);
	}

	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
		perror("prctl");
		exit(EXIT_FAILURE);
	}

	if (install_filter(strtol(argv[1], NULL, 0),
				strtol(argv[2], NULL, 0)))
		exit(EXIT_FAILURE);

	execv(argv[3], &argv[3]);
	perror("execv");
	exit(EXIT_FAILURE);
}

Reply via email to