On Thu, Jul 02, 2026 at 11:34:55AM +0200, Michal Suchánek wrote:
> On Thu, Jul 02, 2026 at 11:20:03AM +0530, Mukesh Kumar Chaurasiya wrote:
> > On Wed, Jul 01, 2026 at 10:29:49AM +0200, Michal Suchánek wrote:
> > > On Wed, Jul 01, 2026 at 10:01:49AM +0200, Michal Suchánek wrote:
> > > > On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> > > > > On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya 
> > > > > wrote:
> > > > > > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > > > > > Hi Mukesh.
> > > > > > > 
> > > > > > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > > 
> > > > > > > > diff --git a/arch/powerpc/kernel/syscall.c 
> > > > > > > > b/arch/powerpc/kernel/syscall.c
> > > > > > > > index a9da2af6efa8..36d73933a311 100644
> > > > > > > > --- a/arch/powerpc/kernel/syscall.c
> > > > > > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > > > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct 
> > > > > > > > pt_regs *regs, unsigned long r0)
> > > > > > > >         syscall_fn f;
> > > > > > > >         add_random_kstack_offset();
> > > > > > > > -       r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > > >         if (unlikely(r0 >= NR_syscalls)) {
> > > > > > > >                 if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > > > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct 
> > > > > > > > pt_regs *regs, unsigned long r0)
> > > > > > > >                 return -ENOSYS;
> > > > > > > >         }
> > > > > > > > +       r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > > > +
> > > > > > > 
> > > > > > > I see many arch first do syscall_enter_from_user_mode and then 
> > > > > > > check for return value.
> > > > > > > take x86 for example,
> > > > > > > 
> > > > > > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > > > > > {
> > > > > > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > > > > > 
> > > > > > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, 
> > > > > > > nr) && nr != -1) {
> > > > > > >                 /* Invalid system call, but still a system call. 
> > > > > > > */
> > > > > > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > > > > > >         }
> > > > > > > 
> > > > > > > }
> > > > > > > 
> > > > > > > So seccomp fails silently there if initial nr was -1?
> > > > > > > 
> > > > > > Hey,
> > > > > > 
> > > > > > No the -1 syscall ignores the error silently and returns 0.
> > > > > > 
> > > > > 
> > > > > There seems to be some inconsistency with the invalid syscalls.
> > > > > 
> > > > > Adapting the example from seccomp man page to ignore architecture I 
> > > > > get
> > > > > on x86_64 (presumably with GENERIC_ENTRY since long ago):
> > > > > 
> > > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); 
> > > > > print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > > ret=-1 errno=55 (No anode)
> > > > > 
> > > > > but on ppc64le (with GENEREC_ENTRY):
> > > > > 
> > > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); 
> > > > > print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > > ret=-1 errno=38 (Function not implemented)
> > > > > 
> > > > > That said, behavior of seccomp on invalid syscalls is not particularly
> > > > > concerning. The tools that people typically use for constructing those
> > > > > filters typically require a valid syscall number.
> > > > > 
> > > > > It would be nice to align, though.
> > > > 
> > > > It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
> > > > should be resolved to correctly execute seccomp even on invalid
> > > > syscalls. The syscall_enter_from_user_mode API is not particularly
> > > > well-suited for that, though.
> > > 
> > > In particular the fixup per
> > > https://lore.kernel.org/linuxppc-dev/[email protected]/
> > > 
> > > handles some cases
> > > 
> > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); 
> > > print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=55 (No anode)
> > > 
> > > but not -1
> > > 
> > > ./a.out -1 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-1, 0); 
> > > print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=38 (Function not implemented)
> > > 
> > > which is the direct result of the ambiguous return value of
> > > syscall_enter_from_user_mode
> > > 
> > > Thanks
> > > 
> > > Michal
> > Hey Michal,
> > 
> > Yeah this seems to be a more complex thing than anticipated.
> > As per conversation on your another patch here
> > https://lore.kernel.org/all/[email protected]/
> > 
> > This patch seems to be redundant at this point.
> 
> Hello,
> 
> while improving the entry API is a fine goal it will take time if
> something can be even agreed on.
> 
> In the mantime we should provide a fix using the current API.
> 
> Inability to run container workloads is a significant regression.
> 
> Thanks
> 
> Michal

Yeah, i am working on a fix for now. Will post out a new version.

Regards,
Mukesh

Reply via email to