On Mon, Dec 02, 2024 at 12:58:59AM +0530, Shrikanth Hegde wrote: > > > On 11/26/24 16:23, Christophe Leroy wrote: > > > > > > Le 16/11/2024 à 20:23, Shrikanth Hegde a écrit : > > > Define preempt lazy bit for Powerpc. Use bit 9 which is free and within > > > 16 bit range of NEED_RESCHED, so compiler can issue single andi. > > > > > > Since Powerpc doesn't use the generic entry/exit, add lazy check at exit > > > to user. CONFIG_PREEMPTION is defined for lazy/full/rt so use it for > > > return to kernel. > > > > FWIW, there is work in progress on using generic entry/exit for powerpc, > > if you can help testing it that can help, see https:// > > patchwork.ozlabs.org/project/linuxppc-dev/patch/ > > f0ae0a4571ce3126+20241111031934.1579-2-luming...@shingroup.cn/ > > > > I gave that series a try. After a lot of manual patching on tip tree and > removal of multiple definition of regs_irqs_disabled, i was able to compile > and boot. > > I am skimming through the series, but as far as i understand from the > comments, it needs to be redesigned. I probably get it why. I will go > through it in more detail to figure out how to do it better. I believe the > changes have to stem from interrupt_64.S
Thanks for your time for the work. when you do it, besides compile, boot, and ppc linux-ci github workflow, please also do selftest make -C tools/testing/selftest TARGETS=seccomp run_tests to make sure all secommp 98 unit tests passed. > > As far as the bits of this patch with that patch concerned, it is with > NEED_RESCHED bits. There atleast couple of major issues in that patch series > that are wrong. > > 1. It only tries to move exit to user to generic. exit to kernel is not. > there is generic irqentry_exit that handles it for generic code. powerpc > exit to kernel still there. > > 2. Even for exit to user, it ends up calling exit_to_user_mode_loop twice > for the same syscall. that seems wrong. once in > interrupt_exit_user_prepare_main and once through syscall_exit_to_user_mode > in syscall_exit_prepare. I knew this as ppc system call uses interrupt path. I planned to delete ppc speicifc interrupt_exit_user_prepare_main later for this series. Now it might be good timing to complete it as there is a real use case need this to be done. :-) > > > > Christophe > > > > So I guess, when we do eventually if move, this checks would be removed at > that point along with rest of the code. > > > > > > > > > Ran a few benchmarks and db workload on Power10. Performance is close to > > > preempt=none/voluntary. > > > Since Powerpc systems can have large core count and large memory, > > > preempt lazy is going to be helpful in avoiding soft lockup issues. > > > > > > Reviewed-by: Sebastian Andrzej Siewior <bige...@linutronix.de> > > > Reviewed-by: Ankur Arora <ankur.a.ar...@oracle.com> > > > Signed-off-by: Shrikanth Hegde <sshe...@linux.ibm.com> > > > --- > > > arch/powerpc/Kconfig | 1 + > > > arch/powerpc/include/asm/thread_info.h | 9 ++++++--- > > > arch/powerpc/kernel/interrupt.c | 4 ++-- > > > 3 files changed, 9 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > > > index 8094a01974cc..2f625aecf94b 100644 > > > --- a/arch/powerpc/Kconfig > > > +++ b/arch/powerpc/Kconfig > > > @@ -145,6 +145,7 @@ config PPC > > > select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE > > > select ARCH_HAS_PHYS_TO_DMA > > > select ARCH_HAS_PMEM_API > > > + select ARCH_HAS_PREEMPT_LAZY > > > select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64 > > > select ARCH_HAS_PTE_SPECIAL > > > select ARCH_HAS_SCALED_CPUTIME if > > > VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64 > > > diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/ > > > include/asm/thread_info.h > > > index 6ebca2996f18..2785c7462ebf 100644 > > > --- a/arch/powerpc/include/asm/thread_info.h > > > +++ b/arch/powerpc/include/asm/thread_info.h > > > @@ -103,6 +103,7 @@ void arch_setup_new_exec(void); > > > #define TIF_PATCH_PENDING 6 /* pending live patching update */ > > > #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ > > > #define TIF_SINGLESTEP 8 /* singlestepping active */ > > > +#define TIF_NEED_RESCHED_LAZY 9 /* Scheduler driven lazy > > > preemption */ > > > #define TIF_SECCOMP 10 /* secure computing */ > > > #define TIF_RESTOREALL 11 /* Restore all regs (implies > > > NOERROR) */ > > > #define TIF_NOERROR 12 /* Force successful syscall return */ > > > @@ -122,6 +123,7 @@ void arch_setup_new_exec(void); > > > #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) > > > #define _TIF_SIGPENDING (1<<TIF_SIGPENDING) > > > #define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED) > > > +#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY) > > > #define _TIF_NOTIFY_SIGNAL (1<<TIF_NOTIFY_SIGNAL) > > > #define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG) > > > #define _TIF_32BIT (1<<TIF_32BIT) > > > @@ -142,9 +144,10 @@ void arch_setup_new_exec(void); > > > _TIF_SYSCALL_EMU) > > > #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ > > > - _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ > > > - _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \ > > > - _TIF_NOTIFY_SIGNAL) > > > + _TIF_NEED_RESCHED_LAZY | _TIF_NOTIFY_RESUME | \ > > > + _TIF_UPROBE | _TIF_RESTORE_TM | \ > > > + _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL) > > > + > > > #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR) > > > /* Bits in local_flags */ > > > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/ > > > interrupt.c > > > index af62ec974b97..8f4acc55407b 100644 > > > --- a/arch/powerpc/kernel/interrupt.c > > > +++ b/arch/powerpc/kernel/interrupt.c > > > @@ -185,7 +185,7 @@ interrupt_exit_user_prepare_main(unsigned long > > > ret, struct pt_regs *regs) > > > ti_flags = read_thread_flags(); > > > while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & > > > ~_TIF_RESTORE_TM))) { > > > local_irq_enable(); > > > - if (ti_flags & _TIF_NEED_RESCHED) { > > > > > > + if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) { > > > schedule(); > > > } else { > > > /* > > > @@ -396,7 +396,7 @@ notrace unsigned long > > > interrupt_exit_kernel_prepare(struct pt_regs *regs) > > > /* Returning to a kernel context with local irqs enabled. */ > > > WARN_ON_ONCE(!(regs->msr & MSR_EE)); > > > again: > > > - if (IS_ENABLED(CONFIG_PREEMPT)) { > > > + if (IS_ENABLED(CONFIG_PREEMPTION)) { > > > /* Return to preemptible kernel context */ > > > if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) { > > > if (preempt_count() == 0) >