This patchset for the hardware transactional memory (TM) subsystem aims to avoid spending a lot of time on TM suspended mode in kernel space. It basically changes where the reclaim/recheckpoint will be executed.
The hardware is designed so once a CPU enters in transactional state it uses a footprint area to track down the loads/stores performed in transaction so it can be verified later to decide if a conflict happened due to some change done in that state by another thread. If a transaction is active in userspace and there is an exception that takes the CPU to the kernel space, the CPU moves the transaction state to suspended state but does not discard the registers (GPR,VEC,VSX,FP) from footprint area, although the memory footprint might be discarded. POWER9 has a known problem [1, 2] and does not have enough room in footprint area for several transactions to be suspended at the same time on various CPUs leading to CPU stalls. This new model, together with a future 'fake userspace suspended' implementation may workaround POWER9 hardware issue. This patchset aims to reclaim the checkpointed registers as soon as the kernel is invoked, in the beginning of the exception handlers, thus freeing room to other CPUs enter in suspended mode for a short period of time as soon as possible, avoiding too many CPUs in suspended state that can cause the CPUs to stall. The same mechanism is done on kernel exit, doing a recheckpoint as late as possible (which will reload the checkpointed registers into CPU's checkpoint area) at the exception return path. The way to achieve this goal is creating a macro (TM_KERNEL_ENTRY) which will check if userspace was in an active transaction just after getting into kernel space and reclaim the transaction if that's the case. Thus all exception handlers will call this macro as soon as possible. All exceptions should reclaim (if necessary) at this stage and only recheckpoint if the task is tagged as TIF_RESTORE_TM (i.e. was in transactional state before being interrupted), which will be done at restore_tm_state(). Hence, by allowing the CPU to be in suspended state for only a brief period it's possible to create the initial infrastructure that copes with the TM hardware limitations. This patchset was tested in different scenarios using different test suites, as the kernel selftests, OpenJDK TM tests, and htm-torture [3], in the following configuration: * POWER8/pseries LE and BE * POWER8/powernv LE * POWER9/pseries LE * POWER8/powernv LE hosting KVM guests running TM tests This patchset is based on initial work done by Cyril Bur: https://patchwork.ozlabs.org/cover/875341/ V1 patchset URL: https://patchwork.ozlabs.org/cover/969173/ Major Change from v1: * restore_tm_state() being called later at the kernel exit path, so, there is no way to replay any IRQ, which will be done with TM in suspended state. This is mostly described in the 'Recheckpoint at exit path' patch. * No neeed to force TEXASR[FS] bit explicitly. This was required because in a very specific case, TEXASR SPR was not being restored properly but MSR[TM] was set. Fixed in patch 'Do not restore TM without SPRs'. * All treclaim/trechkpoint have a WARN_ON() if not called on kernel entrance or exit path. tm_reclaim() is only called by TM_KERNEL_ENTRY and tm_recheckpoint is only called by restore_tm_state(). All the rest causes a warning. Regards, Breno [1] Documentation/powerpc/transactional_memory.txt [2] commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7 [3] https://github.com/leitao/htm_torture/ Breno Leitao (14): powerpc/tm: Reclaim transaction on kernel entry powerpc/tm: Reclaim on unavailable exception powerpc/tm: Recheckpoint when exiting from kernel powerpc/tm: Always set TIF_RESTORE_TM on reclaim powerpc/tm: Refactor the __switch_to_tm code powerpc/tm: Do not recheckpoint at sigreturn powerpc/tm: Do not reclaim on ptrace powerpc/tm: Recheckpoint at exit path powerpc/tm: Warn if state is transactional powerpc/tm: Improve TM debug information powerpc/tm: Save MSR to PACA before RFID powerpc/tm: Restore transactional SPRs powerpc/tm: Do not restore TM without SPRs selftests/powerpc: Adapt tm-syscall test to no suspend arch/powerpc/include/asm/exception-64s.h | 50 ++++ arch/powerpc/include/asm/thread_info.h | 2 +- arch/powerpc/kernel/asm-offsets.c | 4 + arch/powerpc/kernel/entry_64.S | 37 ++- arch/powerpc/kernel/exceptions-64s.S | 15 +- arch/powerpc/kernel/process.c | 242 ++++++++++-------- arch/powerpc/kernel/ptrace.c | 16 +- arch/powerpc/kernel/signal.c | 2 +- arch/powerpc/kernel/signal_32.c | 38 +-- arch/powerpc/kernel/signal_64.c | 42 ++- arch/powerpc/kernel/tm.S | 19 +- arch/powerpc/kernel/traps.c | 22 +- .../testing/selftests/powerpc/tm/tm-syscall.c | 6 - 13 files changed, 293 insertions(+), 202 deletions(-) -- 2.19.0