On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote: > We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then > we have kernel threads running already with TIF_FOREIGN_FPSTATE set which > could be inherited by early userspace applications (e.g, modprobe triggered > from initramfs). This could end up in the applications stuck in > do_nofity_resume() as we never clear the TIF flag, once we now know that > we don't support FP. > > Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag > for tasks which may have them set, as we would have done in the normal > case, but avoiding touching the hardware state (since we don't support any). > > Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD") > Cc: Will Deacon <w...@kernel.org> > Cc: Mark Rutland <mark.rutl...@arm.com> > Cc: Catalin Marinas <catalin.mari...@arm.com> > Signed-off-by: Suzuki K Poulose <suzuki.poul...@arm.com> > --- > arch/arm64/kernel/fpsimd.c | 26 ++++++++++++++++---------- > 1 file changed, 16 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index 37d3912cfe06..dfcdd077aeca 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct > user_fpsimd_state *st, void *sve_state, > */ > void fpsimd_restore_current_state(void) > { > - if (!system_supports_fpsimd()) > - return; > - > get_cpu_fpsimd_context(); > - > - if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { > + /* > + * For the tasks that were created before we detected the absence of > + * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via > fpsimd_thread_switch() > + * and/or could be inherited from the parent(init_task has this set). > Even > + * though userspace has not run yet, this could be inherited by the > + * processes forked from one of those tasks (e.g, modprobe from > initramfs). > + * If the system doesn't support FP/SIMD, we must clear the flag for the > + * tasks mentioned above, to indicate that the FPSTATE is clean (as we > + * can't have one) to avoid looping for ever to clear the flag. > + */ > + if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) && > + system_supports_fpsimd()) {
I'm not too keen on this approach: elsewhere we just stub out all the FPSIMD handling logic if !system_supports_fpsimd() -- I think we should be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE. Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task and this task is current() and the system supports FPSIMD at all, this task's FPSIMD state is not loaded in the cpu". I think we should ensure that any check on TIF_FOREIGN_FPSTATE is shadowed by a check on system_supports_fpsimd() somewhere. This already exists in many places -- we just need to fill in the missing ones. fpsimd_save() is a backend function that should only be called if system_supports_fpsimd(), so that should not need any check internally, but we should make sure that calls to this function are appropriately protected with in if (system_supports_fpsimd()). For other maintenance functions intended for outside callers: * fpsimd_bind_task_to_cpu() * fpsimd_bind_state_to_cpu() * fpsimd_flush_task_state() * fpsimd_save_and_flush_cpu_state() the situation is less clear. Does is make sense to call these at all if !system_supports_fpsimd()? I'm not currently sure. We could at least drop some WARN_ON() into these to check, after revieweing their callsites. > task_fpsimd_load(); > fpsimd_bind_task_to_cpu(); > } > @@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void) > */ > void fpsimd_update_current_state(struct user_fpsimd_state const *state) > { > - if (!system_supports_fpsimd()) > - return; > - > get_cpu_fpsimd_context(); > > current->thread.uw.fpsimd_state = *state; > if (system_supports_sve() && test_thread_flag(TIF_SVE)) > fpsimd_to_sve(current); Why should we do this stuff on a system that doesn't support FP? > - task_fpsimd_load(); > - fpsimd_bind_task_to_cpu(); > + if (system_supports_fpsimd()) { > + task_fpsimd_load(); > + fpsimd_bind_task_to_cpu(); > + } > > clear_thread_flag(TIF_FOREIGN_FPSTATE); [...] Not in scope for a stable fix, but: It would be interesting to try to strip out TIF_FOREIGN_FPSTATE entirely and do some benchmarks and irq latency measurements: TIF_FOREIGN_FPSTATE is just a cached copy of the wrong_task || wrong_cpu condition defined in fpsimd_thread_switch() -- That means we have to do maintenance on it all over the place to keep it in sync with the condition it represents -- this has proven to be a source of complexity and subtle bugs, as well as making the code fragile to maintain. The only point of all this is so that there is a thread flag for do_notify_resume() to check. Now that do_notify_resume() is C it would be trivial to check the real condition -- there would be a cost increase and interrupt latency increase here, but maybe not that much. This wouldn't solve the whole problem, but it might remove a layer of complexity. Cheers ---Dave