Nathan Lynch <nath...@linux.ibm.com> writes: > Hi Michael, > > Michael Ellerman <m...@ellerman.id.au> writes: >> Nathan Lynch <nath...@linux.ibm.com> writes: >>> The partition suspend sequence as specified in the platform >>> architecture requires that all active processor threads call >>> H_JOIN, which: >> ... >>> diff --git a/arch/powerpc/platforms/pseries/mobility.c >>> b/arch/powerpc/platforms/pseries/mobility.c >>> index 1b8ae221b98a..44ca7d4e143d 100644 >>> --- a/arch/powerpc/platforms/pseries/mobility.c >>> +++ b/arch/powerpc/platforms/pseries/mobility.c >>> @@ -412,6 +414,128 @@ static int wait_for_vasi_session_suspending(u64 >>> handle) >> ... >> >>> + >>> +static int do_join(void *arg) >>> +{ >>> + atomic_t *counter = arg; >>> + long hvrc; >>> + int ret; >>> + >>> + /* Must ensure MSR.EE off for H_JOIN. */ >>> + hard_irq_disable(); >> >> Didn't stop_machine() already do that for us? >> >> In the state machine in multi_cpu_stop(). > > Yes, but I didn't want to rely on something that seems like an > implementation detail of stop_machine(). I assumed it's benign and in > keeping with hard_irq_disable()'s intended semantics to make multiple > calls to it within a critical section.
OK. I think it's part of the contract of stop_machine() these days, but you're right hard_irq_disable() can be called multiple times, so we may as well leave it there as insurance/documentation. >>> + hvrc = plpar_hcall_norets(H_JOIN); >>> + >>> + switch (hvrc) { >>> + case H_CONTINUE: >>> + /* >>> + * All other CPUs are offline or in H_JOIN. This CPU >>> + * attempts the suspend. >>> + */ >>> + ret = do_suspend(); >>> + break; >>> + case H_SUCCESS: >>> + /* >>> + * The suspend is complete and this cpu has received a >>> + * prod. >>> + */ >>> + ret = 0; >>> + break; >>> + case H_BAD_MODE: >>> + case H_HARDWARE: >>> + default: >>> + ret = -EIO; >>> + pr_err_ratelimited("H_JOIN error %ld on CPU %i\n", >>> + hvrc, smp_processor_id()); >>> + break; >>> + } >>> + >>> + if (atomic_inc_return(counter) == 1) { >>> + pr_info("CPU %u waking all threads\n", smp_processor_id()); >>> + prod_others(); >>> + } >> >> Do we even need the counter? IIUC only one CPU receives H_CONTINUE. So >> couldn't we just have that CPU do the prodding of others? > > CPUs may exit H_JOIN due to system reset interrupt at any time, and > H_JOIN may return H_HARDWARE to a caller after other CPUs have entered > the join state successfully. In these cases the counter ensures exactly > one thread performs the prod sequence. OK. >>> + /* >>> + * Execution may have been suspended for several seconds, so >>> + * reset the watchdog. >>> + */ >>> + touch_nmi_watchdog(); >>> + return ret; >>> +} >>> + >>> +static int pseries_migrate_partition(u64 handle) >>> +{ >>> + atomic_t counter = ATOMIC_INIT(0); >>> + int ret; >>> + >>> + ret = wait_for_vasi_session_suspending(handle); >>> + if (ret) >>> + goto out; >> >> Direct return would be clearer IMHO. > > OK, I can change this. Thanks. cheers