On Wed, 27 Jul 2022 at 13:49, Daniel Henrique Barboza <danielhb...@gmail.com> wrote: > > > > On 7/20/22 10:33, Nicholas Piggin wrote: > > ISA v2.06 adds new variations of wait, specified by the WC field. These > > are not all compatible with the prior wait implementation, because they > > add additional conditions that cause the processor to resume, which can > > cause software to hang or run very slowly. > > So I suppose this is not a new feature, but a bug fix to remediate these hangs > cause by the incompatibility of the WC field with other ISA versions. Is > that right?
That's the case. Nick has some kernel patches that make Linux use the new opcode: https://lore.kernel.org/all/20220720132132.903462-1-npig...@gmail.com/ With these applied the kernel hangs during boot if more than one CPU is present. I was able to reproduce with ppc64le_defconfig and this command line: qemu-system-ppc64 -M pseries,x-vof=on -cpu POWER10 -smp 2 -nographic -kernel zImage.pseries -no-reboot Qemu will exit (as there's no filesystem) if the test "passes", or hang during boot if it hits the bug. There's a kernel here with the patches applied in case someone else wants to test: https://ozlabs.org/~joel/zImage.pseries-v5.19-rc8-wait-v3 Tested-by: Joel Stanley <j...@jms.id.au> Because of the hang it would be best if we merged the patch as a fix sooner rather than later. Cheers, Joel > I'm explicitly asking for it because if it's a bug fix it's ok to pick it > during the freeze. Especially here, given that what you're doing is mostly > adding no-ops for conditions that we're not covering. > > > > > ISA v3.0 changed the wait opcode and removed the new variants (retaining > > the WC field but making non-zero values reserved). > > > > ISA v3.1 added new WC values to the new wait opcode, and added a PL > > field. > > > > This implements the new wait encoding and supports WC variants with > > no-op implementations, which provides basic correctness as explained in > > comments. > > > > Signed-off-by: Nicholas Piggin <npig...@gmail.com> > > --- > > v3: > > - Add EXTRACT_HELPERs > > - Reserved fields should be ignored, not trap. > > - v3.1 defines special case of reserved PL values being treated as > > a no-op when WC=2. > > - Change code organization to (hopefully) be easier to follow each > > ISA / variation. > > - Tested old wait variant with Linux e6500 boot and verify that > > gen_wait is called and takes the expected path. > > > > Thanks, > > Nick > > > > target/ppc/internal.h | 3 ++ > > target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++---- > > 2 files changed, 91 insertions(+), 8 deletions(-) > > > > diff --git a/target/ppc/internal.h b/target/ppc/internal.h > > index 2add128cd1..57c0a42a6b 100644 > > --- a/target/ppc/internal.h > > +++ b/target/ppc/internal.h > > @@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0) > > /* darn */ > > EXTRACT_HELPER(L, 16, 2); > > #endif > > +/* wait */ > > +EXTRACT_HELPER(WC, 21, 2); > > +EXTRACT_HELPER(PL, 16, 2); > > > > /*** Jump target decoding > > ***/ > > /* Immediate address */ > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c > > index 1d6daa4608..e0a835ac90 100644 > > --- a/target/ppc/translate.c > > +++ b/target/ppc/translate.c > > @@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx) > > /* wait */ > > static void gen_wait(DisasContext *ctx) > > { > > - TCGv_i32 t0 = tcg_const_i32(1); > > - tcg_gen_st_i32(t0, cpu_env, > > - -offsetof(PowerPCCPU, env) + offsetof(CPUState, > > halted)); > > - tcg_temp_free_i32(t0); > > - /* Stop translation, as the CPU is supposed to sleep from now */ > > - gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next); > > + uint32_t wc; > > + > > + if (ctx->insns_flags & PPC_WAIT) { > > + /* v2.03-v2.07 define an older incompatible 'wait' encoding. */ > > + > > + if (ctx->insns_flags2 & PPC2_PM_ISA206) { > > + /* v2.06 introduced the WC field. WC > 0 may be treated as > > no-op. */ > > + wc = WC(ctx->opcode); > > + } else { > > + wc = 0; > > + } > > + > > + } else if (ctx->insns_flags2 & PPC2_ISA300) { > > + /* v3.0 defines a new 'wait' encoding. */ > > + wc = WC(ctx->opcode); > > > The ISA seems to indicate that WC=3 is always reserved in both ISA300 and > ISA310. I believe you can check for WC=3 and gen_invalid() return right > off the bat at this point. > > > Thanks, > > > Daniel > > > > > + if (ctx->insns_flags2 & PPC2_ISA310) { > > + uint32_t pl = PL(ctx->opcode); > > + > > + /* WC 1,2 may be treated as no-op. WC 3 is reserved. */ > > + if (wc == 3) { > > + gen_invalid(ctx); > > + return; > > + } > > + > > + /* PL 1-3 are reserved. If WC=2 then the insn is treated as > > noop. */ > > + if (pl > 0 && wc != 2) { > > + gen_invalid(ctx); > > + return; > > + } > > + > > + } else { /* ISA300 */ > > + /* WC 1-3 are reserved */ > > + if (wc > 0) { > > + gen_invalid(ctx); > > + return; > > + } > > + } > > + > > + } else { > > + warn_report("wait instruction decoded with wrong ISA flags."); > > + gen_invalid(ctx); > > + return; > > + } > > + > > + /* > > + * wait without WC field or with WC=0 waits for an exception / > > interrupt > > + * to occur. > > + */ > > + if (wc == 0) { > > + TCGv_i32 t0 = tcg_const_i32(1); > > + tcg_gen_st_i32(t0, cpu_env, > > + -offsetof(PowerPCCPU, env) + offsetof(CPUState, > > halted)); > > + tcg_temp_free_i32(t0); > > + /* Stop translation, as the CPU is supposed to sleep from now */ > > + gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next); > > + } > > + > > + /* > > + * Other wait types must not just wait until an exception occurs > > because > > + * ignoring their other wake-up conditions could cause a hang. > > + * > > + * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented > > as > > + * no-ops. > > + * > > + * wc=1 and wc=3 explicitly allow the instruction to be treated as a > > no-op. > > + * > > + * wc=2 waits for an implementation-specific condition, such could be > > + * always true, so it can be implemented as a no-op. > > + * > > + * For v3.1, wc=1,2 are architected but may be implemented as no-ops. > > + * > > + * wc=1 (waitrsv) waits for an exception or a reservation to be lost. > > + * Reservation-loss may have implementation-specific conditions, so it > > + * can be implemented as a no-op. > > + * > > + * wc=2 waits for an exception or an amount of time to pass. This > > + * amount is implementation-specific so it can be implemented as a > > + * no-op. > > + * > > + * ISA v3.1 allows for execution to resume "in the rare case of > > + * an implementation-dependent event", so in any case software must > > + * not depend on the architected resumption condition to become > > + * true, so no-op implementations should be architecturally correct > > + * (if suboptimal). > > + */ > > } > > > > #if defined(TARGET_PPC64) > > @@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, > > 0x00000000, PPC_64B), > > GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207), > > #endif > > GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC), > > -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT), > > -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300), > > +/* ISA v3.0 changed the extended opcode from 62 to 30 */ > > +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT), > > +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300), > > GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW), > > GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW), > > GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW), >