Fix an issue with single-stepping branch likely instructions incorrectly nullifying (in the next step) the instruction immediately following the delay slot, when the branch is not taken.
The underlying cause of the issue is the MIPS_HFLAG_BL flag is cleared when the branch is determined to be not taken and the delay-slot instruction nullified, but then mistakenly restored prior to raising the single-step exception at the conclusion of the nullified delay-slot instruction. Fix the issue then by making sure the MIPS_HFLAG_BL flag remains cleared throughout once the branch-not-taken TCG execution path is taken. This is illustrated by an excerpt from a GDB session below: (gdb) continue Continuing. Breakpoint 2, 0x8000b460 in __libc_init_array () 4: /x $ra = 0x8000b460 2: x/i $pc => 0x8000b460 <__libc_init_array+124>: sltu v0,s0,s2 (gdb) stepi 0x8000b464 in __libc_init_array () 4: /x $ra = 0x8000b460 2: x/i $pc => 0x8000b464 <__libc_init_array+128>: bnezl v0,0x8000b454 <__libc_init_array+112> 0x8000b468 <__libc_init_array+132>: lw v0,0(s1) (gdb) 0x8000b46c in __libc_init_array () 4: /x $ra = 0x8000b460 2: x/i $pc => 0x8000b46c <__libc_init_array+136>: lw ra,28(sp) (gdb) 0x8000b470 in __libc_init_array () 4: /x $ra = 0x8000b460 2: x/i $pc => 0x8000b470 <__libc_init_array+140>: lw s2,24(sp) (gdb) -- notice how $ra has not been updated. Signed-off-by: Nathan Froyd <froy...@codesourcery.com> Signed-off-by: Maciej W. Rozycki <ma...@codesourcery.com> --- Richard, On Tue, 12 Jun 2012, Richard Henderson wrote: > What about this comment, from the main body of gen_intermediate_code_internal: > > /* Execute a branch and its delay slot as a single instruction. > This is what GDB expects and is consistent with what the > hardware does (e.g. if a delay slot instruction faults, the > reported PC is the PC of the branch). */ > if (env->singlestep_enabled && (ctx.hflags & MIPS_HFLAG_BMASK) == 0) > break; > > That suggests we should not have split the lw away from the bnezl in > the first place, and thus the fix should be elsewhere. The comment stands, the problem is the MIPS_HFLAG_BL flag is mistakenly set back by a TCG instruction produced at the end of the TB (comprising both the branch and the delay-slot instruction) by `save_cpu_state' called by `gen_goto_tb' *only* when single-stepping. It is the LW instruction at 0x8000b46c that suffers from this problem, not the delay-slot LW instruction at 0x8000b468 (that was included in the branch's TB and nullified in this case, before the single-step debug exception was taken). > Even failing that, this > > > tcg_gen_movi_i32(hflags, ctx->hflags & ~MIPS_HFLAG_BMASK); > > + /* Fake saving hflags so that gen_goto_tb doesn't overwrite the > > + * hflags we saved above. */ > > + saved_hflags = ctx->saved_hflags; > > + ctx->saved_hflags = ctx->hflags; > > gen_goto_tb(ctx, 1, ctx->pc + 4); > > + ctx->saved_hflags = saved_hflags; > > seems needlessly convoluted. Does anyone think that > > /* Save and restore the state we're currently in (inside a branch-likely > delay), > while clearing that state as we exit the current TB. */ > temp_sh = ctx->saved_hflags; > temp_h = ctx->hflags; > > ctx->hflags &= ~MIPS_HFLAG_BMASK; > gen_goto_tb(...) > > ctx->saved_hflags = temp_sh; > ctx->hflags = temp_h; > > is any clearer? I.e. let gen_goto_tb do what it so very much wants to do, > which > is save hflags to env on the way out. I think the confusion here is that `hflags' is only updated (`save_cpu_state' called) by `gen_goto_tb' when single-stepping. The need to update `hflags' in all cases is the reason for the `tcg_gen_movi_i32' to `hflags' here. But this is the only place `hflags' is updated outside `save_cpu_state'. So I think for clarity we ought to just call `save_cpu_state' here as well, which will update `hflags', and also update `ctx->saved_hflags' effectively disabling the otherwise possible subsequent second call from `gen_goto_tb' (which would be suboptimal but harmless anyway as `ctx->hflags' will now have the correct value at that stage). This change therefore brings your suggestions and my observations together. I have verified that this change works as well as the original one for me and also does not cause regressions in mips-sde-elf GDB testing, -EB (o32) multilib. Please apply. Maciej qemu-mips-blikely.diff Index: qemu-git-trunk/target-mips/translate.c =================================================================== --- qemu-git-trunk.orig/target-mips/translate.c 2014-12-12 18:06:45.000000000 +0000 +++ qemu-git-trunk/target-mips/translate.c 2014-12-13 01:05:35.857530836 +0000 @@ -18466,11 +18466,25 @@ static void decode_opc(CPUMIPSState *env /* Handle blikely not taken case */ if ((ctx->hflags & MIPS_HFLAG_BMASK_BASE) == MIPS_HFLAG_BL) { int l1 = gen_new_label(); + uint32_t saved_hflags; + uint32_t bl_hflags; MIPS_DEBUG("blikely condition (" TARGET_FMT_lx ")", ctx->pc + 4); tcg_gen_brcondi_tl(TCG_COND_NE, bcond, 0, l1); - tcg_gen_movi_i32(hflags, ctx->hflags & ~MIPS_HFLAG_BMASK); + + /* Save and restore the state we're currently in (inside + a branch-likely delay), while clearing that state + as we exit the current TB. */ + saved_hflags = ctx->saved_hflags; + bl_hflags = ctx->hflags; + + ctx->hflags = bl_hflags & ~MIPS_HFLAG_BMASK; + save_cpu_state(ctx, 0); gen_goto_tb(ctx, 1, ctx->pc + 4); + + ctx->saved_hflags = saved_hflags; + ctx->hflags = bl_hflags; + gen_set_label(l1); }