Hi, Richard. > From: Richard Henderson <richard.hender...@linaro.org> > Sent: Saturday, July 21, 2018 8:15 PM > > On 07/19/2018 05:54 AM, Stefan Markovic wrote: > > From: Yongbok Kim <yongbok....@mips.com> > > > > Implement nanoMIPS LLWP and SCWP instruction pair. > > > > Signed-off-by: Yongbok Kim <yongbok....@mips.com> > > Signed-off-by: Aleksandar Markovic <amarko...@wavecomp.com> > > Signed-off-by: Stefan Markovic <smarko...@wavecomp.com> > > --- > > linux-user/mips/cpu_loop.c | 25 ++++++++--- > > target/mips/cpu.h | 2 + > > target/mips/helper.h | 2 + > > target/mips/op_helper.c | 35 +++++++++++++++ > > target/mips/translate.c | 107 > > +++++++++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 166 insertions(+), 5 deletions(-) > > Hmm. Well, it's ok as far as it goes, but I'd really really like to see > target/mips to be updated to use actual atomic operations. Otherwise > mips*-linux-user will never be reliable and mips*-softmmu cannot run SMP in > multi-threaded mode. > > While converting the rest of target/mips to atomic operations is perhaps out > of > scope for this patch set, there's really no reason not to do these two > instructions correctly from the start. It'll save the trouble of rewriting > them from scratch later. > > Please see target/arm/translate.c, gen_load_exclusive and gen_store_exclusive, > for the size == 3 case. That is arm32 doing a 64-bit "paired" atomic > operation, just like you are attempting here. > > Note that single-copy atomic semantics apply in both cases, so LLWP must > perform one 64-bit load, not two 32-bit loads. The store in SCWP must happen > with a 64-bit atomic cmpxchg operation.
This is our work-in-progress version: (does it look better?) static void gen_llwp(DisasContext *ctx, uint32_t base, int16_t offset, uint32_t reg1, uint32_t reg2) { TCGv taddr = tcg_temp_new(); TCGv_i64 tval = tcg_temp_new_i64(); TCGv tmp1 = tcg_temp_new(); TCGv tmp2 = tcg_temp_new(); TCGv tmp3 = tcg_temp_new(); TCGLabel *l1 = gen_new_label(); gen_base_offset_addr(ctx, taddr, base, offset); tcg_gen_andi_tl(tmp3, taddr, 0x7); tcg_gen_brcondi_tl(TCG_COND_EQ, tmp3, 0, l1); tcg_temp_free(tmp3); tcg_gen_st_tl(taddr, cpu_env, offsetof(CPUMIPSState, CP0_BadVAddr)); generate_exception(ctx, EXCP_AdES); gen_set_label(l1); tcg_gen_qemu_ld64(tval, taddr, ctx->mem_idx); tcg_gen_extr_i64_tl(tmp1, tmp2, tval); gen_store_gpr(tmp1, reg1); tcg_temp_free(tmp1); gen_store_gpr(tmp2, reg2); tcg_temp_free(tmp2); tcg_gen_st_i64(tval, cpu_env, offsetof(CPUMIPSState, llval_wp)); tcg_temp_free_i64(tval); tcg_gen_st_tl(taddr, cpu_env, offsetof(CPUMIPSState, lladdr)); tcg_temp_free(taddr); } static void gen_scwp(DisasContext *ctx, uint32_t base, int16_t offset, uint32_t reg1, uint32_t reg2) { TCGv taddr = tcg_temp_new(); TCGv lladdr = tcg_temp_new(); TCGv_i64 tval = tcg_temp_new_i64(); TCGv_i64 llval = tcg_temp_new_i64(); TCGv_i64 val = tcg_temp_new_i64(); TCGv tmp1 = tcg_temp_new(); TCGv tmp2 = tcg_temp_new(); TCGLabel *l1 = gen_new_label(); TCGLabel *lab_fail = gen_new_label(); TCGLabel *lab_done = gen_new_label(); gen_base_offset_addr(ctx, taddr, base, offset); tcg_gen_andi_tl(tmp1, taddr, 0x7); tcg_gen_brcondi_tl(TCG_COND_EQ, tmp1, 0, l1); tcg_gen_st_tl(taddr, cpu_env, offsetof(CPUMIPSState, CP0_BadVAddr)); generate_exception(ctx, EXCP_AdES); gen_set_label(l1); tcg_gen_ld_tl(lladdr, cpu_env, offsetof(CPUMIPSState, lladdr)); tcg_gen_brcond_tl(TCG_COND_NE, taddr, lladdr, lab_fail); gen_load_gpr(tmp1, reg1); gen_load_gpr(tmp2, reg2); tcg_gen_concat_tl_i64(tval, tmp1, tmp2); tcg_gen_ld_i64(llval, cpu_env, offsetof(CPUMIPSState, llval_wp)); tcg_gen_atomic_cmpxchg_i64(val, taddr, llval, tval, ctx->mem_idx, MO_64); tcg_gen_setcond_i64(TCG_COND_EQ, val, val, llval); tcg_gen_br(lab_done); gen_set_label(lab_fail); tcg_gen_movi_tl(cpu_gpr[reg2], 0); gen_set_label(lab_done); tcg_gen_movi_tl(lladdr, -1); tcg_gen_st_tl(lladdr, cpu_env, offsetof(CPUMIPSState, lladdr)); }