Hi, Richard. On 07/23/2021 08:46 AM, Richard Henderson wrote: > On 7/20/21 11:53 PM, Song Gao wrote: >> +/* Fixed point arithmetic operation instruction translation */ >> +static bool trans_add_w(DisasContext *ctx, arg_add_w *a) >> +{ >> + TCGv Rd = cpu_gpr[a->rd]; >> + TCGv Rj = cpu_gpr[a->rj]; >> + TCGv Rk = cpu_gpr[a->rk]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + if (a->rj != 0 && a->rk != 0) { >> + tcg_gen_add_tl(Rd, Rj, Rk); >> + tcg_gen_ext32s_tl(Rd, Rd); >> + } else if (a->rj == 0 && a->rk != 0) { >> + tcg_gen_mov_tl(Rd, Rk); >> + } else if (a->rj != 0 && a->rk == 0) { >> + tcg_gen_mov_tl(Rd, Rj); >> + } else { >> + tcg_gen_movi_tl(Rd, 0); >> + } >> + >> + return true; >> +} > > Do not do all of this "if reg(n) zero" testing. > > Use a common function to perform the gpr lookup, and a small callback > function for the operation. Often, the callback function already exists > within include/tcg/tcg-op.h. > > Please see my riscv cleanup patch set I referenced vs patch 6.
I am not sure that 'riscv cleanup' patchs at: https://patchew.org/QEMU/20210709042608.883256-1-richard.hender...@linaro.org It seems that gpr_dst/gpr_src are common function to perform the gpr lookup. is that right? > >> +static bool trans_orn(DisasContext *ctx, arg_orn *a) >> +{ >> + TCGv Rd = cpu_gpr[a->rd]; >> + TCGv Rj = cpu_gpr[a->rj]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + TCGv t0 = tcg_temp_new(); >> + gen_load_gpr(t0, a->rk); >> + >> + tcg_gen_not_tl(t0, t0); >> + tcg_gen_or_tl(Rd, Rj, t0); > > tcg_gen_orc_tl. > OK. >> +static bool trans_andn(DisasContext *ctx, arg_andn *a) >> +{ >> + TCGv Rd = cpu_gpr[a->rd]; >> + TCGv Rj = cpu_gpr[a->rj]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + TCGv t0 = tcg_temp_new(); >> + gen_load_gpr(t0, a->rk); >> + >> + tcg_gen_not_tl(t0, t0); >> + tcg_gen_and_tl(Rd, Rj, t0); > > tcg_gen_andc_tl. > OK. >> +static bool trans_mul_d(DisasContext *ctx, arg_mul_d *a) >> +{ >> + TCGv t0, t1; >> + TCGv Rd = cpu_gpr[a->rd]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + t0 = get_gpr(a->rj); >> + t1 = get_gpr(a->rk); >> + >> + check_loongarch_64(ctx); > > Architecture checks go first, before you've decided the operation is a nop. > OK. >> +static bool trans_mulh_d(DisasContext *ctx, arg_mulh_d *a) >> +{ >> + TCGv t0, t1, t2; >> + TCGv Rd = cpu_gpr[a->rd]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + t0 = get_gpr(a->rj); >> + t1 = get_gpr(a->rk); >> + t2 = tcg_temp_new(); >> + >> + check_loongarch_64(ctx); >> + tcg_gen_muls2_i64(t2, Rd, t0, t1); > > If you actually supported LA32, you'd notice this doesn't compile. Are you > planning to support LA32 in the future? > No. >> +static bool trans_lu32i_d(DisasContext *ctx, arg_lu32i_d *a) >> +{ >> + TCGv_i64 t0, t1; >> + TCGv Rd = cpu_gpr[a->rd]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + t0 = tcg_temp_new_i64(); >> + t1 = tcg_temp_new_i64(); >> + >> + tcg_gen_movi_tl(t0, a->si20); >> + tcg_gen_concat_tl_i64(t1, Rd, t0); >> + tcg_gen_mov_tl(Rd, t1); > > Hmm. Better as > > tcg_gen_deposit_tl(Rd, Rd, tcg_constant_tl(a->si20), 32, 32); > OK.>> +static bool trans_lu52i_d(DisasContext *ctx, arg_lu52i_d *a) >> +{ >> + TCGv t0, t1; >> + TCGv Rd = cpu_gpr[a->rd]; >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + t0 = tcg_temp_new(); >> + t1 = tcg_temp_new(); >> + >> + gen_load_gpr(t1, a->rj); >> + >> + tcg_gen_movi_tl(t0, a->si12); >> + tcg_gen_shli_tl(t0, t0, 52); >> + tcg_gen_andi_tl(t1, t1, 0xfffffffffffffU); >> + tcg_gen_or_tl(Rd, t0, t1); > > Definitely better as > > tcg_gen_deposit_tl(Rd, Rd, tcg_constant_tl(a->si12), 52, 12); > OK. >> +static bool trans_addi_w(DisasContext *ctx, arg_addi_w *a) >> +{ >> + TCGv Rd = cpu_gpr[a->rd]; >> + TCGv Rj = cpu_gpr[a->rj]; >> + target_ulong uimm = (target_long)(a->si12); >> + >> + if (a->rd == 0) { >> + /* Nop */ >> + return true; >> + } >> + >> + if (a->rj != 0) { >> + tcg_gen_addi_tl(Rd, Rj, uimm); >> + tcg_gen_ext32s_tl(Rd, Rd); >> + } else { >> + tcg_gen_movi_tl(Rd, uimm); >> + } >> + >> + return true; >> +} > > Again, there should be a common function for all of the > two-register-immediate operations. The callback here is exactly the same as > for trans_add_w. > OK. >> +static bool trans_xori(DisasContext *ctx, arg_xori *a) >> +{ >> + TCGv Rd = cpu_gpr[a->rd]; >> + TCGv Rj = cpu_gpr[a->rj]; >> + >> + target_ulong uimm = (uint16_t)(a->ui12); > > You shouldn't need these sorts of casts. > OK. Thank you kindly help. Thanks Song Gao