Richard Henderson <r...@twiddle.net> writes: > On 07/23/2016 02:14 PM, Nikunj A Dadhania wrote: >> Adding following instructions: >> >> moduw: Modulo Unsigned Word >> modsw: Modulo Signed Word >> >> Signed-off-by: Nikunj A Dadhania <nik...@linux.vnet.ibm.com> >> --- >> target-ppc/helper.h | 2 ++ >> target-ppc/int_helper.c | 15 +++++++++++++++ >> target-ppc/translate.c | 19 +++++++++++++++++++ >> 3 files changed, 36 insertions(+) >> >> diff --git a/target-ppc/helper.h b/target-ppc/helper.h >> index 1f5cfd0..76072fd 100644 >> --- a/target-ppc/helper.h >> +++ b/target-ppc/helper.h >> @@ -41,6 +41,8 @@ DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_1(popcntw, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_2(cmpb, TCG_CALL_NO_RWG_SE, tl, tl, tl) >> +DEF_HELPER_FLAGS_2(modsw, TCG_CALL_NO_RWG_SE, i32, i32, i32) >> +DEF_HELPER_FLAGS_2(moduw, TCG_CALL_NO_RWG_SE, i32, i32, i32) >> DEF_HELPER_3(sraw, tl, env, tl, tl) >> #if defined(TARGET_PPC64) >> DEF_HELPER_FLAGS_1(cntlzd, TCG_CALL_NO_RWG_SE, tl, tl) >> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c >> index 7445376..631e0b4 100644 >> --- a/target-ppc/int_helper.c >> +++ b/target-ppc/int_helper.c >> @@ -139,6 +139,21 @@ uint64_t helper_divde(CPUPPCState *env, uint64_t rau, >> uint64_t rbu, uint32_t oe) >> >> #endif >> >> +uint32_t helper_modsw(uint32_t rau, uint32_t rbu) >> +{ >> + int32_t ra = (int32_t) rau; >> + int32_t rb = (int32_t) rbu; >> + >> + if ((rb == 0) || (ra == INT32_MIN && rb == -1)) { >> + return 0; >> + } >> + return ra % rb; >> +} >> + >> +uint32_t helper_moduw(uint32_t ra, uint32_t rb) >> +{ >> + return rb ? ra % rb : 0; >> +} > > I think, like you, I got distracted by the current div implementation in ppc. > I've just re-read the spec and seen the "undefined" language. Which of > course > gives us much more freedom. > > With this freedom, we can do the division inline, without branches. Please > see > target-mips/translate.c, gen_r6_muldiv. > > Basically, we check for the offending cases and modify the divisor prior to > the > division. For unsigned: > > a / (b == 0 ? 1 : b)
Modulo case: a % (b == 0 ? 1 : b) tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]); tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]); tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t1, 0); tcg_gen_movi_i32(t3, 0); tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1); tcg_gen_remu_i32(t3, t0, t1); tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3); > For signed: > > a / ((a == INT_MAX & b == -1) | (b == 0) ? : b) Modulo case: a % ((a == INT_MAX & b == -1) | (b == 0) ? 1 : b) tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]); tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]); tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t0, INT_MIN); tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, -1); tcg_gen_and_i32(t2, t2, t3); tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, 0); tcg_gen_or_i32(t2, t2, t3); tcg_gen_movi_i32(t3, 0); tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1); tcg_gen_rem_i32(t3, t0, t1); tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3); I think you were suggesting something like above? For "div[wd]o." we will have further cases to implement overflow. Regards, Nikunj