On Thu, Apr 04, 2013 at 05:56:26PM -0500, Richard Henderson wrote: > The TCG optimizer does great work when inserting constants, being able > to fold the open-coded deposit expansion to just an AND or an OR. Avoid > a bit the regression caused by having the deposit opcode by expanding > deposit of zero as an AND. > > Signed-off-by: Richard Henderson <r...@twiddle.net> > --- > tcg/ppc64/tcg-target.c | 22 ++++++++++++++++------ > 1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c > index 9583cf9..772f4ac 100644 > --- a/tcg/ppc64/tcg-target.c > +++ b/tcg/ppc64/tcg-target.c > @@ -1928,12 +1928,22 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, > const TCGArg *args, > break; > > case INDEX_op_deposit_i32: > - tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3], > - 32 - args[3] - args[4], 31 - args[3]); > + if (const_args[2]) { > + uint32_t mask = ((2u << (args[4] - 1)) - 1) << args[3]; > + tcg_out_andi32(s, args[0], args[0], ~mask); > + } else { > + tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3], > + 32 - args[3] - args[4], 31 - args[3]); > + } > break; > case INDEX_op_deposit_i64: > - tcg_out_rld(s, RLDIMI, args[0], args[2], args[3], > - 64 - args[3] - args[4]); > + if (const_args[2]) { > + uint64_t mask = ((2ull << (args[4] - 1)) - 1) << args[3]; > + tcg_out_andi64(s, args[0], args[0], ~mask); > + } else { > + tcg_out_rld(s, RLDIMI, args[0], args[2], args[3], > + 64 - args[3] - args[4]); > + } > break; > > case INDEX_op_movcond_i32: > @@ -2136,8 +2146,8 @@ static const TCGTargetOpDef ppc_op_defs[] = { > { INDEX_op_bswap32_i64, { "r", "r" } }, > { INDEX_op_bswap64_i64, { "r", "r" } }, > > - { INDEX_op_deposit_i32, { "r", "0", "r" } }, > - { INDEX_op_deposit_i64, { "r", "0", "r" } }, > + { INDEX_op_deposit_i32, { "r", "0", "rZ" } }, > + { INDEX_op_deposit_i64, { "r", "0", "rZ" } }, > > { INDEX_op_add2_i64, { "r", "r", "r", "rI", "r", "rZM" } }, > { INDEX_op_sub2_i64, { "r", "r", "rI", "r", "rZM", "r" } },
I first thought this should go into the middle end, but OTOH it will de-optimize some TCG targets which have a zero register like MIPS. In the long term I think we should allow the middle end to query the backend for constraints, and take the right decision (which would also improve constant propagation on some hosts). In the meantime it looks like the right thing to do, so: Reviewed-by: Aurelien Jarno <aurel...@aurel32.net> -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net