On Thu, Oct 13, 2016 at 01:14:35PM -0500, Richard Henderson wrote: > On 10/12/2016 07:21 PM, David Gibson wrote: > > > +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl, > > > + TCGv_i64 inh, TCGv_i64 inl) > > > +{ > > > + TCGv_i64 hi = tcg_temp_new_i64(); > > > + TCGv_i64 lo = tcg_temp_new_i64(); > > > + > > > + tcg_gen_bswap64_i64(hi, inh); > > > + tcg_gen_bswap64_i64(lo, inl); > > > + tcg_gen_shri_i64(outh, hi, 32); > > > + tcg_gen_deposit_i64(outh, outh, hi, 32, 32); > > > + tcg_gen_shri_i64(outl, lo, 32); > > > + tcg_gen_deposit_i64(outl, outl, lo, 32, 32); > > > + > > > + tcg_temp_free_i64(hi); > > > + tcg_temp_free_i64(lo); > > > +} > > > > Is there actually any advantage to having this 128-bit operation > > working on two 64-bit "register"s, as opposed to having a bswap32x2 > > that operates on a single 64-bit register amd calling it twice? > > For this one, no particular advantage. > > > > + gen_bswap16x8(xth, xtl, xbh, xbl); > > > > Likewise for the 16x8 version, I guess, although that would mean > > changing the existing users. > > For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff. On > some hosts that's up to 6 insns. Being about to reuse that for both swaps > is useful.
Ah, good point. > > > + tcg_gen_bswap64_i64(t0, xbl); > > > + tcg_gen_bswap64_i64(xtl, xbh); > > > + tcg_gen_bswap64_i64(xth, t0); > > > > This looks wrong. You swap xbl as you move it to t0, then swap it > > again as you put it back into xth. So it looks like you'll translate > > 0011223344556677 8899AABBCCDDEEFF > > to > > 8899AABBCCDDEEFF 7766554433221100 > > whereas it should become > > FFEEDDCCBBAA9977 7766554433221100 > > Indeed, the third line should be a move, not a swap. > > > r~ > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature