Hello, Segher,

On Apr 23, 2020, Segher Boessenkool <seg...@kernel.crashing.org> wrote:

> On Thu, Apr 23, 2020 at 05:08:55AM -0300, Alexandre Oliva wrote:

>> I wasn't sure simplify_gen_subreg might possibly emit any code in
>> obscure cases,

> It never does, it just returns an RTX.  Where would it emit the insns to?

I'm pretty sure there are or were subreg-extraction functions that could
emit new insns onto whatever insn stream was active in certain unusual
circumstances; I wasn't sure whether this was one of them.

> since all the top bits are zeros always, it will always be a subnormal
> number, so all comparisons will work as expected / wanted.

*nod*, as long as there's no trapping on subnormals.


>> Yet another issue is that the test assumed the mmfs bits not extracted
>> by mffsl are all zero.

> All those extra bits are required to be set to 0.

Not in the mffs output, no.  See, at this point I'm not talking about
the emulation, but about the mffs use in the test proper.

Consider the case of emulated mffsl: we end up comparing the mffs output
value with a masked version thereof.  If any of the mffs output bits
could possibly be nonzero, the compare would fail, not because the
emulation is wrong, but because the test is comparing the output of
mffsl with that of mffs, with all its bits.

In my testing, mffs output zero in this program, so it worked, but are
all the non-mffsl bits of FPSCR guaranteed to be zero, so the output of
mffs doesn't have to be masked out by the test?

What if some future CPU defines some of the reserved bits in FPSCR in a
way that they should be nonzero?

(And, could the ISA of such a future CPU possibly specify that mffsl
should preserve those extra bits from FPSCR, rather than zero them out?)


>> for  gcc/ChangeLog
>> 
>> * gcc/config/rs6000/rs6000.md (rs6000_mffsl): Copy result to
>> output operand in emulation.  Simplify.

> Indent with a tab please.

Erhm, what I posted had TABs there.  Did it get mangled? :-(


>> +      rtx tmp_df = operands[0];

> Please don't reuse pseudos

Aah, I see, that's what the original tmp_di was trying to accomplish!

I'm familiar with that general guidance, but I didn't think there's much
for optimizers to do with an mffs output, and the predicate ensures we
have a reg, but...  I suppose additional pseudos make the subregging
safer, and not adding extra uses for operands[0] might help with
register allocation for it, so...  How about this, to also avoid reusing
tmpdi as anddi input and output?


      rtx tmp_df = gen_reg_rtx (DFmode);

      /* The mffs instruction reads the entire FPSCR.  Emulate the mffsl
         instruction using the mffs instruction and masking off the bits
         the mmsl instruction actually reads.  */
      emit_insn (gen_rs6000_mffs (tmp_df));

      rtx tmp_df_as_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
      rtx tmp_di = gen_reg_rtx (DImode);
      emit_insn (gen_anddi3 (tmp_di, tmp_df_as_di, GEN_INT (0x70007f0ffLL)));

      rtx tmp_di_as_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
      emit_move_insn (operands[0], tmp_di_as_df);
      DONE;



> "mffsl instruction".  Heh, I see the original was bad already, oops.

Double oops.  I had that typo fixed in some earlier version of the
patch, but somehow I managed to drop that fix.

>> +      if (operands[0] != tmp_df)
>> +    emit_move_insn (operands[0], tmp_df);
>> +      DONE;

> That "==" won't do what you want, I guess?

If the incoming operands[0] is a REG, we'd be back to it after
subregging back and forth; if it was a SUBREG to begin with, the cheap
test would fail and we'd output the redundant move, that later passes
will catch.  Anyway, that's gone in the snippet above.

> (Is there a PR, btw?)

Not that I know of; we've hit the testcase FAIL in our own testing.

-- 
Alexandre Oliva, freedom fighter    he/him    https://FSFLA.org/blogs/lxo/
Free Software Evangelist              Stallman was right, but he's left :(
GNU Toolchain Engineer           Live long and free, and prosper ethically

Reply via email to