Hello, Segher, On Apr 23, 2020, Segher Boessenkool <seg...@kernel.crashing.org> wrote:
> On Thu, Apr 23, 2020 at 05:08:55AM -0300, Alexandre Oliva wrote: >> I wasn't sure simplify_gen_subreg might possibly emit any code in >> obscure cases, > It never does, it just returns an RTX. Where would it emit the insns to? I'm pretty sure there are or were subreg-extraction functions that could emit new insns onto whatever insn stream was active in certain unusual circumstances; I wasn't sure whether this was one of them. > since all the top bits are zeros always, it will always be a subnormal > number, so all comparisons will work as expected / wanted. *nod*, as long as there's no trapping on subnormals. >> Yet another issue is that the test assumed the mmfs bits not extracted >> by mffsl are all zero. > All those extra bits are required to be set to 0. Not in the mffs output, no. See, at this point I'm not talking about the emulation, but about the mffs use in the test proper. Consider the case of emulated mffsl: we end up comparing the mffs output value with a masked version thereof. If any of the mffs output bits could possibly be nonzero, the compare would fail, not because the emulation is wrong, but because the test is comparing the output of mffsl with that of mffs, with all its bits. In my testing, mffs output zero in this program, so it worked, but are all the non-mffsl bits of FPSCR guaranteed to be zero, so the output of mffs doesn't have to be masked out by the test? What if some future CPU defines some of the reserved bits in FPSCR in a way that they should be nonzero? (And, could the ISA of such a future CPU possibly specify that mffsl should preserve those extra bits from FPSCR, rather than zero them out?) >> for gcc/ChangeLog >> >> * gcc/config/rs6000/rs6000.md (rs6000_mffsl): Copy result to >> output operand in emulation. Simplify. > Indent with a tab please. Erhm, what I posted had TABs there. Did it get mangled? :-( >> + rtx tmp_df = operands[0]; > Please don't reuse pseudos Aah, I see, that's what the original tmp_di was trying to accomplish! I'm familiar with that general guidance, but I didn't think there's much for optimizers to do with an mffs output, and the predicate ensures we have a reg, but... I suppose additional pseudos make the subregging safer, and not adding extra uses for operands[0] might help with register allocation for it, so... How about this, to also avoid reusing tmpdi as anddi input and output? rtx tmp_df = gen_reg_rtx (DFmode); /* The mffs instruction reads the entire FPSCR. Emulate the mffsl instruction using the mffs instruction and masking off the bits the mmsl instruction actually reads. */ emit_insn (gen_rs6000_mffs (tmp_df)); rtx tmp_df_as_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0); rtx tmp_di = gen_reg_rtx (DImode); emit_insn (gen_anddi3 (tmp_di, tmp_df_as_di, GEN_INT (0x70007f0ffLL))); rtx tmp_di_as_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0); emit_move_insn (operands[0], tmp_di_as_df); DONE; > "mffsl instruction". Heh, I see the original was bad already, oops. Double oops. I had that typo fixed in some earlier version of the patch, but somehow I managed to drop that fix. >> + if (operands[0] != tmp_df) >> + emit_move_insn (operands[0], tmp_df); >> + DONE; > That "==" won't do what you want, I guess? If the incoming operands[0] is a REG, we'd be back to it after subregging back and forth; if it was a SUBREG to begin with, the cheap test would fail and we'd output the redundant move, that later passes will catch. Anyway, that's gone in the snippet above. > (Is there a PR, btw?) Not that I know of; we've hit the testcase FAIL in our own testing. -- Alexandre Oliva, freedom fighter he/him https://FSFLA.org/blogs/lxo/ Free Software Evangelist Stallman was right, but he's left :( GNU Toolchain Engineer Live long and free, and prosper ethically