Hi! On Fri, Apr 24, 2020 at 04:52:46AM -0300, Alexandre Oliva wrote: > On Apr 23, 2020, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > > On Thu, Apr 23, 2020 at 05:08:55AM -0300, Alexandre Oliva wrote: > >> I wasn't sure simplify_gen_subreg might possibly emit any code in > >> obscure cases, > > > It never does, it just returns an RTX. Where would it emit the insns to? > > I'm pretty sure there are or were subreg-extraction functions that could > emit new insns onto whatever insn stream was active in certain unusual > circumstances;
Yeah, I think so too. > I wasn't sure whether this was one of them. Doesn't look that way, thankfully :-) > > since all the top bits are zeros always, it will always be a subnormal > > number, so all comparisons will work as expected / wanted. > > *nod*, as long as there's no trapping on subnormals. There isn't :-) I did say this isn't clean at all, just *happens* to work, right? :-) > >> Yet another issue is that the test assumed the mmfs bits not extracted > >> by mffsl are all zero. > > > All those extra bits are required to be set to 0. > > Not in the mffs output, no. See, at this point I'm not talking about > the emulation, but about the mffs use in the test proper. Hrm, I don't see it then. > Consider the case of emulated mffsl: we end up comparing the mffs output > value with a masked version thereof. If any of the mffs output bits > could possibly be nonzero, the compare would fail, not because the > emulation is wrong, but because the test is comparing the output of > mffsl with that of mffs, with all its bits. Oh. I thought that was masked as well. > In my testing, mffs output zero in this program, so it worked, but are > all the non-mffsl bits of FPSCR guaranteed to be zero, so the output of > mffs doesn't have to be masked out by the test? No, and there are many bits not yet defined, so it may change for future versions of the ISA. > What if some future CPU defines some of the reserved bits in FPSCR in a > way that they should be nonzero? In a way different between mffs and mffsl. Yeah. > (And, could the ISA of such a future CPU possibly specify that mffsl > should preserve those extra bits from FPSCR, rather than zero them out?) Yes, I don't see why not? Hrm, the exact language for the mffsl description does not allow that, but that wouldn't be the best thing to reply on. > >> for gcc/ChangeLog > >> > >> * gcc/config/rs6000/rs6000.md (rs6000_mffsl): Copy result to > >> output operand in emulation. Simplify. > > > Indent with a tab please. > > Erhm, what I posted had TABs there. Did it get mangled? :-( Yes, to eight spaces, so that aligned "output" under "gcc". > >> + rtx tmp_df = operands[0]; > > > Please don't reuse pseudos > > Aah, I see, that's what the original tmp_di was trying to accomplish! :-) > I'm familiar with that general guidance, but I didn't think there's much > for optimizers to do with an mffs output, and the predicate ensures we > have a reg, but... I suppose additional pseudos make the subregging > safer, and not adding extra uses for operands[0] might help with > register allocation for it, so... How about this, to also avoid reusing > tmpdi as anddi input and output? No pseudo should be assigned to more than once. A la SSA, or "webs". Many of our optimisers do not work well otherwise. > rtx tmp_df = gen_reg_rtx (DFmode); > > /* The mffs instruction reads the entire FPSCR. Emulate the mffsl > instruction using the mffs instruction and masking off the bits > the mmsl instruction actually reads. */ > emit_insn (gen_rs6000_mffs (tmp_df)); > > rtx tmp_df_as_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0); > rtx tmp_di = gen_reg_rtx (DImode); > emit_insn (gen_anddi3 (tmp_di, tmp_df_as_di, GEN_INT (0x70007f0ffLL))); > > rtx tmp_di_as_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0); > emit_move_insn (operands[0], tmp_di_as_df); > DONE; Sure, that looks fine. Maybe tmp1, tmp2, etc. woukld be clearer here? :-) > > "mffsl instruction". Heh, I see the original was bad already, oops. > > Double oops. I had that typo fixed in some earlier version of the > patch, but somehow I managed to drop that fix. > > >> + if (operands[0] != tmp_df) > >> + emit_move_insn (operands[0], tmp_df); > >> + DONE; > > > That "==" won't do what you want, I guess? > > If the incoming operands[0] is a REG, we'd be back to it after > subregging back and forth; if it was a SUBREG to begin with, the cheap > test would fail and we'd output the redundant move, that later passes > will catch. Anyway, that's gone in the snippet above. > > > (Is there a PR, btw?) > > Not that I know of; we've hit the testcase FAIL in our own testing. Could you open one? This patch will need backporting, etc. I can do it, if you really hate bugzilla, but I think it is easier for you to do it :-) Thanks! Segher