Peter:
On Mon, 2023-07-10 at 16:57 -0500, Peter Bergner wrote: > On 7/10/23 2:18 PM, Carl Love wrote: > > + /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 > > (VE, OE, UE, > > + ZE, XE, NI, RN) from the FPSCR and return them. */ > > The 'Z' above should line up directly under the 'G' in Get. Yup. Fixed. > > > > - /* Insert new RN mode into FSCPR. */ > > - emit_insn (gen_rs6000_mffs (tmp_df)); > > - tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0); > > - emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4))); > > - emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn)); > > + /* Insert the new RN value from tmp_rn into FPSCR bit > > [62:63]. */ > > + emit_insn (gen_anddi3 (tmp_di1, tmp_di2, GEN_INT (-4))); > > + emit_insn (gen_iordi3 (tmp_di1, tmp_di1, tmp_rn)); > > This is an expander, so you shouldn't reuse temporaries as multiple > destination pseudos, since that limits the register allocator's > freedom. > I know the old code did it, but since you're changing the line, you > might as well use a new temp. OK, wasn't aware that reusing temps was an issue for the register allocator. Thanks for letting me know. So, I think you want something like: rtx tmp_rn = gen_reg_rtx (DImode); rtx tmp_di3 = gen_reg_rtx (DImode); /* Extract new RN mode from operand. */ rtx op1 = convert_to_mode (DImode, operands[1], false); emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3))); /* Insert the new RN value from tmp_rn into FPSCR bit [62:63]. */ emit_insn (gen_anddi3 (tmp_di1, tmp_di2, GEN_INT (-4))); emit_insn (gen_iordi3 (tmp_di3, tmp_di1, tmp_rn)); /* Need to write to field k=15. The fields are [0:15]. Hence with L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W). FLM is an 8-bit field[0:7]. Need to set the bit that corresponds to the value of i that you want [0:7]. */ tmp_df = simplify_gen_subreg (DFmode, tmp_di3, DImode, 0); where each destination is a unique register. Then let the register allocator can decide if it wants to use the same register or not at code generation time. I made the change and did a quick check compiling on Power 10 with mcpu=power[8,9,10] and it worked fine. I will run the full regression on each of the processor types just to be sure. Carl