Hello Segher: On 24/02/23 8:41 pm, Segher Boessenkool wrote: > Hi! > > For future patches: please don't send patches as replies to existing > threads. Just start a new thread for a new patch (series). You can > mark it as [PATCH v2] in the subject, if you want. > > On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote: >> Here is the patch that uses xxlor instead of fmr where possible. >> Performance results shows that fmr is better in power9 and >> power10 architectures whereas xxlor is better in power7 and >> power 8 architectures. > > And fmr is the only option before p7. > >> rs6000: Use xxlor instead of fmr where possible >> >> This patch replaces fmr with xxlor instruction for power7 >> and power8 architectures whereas for power9 and power10 >> replaces xxlor with fmr instruction. > > Saying "this patch" in a commit message reads strangely. Just "Replace > fmr with" etc.? >
I will correct this. > The second part is just wrong, you cannot replace xxlor by fmr in > general. > >> Perf measurement results: >> >> Power9 fmr: 201,847,661 cycles. >> Power9 xxlor: 201,877,78 cycles. >> Power8 fmr: 201,057,795 cycles. >> Power8 xxlor: 201,004,671 cycles. > > What is this measuring? 100M insns back-to-back, each dependent on the > previous one? > Yes. > What are the results on p7 and p10? > > These numbers show there is no difference on p8 either. Did you paste > the wrong numbers maybe? > I will measure it again and update with a new patch. >> * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor >> for power7 and power8 and fmr for power9 and power10. > > Please don't break lines early. Changelogs lines can be 80 columns > wide, just like source code lines. > >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -354,7 +354,7 @@ (define_attr "cpu" >> (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) >> >> ;; The ISA we implement. >> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" >> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10" > > p78v, and sort it after p8v please. > >> + (and (eq_attr "isa" "p7p8") >> + (match_test "TARGET_VSX && !TARGET_P9_VECTOR")) >> + (const_int 1) > > Okay. > >> (define_insn "*mov<mode>_hardfloat64" >> [(set (match_operand:FMOVE64 0 "nonimmediate_operand" >> - "=m, d, d, <f64_p9>, wY, >> - <f64_av>, Z, <f64_vsx>, <f64_vsx>, !r, >> - YZ, r, !r, *c*l, !r, >> - *h, r, <f64_dm>, wa") >> + "=m, d, <f64_vsx>, <f64_p9>, wY, >> + <f64_av>, Z, wa, <f64_vsx>, !r, >> + YZ, r, !r, *c*l, !r, >> + *h, r, <f64_dm>, d, wn, >> + wa") >> (match_operand:FMOVE64 1 "input_operand" > > (You posted this mail as wrapping. That means the patch cannot be > applied non-manually, and that replies to your mail will be mangled. > Just get a Real mail client, and configure it correctly :-) ) > I am using Thunderbird as mail client and the settings are all correct. I have set the mailnews.wrapLength 0. >> - "d, m, d, wY, <f64_p9>, >> - Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>, >> + "d, m, <f64_vsx>, wY, <f64_p9>, >> + Z, <f64_av>, wa, <zero_fp>, <zero_fp>, >> r, YZ, r, r, *h, >> - 0, <f64_dm>, r, eP"))] >> + 0, <f64_dm>, r, d, wn, >> + eP"))] > > No. It is impossible to figure out what you changed here by just > reading it. > > There is no requirement there should be exactly five alternatives per > line, and/or that there should be the same number everywhere. > > If the indentation was incorrect, and you want to fix that, do that in a > separate *earlier* patch in the series, please. > I will Keep indentation as same. >> "TARGET_POWERPC64 && TARGET_HARD_FLOAT >> && (gpc_reg_operand (operands[0], <MODE>mode) >> || gpc_reg_operand (operands[1], <MODE>mode))" >> "@ >> stfd%U0%X0 %1,%0 >> lfd%U1%X1 %0,%1 >> - fmr %0,%1 >> + xxlor %x0,%x1,%x1 >> lxsd %0,%1 >> stxsd %1,%0 >> lxsdx %x0,%y1 >> stxsdx %x1,%y0 >> - xxlor %x0,%x1,%x1 >> + fmr %0,%1 >> xxlxor %x0,%x0,%x0 >> li %0,0 >> std%U0%X0 %1,%0 >> @@ -8467,23 +8474,28 @@ (define_insn "*mov<mode>_hardfloat64" >> nop >> mfvsrd %0,%x1 >> mtvsrd %x0,%1 >> + fmr %0,%1 >> + fmr %0,%1 >> #" >> [(set_attr "type" >> - "fpstore, fpload, fpsimple, fpload, fpstore, >> + "fpstore, fpload, veclogical, fpload, fpstore, >> fpload, fpstore, veclogical, veclogical, integer, >> store, load, *, mtjmpr, mfjmpr, >> - *, mfvsr, mtvsr, vecperm") >> + *, mfvsr, mtvsr, fpsimple, fpsimple, >> + vecperm") >> (set_attr "size" "64") >> (set_attr "isa" >> - "*, *, *, p9v, p9v, >> - p7v, p7v, *, *, *, >> - *, *, *, *, *, >> - *, p8v, p8v, p10") >> + "*, *, p7p8, p9v, p9v, >> + p7v, p7v, *, *, *, >> + *, *, *, *, *, >> + *, p8v, p8v, *, *, >> + p10") > > So, you swapped the xxlor and fmr entries, and added two nextra fmr > entries at the end?! > I have moved xxlor <f64_vsx> "p7p8" before any other constraints with fmr "*". I have added first constraints as xxlor <f64_vsx> with "p7p8" then wa fmr "*" and wn,d "*" as fmr at end. > > Segher Thanks & Regards Ajit