Hi Mike, Thanks for fixing this, some minor comments are inlined below.
on 2023/3/22 07:53, Michael Meissner wrote: > The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion > optimization generates illegal assembler code. > > Ultimately the code was dying because the fusion load + compare -1/0/1 > patterns > did not handle the possibility that the load might be prefixed. > > The main cause is the constraints for the individual loads in the fusion did > not > match the machine. In particular, LWA is a ds format instruction when it is > unprefixed. The code did not also set the prefixed attribute correctly. > > This patch rewrites the genfusion.pl script so that it will have more accurate > constraints for the LWA and LD instructions (which are DS instructions). The > updated genfusion.pl was then run to update fusion.md. Finally, the code for > the "prefixed" attribute is modified so that it considers load + compare > immediate patterns to be like the normal load insns in checking whether > operand[1] is a prefixed instruction. > > I have tested this patch on a little endian power10 system, on a little endian > power9 system, and a big endian power8 system (both -m32 and -m64 tested on > BE). There were no regressions, can I check this into the trunk? > > The same patch applies to the gcc-12 and gcc-11 branches. Can I check this > patch into those branches also after a burn-in period? > > 2023-03-21 Michael Meissner <meiss...@linux.ibm.com> > Aaron Sawdey <acsaw...@linux.ibm.com> > > gcc/ > > PR target/105325 > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads better. > Set the sign_extend attribute as appropriate. > * gcc/config/rs6000/fusion.md: Regenerate. > * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi > instructions to the list of instructions that might have a prefixed load > instruction. > > gcc/testsuite/ > > PR target/105325 > * g++.target/powerpc/pr105325.C: New test. > * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. > --- > gcc/config/rs6000/genfusion.pl | 26 ++++++++++++++++--- > gcc/config/rs6000/fusion.md | 17 +++++++----- > gcc/config/rs6000/rs6000.md | 2 +- > gcc/testsuite/g++.target/powerpc/pr105325.C | 24 +++++++++++++++++ > .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 4 +-- > 5 files changed, 59 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C > > diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl > index e4db352e0ce..4f367cadc52 100755 > --- a/gcc/config/rs6000/genfusion.pl > +++ b/gcc/config/rs6000/genfusion.pl > @@ -56,7 +56,7 @@ sub mode_to_ldst_char > sub gen_ld_cmpi_p10 > { > my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, > - $mempred, $ccmode, $np, $extend, $resultmode); > + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); > LMODE: foreach $lmode ('DI','SI','HI','QI') { > $ldst = mode_to_ldst_char($lmode); > $clobbermode = $lmode; > @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10 > CCMODE: foreach $ccmode ('CC','CCUNS') { > $np = "NON_PREFIXED_D"; > $mempred = "non_update_memory_operand"; > + $constraint = "m"; The three assignments on $np $mempred $constraint can be moved to place (a) (see below) and add one explicit assignment for $constraint at place (b), since for the condition ccmode eq 'CC', HI/SI/DI have their own settings (btw QI is skipped), these assignments for default value can be moved to else arm (for CCUNS). > if ( $ccmode eq 'CC' ) { > next CCMODE if $lmode eq 'QI'; > - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > + if ( $lmode eq 'HI' ) { > + $np = "NON_PREFIXED_D"; > + $mempred = "non_update_memory_operand"; > + $echr = "a"; // place b $constraint = "m"; > + } elsif ( $lmode eq 'SI' ) { > + # ld and lwa are both DS-FORM. we have broken it into two different arms for SI and DI, this comment can be removed? > + $np = "NON_PREFIXED_DS"; > + $mempred = "lwa_operand"; > + $echr = "a"; > + $constraint = "YZ"; > + } elsif ( $lmode eq 'DI' ) { > # ld and lwa are both DS-FORM. ... and this comment. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $echr = ""; > + $constraint = "YZ"; > } > $cmpl = ""; > - $echr = "a"; > $constpred = "const_m1_to_1_operand"; > } else { // place a > if ( $lmode eq 'DI' ) { > # ld is DS-form, but lwz is not. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $constraint = "YZ"; > } > $cmpl = "l"; > $echr = "z"; > @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 > > print "(define_insn_and_split > \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; > print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" > \"=x\")\n"; > - print " (compare:${ccmode} (match_operand:${lmode} 1 > \"${mempred}\" \"m\")\n"; > + print " (compare:${ccmode} (match_operand:${lmode} 1 > \"${mempred}\" \"${constraint}\")\n"; > if ($ccmode eq 'CCUNS') { print " "; } > print " (match_operand:${lmode} 3 \"${constpred}\" > \"n\")))\n"; > if ($result eq 'clobber') { > @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10 > print " \"\"\n"; > print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; > print " (set_attr \"cost\" \"8\")\n"; > + > + if ($extend eq "sign") { > + print " (set_attr \"sign_extend\" \"yes\")\n"; > + } > + > print " (set_attr \"length\" \"8\")])\n"; > print "\n"; > } > diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md > index d45fb138a70..da9953d9ad9 100644 > --- a/gcc/config/rs6000/fusion.md > +++ b/gcc/config/rs6000/fusion.md > @@ -22,7 +22,7 @@ > ;; load mode is DI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > ;; load mode is DI result mode is clobber compare mode is CCUNS extend is > none > (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -64,7 +64,7 @@ (define_insn_and_split > "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > ;; load mode is DI result mode is DI compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > ;; load mode is DI result mode is DI compare mode is CCUNS extend is none > (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > ;; load mode is SI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:SI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -148,7 +148,7 @@ (define_insn_and_split > "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none" > ;; load mode is SI result mode is SI compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none" > ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign > (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI > (match_dup 1)))] > "(TARGET_P10_FUSION)" > @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 81bffb04ceb..0f809c3793f 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes" > (eq_attr "maybe_prefixed" "no")) > (const_string "no") > > - (eq_attr "type" "load,fpload,vecload") > + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi") > (if_then_else (match_test "prefixed_load_p (insn)") > (const_string "yes") > (const_string "no")) > diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C > b/gcc/testsuite/g++.target/powerpc/pr105325.C > new file mode 100644 > index 00000000000..f4ab384daa7 > --- /dev/null > +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C > @@ -0,0 +1,24 @@ > +/* { dg-do assemble } */ > +/* { dg-require-effective-target lp64 } */ Nit: lp64 seems not necessary. The others look good to me, thanks! BR, Kewen