Hi Mike, on 2023/3/25 07:06, Michael Meissner wrote: > I posted a version of patch on March 21st. This patch makes some code changes > suggested in the genfusion.pl code. The only change is in genfusion.pl. The > fusion.md that it makes is the same. > > The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion > optimization generates illegal assembler code. > > Ultimately the code was dying because the fusion load + compare -1/0/1 > patterns > did not handle the possibility that the load might be prefixed. > > The main cause is the constraints for the individual loads in the fusion did > not > match the machine. In particular, LWA is a ds format instruction when it is > unprefixed. The code did not also set the prefixed attribute correctly. > > This patch rewrites the genfusion.pl script so that it will have more accurate > constraints for the LWA and LD instructions (which are DS instructions). The > updated genfusion.pl was then run to update fusion.md. Finally, the code for > the "prefixed" attribute is modified so that it considers load + compare > immediate patterns to be like the normal load insns in checking whether > operand[1] is a prefixed instruction. > > I am re-running the tests right now, but they should have the same results > since fsuion.md is the same, and only code in genfusion.pl that makes > fusion.md > was modified. Assuming these runs pass can I check this into the master > branch? > > I will also need to check these same patches into GCC 11 and GCC 12 after a > waiting period (the patch applied to those branches as well). > > 2023-03-21 Michael Meissner <meiss...@linux.ibm.com> > > gcc/ > > PR target/105325 > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads better. > Set the sign_extend attribute as appropriate. > * gcc/config/rs6000/fusion.md: Regenerate. > * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi > instructions to the list of instructions that might have a prefixed load > instruction. > > gcc/testsuite/ > > PR target/105325 > * g++.target/powerpc/pr105325.C: New test. > * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. > --- > gcc/config/rs6000/fusion.md | 17 ++++++---- > gcc/config/rs6000/genfusion.pl | 32 +++++++++++++++---- > gcc/config/rs6000/rs6000.md | 2 +- > gcc/testsuite/g++.target/powerpc/pr105325.C | 24 ++++++++++++++ > .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 4 +-- > 5 files changed, 62 insertions(+), 17 deletions(-) > create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C > > diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md > index d45fb138a70..da9953d9ad9 100644 > --- a/gcc/config/rs6000/fusion.md > +++ b/gcc/config/rs6000/fusion.md > @@ -22,7 +22,7 @@ > ;; load mode is DI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > ;; load mode is DI result mode is clobber compare mode is CCUNS extend is > none > (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -64,7 +64,7 @@ (define_insn_and_split > "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > ;; load mode is DI result mode is DI compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > ;; load mode is DI result mode is DI compare mode is CCUNS extend is none > (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > ;; load mode is SI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:SI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -148,7 +148,7 @@ (define_insn_and_split > "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none" > ;; load mode is SI result mode is SI compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none" > ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign > (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI > (match_dup 1)))] > "(TARGET_P10_FUSION)" > @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl > index e4db352e0ce..fa9ab7e9704 100755 > --- a/gcc/config/rs6000/genfusion.pl > +++ b/gcc/config/rs6000/genfusion.pl > @@ -56,7 +56,7 @@ sub mode_to_ldst_char > sub gen_ld_cmpi_p10 > { > my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, > - $mempred, $ccmode, $np, $extend, $resultmode); > + $ccmode, $extend, $resultmode);
Sorry for nitpicking here, but this change isn't what I suggested, since for now this script declares all local variables at the beginning of function, I think we can just leave this style unchanged. > LMODE: foreach $lmode ('DI','SI','HI','QI') { > $ldst = mode_to_ldst_char($lmode); > $clobbermode = $lmode; > @@ -69,23 +69,36 @@ sub gen_ld_cmpi_p10 > # Don't allow EXTQI because that would allow HI result which we can't > do. > $result = "GPR" if $result eq "EXTQI"; > CCMODE: foreach $ccmode ('CC','CCUNS') { > - $np = "NON_PREFIXED_D"; > - $mempred = "non_update_memory_operand"; > + my $np = "NON_PREFIXED_D"; > + my $mempred = "non_update_memory_operand"; > + my $constraint = "m"; ... instead I suggested moving these three lines to below else arm for CCUNS, since the arm for CC already has those variables redefined, so it's something like: diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index e4db352e0ce..3469c9e3c55 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -56,7 +56,7 @@ sub mode_to_ldst_char sub gen_ld_cmpi_p10 { my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, - $mempred, $ccmode, $np, $extend, $resultmode); + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); LMODE: foreach $lmode ('DI','SI','HI','QI') { $ldst = mode_to_ldst_char($lmode); $clobbermode = $lmode; @@ -69,23 +69,37 @@ sub gen_ld_cmpi_p10 # Don't allow EXTQI because that would allow HI result which we can't do. $result = "GPR" if $result eq "EXTQI"; CCMODE: foreach $ccmode ('CC','CCUNS') { - $np = "NON_PREFIXED_D"; - $mempred = "non_update_memory_operand"; if ( $ccmode eq 'CC' ) { next CCMODE if $lmode eq 'QI'; - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { - # ld and lwa are both DS-FORM. + if ( $lmode eq 'HI' ) { + $np = "NON_PREFIXED_D"; + $mempred = "non_update_memory_operand"; + $echr = "a"; + $constraint = "m"; + } elsif ( $lmode eq 'SI' ) { + # ld is DS-FORM. + $np = "NON_PREFIXED_DS"; + $mempred = "lwa_operand"; + $echr = "a"; + $constraint = "YZ"; + } elsif ( $lmode eq 'DI' ) { + # lwa is DS-FORM. $np = "NON_PREFIXED_DS"; $mempred = "ds_form_mem_operand"; + $echr = ""; + $constraint = "YZ"; } $cmpl = ""; - $echr = "a"; $constpred = "const_m1_to_1_operand"; } else { + $np = "NON_PREFIXED_D"; + $mempred = "non_update_memory_operand"; + $constraint = "m"; if ( $lmode eq 'DI' ) { # ld is DS-form, but lwz is not. $np = "NON_PREFIXED_DS"; $mempred = "ds_form_mem_operand"; + $constraint = "YZ"; } $cmpl = "l"; $echr = "z"; > if ( $ccmode eq 'CC' ) { > next CCMODE if $lmode eq 'QI'; > - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > - # ld and lwa are both DS-FORM. > + if ( $lmode eq 'HI' ) { > + $np = "NON_PREFIXED_D"; > + $mempred = "non_update_memory_operand"; > + $echr = "a"; > + } elsif ( $lmode eq 'SI' ) { > + # ld is DS-FORM. > + $np = "NON_PREFIXED_DS"; > + $mempred = "lwa_operand"; > + $echr = "a"; > + $constraint = "YZ"; > + } elsif ( $lmode eq 'DI' ) { > + # lwa is DS-FORM. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $echr = ""; > + $constraint = "YZ"; > } > $cmpl = ""; > - $echr = "a"; > $constpred = "const_m1_to_1_operand"; > } else { > if ( $lmode eq 'DI' ) { > # ld is DS-form, but lwz is not. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $constraint = "YZ"; > } > $cmpl = "l"; > $echr = "z"; > @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 [snip...] > +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C > @@ -0,0 +1,24 @@ > +/* { dg-do assemble } */ > +/* { dg-require-effective-target lp64 } */ In the previous review, I put a comment that "lp64 seems not necessary.". Did you try to test without it? (if yes, any fallouts?) BR, Kewen