Hi Mike,

Thanks for fixing this, some minor comments are inlined below.

on 2023/3/22 07:53, Michael Meissner wrote:
> The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
> optimization generates illegal assembler code.
> 
> Ultimately the code was dying because the fusion load + compare -1/0/1 
> patterns
> did not handle the possibility that the load might be prefixed.
> 
> The main cause is the constraints for the individual loads in the fusion did 
> not
> match the machine.  In particular, LWA is a ds format instruction when it is
> unprefixed.  The code did not also set the prefixed attribute correctly.
> 
> This patch rewrites the genfusion.pl script so that it will have more accurate
> constraints for the LWA and LD instructions (which are DS instructions).  The
> updated genfusion.pl was then run to update fusion.md.  Finally, the code for
> the "prefixed" attribute is modified so that it considers load + compare
> immediate patterns to be like the normal load insns in checking whether
> operand[1] is a prefixed instruction.
> 
> I have tested this patch on a little endian power10 system, on a little endian
> power9 system, and a big endian power8 system (both -m32 and -m64 tested on
> BE).  There were no regressions, can I check this into the trunk?
> 
> The same patch applies to the gcc-12 and gcc-11 branches.  Can I check this
> patch into those branches also after a burn-in period?
> 
> 2023-03-21   Michael Meissner  <meiss...@linux.ibm.com>
>              Aaron Sawdey  <acsaw...@linux.ibm.com>
> 
> gcc/
> 
>       PR target/105325
>       * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation
>       of the ld and lwa instructions which use the DS encoding instead of D.
>       Use the YZ constraint for these loads.  Handle prefixed loads better.
>       Set the sign_extend attribute as appropriate.
>       * gcc/config/rs6000/fusion.md: Regenerate.
>       * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
>       instructions to the list of instructions that might have a prefixed load
>       instruction.
> 
> gcc/testsuite/
> 
>       PR target/105325
>       * g++.target/powerpc/pr105325.C: New test.
>       * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
> ---
>  gcc/config/rs6000/genfusion.pl                | 26 ++++++++++++++++---
>  gcc/config/rs6000/fusion.md                   | 17 +++++++-----
>  gcc/config/rs6000/rs6000.md                   |  2 +-
>  gcc/testsuite/g++.target/powerpc/pr105325.C   | 24 +++++++++++++++++
>  .../gcc.target/powerpc/fusion-p10-ldcmpi.c    |  4 +--
>  5 files changed, 59 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C
> 
> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> index e4db352e0ce..4f367cadc52 100755
> --- a/gcc/config/rs6000/genfusion.pl
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -56,7 +56,7 @@ sub mode_to_ldst_char
>  sub gen_ld_cmpi_p10
>  {
>      my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
> -     $mempred, $ccmode, $np, $extend, $resultmode);
> +     $mempred, $ccmode, $np, $extend, $resultmode, $constraint);
>    LMODE: foreach $lmode ('DI','SI','HI','QI') {
>        $ldst = mode_to_ldst_char($lmode);
>        $clobbermode = $lmode;
> @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10
>        CCMODE: foreach $ccmode ('CC','CCUNS') {
>         $np = "NON_PREFIXED_D";
>         $mempred = "non_update_memory_operand";
> +       $constraint = "m";

The three assignments on $np $mempred $constraint can be moved
to place (a) (see below) and add one explicit assignment for
$constraint at place (b), since for the condition ccmode eq 'CC',
HI/SI/DI have their own settings (btw QI is skipped), these
assignments for default value can be moved to else arm (for CCUNS).

>         if ( $ccmode eq 'CC' ) {
>             next CCMODE if $lmode eq 'QI';
> -           if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +           if ( $lmode eq 'HI' ) {
> +               $np = "NON_PREFIXED_D";
> +               $mempred = "non_update_memory_operand";
> +               $echr = "a";
                  // place b
                  $constraint = "m";
                   
> +           } elsif ( $lmode eq 'SI' ) {
> +               # ld and lwa are both DS-FORM.
  
we have broken it into two different arms for SI and DI, this
comment can be removed?

> +               $np = "NON_PREFIXED_DS";
> +               $mempred = "lwa_operand";
> +               $echr = "a";
> +               $constraint = "YZ";
> +           } elsif ( $lmode eq 'DI' ) {
>                 # ld and lwa are both DS-FORM.

... and this comment.

>                 $np = "NON_PREFIXED_DS";
>                 $mempred = "ds_form_mem_operand";
> +               $echr = "";
> +               $constraint = "YZ";
>             }
>             $cmpl = "";
> -           $echr = "a";
>             $constpred = "const_m1_to_1_operand";
>         } else {

              // place a

>             if ( $lmode eq 'DI' ) {
>                 # ld is DS-form, but lwz is not.
>                 $np = "NON_PREFIXED_DS";
>                 $mempred = "ds_form_mem_operand";
> +               $constraint = "YZ";
>             }
>             $cmpl = "l";
>             $echr = "z";
> @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10
>  
>         print "(define_insn_and_split 
> \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
>         print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" 
> \"=x\")\n";
> -       print "        (compare:${ccmode} (match_operand:${lmode} 1 
> \"${mempred}\" \"m\")\n";
> +       print "        (compare:${ccmode} (match_operand:${lmode} 1 
> \"${mempred}\" \"${constraint}\")\n";
>         if ($ccmode eq 'CCUNS') { print "   "; }
>         print "                    (match_operand:${lmode} 3 \"${constpred}\" 
> \"n\")))\n";
>         if ($result eq 'clobber') {
> @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10
>         print "  \"\"\n";
>         print "  [(set_attr \"type\" \"fused_load_cmpi\")\n";
>         print "   (set_attr \"cost\" \"8\")\n";
> +
> +       if ($extend eq "sign") {
> +               print "   (set_attr \"sign_extend\" \"yes\")\n";
> +       }
> +
>         print "   (set_attr \"length\" \"8\")])\n";
>         print "\n";
>        }
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> index d45fb138a70..da9953d9ad9 100644
> --- a/gcc/config/rs6000/fusion.md
> +++ b/gcc/config/rs6000/fusion.md
> @@ -22,7 +22,7 @@
>  ;; load mode is DI result mode is clobber compare mode is CC extend is none
>  (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                      (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>     (clobber (match_scratch:DI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>  ;; load mode is DI result mode is clobber compare mode is CCUNS extend is 
> none
>  (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
>    [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> -        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                         (match_operand:DI 3 "const_0_to_1_operand" "n")))
>     (clobber (match_scratch:DI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -64,7 +64,7 @@ (define_insn_and_split 
> "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
>  ;; load mode is DI result mode is DI compare mode is CC extend is none
>  (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                      (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
>  ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
>  (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
>    [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> -        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                         (match_operand:DI 3 "const_0_to_1_operand" "n")))
>     (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
>  ;; load mode is SI result mode is clobber compare mode is CC extend is none
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (clobber (match_scratch:SI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -148,7 +148,7 @@ (define_insn_and_split 
> "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
>  ;; load mode is SI result mode is SI compare mode is CC extend is none
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
>  ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI 
> (match_dup 1)))]
>    "(TARGET_P10_FUSION)"
> @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 81bffb04ceb..0f809c3793f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes"
>             (eq_attr "maybe_prefixed" "no"))
>        (const_string "no")
>  
> -      (eq_attr "type" "load,fpload,vecload")
> +      (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi")
>        (if_then_else (match_test "prefixed_load_p (insn)")
>                      (const_string "yes")
>                      (const_string "no"))
> diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C 
> b/gcc/testsuite/g++.target/powerpc/pr105325.C
> new file mode 100644
> index 00000000000..f4ab384daa7
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
> @@ -0,0 +1,24 @@
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target lp64 } */

Nit: lp64 seems not necessary.

The others look good to me, thanks!

BR,
Kewen

Reply via email to