Hello Jeff: On 16/03/23 8:18 pm, Jeff Law wrote: > > > On 3/16/23 04:11, Ajit Agarwal via Gcc-patches wrote: >> >> Hello Richard: >> >> On 16/03/23 3:22 pm, Richard Biener wrote: >>> On Thu, Mar 16, 2023 at 9:19 AM Ajit Agarwal <aagar...@linux.ibm.com> wrote: >>>> >>>> >>>> >>>> On 16/03/23 1:44 pm, Richard Biener wrote: >>>>> On Thu, Mar 16, 2023 at 9:11 AM Ajit Agarwal <aagar...@linux.ibm.com> >>>>> wrote: >>>>>> >>>>>> Hello Richard: >>>>>> >>>>>> On 16/03/23 1:10 pm, Richard Biener wrote: >>>>>>> On Thu, Mar 16, 2023 at 6:21 AM Ajit Agarwal via Gcc-patches >>>>>>> <gcc-patches@gcc.gnu.org> wrote: >>>>>>>> >>>>>>>> Hello All: >>>>>>>> >>>>>>>> >>>>>>>> This patch eliminates unnecessary zero extension instruction from >>>>>>>> power generated assembly. >>>>>>>> Bootstrapped and regtested on powerpc64-linux-gnu. >>>>>>> >>>>>>> What makes this so special that we cannot deal with it from generic >>>>>>> code? >>>>>>> In particular we do have the REE pass, why is target specific >>>>>>> knowledge neccessary >>>>>>> to eliminate the extension? >>>>>>> >>>>>> >>>>>> For returning bool values and comparision with integers generates the >>>>>> following by all the rtl passes. >>>>>> >>>>>> set compare (subreg) >>>>>> set if_then_else >>>>>> Convert SImode -> QImode >>>>>> set zero_extend to SImode from QImode >>>>>> set return value 0 in one path of cfg. >>>>>> set return value 1 in other path of cfg. >>>>>> >>>>>> This pass replaces the above zero extension and conversion from QImode >>>>>> to DImode with copy operation to keep QImode in 64 bit registers in >>>>>> powerpc target. >>>>> >>>>> Sorry, I can't parse that - as there's no testcase with the patch I >>>>> cannot even try to see what the actual RTL >>>>> looks like (without the pass). >>>>> >>>> >>>> Here is the PR with bugzilla. >>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 >>>> >>>> I can add the attached testcase with this PR in the patch. >>> >>> I don't see any zero-extends there. >>> >> >> Here is the testcase. >> >> >> bool (int a, int b) >> { >> if (a > 2) >> return false; >> if (b < 10) >> return true; >> return false; >> } >> >> compiled with gcc -O3 -m64 testcase.cc -mcpu=power9 -save-temps. >> >> Here is the rtl after cse. >> (note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) >> (insn 15 12 16 3 (set (reg:CC 123) >> (compare:CC (subreg/s/u:SI (reg/v:DI 120 [ b ]) 0) >> (const_int 9 [0x9]))) "ext.cc":5:5 796 {*cmpsi_signed} >> (expr_list:REG_DEAD (reg/v:DI 120 [ b ]) >> (nil))) >> (insn 16 15 17 3 (set (reg:SI 124) >> (const_int 1 [0x1])) "ext.cc":5:5 555 {*movsi_internal1} >> (nil)) >> (insn 17 16 18 3 (set (reg:SI 122) >> (if_then_else:SI (gt (reg:CC 123) >> (const_int 0 [0])) >> (const_int 0 [0]) >> (reg:SI 124))) "ext.cc":5:5 344 {isel_cc_si} >> (expr_list:REG_DEAD (reg:SI 124) >> (expr_list:REG_DEAD (reg:CC 123) >> (nil)))) >> (insn 18 17 32 3 (set (reg:QI 117 [ _1 ]) >> (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} >> (expr_list:REG_DEAD (reg:SI 122) >> (nil))) >> ; pc falls through to BB 5 >> (code_label 32 18 31 4 3 (nil) [1 uses]) >> (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) >> (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) >> (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} >> (nil)) >> (code_label 19 5 20 5 2 (nil) [0 uses]) >> (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) >> (insn 21 20 22 5 (set (reg:DI 126 [ _1 ]) >> (zero_extend:DI (reg:QI 117 [ _1 ]))) "ext.cc":8:1 5 >> {zero_extendqidi2} >> (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) >> (nil))) >> (insn 22 21 26 5 (set (reg:DI 118 [ <retval> ]) >> (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} >> (expr_list:REG_DEAD (reg:DI 126 [ _1 ]) >> (nil))) >> (insn 26 22 27 5 (set (reg/i:DI 3 3) >> (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} >> (expr_list:REG_DEAD (reg:DI 118 [ <retval> ]) >> (nil))) >> (insn 27 26 0 5 (use (reg/i:DI 3 3)) "ext.cc":8:1 -1 >> (nil)) > This looks like it'd be better addressed in REE. > > > We've got two paths to the zero_extend. One sets (reg 117) from a constant. > The other sets (reg 117) from a (subreg:QI (reg:SI)). > > Handling the constant is trivial. For the other set, we can replace the > subreg with the zero_extend. Presumably we'd then proceed to try and > eliminate the zero-extend by realizing both arms of the conditional move are > constants and thus trivially handled. > > While I don't think REE would handle all this today, fixing it to handle this > case seems like it'd be better than doing a specialized pass in the ppc > backend. > > jeff >
Thanks for your advice. At the input of REE pass the RTL has the following wherein zero_extend and subreg( reg 117) is converted to and (subreg DI ( reg QI 117). This needs to be handled. I am working on handling this in REE pass. insn 44 43 18 3 (set (reg:SI 122) (if_then_else:SI (le:SI (reg:CC 130) (const_int 0 [0])) (reg:SI 129) (const_int 0 [0]))) "ext.cc":5:5 -1 (nil)) (insn 18 44 40 3 (set (reg:QI 117 [ _1 ]) (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} (expr_list:REG_DEAD (reg:SI 122) (nil))) (jump_insn 40 18 41 3 (set (pc) (label_ref 19)) -1 (nil) -> 19) (barrier 41 40 32) (code_label 32 41 31 4 3 (nil) [1 uses]) (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} (nil)) (code_label 19 5 20 5 2 (nil) [1 uses]) (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (note 21 20 26 5 NOTE_INSN_DELETED) (insn 26 21 27 5 (set (reg/i:DI 3 %r3) (and:DI (subreg:DI (reg:QI 117 [ _1 ]) 0) (const_int 1 [0x1]))) "ext.cc":8:1 207 {anddi3_mask} (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) (nil))) (insn 27 26 0 5 (use (reg/i:DI 3 %r3)) "ext.cc":8:1 -1 (nil)) "a-ext.cc.292r.split1" 92L, 4727C Thanks & Regards Ajit