Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/14 16:35, HAO CHEN GUI wrote: > Hi Kewen, > > 在 2023/9/12 17:33, Kewen.Lin 写道: >> Ok, at least regression testing doesn't expose any needs to do disparaging >> for this. Could you also test this patch with SPEC2017 for P7 and P8 >> separately at options like -O2 or -O3, to

[PATCH] rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]

2023-09-17 Thread Kewen.Lin via Gcc-patches
Hi, PR111366 exposes one thing that can be improved in function rs6000_update_ipa_fn_target_info is to skip the given empty inline asm string, since it's impossible to adopt any hardware features (so far HTM). Since this rs6000_update_ipa_fn_target_info related approach exists in GCC12 and later,

[PATCH] rs6000: Use default target option node for callee by default [PR111380]

2023-09-17 Thread Kewen.Lin via Gcc-patches
Hi, As PR111380 (and the discussion in related PRs) shows, for now how function rs6000_can_inline_p treats the callee without any target option node is wrong. It considers it's always safe to inline this kind of callee, but actually its target flags are from the command line options (target_optio

Re: [PATCH v1] rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index

2023-09-13 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/9/13 00:39, Ajit Agarwal wrote: > This patch removes zero extension from vctzlsbb as it already zero extends. > Bootstrapped and regtested on powerpc64-linux-gnu. > > Thanks & Regards > Ajit > > rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index > > For rs6000

Re: [PATCH] rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Ajit, on 2023/8/31 18:44, Ajit Agarwal via Gcc-patches wrote: > > This patch removes zero extension from vctzlsbb as it already zero extends. > Bootstrapped and regtested on powerpc64-linux-gnu. > > Thanks & Regards > Ajit > > rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eo

Re: [PATCH-2v2, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/4 13:33, HAO CHEN GUI wrote: > Hi, > This patch implements 32bit inline lrint by "fctiw". It depends on > the patch1 to do SImode move from FP registers on P7. > > Compared to last version, the main change is to add tests for "lrintf" > and adjust the count of correspond

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/4 13:33, HAO CHEN GUI wrote: > Hi, > This patch enables SImode in FP registers on P7. Instruction "fctiw" > stores its integer output in an FP register. So SImode in FP register > needs be enabled on P7 if we want support "fctiw" on P7. > > The test case is in the second

Re: [PATCH] rs6000: Update instruction counts to match vec_* calls [PR111228]

2023-08-30 Thread Kewen.Lin via Gcc-patches
Hi Peter, on 2023/8/31 06:42, Peter Bergner wrote: > Commit r14-3258-ge7a36e4715c716 increased the amount of folding we perform, > leading to better code. Update the expected instruction counts to match the > the number of associated vec_* built-in calls. > > Tested on powerpc64le-linux with no

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-30 Thread Kewen.Lin via Gcc-patches
on 2023/8/31 13:47, HAO CHEN GUI wrote: > Kewen, > I refined the patch according to your comments and it passed bootstrap > and regression test. > > I committed it as > https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be Thanks! We want this to be backported, so it's also ok for b

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-29 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/29 10:50, HAO CHEN GUI wrote: > Hi, > This patch adds "TARGET_64BIT" check when calling vector load/store > with length expand in expand_block_move. It matches the expand condition > of "lxvl" and "stxvl" defined in vsx.md. > > This patch fixes the ICE occurred with the

Re: [PATCH ver 4] rs6000, add overloaded DFP quantize support

2023-08-29 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/29 04:00, Carl Love wrote: > > GCC maintainers: > > Version 4, additional define_insn name fix. Change Log fix for the > UNSPEC_DQUAN. Retested patch on Power 10 LE. > > Version 3, fixed the built-in instance names. Missed removing the "n" > the name. Added the tighter co

Re: [PATCH-2, rs6000] Implement 32bit inline lrint [PR88558]

2023-08-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/25 14:44, HAO CHEN GUI wrote: > Hi, > This patch implements 32bit inline lrint by "fctiw". It depends on > the patch1 to do SImode move from FP register on P7. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > > Thanks > Gui Haochen > > Ch

Re: [PATCH-1, rs6000] Enable SImode in FP register on P7 [PR88558]

2023-08-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/25 14:44, HAO CHEN GUI wrote: > Hi, > This patch enables SImode in FP register on P7. Instruction "fctiw" > stores its integer output in an FP register. So SImode in FP register > needs be enabled on P7 if we want support "fctiw" on P7. > It sounds reasonable to support S

Re: [PATCH ver 3] rs6000, add overloaded DFP quantize support

2023-08-27 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/25 03:53, Carl Love wrote: > GCC maintainers: > > Version 3, fixed the built-in instance names. Missed removing the "n" > the name. Added the tighter constraints on the predicates for the > define_insn. Updated the wording for the built-ins in the > documentation file. Chan

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-27 Thread Kewen.Lin via Gcc-patches
on 2023/8/26 06:04, Peter Bergner wrote: > On 8/25/23 6:20 AM, Kewen.Lin wrote: >> Assuming the current PCREL_SUPPORTED_BY_OS unchanged, when >> PCREL_SUPPORTED_BY_OS is true, all its required conditions are >> satisfied, it should be safe. while PCREL_SUPPORTED_BY_OS is >> false, it means the giv

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-25 Thread Kewen.Lin via Gcc-patches
on 2023/8/25 11:20, Peter Bergner wrote: > On 8/24/23 12:56 AM, Kewen.Lin wrote: >> By looking into the uses of function rs6000_pcrel_p, I think we can >> just replace it with TARGET_PCREL. Previously we don't require PCREL >> unset for any unsupported target/OS, so we need rs6000_pcrel_p() to >>

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-23 Thread Kewen.Lin via Gcc-patches
Hi Peter, on 2023/8/24 10:07, Peter Bergner wrote: > On 8/21/23 8:51 PM, Kewen.Lin wrote: >>> The following patch has been bootstrapped and regtested on powerpc64-linux. >> >> I think we should test this on powerpc64le-linux P8 or P9 (no P10) as well. > > That's a good idea! > > > >> I think t

Re: [PATCH 1/3] vect: Remove some manual release in vectorizable_store

2023-08-22 Thread Kewen.Lin via Gcc-patches
on 2023/8/22 20:32, Richard Biener wrote: > On Tue, Aug 22, 2023 at 10:45 AM Kewen.Lin wrote: >> >> Hi, >> >> To avoid some duplicates in some follow-up patches on >> function vectorizable_store, this patch is to adjust some >> existing vec with auto_vec and remove some manual release >> invocatio

Re: [PATCH] vect: Replace DR_GROUP_STORE_COUNT with DR_GROUP_LAST_ELEMENT

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2023/8/22 20:17, Richard Biener wrote: > On Tue, Aug 22, 2023 at 10:44 AM Kewen.Lin wrote: >> >> Hi, >> >> Now we use DR_GROUP_STORE_COUNT to record how many stores >> in a group have been transformed and only do the actual >> transform when encountering the last one. I'm making >>

[PATCH 3/3] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi, Like r14-3317 which moves the handlings on memory access type VMAT_GATHER_SCATTER in vectorizable_load final loop nest, this one is to deal with vectorizable_store side. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR

[PATCH 2/3] vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi, Like commit r14-3214 which moves the handlings on memory access type VMAT_LOAD_STORE_LANES in vectorizable_load final loop nest, this one is to deal with the function vectorizable_store. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it

[PATCH 1/3] vect: Remove some manual release in vectorizable_store

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi, To avoid some duplicates in some follow-up patches on function vectorizable_store, this patch is to adjust some existing vec with auto_vec and remove some manual release invocation. Also refactor a bit and remove some useless codes. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-

[PATCH] vect: Replace DR_GROUP_STORE_COUNT with DR_GROUP_LAST_ELEMENT

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi, Now we use DR_GROUP_STORE_COUNT to record how many stores in a group have been transformed and only do the actual transform when encountering the last one. I'm making patches to move costing next to the transform code, it's awkward to use this DR_GROUP_STORE_COUNT for both costing and transfo

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-21 Thread Kewen.Lin via Gcc-patches
Hi Jeevitha, on 2023/8/21 18:32, jeevitha wrote: > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64-linux. I think we should test this on powerpc64le-linux P8 or P9 (no P10) as well. > > It is currently possible to incorrectly enable PCREL for targets that do no

Re: [PATCH V5] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-21 Thread Kewen.Lin via Gcc-patches
Hi Juzhe, on 2023/8/21 18:59, Juzhe-Zhong wrote: > Co-Authored-By: Kewen.Lin > > Hi, @Richi and @Richard, base on previous disscussion, I simpily fix issuses > for > powerpc and s390 with your suggestions: > > - machine_mode len_load_mode = get_len_load_store_mode > -(loop_vinfo->vector_m

[PATCH] vect: Factor out the handling on scatter store having gs_info.decl

2023-08-16 Thread Kewen.Lin via Gcc-patches
Hi, Similar to the existing function vect_build_gather_load_calls, this patch is to factor out the handling on scatter store having gs_info.decl to vect_build_scatter_store_calls which is a new function. It also does some minor refactoring like moving some variables' declarations close to their u

[PATCH] Makefile.in: Make TM_P_H depend on $(TREE_H) [PR111021]

2023-08-16 Thread Kewen.Lin via Gcc-patches
Hi, As PR111021 shows, the below ${port}-protos.h include tree.h for code_helper and tree_code: arm/arm-protos.h:#include "tree.h" cris/cris-protos.h:#include "tree.h" (H-P removed this in r14-3218) microblaze/microblaze-protos.h:#include "tree.h" rl78/rl78-protos.h:#include "tree.h" st

Re: [PATCH ver 2] rs6000, add overloaded DFP quantize support

2023-08-16 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/17 08:19, Carl Love wrote: > > GCC maintainers: > > Version 2, renamed the built-in instances. Changed the name of the > overloaded built-in. Added the missing documentation for the new > built-ins. Fixed typos. Changed name of the test. Updated the > effective target for

Re: [PATCH ver 2] rs6000, add overloaded DFP quantize support

2023-08-16 Thread Kewen.Lin via Gcc-patches
on 2023/8/17 11:11, Peter Bergner wrote: > On 8/16/23 7:19 PM, Carl Love wrote: >> +(define_insn "dfp_dquan_" >> + [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d") >> +(unspec:DDTD [(match_operand:DDTD 1 "gpc_reg_operand" "d") >> + (match_operand:DDTD 2 "gpc_reg_operand

Re: [PATCH] Makefile.in: Add variable TM_P_H2 for TM_P_H dependency [PR111021]

2023-08-15 Thread Kewen.Lin via Gcc-patches
on 2023/8/16 10:31, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR111021 shows, the below ${port}-protos.h include tree.h > for code_helper and tree_code: > > arm/arm-protos.h:#include "tree.h" > cris/cris-protos.h:#include "tree.h" (H-P removed th

[PATCH] Makefile.in: Add variable TM_P_H2 for TM_P_H dependency [PR111021]

2023-08-15 Thread Kewen.Lin via Gcc-patches
Hi, As PR111021 shows, the below ${port}-protos.h include tree.h for code_helper and tree_code: arm/arm-protos.h:#include "tree.h" cris/cris-protos.h:#include "tree.h" (H-P removed this in r14-3218) microblaze/microblaze-protos.h:#include "tree.h" rl78/rl78-protos.h:#include "tree.h" s

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Kewen.Lin via Gcc-patches
on 2023/8/15 17:13, Richard Sandiford wrote: > Richard Biener writes: >>> OK, fair enough. So the idea is: see where we end up and then try to >>> improve/factor the APIs in a less peephole way? >> >> Yeah, I think that's the only good way forward. > > OK, no objection from me. Sorry for holdin

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Kewen.Lin via Gcc-patches
on 2023/8/15 20:07, Richard Biener wrote: > On Tue, Aug 15, 2023 at 1:47 PM Kewen.Lin wrote: >> >> on 2023/8/15 15:53, Richard Biener wrote: >>> On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote: on 2023/8/14 22:16, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >>

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Kewen.Lin via Gcc-patches
on 2023/8/15 15:53, Richard Biener wrote: > On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote: >> >> on 2023/8/14 22:16, Richard Sandiford wrote: >>> "Kewen.Lin" writes: Hi Richard, on 2023/8/14 20:20, Richard Sandiford wrote: > Thanks for the clean-ups. But... > > "Kewe

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Stefan, on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote: > Hi everyone, > > I have bootstrapped and regtested the patch below on s390. For the > 64-bit target I do not see any changes regarding the testsuite. For the > 31-bit target I see the following failures: > > FAIL: gcc.dg/vect/

[PATCH] Makefile.in: Make recog.h depend on $(TREE_H)

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi, Commit r14-3093 introduced a random build failure on build/gencondmd.cc building. Since r14-3093 makes recog.h include tree.h, which further includes (depends on) some files that are generated during the building, such as: all-tree.def, tree-check.h etc, when building file build/gencondmd.cc,

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Kewen.Lin via Gcc-patches
on 2023/8/14 22:16, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> on 2023/8/14 20:20, Richard Sandiford wrote: >>> Thanks for the clean-ups. But... >>> >>> "Kewen.Lin" writes: Hi, Following Richi's suggestion [1], this patch is to move the handlings on V

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2023/8/14 20:20, Richard Sandiford wrote: > Thanks for the clean-ups. But... > > "Kewen.Lin" writes: >> Hi, >> >> Following Richi's suggestion [1], this patch is to move the >> handlings on VMAT_GATHER_SCATTER in the final loop nest >> of function vectorizable_load to its own loo

Re: [PATCH] vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2023/8/14 20:04, Richard Biener wrote: > On Mon, Aug 14, 2023 at 10:54 AM Kewen.Lin wrote: >> >> Hi, >> >> Following Richi's suggestion [1], this patch is to move the >> handlings on VMAT_LOAD_STORE_LANES in the final loop nest >> of function vectorizable_load to its own loop. Basic

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Juzhe, on 2023/8/14 20:08, juzhe.zh...@rivai.ai wrote: > Hi, Kewin. > > Could you test 'can_vec_extract_var_idx_p' and send V5 patch when you pass > the testing? The below diff was bootstrapped and regress-tested on Power10 LE. Comparing to the previous v4, the only changes should be the p

Re: [PATCH] rs6000, add overloaded DFP quantize support

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/9 23:52, Carl Love wrote: > > GCC maintainers: > > The following patch adds four built-ins for the decimal floating point > (DFP) quantize instructions on rs6000. The built-ins are for 64-bit > and 128-bit DFP operands. > > The patch also adds a test case for the new builtin

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Robin, on 2023/8/14 16:58, Robin Dapp wrote: > Hi Kewen, > >> I did a bootstrapping and regression testing on Power10 (LE) and found a lot >> of failures. > > I think the problem is that just like for vec_set we're expecting > the vec_extract expander not to fail. It is probably passed not

[PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi, Following Richi's suggestion [1], this patch is to move the handlings on VMAT_GATHER_SCATTER in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_GATHER_SCATTER, remove some un

[PATCH] vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi, Following Richi's suggestion [1], this patch is to move the handlings on VMAT_LOAD_STORE_LANES in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_LOAD_STORE_LANES, remove som

[PATCH] vect: Remove several useless VMAT_INVARIANT checks

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi, In function vectorizable_load, there is one hunk which is dedicated for the handlings on VMAT_INVARIANT and return early, it means we shouldn't encounter any cases with memory_access_type VMAT_INVARIANT in the following code after that. This patch is to clean up several useless checks on VMAT

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Juzhe, on 2023/8/14 15:09, juzhe.zh...@rivai.ai wrote: > Thanks Richi. > > CC kewen to see whether this patch is suitable for powerpc and s390. I did a bootstrapping and regression testing on Power10 (LE) and found a lot of failures. A short list looks like: < FAIL: gcc.c-torture/compile/2

Re: [PATCH 2/3] ivopts: Call valid_mem_ref_p with code_helper [PR110248]

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/8/14 15:53, Jan-Benedict Glaw wrote: > On Fri, 2023-06-30 13:46:40 +0800, Kewen.Lin via Gcc-patches > wrote: >> Bootstrapped and regtested on x86_64-redhat-linux and >> powerpc64{,le}-linux-gnu. >> >> Is it ok for trunk? > [...] > >> diff

Re: [PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/14 10:18, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all sub targets when the mode is V4SI and the extracted element is word > 1 from BE order. Also this patch adds a insn pattern for mfvsrwz which > helps eliminat

Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2023-08-09 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/7/20 12:35, jeevitha via Gcc-patches wrote: > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > When the user specifies PTImode as an attribute, it breaks. Created > a tree node to handle PTImode types. PTImode attribute helps in generating

Re: [PATCH ver 3] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-08-09 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/8 01:50, Carl Love wrote: > > GCC maintainers: > > Ver 3: Updated description to make it clear the patch fixes the > confusion on the availability of the builtins. Fixed the dg-require- > effective-target on the test cases and the dg-options. Change the test > case so the fo

Re: [PATCH 1/3] targhooks: Extend legitimate_address_p with code_helper [PR110248]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2023/6/30 17:13, Kewen.Lin via Gcc-patches wrote: > Hi Richi, > > Thanks for your review! > > on 2023/6/30 16:56, Richard Biener wrote: >> On Fri, Jun 30, 2023 at 7:38 AM Kewen.Lin wrote: >>> >>> Hi, >>> >>> As PR110248 sh

PING^2 [PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html BR, Kewen > on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> As Honza pointed out in [1], the current uses of function >> optimize_function_for_speed_p in rs60

PING^3 [PATCH v2] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html BR, Kewen >> on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> By addressing Alexander's comments, against v1 this >>>

PING^4 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>> on 2022/11/24 17:15, Kewen Lin wrote: Hi, Following Segher's suggestion, this patch series is to rework function rs6000_emit_vector_compare for vector float and int

Re: [PATCH V2] rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Jeevitha, on 2023/7/20 00:46, jeevitha wrote: > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > There are no instructions that do traditional AltiVec addresses (i.e. > with the low four bits of the address masked off) for OOmode and XOmode > objec

Re: [PATCH v2] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Carl, Sorry for the late review. on 2023/8/2 02:29, Carl Love wrote: > > GCC maintainers: > > Ver 2: Re-worked the test vec-cmpne.c to create a compile only test > verify the instruction generation and a runnable test to verify the > built-in functionality. Retested the patch on Power 8 LE

Re: [PATCH] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-07-30 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/28 23:00, Carl Love wrote: > GCC maintainers: > > The following patch cleans up the definition for the > __builtin_altivec_vcmpnet. The current implementation implies that the s/__builtin_altivec_vcmpnet/__builtin_altivec_vcmpne[bhw]/ > built-in is only supported on Power 9

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-30 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/25 10:10, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all subtargets when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which helps

Re: [PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-07-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/5 11:22, HAO CHEN GUI wrote: > Hi, > This patch skips redundant vector extract insn to be generated when > the extracted element is the first element of dword0 and the destination "The first element" is confusing, it's easy to be misunderstood as element 0, but in fact the

Re: [PATCH] Optimize vec_splats of vec_extract for V2DI/V2DF (PR target/99293)

2023-07-28 Thread Kewen.Lin via Gcc-patches
Hi Mike, on 2023/7/11 03:50, Michael Meissner wrote: > This patch optimizes cases like: > > vector double v1, v2; > /* ... */ > v2 = vec_splats (vec_extract (v1, 0); /* or */ > v2 = vec_splats (vec_extract (v1, 1); > > Previously: > > vector long long > sp

Re: [PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-26 Thread Kewen.Lin via Gcc-patches
on 2023/7/26 18:02, Richard Biener wrote: > On Wed, Jul 26, 2023 at 4:52 AM Kewen.Lin wrote: >> >> Hi, >> >> PR110776 exposes one issue that we could query unaligned >> load for vector type but actually no unaligned vector load >> is supported there. The reason is that the costed load is >> with

Re: [PATCH] Fix typo in insn name.

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi Mike, on 2023/7/11 03:59, Michael Meissner wrote: > In doing other work, I noticed that there was an insn: > > vsx_extract_v4sf__load > > Which did not have an iterator. I removed the useless . It actually has a mode iterator, the "P" is used for clobber. The whole pattern of this de

[PATCH] rs6000: Correct vsx operands output for xxeval [PR110741]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi, PR110741 exposes one issue that we didn't use the correct character for vsx operands in output operand substitution, consequently it can map to the wrong registers which hold some unexpected values. Bootstrapped and regress-tested on powerpc64-linux-gnu P7/P8/P9 and powerpc64le-linux-gnu P9/P

[PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi, PR110776 exposes one issue that we could query unaligned load for vector type but actually no unaligned vector load is supported there. The reason is that the costed load is with single-lane vector type and its memory access type is VMAT_ELEMENTWISE, we actually take it as scalar load and set

Re: [PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-23 Thread Kewen.Lin via Gcc-patches
on 2023/7/21 19:49, Richard Biener wrote: > On Fri, Jul 21, 2023 at 8:08 AM Kewen.Lin wrote: >> >> Hi, >> >> The function vect_update_epilogue_niters which has been >> removed by r14-2281 has some code taking care of that if >> there is only one scalar iteration left for epilogue then >> we won't

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/21 09:32, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all subtargets when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which can h

Re: [PATCH 2/2 ver 5] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 5, Fixed patch description, the first argument should be of > type vector. Fixed comment in vsx.md to say "Vector and scalar > extract_elt iterator/attr ". Removed a few of the changes in > version 4. Specifically

Re: [PATCH 1/2 ver 2] rs6000, add argument to function find_instance

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 2: Updated a number of formatting and spacing issues. Added > the NARGS description to the header comment for function find_instance. > This patch was tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE > with no r

Re: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Iain, on 2023/7/22 23:58, Iain Sandoe wrote: > Hi Kewen, > > This patch breaks bootstrap on powerpc-darwin (which has Altivec, but not > VSX) while building libgfortran. > >> On 3 Jul 2023, at 04:19, Kewen.Lin via Gcc-patches >> wrote: > > Please

[PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, The function vect_update_epilogue_niters which has been removed by r14-2281 has some code taking care of that if there is only one scalar iteration left for epilogue then we won't try to vectorize it any more. Although costing should be able to care about it eventually, I think we still want

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
on 2023/7/20 20:37, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >> Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order >> of LEN_STORE from {len,vector,bias} to {len,bias,vector}, >> in order to make them consistent with LEN_MASK_STORE and >> MASK_STORE. But it missed to upda

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
on 2023/7/20 20:34, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >> As PR110729 reported, there was one issue for .section >> __patchable_function_entries with -ffunction-sections, that >> is we put the same symbol as link_to section symbol for all >> functions wrongly. The commit r13

Re: [PATCH 2/2 ver 4] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:20, Carl Love wrote: > GCC maintainers: > > Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case statement > rs6000/rs6000-c.cc. The existing REPLACE_ELT iterator name was changed > to REPLACE_ELT_V along with the associated define_mode_attr. Renamed > VEC_RU to R

Re: [PATCH 1/2] rs6000, add argument to function find_instance

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:19, Carl Love wrote: > > GCC maintainers: > > The rs6000 function find_instance assumes that it is called for built- > ins with only two arguments. There is no checking for the actual > number of aruguments used in the built-in. This patch adds an > additional paramete

[PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order of LEN_STORE from {len,vector,bias} to {len,bias,vector}, in order to make them consistent with LEN_MASK_STORE and MASK_STORE. But it missed to update the related handlings in tree-ssa-sccvn.cc, it caused the failure shown in PR 1107

[PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, As PR110729 reported, there was one issue for .section __patchable_function_entries with -ffunction-sections, that is we put the same symbol as link_to section symbol for all functions wrongly. The commit r13-4294 for PR99889 has fixed this with the corresponding label LPFE* which sits in the

Re: PING^2 [PATCH] Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]

2023-07-19 Thread Kewen.Lin via Gcc-patches
Hi Fangrui, on 2023/7/19 14:33, Fangrui Song wrote: > On Thu, Nov 24, 2022 at 7:26 PM Kewen.Lin via Gcc-patches > wrote: >> >> Hi Richard, >> >> on 2022/11/23 00:08, Richard Sandiford wrote: >>> "Kewen.Lin" writes: >>>> Hi Richard, &g

Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Jeevitha, on 2023/7/17 11:40, P Jeevitha wrote: > > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. Since one line touched has (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) and powerpc64le-linux only adopts ABI_ELFv2, could you also test this

Re: [PATCH v7, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2022/9/26 11:35, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max

Re: rs6000: Fix expected counts powerpc/p9-vec-length-full

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Carl, The issue was tracked by PR109971 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971) and I think it had been resolved. btw, when the expected insn count changes, it does expose some issues but which can be either test or functionality issue, if it's taken as a test issue, it needs so

Re: [PATCH, rs6000] Generate mfvsrwz for all platforms and remove redundant zero extend [PR106769]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/6/19 09:14, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expander and generates mfvsrwz/stxsiwx > for all platforms when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which can

Re: [PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-17 Thread Kewen.Lin via Gcc-patches
on 2023/7/17 14:39, Richard Biener wrote: > On Mon, Jul 17, 2023 at 4:22 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR110652 and its duplicate PRs show, there could be one >> build error >> >> error: 'new_temp' may be used uninitialized >> >> for some build configurations. It's a false positive war

[PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-16 Thread Kewen.Lin via Gcc-patches
Hi, As PR110652 and its duplicate PRs show, there could be one build error error: 'new_temp' may be used uninitialized for some build configurations. It's a false positive warning (or error at -Werror), but in order to make the build succeed, this patch is to initialize the reported variable

Re: [PATCH ver 3] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/8 04:18, Carl Love wrote: > > GCC maintainers: > > Version 3, added code to altivec_resolve_overloaded_builtin so the > correct instruction is selected for the size of the second argument. > This restores the instruction counts to the original values where the > correct instr

Re: [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/12 02:06, Carl Love wrote: > GCC maintainers: > > Ver 4, Removed extra space in subject line. Added comment to commit > log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define. Changed > Added to Add and Renamed to Rename in ChangeLog. Updated define_expand > "rs6000_se

Re: [PATCH ver3] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-10 Thread Kewen.Lin via Gcc-patches
Hi Carl, Excepting for Peter's review comments, some nits are inline below. on 2023/7/11 03:18, Carl Love wrote: > > GCC maintainers: > > Ver 3, Renamed the patch per comments on ver 2. Previous subject line > was " [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value". > Fixed spel

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-10 Thread Kewen.Lin via Gcc-patches
on 2023/7/11 03:18, Carl Love wrote: > On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote: >> Hi Carl, >> >> Some more minor comments are inline below on top of Peter's >> insightful >> review comments. >> >> on 2023/7/1 08:58, Carl Love wrote: >>> GCC maintainers: >>> >>> Ver 2, Went back thru th

Re: [PATCH] VECT: Add COND_LEN_* operations for loop control with length targets

2023-07-10 Thread Kewen.Lin via Gcc-patches
Hi Juzhe, > Hi, Kewen. > > You mean COND_LEN_* family I added is unlikely used by power in the future? Since AFAIK there is no plan to extend len family on Power, it's very likely that Power will have no chance to leverage them, so yes. > Could you revise them to make it possible used by power

Re: [PATCH] rs6000: Remove redundant initialization [PR106907]

2023-07-10 Thread Kewen.Lin via Gcc-patches
on 2023/7/11 07:11, Peter Bergner wrote: > On 6/29/23 4:31 AM, Kewen.Lin via Gcc-patches wrote: >> This is okay for trunk (no backports needed btw), this fix can even be >> taken as obvious, thanks! >> >>> >>> 2023-06-07 Jeevitha Palanisamy >>

Re: [PATCH v5] rs6000: Update the vsx-vector-6.* tests.

2023-07-10 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/8 04:40, Carl Love wrote: > > GCC maintainers: > > Ver 5. Removed -compile from the names of the compile only tests. Fixed > up the reference to the compile file names in the .h file headers. > Replaced powerpc_vsx_ok with vsx_hw in the run test files. Removed the > -save-te

Re: [PATCH] VECT: Add COND_LEN_* operations for loop control with length targets

2023-07-10 Thread Kewen.Lin via Gcc-patches
on 2023/7/10 18:40, Richard Biener wrote: > On Fri, 7 Jul 2023, juzhe.zh...@rivai.ai wrote: > >> From: Ju-Zhe Zhong >> >> Hi, Richard and Richi. >> >> This patch is adding cond_len_* operations pattern for target support >> loop control with length. > > It looks mostly OK - the probably obvious

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-06 Thread Kewen.Lin via Gcc-patches
on 2023/7/7 07:00, Peter Bergner wrote: > On 7/6/23 5:54 PM, Peter Bergner wrote: >> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote: >>> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c >>> @@ -0,0 +1,153 @@ >>> +/* { dg-do run { target { powerpc*-*-* } } } */ >> >> powerpc*-*-

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-06 Thread Kewen.Lin via Gcc-patches
Hi Carl, Some more minor comments are inline below on top of Peter's insightful review comments. on 2023/7/1 08:58, Carl Love wrote: > > GCC maintainers: > > Ver 2, Went back thru the requirements and emails. Not sure where I > came up with the requirement for an overloaded version with doubl

Re: [PATCH v4] rs6000: Update the vsx-vector-6.* tests.

2023-07-06 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/6 23:33, Carl Love wrote: > GCC maintainers: > > Ver 4. Fixed a few typos. Redid the tests to create separate run and > compile tests. Thanks! This new version looks good, excepting that we need vsx_hw for run and two nits, see below. > > Ver 3. Added __attribute__ ((noip

Re: [PATCH V4 1/4] rs6000: build constant via li;rotldi

2023-07-03 Thread Kewen.Lin via Gcc-patches
Hi Jeff, on 2023/7/4 10:18, Jiufu Guo via Gcc-patches wrote: > Hi, > > If a constant is possible to be rotated to/from a positive or negative > value from "li", then "li;rotldi" can be used to build the constant. > > Compare with the previous version: > https://gcc.gnu.org/pipermail/gcc-patches/

Re: [PATCH ver 3] rs6000: Update the vsx-vector-6.* tests.

2023-07-03 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/6/30 05:36, Carl Love wrote: > GCC maintainers: > > Ver 3. Added __attribute__ ((noipa)) to the test files. Changed some > of the scan-assembler-times checks to cover multiple similar > instructions. Change the function check macro to a macro to generate a > function to do the

Re: [PATCH] rs6000: Update the vsx-vector-6.* tests.

2023-07-03 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/3 23:57, Carl Love wrote: > Kewen: > > On Fri, 2023-06-30 at 15:20 -0700, Carl Love wrote: >> Segher never liked the above way of looking at the assembly. He >> prefers: >> gcc -S -g -mcpu=power8 -o vsx-vector-6-func-2lop.s vsx-vector-6- >> func- >> 2lop.c >> >> grep xxlor

Re: [PATCH 0/9] vect: Move costing next to the transform for vect load

2023-07-02 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for your review comments on this and some others! on 2023/6/30 19:37, Richard Biener wrote: > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin wrote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> ne

Re: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

2023-07-02 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2023/7/2 16:58, Richard Sandiford wrote: > Kewen Lin writes: >> @@ -9743,11 +9739,23 @@ vectorizable_load (vec_info *vinfo, >>unsigned int n_groups = 0; >>for (j = 0; j < ncopies; j++) >> { >> - if (nloads > 1) >> + if (nloads > 1 && !costing_p) >>

[PATCH 9/9 v2] vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS

2023-07-02 Thread Kewen.Lin via Gcc-patches
This is version v2 rebasing from latest trunk. = This patch adjusts the cost handling on VMAT_CONTIGUOUS in function vectorizable_load. We don't call function vect_model_load_cost for it any more. It removes function vect_model_load_cost which becomes useless and unreachable now. gcc/Chang

  1   2   3   4   5   6   7   8   9   10   >