Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-11 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2021/5/10 下午9:55, Richard Biener wrote: > On Sat, May 8, 2021 at 10:05 AM Kewen.Lin wrote: >> >> Hi Richi, >> >> Thanks for the comments! >> >> on 2021/5/7 下午5:43, Richard Biener wrote: >>> On Fri, May 7, 2021 at 5:30 AM Ke

Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-11 Thread Kewen.Lin via Gcc-patches
Hi Richard, on 2021/5/10 下午10:08, Richard Sandiford wrote: > "Kewen.Lin via Gcc-patches" writes: >> on 2021/5/7 下午5:43, Richard Biener wrote: >>> On Fri, May 7, 2021 at 5:30 AM Kewen.Lin via Gcc-patches >>> wrote: >>>> >>>> Hi,

Re: [PATCH 2/2 v2] rs6000: Guard density_test only for vector version

2021-05-11 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2021/5/11 上午4:12, Segher Boessenkool wrote: > Hi! > > On Sat, May 08, 2021 at 04:12:18PM +0800, Kewen.Lin wrote: >> --- a/gcc/config/rs6000/rs6000.c >> +++ b/gcc/config/rs6000/rs6000.c >> @@ -5234,6 +5234,8 @@ typedef struct _rs6000_cost_data >>

Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-11 Thread Kewen.Lin via Gcc-patches
Hi Richi, > OTOH we already pass scalar_stmt to individual add_stmt_cost, > so not sure whether the context really matters. That said, > the density test looks "interesting" ... the intent was that finish_cost > might look at gathered data from add_stmt, not that it looks at >

[PATCH v2] forwprop: Support vec perm fed by CTOR and CTOR/CST [PR99398]

2021-05-12 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for the review! on 2021/5/11 下午9:26, Richard Biener wrote: > On Fri, 7 May 2021, Kewen.Lin wrote: > >> Hi, >> >> This patch is to teach forwprop to optimize some cases where the >> permutated operands of vector permutation are from two same typ

Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-13 Thread Kewen.Lin via Gcc-patches
Hi! >>> But in the end the vector code shouldn't end up worse than the >>> scalar code with respect to IVs - the cases where it would should >>> be already costed. So I wonder if you have specific examples >>> where things go worse enough for the heuristic to trigger? >>> >> >> One typical case t

Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-17 Thread Kewen.Lin via Gcc-patches
on 2021/5/17 下午4:55, Richard Biener wrote: > On Thu, May 13, 2021 at 9:04 AM Kewen.Lin wrote: >> >> Hi! >> >>>>> But in the end the vector code shouldn't end up worse than the >>>>> scalar code with respect to IVs - the cases where it would

[PATCH] vect: Replace hardcoded weight factor with param

2021-05-18 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to replace the current hardcoded weight factor 50 for those statements in an inner loop relative to the loop being vectorized with a specific parameter vect-inner-loop-weight-factor. The motivation behind this change is: if targets want to have one unique function to gather some

Re: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-12-12 Thread Kewen.Lin via Gcc-patches
on 2022/12/9 06:04, Michael Meissner wrote: > On Wed, Dec 07, 2022 at 03:55:41PM +0800, Kewen.Lin wrote: >> Hi Mike, >> >> on 2022/12/7 14:44, Michael Meissner wrote: >>> On Tue, Dec 06, 2022 at 05:36:54PM +0800, Kewen.Lin wrote: >>>> Hi Mike, >>>

Re: [PATCH v5, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-12 Thread Kewen.Lin via Gcc-patches
on 2022/12/12 11:23, HAO CHEN GUI wrote: > Hi Kewen, > > 在 2022/12/8 16:47, Kewen.Lin 写道: >> This documentation update reminds me of that the current prototype of >> __ieee128 >> variant can be: >> >> unsigned int scalar_extract_exp (__ieee128 source); &g

Re: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/13 14:14, Michael Meissner wrote: > On Mon, Dec 12, 2022 at 06:20:14PM +0800, Kewen.Lin wrote: >> Without or with patch #1, the below ICE in libgcc exists, the ICE should have >> nothing to do with the special handling for building_libgcc in patch #1. I >> think

Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/6 19:27, Kewen.Lin via Gcc-patches wrote: > Hi Mike, > > Thanks for fixing this, some comments are inlined below. > > on 2022/11/2 10:42, Michael Meissner wrote: >> This patch fixes the issue that GCC cannot build when the default long double >> is IEEE 128

Re: [PATCH V2] rs6000: Load high and low part of 64bit constant independently

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jeff, on 2022/12/12 09:44, Jiufu Guo via Gcc-patches wrote: > Hi, > > Compare with previous patch, this patch updates accoding to comments; fixes > conflicts with trunk, and recheck bootstrap®test. > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607333.html > > For a complicate 64bi

Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jakub, Thanks for the comments! on 2022/12/14 17:36, Jakub Jelinek wrote: > On Wed, Dec 14, 2022 at 04:46:07PM +0800, Kewen.Lin via Gcc-patches wrote: >> on 2022/12/6 19:27, Kewen.Lin via Gcc-patches wrote: >>> Hi Mike, >>> >>> Thanks for fixing t

Re: [PATCH V4 1/2] rs6000: use li;x?oris to build constant

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jeff, on 2022/12/12 09:38, Jiufu Guo via Gcc-patches wrote: > Hi, > > For constant C: > If '(c & 0x8000ULL) == 0x8000ULL' or say: > 32(1) || 16(x) || 1(1) || 15(x), using "li; xoris" would be ok. > > If '(c & 0x80008000ULL) == 0x8000ULL' or say: > 32(0) ||

[PATCH] rs6000: Raise error for __vector_{quad, pair} uses without MMA enabled [PR106736]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi, As PR106736 shows, it's unexpected to use __vector_quad and __vector_pair types without MMA support, it would cause ICE when expanding the corresponding assignment. We can't guard these built-in types registering under MMA support as Peter pointed out in that PR, because the registering is gl

PING^1 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen on 2022/11/24 17:15, Kewen Lin wrote: > Hi, > > Following Segher's suggestion, this patch series is to rework > function rs6000_emit_vector_compare for vector float and int > in multiple

PING^2 [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603350.html BR, Kewen > on 2022/10/12 16:12, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> PR106680 shows that -m32 -mpowerpc64 is different from >> -mpowerpc64 -m32, this is determined

PING^1 [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607526.html BR, Kewen on 2022/11/30 16:30, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR104024 shows, the option -mpower10-fusion isn't guarded by > -mcpu=power10, it causes compiler to fuse for som

PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607527.html BR, Kewen on 2022/11/30 16:30, Kewen.Lin via Gcc-patches wrote: > Hi, > > Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO > if fun->decl is not null but no cgraph node is availab

Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
>> I bet the above workaround in generic code was added for a reason, it would >> surprise me if _Float128 worked at all without that hack. > > OK, I'll have a look at those nan failures soon. By investigating the exposed NaN failures, I found it's due to that it wants to convert _Float128 type c

Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/14 18:33, Jakub Jelinek wrote: > On Wed, Dec 14, 2022 at 06:11:26PM +0800, Kewen.Lin wrote: >>> The hacks with different precisions of powerpc 128-bit floating types are >>> very unfortunate, it is I assume because the middle-end asserted that scalar >>

Re: PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-15 Thread Kewen.Lin via Gcc-patches
Hi Honza, Thanks for the comments. on 2022/12/14 21:22, Jan Hubicka wrote: >>> PR middle-end/105818 >>> >>> gcc/ChangeLog: >>> >>> * predict.cc (optimize_function_for_size_p): Further check >>> optimize_size of fun->decl when it is valid but no cgraph node. >>> >>> gcc/testsuite/Chang

Re: [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the review comments! on 2022/12/15 06:29, Segher Boessenkool wrote: > On Wed, Nov 30, 2022 at 04:30:13PM +0800, Kewen.Lin wrote: >> As PR104024 shows, the option -mpower10-fusion isn't guarded by >> -mcpu=power10, it causes compiler to fuse for some patt

[PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi, In function fold_convert_const_real_from_real, when the modes of two types involved in fp conversion are the same, we can simply take it as copy, rebuild with the exactly same TREE_REAL_CST and the target type. It is more efficient and helps to avoid possible unexpected signalling bit clearin

Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for the comments! on 2022/12/19 16:49, Richard Biener wrote: > On Mon, Dec 19, 2022 at 9:12 AM Kewen.Lin wrote: >> >> Hi, >> >> In function fold_convert_const_real_from_real, when the modes of >> two types involved in fp conversion are the same,

Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-20 Thread Kewen.Lin via Gcc-patches
on 2022/12/20 20:14, Jakub Jelinek wrote: > On Mon, Dec 19, 2022 at 04:11:59PM +0800, Kewen.Lin wrote: >> In function fold_convert_const_real_from_real, when the modes of >> two types involved in fp conversion are the same, we can simply >> take it as copy, rebuild

Re: [PATCH] rs6000: Raise error for __vector_{quad, pair} uses without MMA enabled [PR106736]

2022-12-20 Thread Kewen.Lin via Gcc-patches
on 2022/12/21 02:56, Segher Boessenkool wrote: > On Wed, Dec 14, 2022 at 07:21:20PM +0800, Kewen.Lin wrote: >> I'm going to push this next week if no objections. > > Please do? > Thanks! Committed in r13-4814-g282462b39584ae. BR, Kewen

[PATCH, committed] rs6000: Fix the wrong location of OPTION_MASK_P10_FUSION setting hunk

2022-12-20 Thread Kewen.Lin via Gcc-patches
Hi, The hunk for setting flag OPTION_MASK_P10_FUSION locates wrongly between the if and else if block for OPTION_MASK_MMA. This is to fix this oversight accordingly. Bootstrapped and regtested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10. IMO this is obvious, already committe

Re: [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-20 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2022/12/20 21:19, Segher Boessenkool wrote: > Hi! > > On Mon, Dec 19, 2022 at 02:13:49PM +0800, Kewen.Lin wrote: >> on 2022/12/15 06:29, Segher Boessenkool wrote: >>> On Wed, Nov 30, 2022 at 04:30:13PM +0800, Kewen.Lin wrote: >>>> --- a/gcc/confi

[RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi, This a different attempt from Mike's approach[1][2] to fix the issue in PR107299. With option -mabi=ieeelongdouble specified, type long double (and __float128) and _Float128 have the same mode TFmode, but they have different type precisions, it causes the assertion to fail in function fold_us

Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2022/12/22 05:24, Segher Boessenkool wrote: > Hi! > > On Wed, Dec 21, 2022 at 05:02:17PM +0800, Kewen.Lin wrote: >> This a different attempt from Mike's approach[1][2] to fix the >> issue in PR107299. > > Ke Wen, Mike: so iiuc with this patch appli

Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-21 Thread Kewen.Lin via Gcc-patches
Hi Joseph, on 2022/12/22 05:40, Joseph Myers wrote: > On Wed, 21 Dec 2022, Segher Boessenkool wrote: > >>> --- a/gcc/tree.cc >>> +++ b/gcc/tree.cc >>> @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char) >>>if (!targetm.floatn_mode (n, extended).exists (&mode)) >>> contin

Re: [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-12-27 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2022/12/24 04:26, Segher Boessenkool wrote: > Hi! > > On Wed, Oct 12, 2022 at 04:12:21PM +0800, Kewen.Lin wrote: >> PR106680 shows that -m32 -mpowerpc64 is different from >> -mpowerpc64 -m32, this is determined by the way how we >> handle option powerp

[PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi, As Honza pointed out in [1], the current uses of function optimize_function_for_speed_p in rs6000_option_override_internal are too early, since the query results from the functions optimize_function_for_{speed,size}_p could be changed later due to profile feedback and some function attributes

[PATCH] rs6000: Make P10_FUSION honour tuning setting

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi, We noticed this issue when Segher reviewed the patch for PR104024. When there is no explicit setting for option -mpower10-fusion, we enable OPTION_MASK_P10_FUSION for TARGET_POWER10. But it's not right, it should honour tuning setting instead. This patch is to fix it accordingly, it's boots

Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the comments. on 2023/1/4 18:46, Segher Boessenkool wrote: > On Wed, Jan 04, 2023 at 05:20:14PM +0800, Kewen.Lin wrote: >> As Honza pointed out in [1], the current uses of function >> optimize_function_for_speed_p in rs6000_option_override_internal >>

Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
on 2023/1/4 22:02, Segher Boessenkool wrote: > Hi! > > On Wed, Jan 04, 2023 at 08:15:03PM +0800, Kewen.Lin wrote: >> on 2023/1/4 18:46, Segher Boessenkool wrote: >>>> @@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx >>>> tlsarg, rtx co

[PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about inline asm [PR108272]

2023-01-06 Thread Kewen.Lin via Gcc-patches
Hi, As PR108272 shows, there are some invalid uses of MMA opaque types in inline asm statements. This patch is to teach the function rs6000_opaque_type_invalid_use_p for inline asm, check and error any invalid use of MMA opaque types in input and output operands. Bootstrapped and regtested on po

[PATCH] rs6000: Allow powerpc64 to be unset for implicit 64 bit [PR108240]

2023-01-06 Thread Kewen.Lin via Gcc-patches
Hi, Before r13-4894, if 64 bit is explicitly specified, option powerpc64 is explicitly enabled too; while if 64 bit is implicitly enabled and there is no explicit setting for option powerpc64, option powerpc64 is eventually enabled or not would rely on the default value of the used cpu. It's initi

Re: [PATCH] rs6000: Make P10_FUSION honour tuning setting

2023-01-06 Thread Kewen.Lin via Gcc-patches
Hi Pat, on 2023/1/6 03:30, Pat Haugen wrote: > On 1/4/23 3:20 AM, Kewen.Lin via Gcc-patches wrote: >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >> index 88c865b6b4b..6fa084c0807 100644 >> --- a/gcc/config/rs6000/rs6000.cc >> +++ b/

Re: [PATCH] rs6000: Make P10_FUSION honour tuning setting

2023-01-11 Thread Kewen.Lin via Gcc-patches
on 2023/1/6 17:28, Kewen.Lin via Gcc-patches wrote: > Hi Pat, > > on 2023/1/6 03:30, Pat Haugen wrote: >> On 1/4/23 3:20 AM, Kewen.Lin via Gcc-patches wrote: >>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >>> index 88c865b6b4b..6fa084

[PATCH, committed] rs6000/test: Make ppc-fortran.exp only available for PowerPC target

2023-01-11 Thread Kewen.Lin via Gcc-patches
Hi, When testing one patch which adds a fortran test case into test bucket powerpc/ppc-fortran/, I found one unexpected failure on a non-PowerPC target. It's due to that ppc-fortran.exp does not exit early if the testing target isn't a PowerPC target. This patch is to make it exit immediately if

[PATCH] rs6000: Imply VSX early to adopt some checkings on conflict [PR108240]

2023-01-11 Thread Kewen.Lin via Gcc-patches
Hi, As PR108240 shows, some options like -mmodulo can enable some flags implicitly including OPTION_MASK_VSX. But the enabled flag can conflict with some existing setting like soft float, it would result in some unexpected cases and consequent ICE. Actually there are already some checkings for VS

Re: [PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about inline asm [PR108272]

2023-01-16 Thread Kewen.Lin via Gcc-patches
on 2023/1/6 17:26, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR108272 shows, there are some invalid uses of MMA opaque > types in inline asm statements. This patch is to teach the > function rs6000_opaque_type_invalid_use_p for inline asm, > check and error any invalid

[PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about gcall [PR108348]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi, PR108348 shows one special case that MMA opaque types are used in function arguments and treated as pass by reference, it results in one copying from argument to a temp variable, since this copying happens before rs6000_function_arg check, it can cause ICE without MMA support then. This patc

[PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi, As Honza pointed out in [1], the current uses of function optimize_function_for_speed_p in rs6000_option_override_internal are too early, since the query results from the functions optimize_function_for_{speed,size}_p could be changed later due to profile feedback and some function attributes

Re: [PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about gcall [PR108348]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the review comments! on 2023/1/16 16:49, Segher Boessenkool wrote: > Hi! > > On Mon, Jan 16, 2023 at 04:33:36PM +0800, Kewen.Lin wrote: >> PR108348 shows one special case that MMA opaque types are >> used in function arguments and treated as pas

[PATCH/RFC] rs6000: Remove optimize_for_speed check for implicit TARGET_SAVE_TOC_INDIRECT [PR108184]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi, Now we will check optimize_function_for_speed_p (cfun) for TARGET_SAVE_TOC_INDIRECT if it's implicitly enabled. But the effect of -msave-toc-indirect is actually to save the TOC in the prologue for indirect calls rather than inline, it's also good for optimize_function_for_size? So this patc

Re: [PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about gcall [PR108348]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi Segher! on 2023/1/16 18:40, Segher Boessenkool wrote: > Hi! > > On Mon, Jan 16, 2023 at 05:20:56PM +0800, Kewen.Lin wrote: >> on 2023/1/16 16:49, Segher Boessenkool wrote: >>>> +/* { dg-require-effective-target powerpc_p9modulo_ok } */ >>> >>>

[PATCH] rs6000: Fix typo on vec_vsubcuq in rs6000-overload.def [PR108396]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi, As Andrew pointed out in PR108396, there is one typo in rs6000-overload.def on built-in function vec_vsubcuq: [VEC_VSUBCUQ, vec_vsubcuqP, __builtin_vec_vsubcuq] "vec_vsubcuqP" should be "vec_vsubcuq", this typo caused us to define vec_vsubcuqP in rs6000-vecdefines.h instead of vec_vsubcuq,

Re: [PATCH] rs6000: Teach rs6000_opaque_type_invalid_use_p about gcall [PR108348]

2023-01-16 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2023/1/16 23:24, Segher Boessenkool wrote: > On Mon, Jan 16, 2023 at 09:05:38PM +0800, Kewen.Lin wrote: >>> The *_ok things should only be used for features that can be disabled >>> during configuration, or features that we *want* users to be able to >>&g

[PATCH 1/2] rs6000: Refactor script genfusion.pl

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi, As Segher suggested in [1], this patch is to refactor the script genfusion.pl for generating fusion.md. It mainly consists of: 1) Add main subroutine, which calls several backbone subroutines, hope it can show the skeleton clearly. 2) Encapsulate copyright and top comments emission t

[PATCH 2/2] rs6000: Refactor genfusion.pl a bit further

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi, To keep the previous refactoring patch not need to re-generate fusion.md and make the review easier, I didn't merge this patch into the previous one. But I think this one can help to make the subroutine gen_logical_addsubf_scalar more clear, by separating logical-logical and add-logical handl

Re: [PATCH/RFC] rs6000: Remove optimize_for_speed check for implicit TARGET_SAVE_TOC_INDIRECT [PR108184]

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi Mike, Thanks for the comments! on 2023/1/18 04:57, Michael Meissner wrote: > On Mon, Jan 16, 2023 at 05:39:04PM +0800, Kewen.Lin wrote: >> Hi, >> >> Now we will check optimize_function_for_speed_p (cfun) for >> TARGET_SAVE_TOC_INDIRECT if it's implicitly enabl

Re: Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2023-01-18 Thread Kewen.Lin via Gcc-patches
8/24 09:24, Xionghu Luo wrote: > 主题: > Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the > UNSPECS [PR106069] > From: > Xionghu Luo > 日期: > 2022/8/24, 09:24 > > 收件人: > "Kewen.Lin" , Segher Boessenkool > > 抄送: > Xiongh

[PATCH 3/3] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi, Like r14-3317 which moves the handlings on memory access type VMAT_GATHER_SCATTER in vectorizable_load final loop nest, this one is to deal with vectorizable_store side. Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. Is it ok for trunk? BR

Re: [PATCH] vect: Replace DR_GROUP_STORE_COUNT with DR_GROUP_LAST_ELEMENT

2023-08-22 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2023/8/22 20:17, Richard Biener wrote: > On Tue, Aug 22, 2023 at 10:44 AM Kewen.Lin wrote: >> >> Hi, >> >> Now we use DR_GROUP_STORE_COUNT to record how many stores >> in a group have been transformed and only do the actual >> transform when

Re: [PATCH 1/3] vect: Remove some manual release in vectorizable_store

2023-08-22 Thread Kewen.Lin via Gcc-patches
on 2023/8/22 20:32, Richard Biener wrote: > On Tue, Aug 22, 2023 at 10:45 AM Kewen.Lin wrote: >> >> Hi, >> >> To avoid some duplicates in some follow-up patches on >> function vectorizable_store, this patch is to adjust some >> existing vec with

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-23 Thread Kewen.Lin via Gcc-patches
Hi Peter, on 2023/8/24 10:07, Peter Bergner wrote: > On 8/21/23 8:51 PM, Kewen.Lin wrote: >>> The following patch has been bootstrapped and regtested on powerpc64-linux. >> >> I think we should test this on powerpc64le-linux P8 or P9 (no P10) as well. > > That'

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-25 Thread Kewen.Lin via Gcc-patches
on 2023/8/25 11:20, Peter Bergner wrote: > On 8/24/23 12:56 AM, Kewen.Lin wrote: >> By looking into the uses of function rs6000_pcrel_p, I think we can >> just replace it with TARGET_PCREL. Previously we don't require PCREL >> unset for any unsupported target/OS, so w

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-27 Thread Kewen.Lin via Gcc-patches
on 2023/8/26 06:04, Peter Bergner wrote: > On 8/25/23 6:20 AM, Kewen.Lin wrote: >> Assuming the current PCREL_SUPPORTED_BY_OS unchanged, when >> PCREL_SUPPORTED_BY_OS is true, all its required conditions are >> satisfied, it should be safe. while PCREL_SUPPORTED_BY_OS is &

Re: [PATCH ver 3] rs6000, add overloaded DFP quantize support

2023-08-27 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/25 03:53, Carl Love wrote: > GCC maintainers: > > Version 3, fixed the built-in instance names. Missed removing the "n" > the name. Added the tighter constraints on the predicates for the > define_insn. Updated the wording for the built-ins in the > documentation file. Chan

Re: [PATCH-1, rs6000] Enable SImode in FP register on P7 [PR88558]

2023-08-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/25 14:44, HAO CHEN GUI wrote: > Hi, > This patch enables SImode in FP register on P7. Instruction "fctiw" > stores its integer output in an FP register. So SImode in FP register > needs be enabled on P7 if we want support "fctiw" on P7. > It sounds reasonable to support S

Re: [PATCH-2, rs6000] Implement 32bit inline lrint [PR88558]

2023-08-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/25 14:44, HAO CHEN GUI wrote: > Hi, > This patch implements 32bit inline lrint by "fctiw". It depends on > the patch1 to do SImode move from FP register on P7. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > > Thanks > Gui Haochen > > Ch

Re: [PATCH ver 4] rs6000, add overloaded DFP quantize support

2023-08-29 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/29 04:00, Carl Love wrote: > > GCC maintainers: > > Version 4, additional define_insn name fix. Change Log fix for the > UNSPEC_DQUAN. Retested patch on Power 10 LE. > > Version 3, fixed the built-in instance names. Missed removing the "n" > the name. Added the tighter co

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-29 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/29 10:50, HAO CHEN GUI wrote: > Hi, > This patch adds "TARGET_64BIT" check when calling vector load/store > with length expand in expand_block_move. It matches the expand condition > of "lxvl" and "stxvl" defined in vsx.md. > > This patch fixes the ICE occurred with the

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-30 Thread Kewen.Lin via Gcc-patches
on 2023/8/31 13:47, HAO CHEN GUI wrote: > Kewen, > I refined the patch according to your comments and it passed bootstrap > and regression test. > > I committed it as > https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be Thanks! We want this to be backported, so it's also ok for b

Re: [PATCH] rs6000: Update instruction counts to match vec_* calls [PR111228]

2023-08-30 Thread Kewen.Lin via Gcc-patches
Hi Peter, on 2023/8/31 06:42, Peter Bergner wrote: > Commit r14-3258-ge7a36e4715c716 increased the amount of folding we perform, > leading to better code. Update the expected instruction counts to match the > the number of associated vec_* built-in calls. > > Tested on powerpc64le-linux with no

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/4 13:33, HAO CHEN GUI wrote: > Hi, > This patch enables SImode in FP registers on P7. Instruction "fctiw" > stores its integer output in an FP register. So SImode in FP register > needs be enabled on P7 if we want support "fctiw" on P7. > > The test case is in the second

Re: [PATCH-2v2, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/4 13:33, HAO CHEN GUI wrote: > Hi, > This patch implements 32bit inline lrint by "fctiw". It depends on > the patch1 to do SImode move from FP registers on P7. > > Compared to last version, the main change is to add tests for "lrintf" > and adjust the count of correspond

Re: [PATCH] rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index

2023-09-12 Thread Kewen.Lin via Gcc-patches
Hi Ajit, on 2023/8/31 18:44, Ajit Agarwal via Gcc-patches wrote: > > This patch removes zero extension from vctzlsbb as it already zero extends. > Bootstrapped and regtested on powerpc64-linux-gnu. > > Thanks & Regards > Ajit > > rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eo

Re: [PATCH v1] rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index

2023-09-13 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/9/13 00:39, Ajit Agarwal wrote: > This patch removes zero extension from vctzlsbb as it already zero extends. > Bootstrapped and regtested on powerpc64-linux-gnu. > > Thanks & Regards > Ajit > > rs6000: unnecessary clear after vctzlsbb in vec_first_match_or_eos_index > > For rs6000

[PATCH] rs6000: Use default target option node for callee by default [PR111380]

2023-09-17 Thread Kewen.Lin via Gcc-patches
Hi, As PR111380 (and the discussion in related PRs) shows, for now how function rs6000_can_inline_p treats the callee without any target option node is wrong. It considers it's always safe to inline this kind of callee, but actually its target flags are from the command line options (target_optio

[PATCH] rs6000: Skip empty inline asm in rs6000_update_ipa_fn_target_info [PR111366]

2023-09-17 Thread Kewen.Lin via Gcc-patches
Hi, PR111366 exposes one thing that can be improved in function rs6000_update_ipa_fn_target_info is to skip the given empty inline asm string, since it's impossible to adopt any hardware features (so far HTM). Since this rs6000_update_ipa_fn_target_info related approach exists in GCC12 and later,

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/9/14 16:35, HAO CHEN GUI wrote: > Hi Kewen, > > 在 2023/9/12 17:33, Kewen.Lin 写道: >> Ok, at least regression testing doesn't expose any needs to do disparaging >> for this. Could you also test this patch with SPEC2017 for P7 and P8 >> separatel

Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-08-09 Thread Kewen.Lin via Gcc-patches
Hi Bin, Thanks for the review!! on 2020/8/8 下午4:01, Bin.Cheng wrote: > Hi Kewen, > Sorry for the late reply. > The patch's most important change is below cost computation: > >> @@ -5890,6 +5973,10 @@ determine_iv_cost (struct ivopts_data *data, struct >> iv_cand *cand) >> cost_step = add_co

[PATCH] options: Make --help= to emit values post-overrided

2020-08-09 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/8/7 下午10:42, Segher Boessenkool wrote: > Hi! > > On Fri, Aug 07, 2020 at 10:44:10AM +0800, Kewen.Lin wrote: >>> I think this makes a lot of sense. >>> >>>> btw, not sure whether it's a good idea to move target_option_override_hook &g

Re: [PATCH 3/4] ivopts: Consider cost_step on different forms during unrolling

2020-08-10 Thread Kewen.Lin via Gcc-patches
Hi Bin, on 2020/8/10 下午8:38, Bin.Cheng wrote: > On Mon, Aug 10, 2020 at 12:27 PM Kewen.Lin wrote: >> >> Hi Bin, >> >> Thanks for the review!! >> >> on 2020/8/8 下午4:01, Bin.Cheng wrote: >>> Hi Kewen, >>> Sorry for the late reply. >&g

[PATCH] testsuite: Add -fno-common to pr82374.c [PR94077]

2020-08-12 Thread Kewen.Lin via Gcc-patches
Hi, As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on Power7 since gcc8. But it passes from gcc10. By looking into the difference, it's due to that gcc10 sets -fno-common as default, which makes vectorizer force the alignment and be able to use aligned vector load/store on those t

Re: [PATCH] options: Make --help= to emit values post-overrided

2020-08-13 Thread Kewen.Lin via Gcc-patches
Hi Richard, Thanks for the comments! on 2020/8/13 上午12:10, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Segher, >> >> on 2020/8/7 锟斤拷锟斤拷10:42, Segher Boessenkool wrote: >>> Hi! >>> >>> On Fri, Aug 07, 2020 at 10:44:10AM +0800

[PATCH 3/4 v2] ivopts: Consider cost_step on different forms during unrolling

2020-08-18 Thread Kewen.Lin via Gcc-patches
Hi Bin, > I see, it's similar to the auto-increment case where cost should be > recorded only once. So this is okay given 1) fine predicting > rtl-unroll is likely impossible here; 2) the patch has very limited > impact. > Really appreciate your time and patience! I extended the previous versio

[PATCH v2] options: Make --help= to emit values post-overrided

2020-08-18 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/8/15 上午6:01, Segher Boessenkool wrote: > Hi! > > On Fri, Aug 14, 2020 at 01:42:24PM +0800, Kewen.Lin wrote: >>> I think personally I'd prefer an option (3): call >>> target_option_override_hook directly in decode_options, >>> if help_op

[PATCH v2] testsuite: Update some vect cases for partial vectors

2020-08-18 Thread Kewen.Lin via Gcc-patches
Hi Richard, >> Yeah, the comments were confusing, its intent is to check which targets >> support partial vectors and which usage to be used. >> >> How about to update them like: >> >> "Return true if loops using partial vectors are supported and usage kind is >> 1/2". > > I wasn't really comment

[PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-08-25 Thread Kewen.Lin via Gcc-patches
Hi Bin, >> >> For one particular case like: >> >> for (i = 0; i < SIZE; i++) >> y[i] = a * x[i] + z[i]; >> >> we will mark reg_offset_p for IV candidates on x as below: >>- (unsigned long) (x_18(D) + 8)// only mark this before. >>- x_18(D) + 8 >>- (unsigne

[PATCH,GCC9]rs6000: Backport fixes for PR92923 and PR93136

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to backport the fix for PR92923 and its sequent fix for PR93136 to GCC-9 branch. We found the builtin functions needlessly using VIEW_CONVERT_EXPRs on their operands can probably cause remarkable performance issue especailly when they are in the hotspot. One typical case is h

Re: [PATCH v2] testsuite: Update some vect cases for partial vectors

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi Richard, > >> +# Return true if loops using partial vectors are supported but only for >> loops >> +# whose need to iterate can be removed, that is, value of >> +# param_vect_partial_vector_usage is set to 1. > > For these comments, I think it would be good to use the sourcebuild.texi > word

PING [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this since IVOPTs part is already to land. https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html BR, Kewen on 2020/5/28 下午8:19, Kewen.Lin via Gcc-patches wrote: > > gcc/ChangeLog > > 2020-MM-DD Kewen Lin > > * cfgloop.h (

[PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-30 Thread Kewen.Lin via Gcc-patches
Hi, Power9 supports vector with length in bytes load/store, this patch is to teach check_effective_target_vect_len_load_store to take it and its laters as effective vector with length targets. Also supplement the documents for has_arch_pwr*. Bootstrapped/regtested on powerpc64le-linux-gnu P8. I

Re: [PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi Will, Thanks for the review! on 2020/9/1 上午1:13, will schmidt wrote: > On Mon, 2020-08-31 at 14:43 +0800, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> Power9 supports vector with length in bytes load/store, this patch >> is to teach check_effective_target_vec

Re: [PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi Segher, >> proc check_effective_target_vect_len_load_store { } { >> -return 0 >> +return [expr { [check_effective_target_has_arch_pwr9] }] >> } > > Why not just > > return check_effective_target_has_arch_pwr9; > > ? (Or lose at least two pairs of brackets if not all three :-) )

[PATCH] test/rs6000: Replace test target p8 and p9+

2020-08-31 Thread Kewen.Lin via Gcc-patches
Hi, This is a trivial patch to clean existing rs6000 test targets p8 and p9+ with existing has_arch_pwr8 and has_arch_pwr9 target combination or only one of them. Not sure if it's a good idea to tidy this, but send out for comments. Bootstrapped/regtested on powerpc64le-linux-gnu P9. Any commen

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/1 上午3:41, Segher Boessenkool wrote: > Hi! > > Just a note: > > On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote: >> 1) Currently address_cost hook on rs6000 always return zero, but at least >> from Power7, pre_inc/pre_dec kind instructio

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Bin, >> 2) This case makes me think we should exclude ainc candidates in function >> mark_reg_offset_candidates. The justification is that: ainc candidate >> handles step update itself and when we calculate the cost for it against >> its ainc_use, the cost_step has been reduced. When unrolling

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-01 Thread Kewen.Lin via Gcc-patches
Hi Bin, I've updated the patch to punt ainc_use candidates as below: > + /* Skip AINC candidate since it contains address update itself, > +the replicated AINC computations when unrolling still have > +updates, unlike reg_offset_p candidates ca

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-02 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/2 下午6:25, Segher Boessenkool wrote: > Hi! > > On Wed, Sep 02, 2020 at 11:16:00AM +0800, Kewen.Lin wrote: >> on 2020/9/1 上午3:41, Segher Boessenkool wrote: >>> On Tue, Aug 25, 2020 at 08:46:55PM +0800, Kewen.Lin wrote: >>>> 1) Currentl

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Segher, >> Good question! I agree that they can execute in parallel, but it depends >> on how we interprete the addressing cost, if it's for required execution >> resource, I think it's off, since comparing with ld, the ldu has two iops >> and extra ALU requirement. > > OTOH, if you do not us

Re: [PATCH] vec: remove unreachable code

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Andrea, on 2020/9/4 下午8:11, Andrea Corallo wrote: > Hi all, > > just a small patch removing a piece of unreachable code in > 'vect_estimate_min_profitable_iters' given the condition > (LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)) is always true as > checked just above. > FWIW, I had the

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-04 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2020/9/4 下午10:16, Segher Boessenkool wrote: > Hi! > > On Fri, Sep 04, 2020 at 04:47:37PM +0800, Kewen.Lin wrote: >>>> Apart from that, one P9 specific point is that the update form load isn't >>>> preferred, the reason is that the instructio

[PATCH] rs6000: Use direct move for char/short vector CTOR [PR96933]

2020-09-08 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to make vector CTOR with char/short leverage direct move instructions when they are available. With one constructed test case, it can speed up 145% for char and 190% for short on P9. Tested SPEC2017 x264_r at -Ofast on P9, it gets 1.61% speedup (but based on unexpected SLP see

<    8   9   10   11   12   13   14   15   16   17   >