Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng
在 2025/1/16 下午8:59, Xi Ruoyao 写道: On Thu, 2025-01-16 at 20:52 +0800, Xi Ruoyao wrote: On Thu, 2025-01-16 at 20:30 +0800, Lulu Cheng wrote: 在 2025/1/15 下午6:10, Xi Ruoyao 写道: diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 9d97f0216f0..3a8e1297bd3

Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng
在 2025/1/15 下午6:10, Xi Ruoyao 写道: diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 9d97f0216f0..3a8e1297bd3 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine

Re: [PATCH] LoongArch: Add alsl.wu

2025-01-16 Thread Lulu Cheng
LGTM! Thanks! 在 2025/1/15 下午6:09, Xi Ruoyao 写道: On 64-bit capable LoongArch hardware, alsl.wu is similar to alsl.w but zero-extending the 32-bit result. gcc/ChangeLog: * config/loongarch/loongarch.md (alslsi3_extend): Add alsl.wu. gcc/testsuite/ChangeLog: * gcc.target/loonga

Re: [pushed][PATCH v2] LoongArch: Generate the final immediate for lu12i.w, lu32i.d and lu52i.d

2025-01-10 Thread Lulu Cheng
Pushed to r15-6817. 在 2025/1/10 上午10:27, mengqinggang 写道: Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and lu52i.d use the same processing. gcc/ChangeLog: * config/loongarch/lasx.md: Use new loongarch_output_move. * config/loongarch/loongarch-protos.h (loongarc

Re:[pushed] [PATCH v2] LoongArch: Opitmize the cost of vec_construct.

2025-01-09 Thread Lulu Cheng
Pushed to r15-6755. 在 2025/1/7 下午9:04, chenxiaolong 写道: When analyzing 525 on LoongArch architecture, it was found that the for loop of hotspot function x264_pixel_satd_8x4 could not be quantized 256-bit due to the cost of vec_construct setting. After re-adjusting vec_construct, the performan

Re: [pushed] [PATCH v1] LoongArch: Generate the final immediate for lu12i.w, lu32i.d and lu52i.d

2025-01-09 Thread Lulu Cheng
在 2025/1/10 上午10:03, Lulu Cheng 写道: Pushed to r15-6755. Sorry, I replied to the wrong email. 在 2025/1/6 下午4:16, mengqinggang 写道: Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and lu52i.d use the same processing. gcc/ChangeLog: * config/loongarch/lasx.md: U

Re:[pushed] [PATCH v1] LoongArch: Generate the final immediate for lu12i.w, lu32i.d and lu52i.d

2025-01-09 Thread Lulu Cheng
Pushed to r15-6755. 在 2025/1/6 下午4:16, mengqinggang 写道: Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and lu52i.d use the same processing. gcc/ChangeLog: * config/loongarch/lasx.md: Use new loongarch_output_move. * config/loongarch/loongarch-protos.h (loongarch_

Re: [PATCH] LoongArch: Adjust the cost of ADDRESS_REG_REG [PR114978].

2025-01-09 Thread Lulu Cheng
在 2025/1/8 下午11:16, Xi Ruoyao 写道: On Tue, 2025-01-07 at 10:44 +0800, Lulu Cheng wrote: After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Would this fix https://gcc.gnu.org/PR114978 (or at least make it latent)? The

Re: [PATCH v1] LoongArch: Opitmize the cost of vec_construct.

2025-01-07 Thread Lulu Cheng
在 2025/1/7 下午12:47, chenxiaolong 写道: When analyzing 525 on LoongArch architecture, it was found that the for loop of hotspot function x264_pixel_satd_8x4 could not be quantized 256-bit due to the cost of vec_construct setting. After re-adjusting vec_construct, the performance of 525 program

[PATCH 2/2] LoongArch: Implement target pragma.

2025-01-07 Thread Lulu Cheng
The target pragmas defined correspond to the target function attributes. This implementation is derived from AArch64. gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_reset_previous_fndecl): Add function declaration. (loongarch_save_restore_target_globals)

Re: [PATCH] LoongArch: combine related slli operations

2025-01-07 Thread Lulu Cheng
在 2025/1/2 下午5:46, Zhou Zhao 写道: If SImode reg is continuous left shifted twice, combine related instruction to one. gcc/ChangeLog: * config/loongarch/loongarch.md (extsv_ashlsi3): New template Hi, zhaozhou: The indentation here is wrong, it needs to be aligned with *.

Re: [pushed][PATCH] LoongArch: Optimize initializing fp resgister to zero

2025-01-07 Thread Lulu Cheng
Pushed to r15-6617. 在 2024/12/31 下午7:33, Deng Jianbo 写道: In LoongArch, currently uses instruction movgr2fr.{d|w} to move zero from fixed-point register to floating-pointer regsiter for initializing fp register to zero. When LSX or LASX is enabled, we can use instruction vxor.v which has lower la

[PATCH 1/2] LoongArch: Implement target attribute.

2025-01-07 Thread Lulu Cheng
Add function attributes support for LoongArch. Currently, the following items are supported: __attribute__ ((target ("{no-}strict-align"))) __attribute__ ((target ("cmodel="))) __attribute__ ((target ("arch="))) __attribute__ ((target ("tune="))) __attribut

[PATCH 0/2] Implement target attribute and pragma.

2025-01-07 Thread Lulu Cheng
__attribute__ ((target ("{no-}lsx"))) __attribute__ ((target ("{no-}lasx"))) Lulu Cheng (2): LoongArch: Implement target attribute. LoongArch: Implement target pragma. gcc/attr-urls.def | 6 + gcc/config.gcc| 2 +-

[PATCH] LoongArch: Adjust the cost of ADDRESS_REG_REG [PR114978].

2025-01-06 Thread Lulu Cheng
After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Add option '-maddr-reg-reg-cost='. gcc/ChangeLog: * config/loongarch/genopts/loongarch.opt.in: Add option '-maddr-reg-reg-cost='. * config/loongarch

Re:[pushed] [PATCH] LoongArch: Optimize for conditional move operations

2025-01-01 Thread Lulu Cheng
Pushed to r15-6493. 在 2024/12/30 上午10:39, Guo Jie 写道: The optimization example is as follows. From: if (condition) dest += 1 << 16; To: dest += (condition ? 1 : 0) << 16; It does not use maskeqz and masknez, thus reducing the number of instructions. gcc/ChangeLog: * config

Re: [pushed][PATCH] LoongArch: Add some vector pack/unpack patterns

2025-01-01 Thread Lulu Cheng
Pushed to r15-6491. 在 2024/12/30 上午10:38, Guo Jie 写道: gcc/ChangeLog: * config/loongarch/lasx.md (vec_unpacks_lo_): Redefine. (vec_unpacku_lo_): Ditto. (lasx_vext2xv_h_b): Replaced by vec_unpack_lo_v32qi. (vec_unpack_lo_v32qi): New insn. (lasx_vext2xv_w_h)

Re:[pushed] [PATCH v2] LoongArch: Add standard patterns uabd and sabd

2025-01-01 Thread Lulu Cheng
Pushed to r15-6492. 在 2024/12/30 下午3:12, Guo Jie 写道: gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvabsd_s_): Remove. (abd3): New insn pattern. (lasx_xvabsd_u_): Remove. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vabsd_b): Rename. (

Re:[pushed] [PATCH] LoongArch: Adjust insn patterns for better combine

2025-01-01 Thread Lulu Cheng
Pushed to r15-6490. 在 2024/12/30 上午10:38, Guo Jie 写道: For some instruction patterns with commutative operands, the order of operands needs to be adjusted to match the rules. gcc/ChangeLog: * config/loongarch/loongarch.md (bytepick_d__rev): New combiner. (bstrpick_alsl_p

Re:[pushed] [PATCH] LoongArch: Fix bugs in insn patterns lasx_xvrepl128vei_b/h/w/d_internal

2025-01-01 Thread Lulu Cheng
Pushed to r15-6489. 在 2024/12/30 上午10:37, Guo Jie 写道: There are two aspects that affect the matching of instruction templates: 1. vec_duplicate is redundant in the following operations. set (match_operand:V4DI ...) (vec_duplicate:V4DI (vec_select:V4DI ...)) 2. The range of values

Re: [pushed][PATCH] LoongArch: Fix selector error in lasx_xvexth_h/w/d* patterns

2025-01-01 Thread Lulu Cheng
Pushed to r15-6488. 在 2024/12/30 上午10:37, Guo Jie 写道: The xvexth related instructions operate SEPARATELY according to the high and low 128 bits, and sign/zero extend the upper half of every 128 bits in src to the corresponding 128 bits in dest. For xvexth.d.w, the rule for the first element of

Re: [pushed][PATCH] LoongArch: Remove useless UNSPECs and define_mode_attrs

2025-01-01 Thread Lulu Cheng
Pushed to r15-6487. 在 2024/12/30 上午10:34, Guo Jie 写道: gcc/ChangeLog: * config/loongarch/lasx.md: Remove useless code. * config/loongarch/lsx.md: Ditto. --- gcc/config/loongarch/lasx.md | 66 gcc/config/loongarch/lsx.md | 35 -

Re:[pushed] [PATCH v3] LoongArch: Implement vector cbranch optab for LSX and LASX

2024-12-31 Thread Lulu Cheng
Pushed to r15-6477. 在 2024/12/25 下午5:59, Jiahao Xu 写道: In order to support vectorization of loops with multiple exits, this patch adds the implementation of the conditional branch optab for LoongArch LSX/LASX instructions. This patch causes the gen-vect-{2,25}.c tests to fail. This is because

Re: [pushed][PATCH v2] LoongArch: Support immediate_operand for vec_cmp

2024-12-26 Thread Lulu Cheng
Pushed to r15-6445. 在 2024/12/18 下午3:45, Jiahao Xu 写道: We can't vectorize the code into instructions like vslti.w that compare with immediate_operand, because we miss immediate_operand support for integer comparisons. gcc/ChangeLog: * config/loongarch/lasx.md (vec_cmp): Remove.

Re: [pushed][PATCH] LoongArch: Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS hook

2024-12-25 Thread Lulu Cheng
Pushed to r15-6432. 在 2024/12/17 上午10:41, Jiahao Xu 写道: The hook changes the allocno class to either FP_REGS or GR_REGS depending on the mode of the register. This results in better register allocation overall, fewer spills and reduced codesize - particularly in SPEC2017 lbm. gcc/ChangeLog:

Re: [PATCH 0/5] LoongArch: CRC optimization

2024-12-17 Thread Lulu Cheng
在 2024/12/16 下午9:19, Xi Ruoyao 写道: A generic CRC optimization pass has been implemented in r15-5850. But without target-specific code, it'll only optimize the CRC loop to a table lookup. With LoongArch-specific code we can do it better: for 64-bit LoongArch and the IEEE 802.3 polynomial or th

Re: [PATCH 2/5] LoongArch: Add bit reverse operations

2024-12-16 Thread Lulu Cheng
在 2024/12/17 下午12:30, Xi Ruoyao 写道: On Tue, 2024-12-17 at 11:27 +0800, Lulu Cheng wrote: 在 2024/12/16 下午9:20, Xi Ruoyao 写道: /* snip */ +;; For HImode it's a little complicated... +(define_expand "rbithi" I didn't find rtithi's template description. Are there any tes

Re: [PATCH 2/5] LoongArch: Add bit reverse operations

2024-12-16 Thread Lulu Cheng
在 2024/12/16 下午9:20, Xi Ruoyao 写道: /* snip */ +;; For HImode it's a little complicated... +(define_expand "rbithi" I didn't find rtithi's template description. Are there any test cases ? + [(match_operand:HI 0 "register_operand") + (match_operand:HI 1 "register_operand")] + "" + { +r

Re: [pushed][PATCH v3] LoongArch: Mask shift offset when emit {xv,v}{srl,sll,sra} with sameimm vector

2024-11-30 Thread Lulu Cheng
Pushed to r15-5819.. 在 2024/11/28 上午9:26, Jinyang He 写道: For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow in when emit {w,h,b}. Since the number of bits shifted is the remainder of the register value, it is actually unnecessary to constrain the range. Simply mask the sh

Re:[pushed] [PATCH 2/2] LoongArch: testsuite: Fix l{a}sx-andn-iorn.c.

2024-11-30 Thread Lulu Cheng
Pushed to r15-5818. 在 2024/11/26 下午4:06, Lulu Cheng 写道: Add '-fdump-tree-optimized' to this testcases. gcc/testsuite/ChangeLog: * gcc.target/loongarch/lasx-andn-iorn.c: Add '-fdump-tree-optimized'. * gcc.target/loongarch/lsx-andn-iorn.c:

Re:[pushed] [PATCH 1/2] LoongArch: testsuite: Fix loongarch/vect-frint-scalar.c.

2024-11-30 Thread Lulu Cheng
Pushed to r15-5817. 在 2024/11/26 下午4:06, Lulu Cheng 写道: In r15-5327, change the default language version for C compilation from -std=gnu17 to -std=gnu23. ISO C99 and C11 allow ceil, floor, round and trunc, and their float and long double variants, to raise the “inexact” exception, but ISO/IEC

Re: [PATCH v3] LoongArch: Mask shift offset when emit {xv,v}{srl,sll,sra} with sameimm vector

2024-11-27 Thread Lulu Cheng
在 2024/11/28 上午9:26, Jinyang He 写道: For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow in when emit {w,h,b}. Since the number of bits shifted is the remainder of the register value, it is actually unnecessary to constrain the range. Simply mask the shift number with the

Re: [PATCH] LoongArch: Mask shift offset when emit {xv,v}{srl,sll,sra} with sameimm vector.

2024-11-26 Thread Lulu Cheng
在 2024/11/27 下午3:10, Xi Ruoyao 写道: On Wed, 2024-11-27 at 14:24 +0800, Lulu Cheng wrote: 在 2024/11/27 下午12:06, Xi Ruoyao 写道: On Wed, 2024-11-27 at 11:58 +0800, Lulu Cheng wrote: --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-shift-sameimm-vec.c @@ -0,0 +1,72

Re: [PATCH] LoongArch: Mask shift offset when emit {xv,v}{srl,sll,sra} with sameimm vector.

2024-11-26 Thread Lulu Cheng
在 2024/11/27 下午12:06, Xi Ruoyao 写道: On Wed, 2024-11-27 at 11:58 +0800, Lulu Cheng wrote: --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-shift-sameimm-vec.c @@ -0,0 +1,72 @@ +/* Test shift bits overflow in vector */ +/* { dg-do compile } */ +/* { dg-options "-mlas

Re: [PATCH] LoongArch: Mask shift offset when emit {xv,v}{srl,sll,sra} with sameimm vector.

2024-11-26 Thread Lulu Cheng
在 2024/11/27 上午10:14, Xi Ruoyao 写道: On Tue, 2024-11-26 at 18:37 +0800, Jinyang He wrote: For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow in when emit {w,h,b}. Since the number of bits shifted is the remainder of the register value, it is actually unnecessary to const

[PATCH 1/2] LoongArch: testsuite: Fix loongarch/vect-frint-scalar.c.

2024-11-26 Thread Lulu Cheng
In r15-5327, change the default language version for C compilation from -std=gnu17 to -std=gnu23. ISO C99 and C11 allow ceil, floor, round and trunc, and their float and long double variants, to raise the “inexact” exception, but ISO/IEC TS 18661-1:2014, the C bindings to IEEE 754-2008, as integra

[PATCH 2/2] LoongArch: testsuite: Fix l{a}sx-andn-iorn.c.

2024-11-26 Thread Lulu Cheng
Add '-fdump-tree-optimized' to this testcases. gcc/testsuite/ChangeLog: * gcc.target/loongarch/lasx-andn-iorn.c: Add '-fdump-tree-optimized'. * gcc.target/loongarch/lsx-andn-iorn.c: Likewise. --- gcc/testsuite/gcc.target/loongarch/lasx-andn-iorn.c | 2 +- gcc/test

[PATCH] Regenerate opt urls for r15-5584.

2024-11-22 Thread Lulu Cheng
gcc/ChangeLog: * config/g.opt.urls: Regenerate. * config/i386/nto.opt.urls: Regenerate. * config/riscv/riscv.opt.urls: Regenerate. * config/rx/rx.opt.urls: Regenerate. * config/sol2.opt.urls: Regenerate. --- gcc/config/g.opt.urls | 2 +- gcc/confi

Re: [pushed][PATCH 0/2] Remove redundant code.

2024-11-21 Thread Lulu Cheng
Pushed to r15-5583 and r15-5584. 在 2024/11/2 上午10:48, Lulu Cheng 写道: Lulu Cheng (2): LoongArch: Remove redundant code. LoongArch: Modify the document to remove options that don't exist. gcc/config/loongarch/loongarch-builtins.cc | 102 - gcc/config/loon

Re:[pushed] [PATCH] LoongArch: Fix clerical errors in lasx_xvreplgr2vr_* and lsx_vreplgr2vr_*.

2024-11-21 Thread Lulu Cheng
Pushed to r15-5581 and r14-10961. 在 2024/11/2 下午3:37, Lulu Cheng 写道: [x]vldi.{b/h/w/d} is not implemented in LoongArch. Use the macro [x]vrepli.{b/h/w/d} to replace. gcc/ChangeLog: * config/loongarch/lasx.md: Fixed. * config/loongarch/lsx.md: Fixed. --- gcc/config/loongarch

Re: [pushed] [PATCH] LoongArch: Make __builtin_lsx_vorn_v and __builtin_lasx_xvorn_v arguments and return values unsigned

2024-11-21 Thread Lulu Cheng
Pushed to r14-10960. 在 2024/11/22 上午9:52, Lulu Cheng 写道: Pushed to r15-5580. We searched in the multimedia package and found no cases of using __builtin_lsx_vorn_v or __builtin_lasx_xvorn_v, so the interface type has been modified in the form of a bugfix. Thanks! 在 2024/10/31 下午11:58

Re:[pushed] [PATCH] LoongArch: Make __builtin_lsx_vorn_v and __builtin_lasx_xvorn_v arguments and return values unsigned

2024-11-21 Thread Lulu Cheng
Pushed to r15-5580. We searched in the multimedia package and found no cases of using __builtin_lsx_vorn_v or __builtin_lasx_xvorn_v, so the interface type has been modified in the form of a bugfix. Thanks! 在 2024/10/31 下午11:58, Xi Ruoyao 写道: Align them with other vector bitwise builtins.

[PATCH] LoongArch: Fix clerical errors in lasx_xvreplgr2vr_* and lsx_vreplgr2vr_*.

2024-11-02 Thread Lulu Cheng
[x]vldi.{b/h/w/d} is not implemented in LoongArch. Use the macro [x]vrepli.{b/h/w/d} to replace. gcc/ChangeLog: * config/loongarch/lasx.md: Fixed. * config/loongarch/lsx.md: Fixed. --- gcc/config/loongarch/lasx.md | 2 +- gcc/config/loongarch/lsx.md | 2 +- 2 files changed, 2 in

Re: [PATCH] LoongArch: Make __builtin_lsx_vorn_v and __builtin_lasx_xvorn_v arguments and return values unsigned

2024-11-01 Thread Lulu Cheng
在 2024/11/2 上午1:10, Xi Ruoyao 写道: On Thu, 2024-10-31 at 23:58 +0800, Xi Ruoyao wrote: /* snip */ --- Now running bootstrap & regtest.  Posted early as a context for some LLVM patch.  I'll post the regtest result once it finishes. Done, no regressions. The LLVM patch is https://github.com/

[PATCH 2/2] LoongArch: Modify the document to remove options that don't exist.

2024-11-01 Thread Lulu Cheng
gcc/ChangeLog: * doc/invoke.texi: Remove the non-existent option '-msmall-data-limit' and add a description of '-G'. --- gcc/doc/invoke.texi | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index fd6c0c44709..

[PATCH 1/2] LoongArch: Remove redundant code.

2024-11-01 Thread Lulu Cheng
TARGET_ASM_ALIGNED_{HI,SI,QI}_OP are defined repeatedly and deleted. gcc/ChangeLog: * config/loongarch/loongarch-builtins.cc (loongarch_builtin_vectorized_function): Delete. (LARCH_GET_BUILTIN): Delete. * config/loongarch/loongarch-protos.h (loongarch_built

[PATCH 0/2] Remove redundant code.

2024-11-01 Thread Lulu Cheng
Lulu Cheng (2): LoongArch: Remove redundant code. LoongArch: Modify the document to remove options that don't exist. gcc/config/loongarch/loongarch-builtins.cc | 102 - gcc/config/loongarch/loongarch-protos.h| 1 - gcc/config/loongarch/loongarch.cc

Re: Pushed: [PATCH] LoongArch: testsuite: Add -O for jump-table-annotate.c

2024-11-01 Thread Lulu Cheng
在 2024/11/2 上午1:36, Xi Ruoyao 写道: Without optimization, GCC does not emit a jump table for the test case. I'm not sure if the test case has been wrong in the first place or something has changed in these months... It was in the r15-4756 that turned -fjump-tables off at O0 optimization. I wa

Re:[pushed] [PATCH] LoongArch: Fix soft-float builds of libffi

2024-10-23 Thread Lulu Cheng
Pushed to r15-4588 在 2024/1/27 下午3:09, Yang Yujie 写道: This patch correspond to the upstream PR: https://github.com/libffi/libffi/pull/817 libffi/ChangeLog: * src/loongarch64/ffi.c: Avoid defining floats in struct call_context if the ABI is soft-float. --- libffi/src/loongarch

Re:[pushed] [PATCH] LoongArch: Add support to annotate tablejump

2024-10-07 Thread Lulu Cheng
Pushed to r15-4130. 在 2024/7/11 下午7:43, Xi Ruoyao 写道: This is per the request from the kernel developers. For generating the ORC unwind info, the objtool program needs to analysis the control flow of a .o file. If a jump table is used, objtool has to correlate the jump instruction with the tab

Re: [pushed][PATCH v1 2/2] LoongArch: Provide ashr lshr and ashl RTL pattern for vectors.

2024-08-11 Thread Lulu Cheng
Pushed to r15-2879. 在 2024/8/8 下午2:47, Lulu Cheng 写道: We support vashr vlshr and vashl. However, in r15-1638 support optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31. To support this optimization, vector ashr lshr and ashl need to be

Re: [pushed][PATCH v1 1/2] LoongArch: Drop vcond{,u} expanders.

2024-08-11 Thread Lulu Cheng
Pushed to r15-2878. 在 2024/8/8 下午2:47, Lulu Cheng 写道: Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no fallout, dropping the expanders, now. gcc/ChangeLog: PR target/114189 * config/loongarch/lasx.md (vcondu): Delete. (vcond): Likewise

Re:[pushed] [PATCH v2] LoongArch: Use iorn and andn standard pattern names.

2024-08-11 Thread Lulu Cheng
 Pushed to r15-2877. 在 2024/8/2 上午9:19, Lulu Cheng 写道: R15-1890 introduced new optabs iorc and andc, and its corresponding internal functions BIT_{ANDC,IORC}, and if targets defines such optabs for vector modes. And in r15-2258 the iorc and andc were renamed to iorn and andn. So we changed the

[PATCH v1 2/2] LoongArch: Provide ashr lshr and ashl RTL pattern for vectors.

2024-08-07 Thread Lulu Cheng
We support vashr vlshr and vashl. However, in r15-1638 support optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31. To support this optimization, vector ashr lshr and ashl need to be implemented. gcc/ChangeLog: * config/loongarch/loongarch.md (insn): Ad

[PATCH v1 1/2] LoongArch: Drop vcond{,u} expanders.

2024-08-07 Thread Lulu Cheng
Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no fallout, dropping the expanders, now. gcc/ChangeLog: PR target/114189 * config/loongarch/lasx.md (vcondu): Delete. (vcond): Likewise. * config/loongarch/lsx.md (vcondu): Likewise. (vcond):

Re:[pushed] [PATCH 0/1] LoongArch: Remove gawk extension from a generator script.

2024-08-01 Thread Lulu Cheng
Pushed to r15-2660. 在 2024/7/23 上午10:04, Yang Yujie 写道: Builds for the LoongArch target fail if the system "awk" is not "gawk". This patch removes this unnecessary requirement. Thanks to Jan-Benedict Glaw for finding and reporting this issue. Yang Yujie (1): LoongArch: Remove gawk extensio

[PATCH v2] LoongArch: Use iorn and andn standard pattern names.

2024-08-01 Thread Lulu Cheng
R15-1890 introduced new optabs iorc and andc, and its corresponding internal functions BIT_{ANDC,IORC}, and if targets defines such optabs for vector modes. And in r15-2258 the iorc and andc were renamed to iorn and andn. So we changed the andn and iorn implementation templates to the standard tem

[PATCH v2] LoongArch: Use iorn and andn standard pattern names.

2024-08-01 Thread Lulu Cheng
R15-1890 introduced new optabs iorc and andc, and its corresponding internal functions BIT_{ANDC,IORC}, and if targets defines such optabs for vector modes. And in r15-2258 the iorc and andc were renamed to iorn and andn. So we changed the andn and iorn implementation templates to the standard tem

Re: [PATCH] LoongArch: Rework bswap{hi,si,di}2 definition

2024-07-31 Thread Lulu Cheng
在 2024/7/31 下午6:25, Xi Ruoyao 写道: On Wed, 2024-07-31 at 16:57 +0800, Lulu Cheng wrote: 在 2024/7/29 下午3:58, Xi Ruoyao 写道: Per a gcc-help thread we are generating sub-optimal code for __builtin_bswap{32,64}.  To fix it: - Use a single revb.d instruction for bswapdi2. - Use a single revb.2w

Re: [PATCH] LoongArch: Rework bswap{hi,si,di}2 definition

2024-07-31 Thread Lulu Cheng
在 2024/7/29 下午3:58, Xi Ruoyao 写道: Per a gcc-help thread we are generating sub-optimal code for __builtin_bswap{32,64}. To fix it: - Use a single revb.d instruction for bswapdi2. - Use a single revb.2w instruction for bswapsi2 for TARGET_64BIT, revb.2h + rotri.w for !TARGET_64BIT. - Use a s

Re: [PATCH] LoongArch: Relax ins_zero_bitmask_operand and remove and3_align

2024-07-30 Thread Lulu Cheng
在 2024/7/29 下午3:59, Xi Ruoyao 写道: In r15-1207 I was too stupid to realize we just need to relax ins_zero_bitmask_operand to allow using bstrins for aligning, instead of adding a new split. And, "> 12" in ins_zero_bitmask_operand also makes no sense: it rejects bstrins for things like "x & ~4l"

Re: [PATCH] LoongArch: Expand some SImode operations through "si3_extend" instructions if TARGET_64BIT

2024-07-30 Thread Lulu Cheng
在 2024/7/26 下午8:43, Xi Ruoyao 写道: We already had "si3_extend" insns and we hoped the fwprop or combine passes can use them to remove unnecessary sign extensions. But this does not always work: for cases like x << 1 | y, the compiler tends to do (sign_extend:DI (ior:SI (ashift:SI (

Re: [PATCH] LoongArch: Use iorn and andn standard pattern names.

2024-07-28 Thread Lulu Cheng
在 2024/7/28 上午3:30, Andrew Pinski 写道: On Sat, Jul 27, 2024 at 1:55 AM Lulu Cheng wrote: gcc/ChangeLog: * config/loongarch/lasx.md (xvandn3): Rename to ... (andn3): This. (xvorn3): Rename to ... (iorn3): This. * config/loongarch/loongarch

[PATCH] LoongArch: Use iorn and andn standard pattern names.

2024-07-27 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/lasx.md (xvandn3): Rename to ... (andn3): This. (xvorn3): Rename to ... (iorn3): This. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vandn_v): Defined as the modified name. (CODE_FOR_lsx_vorn_v): Lik

[PATCH] LoongArch: Use iorn and andn standard pattern names for scalar modes.

2024-07-27 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.md (n): Rename to ... (n3): This. --- gcc/config/loongarch/loongarch.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 459ad30b9bb..4e4ddd5

Re: [PATCH] LoongArch: Use iorn and andn standard pattern names for scalar modes.

2024-07-27 Thread Lulu Cheng
在 2024/7/27 下午4:41, Xi Ruoyao 写道: On Sat, 2024-07-27 at 16:36 +0800, Lulu Cheng wrote: gcc/ChangeLog: * config/loongarch/loongarch.md (n): Rename to ... (n3): This. Ok. Note that [x]vorn3 and [x]vandn3 should be renamed as well. Uh, I just forgot about them, I'm modi

[PATCH] LoongArch: Use iorn and andn standard pattern names for scalar modes.

2024-07-27 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.md (n): Rename to ... (n3): This. --- gcc/config/loongarch/loongarch.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 459ad30b9bb..4e4ddd5

Re:[pushed] [PATCH] LoongArch: Organize the code related to split move and merge the same functions.

2024-07-19 Thread Lulu Cheng
Pushed to r15-2167. 在 2024/7/13 下午5:04, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_split_128bit_move): Delete. (loongarch_split_128bit_move_p): Delete. (loongarch_split_256bit_move): Delete

Re: [PATCH] LoongArch: Implement scalar isinf, isnormal, and isfinite via fclass

2024-07-15 Thread Lulu Cheng
在 2024/7/11 下午7:45, Xi Ruoyao 写道: Doing so can avoid loading FP constants from the memory. It also partially fixes PR 66462 as fclass does not signal on sNaN. gcc/ChangeLog: * config/loongarch/loongarch.md (extendsidi2): Add ("=r", "f") alternative and use movfr2gr.s for it.

[PATCH] LoongArch: Organize the code related to split move and merge the same functions.

2024-07-13 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_split_128bit_move): Delete. (loongarch_split_128bit_move_p): Delete. (loongarch_split_256bit_move): Delete. (loongarch_split_256bit_move_p): Delete. (loongarch_split_vector_move): Add a

Re:[pushed] [PATCH 2/2] LoongArch: Remove unreachable codes.

2024-07-11 Thread Lulu Cheng
Pushed to r15-1987. 在 2024/7/4 下午5:56, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_split_move): Delete. (loongarch_hard_regno_mode_ok_uncached): Likewise. * config/loongarch/loongarch.md (move_doubleword_fpr): Likewise

Re:[pushed] [PATCH 1/2] LoongArch: TFmode is not allowed to be stored in the float register.

2024-07-11 Thread Lulu Cheng
Pushed to r15-1986. 在 2024/7/4 下午5:56, Lulu Cheng 写道: PR target/115752 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_hard_regno_mode_ok_uncached): Replace UNITS_PER_FPVALUE with UNITS_PER_HWFPVALUE. * config/loongarch/loongarch.h

[PATCH 2/2] LoongArch: Remove unreachable codes.

2024-07-04 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_split_move): Delete. (loongarch_hard_regno_mode_ok_uncached): Likewise. * config/loongarch/loongarch.md (move_doubleword_fpr): Likewise. (load_low): Likewise. (load_high): Likewise.

[PATCH 1/2] LoongArch: TFmode is not allowed to be stored in the float register.

2024-07-04 Thread Lulu Cheng
PR target/115752 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_hard_regno_mode_ok_uncached): Replace UNITS_PER_FPVALUE with UNITS_PER_HWFPVALUE. * config/loongarch/loongarch.h (UNITS_PER_FPVALUE): Delete. gcc/testsuite/ChangeLog: * gcc

Re:[pushed] [PATCH 2/2] LoongArch: Define loongarch_insn_cost and set the cost of movcf2gr and movgr2cf.

2024-07-02 Thread Lulu Cheng
Modified and pushed to r15-1765. 在 2024/7/2 上午11:50, Xi Ruoyao 写道: On Tue, 2024-07-02 at 11:22 +0800, Lulu Cheng wrote: +static int +loongarch_insn_cost (rtx_insn *insn, bool speed) +{ +  rtx x = PATTERN (insn); +  int cost = pattern_cost (x, speed); + +  /* On LA464, prevent movcf2fr and

Re: [pushed][PATCH 1/2] LoongArch: Fix explicit-relocs-{extreme-,}tls-desc.c tests.

2024-07-02 Thread Lulu Cheng
Pushed to r15-1764. 在 2024/7/2 上午11:21, Lulu Cheng 写道: After r15-1579, ADD and LD/ST pairs will be merged into LDX/STX. Cause these two tests to fail. To guarantee that these two tests pass, add the compilation option '-fno-late-combine-instructions'. gcc/testsuite

Re: [PATCH 2/2] LoongArch: Define loongarch_insn_cost and set the cost of movcf2gr and movgr2cf.

2024-07-01 Thread Lulu Cheng
在 2024/7/2 上午11:50, Xi Ruoyao 写道: On Tue, 2024-07-02 at 11:22 +0800, Lulu Cheng wrote: +static int +loongarch_insn_cost (rtx_insn *insn, bool speed) +{ +  rtx x = PATTERN (insn); +  int cost = pattern_cost (x, speed); + +  /* On LA464, prevent movcf2fr and movfr2gr from merging into movcf2gr

[PATCH 2/2] LoongArch: Define loongarch_insn_cost and set the cost of movcf2gr and movgr2cf.

2024-07-01 Thread Lulu Cheng
The following two FAIL items have been fixed: FAIL: gcc.target/loongarch/movcf2gr-via-fr.c scan-assembler movcf2fr\\t\$f[0-9]+,\$fcc FAIL: gcc.target/loongarch/movcf2gr-via-fr.c scan-assembler movfr2gr.s\\t\$r4 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_i

[PATCH 1/2] LoongArch: Fix explicit-relocs-{extreme-, }tls-desc.c tests.

2024-07-01 Thread Lulu Cheng
After r15-1579, ADD and LD/ST pairs will be merged into LDX/STX. Cause these two tests to fail. To guarantee that these two tests pass, add the compilation option '-fno-late-combine-instructions'. gcc/testsuite/ChangeLog: * gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c:

Re: Ping: [PATCH v2] LoongArch: Tweak IOR rtx_cost for bstrins

2024-06-26 Thread Lulu Cheng
LGTM! Thanks very much! 在 2024/6/26 下午3:53, Xi Ruoyao 写道: Ping. On Sun, 2024-06-16 at 01:50 +0800, Xi Ruoyao wrote: Consider     c &= 0xfff;     a &= ~0xfff;     b &= ~0xfff;     a |= c;     b |= c; This can be done with 2 bstrins instructions.  But we need to recognize it in loongarc

Re: Ping: [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os

2024-06-26 Thread Lulu Cheng
 ;; We always avoid the shift operation in bstrins__for_ior_mask -;; if possible, but the result may be sub-optimal when one of the masks +;; if possible, but the result may be larger when one of the masks  ;; is (1 << N) - 1 and one of the src register is the dest register.  ;; For example

Re: [PATCH] LoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc

2024-06-17 Thread Lulu Cheng
I think that's fine. Thanks! 在 2024/6/16 下午5:11, Xi Ruoyao 写道: gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Dedup and sort the comment describing modifiers. --- It's a non-functional change thus I've not tested it. Ok for trunk? gcc/confi

Re: [PATCH] LoongArch: Use bstrins for "value & (-1u << const)"

2024-06-12 Thread Lulu Cheng
LGTM! Thanks! 在 2024/6/9 下午9:48, Xi Ruoyao 写道: A move/bstrins pair is as fast as a (addi.w|lu12i.w|lu32i.d|lu52i.d)/and pair, and twice fast as a srli/slli pair. When the src reg and the dst reg happens to be the same, the move instruction can be optimized away. gcc/ChangeLog: * conf

Re: [PATCH] LoongArch: Fix mode size comparision in loongarch_expand_conditional_move

2024-06-11 Thread Lulu Cheng
在 2024/6/12 上午11:06, Xi Ruoyao 写道: We were comparing a mode size with word_mode, but word_mode is an enum value thus this does not really make any sense. (Un)luckily E_DImode happens to be 8 so this seemed to work, but let's make it correct so it won't blow up when we add LA32 support or add a

Re: [PATCH 47/52] loongarch: New hook implementation loongarch_c_mode_for_floating_type

2024-06-03 Thread Lulu Cheng
Ok! Thanks! Lulu Cheng 在 2024/6/3 上午11:01, Kewen Lin 写道: This is to add new port specific hook implementation loongarch_c_mode_for_floating_type, remove macro defines for FLOAT_TYPE_SIZE and DOUBLE_TYPE_SIZE, and rename LONG_DOUBLE_TYPE_SIZE to LA_LONG_DOUBLE_TYPE_SIZE as we poison

Re: [PATCH] LoongArch: Guard REGNO with REG_P in loongarch_expand_conditional_move [PR115169]

2024-05-22 Thread Lulu Cheng
LGTM! Thanks! 在 2024/5/22 下午7:24, Xi Ruoyao 写道: gcc/ChangeLog: PR target/115169 * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Guard REGNO with REG_P. --- Bootstrapped with --enable-checking=all. Ok for trunk and 14? gcc/config/loongarch/loong

Re: [pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-05-07 Thread Lulu Cheng
dme-ov-file#target-presets has a detailed description of -march. -march=la64v1.0 will open lsx by default. On Tue, 2024-04-23 at 11:31 +0800, Lulu Cheng wrote: Pushed to r14-10083. 在 2024/4/23 上午10:42, Yang Yujie 写道: These ISA versions are defined as -march= parameters and are recommended

Re: [pushed][PATCH][gcc-13] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
Pushed to r13-8661. 在 2024/4/29 下午4:09, Lulu Cheng 写道: From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be

Re: [pushed][PATCH][gcc-12] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
Pushed to r12-10403. 在 2024/4/29 下午4:09, Lulu Cheng 写道: From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be

[PATCH][gcc-12] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be overwritten on normal return paths and breaks a rare case of libgcc'

[PATCH][gcc-13] LoongArch: Fix eh_return epilogue for normal returns.

2024-04-29 Thread Lulu Cheng
From: Yang Yujie On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved and restored in the function prologue and epilogue if the given function calls __builtin_eh_return. This causes the return value to be overwritten on normal return paths and breaks a rare case of libgcc'

Re: [PATCH] LoongArch: Add constraints for bit string operation define_insn_and_split's [PR114861]

2024-04-26 Thread Lulu Cheng
LGTM! Thanks. 在 2024/4/26 下午9:52, Xi Ruoyao 写道: Without the constrants, the compiler attempts to use a stack slot as the target, causing an ICE building the kernel with -Os: drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c:3144:1: error: could not split insn (insn:TI 1764 67 1745 (s

Re: [pushed][PATCH] wwwdocs: gcc-14/changes.html: Add Loongarch changes.

2024-04-24 Thread Lulu Cheng
在 2024/4/23 上午11:43, Lulu Cheng 写道: --- htdocs/gcc-14/changes.html | 156 + 1 file changed, 156 insertions(+) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index 9509487c..f0f0efe0 100644 --- a/htdocs/gcc-14/changes.html +++ b

[PATCH] wwwdocs: gcc-14/changes.html: Add Loongarch changes.

2024-04-22 Thread Lulu Cheng
--- htdocs/gcc-14/changes.html | 156 + 1 file changed, 156 insertions(+) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index 9509487c..f0f0efe0 100644 --- a/htdocs/gcc-14/changes.html +++ b/htdocs/gcc-14/changes.html @@ -877,6 +877,162 @

Re:[pushed] [PATCH v4 1/2] LoongArch: Define ISA versions

2024-04-22 Thread Lulu Cheng
Pushed to r14-10083. 在 2024/4/23 上午10:42, Yang Yujie 写道: These ISA versions are defined as -march= parameters and are recommended for building binaries for distribution. Detailed description of these definitions can be found at https://github.com/loongson/la-toolchain-conventions, which the Loo

Re: [pushed][PATCH v4 2/2] LoongArch: Define builtin macros for ISA evolutions

2024-04-22 Thread Lulu Cheng
Pushed to r14-10084. 在 2024/4/23 上午10:42, Yang Yujie 写道: Detailed description of these definitions can be found at https://github.com/loongson/la-toolchain-conventions, which the LoongArch GCC port aims to conform to. gcc/ChangeLog: * config.gcc: Add loongarch-evolution.o. * co

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Lulu Cheng
在 2024/4/19 下午10:27, Xi Ruoyao 写道: On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:  @table @samp  @item native -This selects the CPU to generate code for at compilation time by determining -the processor type of the compiling machine.  Using @option{-march=native} -enables all instructio

[PATCH] gcc-13/changes.html (LoongArch): Fix link.

2024-04-18 Thread Lulu Cheng
--- htdocs/gcc-13/changes.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 4384c329..15a309d6 100644 --- a/htdocs/gcc-13/changes.html +++ b/htdocs/gcc-13/changes.html @@ -625,7 +625,7 @@ You may also want to chec

Re: [pushed][PATCH] LoongArch: Add indexes for some compilation options.

2024-04-15 Thread Lulu Cheng
Pushed to r14-9984. 在 2024/4/9 下午4:19, Lulu Cheng 写道: gcc/ChangeLog: * config/loongarch/loongarch.opt.urls: Regenerate. * config/mn10300/mn10300.opt.urls: Likewise. * config/msp430/msp430.opt.urls: Likewise. * config/nds32/nds32-elf.opt.urls: Likewise

Re:[pushed] [PATCH v2] LoongArch: Enable switchable target

2024-04-09 Thread Lulu Cheng
Pushed to r14-9866. 在 2024/4/8 下午4:45, Yang Yujie 写道: This patch fixes the back-end context switching in cases where functions should be built with their own target contexts instead of the global one, such as LTO linking and functions with target attributes (TBD). PR target/113233 gcc/

  1   2   3   4   5   >