Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-19 Thread Mayshao-oc
> On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote: > > > > > > -Original Message- > > > > From: Xi Ruoyao > > > > Sent: Thursday, November 7, 2024 1:12 PM > > > > To: Liu, Hongtao ; Mayshao-oc > > > o...@zhaoxin.com&

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote: > > > > -Original Message- > > > > From: Xi Ruoyao > > > > Sent: Thursday, November 7, 2024 1:12 PM > > > > To: Liu, Hongtao ; Mayshao-oc > > > o...@zhaoxin.com>; Hongtao

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> > -Original Message- > > From: Xi Ruoyao > > Sent: Thursday, November 7, 2024 1:12 PM > > To: Liu, Hongtao ; Mayshao-oc > o...@zhaoxin.com>; Hongtao Liu > > Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com; > > richard.guent...@

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-06 Thread Mayshao-oc
> > On Thu, Nov 7, 2024 at 10:29?AM MayShao-oc wrote: > > > > Hi all: > >For zhaoxin, I find no improvement when enable pass_align_tight_loops, > > and have performance drop in some cases. > >This patch add a new tunable to bypass pass_align_tight_loops

[PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-06 Thread MayShao-oc
Hi all: For zhaoxin, I find no improvement when enable pass_align_tight_loops, and have performance drop in some cases. This patch add a new tunable to bypass pass_align_tight_loops in zhaoxin. Bootstrapped X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * config/i386/i386

[PATCH] [x86_64] Add flag to control tight loops alignment opt

2024-11-04 Thread MayShao-oc
3B cycles). So I propose to add -malign-tight-loops flag to control tight loop optimization to avoid this, we could disalbe this optimization by default. Bootstrapped X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * config/i386/i386-features.cc (ix86_align_tight_loops):

[PATCH v2] [libatomic]: Handle AVX+CX16 ZHAOXIN like intel for 16b atomic [PR104688]

2024-07-18 Thread MayShao-oc
From: mayshao Hi Jakub: Thanks for your review,We should just amend this to handle Zhaoxin. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao libatomic/ChangeLog: PR target/104688 * config/x86/init.c (__libat_feat1_init): Don't clear bit_A

[PATCH] [libatomic]: Handle AVX+CX16 ZHAOXIN like intel for 16b atomic [PR104688]

2024-07-11 Thread MayShao-oc
From: mayshao Hi all: We reply in PR104688 that ZHAOXIN guarantees that 16-byte VMOVDQA on 16-byte aligned address is atomic, if memory type of the address is WB. So there is no need to clear bit_AVX on ZHAOXIN CPUs. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao

Re: [PATCH] [x86_64]: Zhaoxin shijidadao enablement

2024-06-18 Thread mayshao-oc
On 5/28/24 14:15, Uros Bizjak wrote: On Mon, May 27, 2024 at 10:33 AM MayShao wrote: From: mayshao Hi all: This patch enables -march/-mtune=shijidadao, costs and tunings are set according to the characteristics of the processor. Bootstrapped /regtested X86_64. Ok for

[PATCH] [x86_64]: Zhaoxin shijidadao enablement

2024-05-27 Thread MayShao
From: mayshao Hi all: This patch enables -march/-mtune=shijidadao, costs and tunings are set according to the characteristics of the processor. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_zhaoxin_cpu

Re: [PATCH] invoke.texi: Clarify -march=lujiazui

2024-05-23 Thread mayshao-oc
Hi Jakub: I think the modified lujiazui description is what actually happens,thanks. BR Mayshao [这封邮件来自外部发件人 谨防风险] Hi! Yesterday I was searching which exact CPUs are affected by the PR114576 wrong-code issue and went from the PTA_* bitmasks in GCC, so arrived at the goldmont, goldmont

Re: [PATCH] [x86_64]: Zhaoxin yongfeng enablement

2023-10-30 Thread Mayshao-oc
>On Fri, Oct 27, 2023 at 12:20 PM mayshao wrote: >> >> On 2023/10/26 17:34, Uros Bizjak wrote: >> > On Wed, Oct 25, 2023 at 8:43 AM mayshao wrote: >> >> >> >> Hi all: >> >> This patch enables -march/-mtune=yongfeng, costs and tu

[PATCH] [x86_64]: Zhaoxin yongfeng enablement

2023-10-24 Thread mayshao
Hi all: This patch enables -march/-mtune=yongfeng, costs and tunings are set according to the characteristics of the processor. We add a new md file to describe yongfeng processor. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * common/config/i386

[gcc10 backport] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]

2023-03-08 Thread mayshao
From: mayshao-oc Hi Jakub: This is backport of the fix for PR target/100758 from mainline to the gcc10 release branch. Because the bug still exists in gcc10 on Zhaoxin platform, and it will incur ISA feature detection failure, we want to fix it as the mainline.This patch has been retested

[gcc11 backport] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]

2023-03-08 Thread mayshao
branch on Intel,Amd,Zhaoxin with make bootstrap and make -k check without failure. Ok for the gcc11 branch? BR Mayshao gcc/ChangeLog: PR target/100758 * common/config/i386/cpuinfo.h (cpu_indicator_init): Call get_available_features for all CPUs with max_level >= 1, rat

[gcc12 backport] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]

2023-03-08 Thread mayshao
From: mayshao-oc Hi Jakub: This is backport of the fix for PR target/100758 from mainline to the gcc12 release branch. Because the bug still exists in gcc12 on Zhaoxin platform, and it will incur ISA feature detection failure, we want to fix it as the mainline.This patch has been retested

Re: [PATCH] i386: correct division modeling in lujiazui.md

2022-12-29 Thread Mayshao-oc
>Ping. If there are any questions or concerns about the patch, please let me >know: I'm interested in continuing this cleanup at least for older AMD models. > Hi Alexander: According to the speccpu2017 benchmark result, the patch looks good in lujiazui. BR Mayshao >I

答复: [PATCH] i386: correct division modeling in lujiazui.md

2022-12-20 Thread Mayshao-oc
get the result , we will give feedback right away. BR Mayshao >I noticed I had an extra line in my Changelog: > >> (lua_sseicvt_si): Ditto. > >It got there accidentally and I will drop it. > >Alexander > >On Fri, 9 Dec 2022, Alexander Monakov wrote: > >> M

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Mayshao-oc
;> Yes, both cases exist in our products. > Good. Then we miss a CPU features detection for (vendor == > signature_CENTAUR_ebx && family < 0x07) > aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. But it's not worth > it as it's a legacy hardware, > right? Yes, for legacy hardware, we need to keep it work correctly, but in respect of performance, we don't spend a lot of time to tune. > Cheers, > Martin >> >>> Thanks, >> Martin >> >> BR >> Mayshao

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-26 Thread Mayshao-oc
family == 0x7 ? > Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ? Yes, both cases exist in our products. > Thanks, > Martin BR Mayshao

Re: [PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-17 Thread Mayshao-oc
> On Tue, May 17, 2022 at 5:15 AM mayshao wrote: >> Hi Uros: >> This patch fix Zhaoxin CPU vendor ID detection problem and add >> zhaoxin "lujiazui" processor support. >> Currently gcc can't recognize Zhaoxin CPU(vendor ID "Centa

[PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-16 Thread mayshao
Hi Uros: This patch fix Zhaoxin CPU vendor ID detection problem and add zhaoxin "lujiazui" processor support. Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" and "Shanghai") if user use -march=native option, which is confusing for users. This patch enabl

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-28 Thread Mayshao-oc
On Sun, Mar 27, 2022 at 5:15 PM Uros Bizjak wrote: > On Fri, Mar 25, 2022 at 3:08 AM MayShao wrote: > > > > Hi Uros, > > > > This patch fix Zhaoxin CPU Vendor ID detection problem > > and add Zhaoxin "lujiazui" processor support and tuning. > &

[PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-24 Thread MayShao
Hi Uros, This patch fix Zhaoxin CPU Vendor ID detection problem and add Zhaoxin "lujiazui" processor support and tuning. Currently gcc can't recognize Zhaoxin CPU (Vendor ID "CentaurHauls" and "Shanghai") and wrongly identify Zhaoxin "lujiazui" as Intel core2 or i386, which is confusing for use