[Bug tree-optimization/114760] New: traling zero count detection failure

2024-04-17 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114760 Bug ID: 114760 Summary: traling zero count detection failure Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimi

[Bug tree-optimization/98138] BB vect fail to SLP one case

2023-10-04 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #12 from Jiangning Liu --- Hi Richi, > That said, "failure" to identify the common (vector) load is known > and I do have experimental patches trying to address that but did > not yet arrive at a conclusive "best" approach. It was

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-14 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671 --- Comment #11 from Jiangning Liu --- Hi Wilco, > "it means we will need a linker optimization to remove those redundant BTIs > (eg. by changing them into NOPs)" It will be only for performance optimization, right? If we don't care about pe

[Bug tree-optimization/109603] New: Vectorization failure for a small loop containing a simple branch

2023-04-24 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109603 Bug ID: 109603 Summary: Vectorization failure for a small loop containing a simple branch Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal

[Bug rtl-optimization/109343] New: invalid if conversion optimization for aarch64

2023-03-30 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109343 Bug ID: 109343 Summary: invalid if conversion optimization for aarch64 Product: gcc Version: rust/master Status: UNCONFIRMED Severity: normal Priority: P3 Compo

[Bug tree-optimization/89430] A missing ifcvt optimization to generate csel

2022-11-11 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89430 --- Comment #17 from Jiangning Liu --- Yes. > -Original Message- > From: tnfchris at gcc dot gnu.org > Sent: Friday, November 11, 2022 4:48 PM > To: JiangNing Liu > Subject: [Bug tree-optimization/89430] A missing ifcvt optimization t

[Bug c/106823] New: #pragma GCC diagnostic ignored "-Wattribute-warning" doesn't work for -flto

2022-09-03 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106823 Bug ID: 106823 Summary: #pragma GCC diagnostic ignored "-Wattribute-warning" doesn't work for -flto Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: no

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2021-11-28 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #7 from Jiangning Liu --- Without reverting the commit g:1118a3ff9d3ad6a64bba25dc01e7703325e23d92, we still see exchange2 performance issue for aarch64. BTW, we have been using -fno-inline-functions-called-once to get the best perform

[Bug tree-optimization/100511] Fail to remove dead code in loop

2021-05-11 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100511 --- Comment #5 from Jiangning Liu --- If we change "c3 = a" to "c3 = x->b", GCC can optimize it without IPA. It seems VRP is working for this case. $ cat tt7.c #include int a; typedef struct { int b; int count; } XX; int g; __attrib

[Bug tree-optimization/100511] Fail to remove dead code in loop

2021-05-10 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100511 --- Comment #2 from Jiangning Liu --- Then why gcc can't optimize this case either? sizeof (XX) <> sizeof(g) here. #include int a; typedef struct { int b; int count; } XX; int g; __attribute__((noinline)) void f(XX *x) { int c1

[Bug tree-optimization/100511] New: Fail to remove dead code in loop

2021-05-10 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100511 Bug ID: 100511 Summary: Fail to remove dead code in loop Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimizati

[Bug tree-optimization/99946] fail to exchange if conditions in terms of likely/unlikely probability

2021-04-06 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99946 --- Comment #1 from Jiangning Liu --- Is there any gcc pass that can deal with this simple optimization?

[Bug tree-optimization/99946] New: fail to exchange if conditions in terms of likely/unlikely probability

2021-04-06 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99946 Bug ID: 99946 Summary: fail to exchange if conditions in terms of likely/unlikely probability Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug rtl-optimization/98782] [11 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2021-02-22 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #4 from Jiangning Liu --- Hi Honza, Do you see any other real case problems if the patch g:1118a3ff9d3ad6a64bba25dc01e7703325e23d92 is not applied? If exchange2 is the only one affected by this patch so far, and because we have obse

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-14 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #12 from Jiangning Liu --- MGO RFC is at https://gcc.gnu.org/pipermail/gcc/2021-January/234682.html

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-11 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #11 from Jiangning Liu --- (In reply to rguent...@suse.de from comment #8) > On Sat, 9 Jan 2021, jiangning.liu at amperecomputing dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 > > > > --- Comment #7 from J

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-11 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #10 from Jiangning Liu --- (In reply to Hongtao.liu from comment #9) > It looks like a SOA/AOC opt opportunity which is discussed in > https://gcc.gnu.org/wiki/ > cauldron2015?action=AttachFile&do=view&target=Olga+Golovanevsky_+Memor

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-09 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #7 from Jiangning Liu --- (In reply to rguent...@suse.de from comment #6) > On January 9, 2021 4:17:17 AM GMT+01:00, "jiangning.liu at amperecomputing > dot com" wrote: > >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 > > > >---

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-08 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #5 from Jiangning Liu --- > It has to be done with care of course, cost modeling is difficult > (we need to have a good estimate of n and m or need to version > the whole nest). That said, usually we attempt the reverse transform. B

[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops

2021-01-08 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 --- Comment #2 from Jiangning Liu --- Loop distribution can only handle very simple case. If the inner loop has complicated control flow and other memory accesses with loop-carried dependence, it would be hard to handle it. For example, int foo