https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99504
Bug ID: 99504
Summary: Missing memmove detection
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99704
--- Comment #2 from Hongtao.liu ---
How should we handle -march=native on hybrid core?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99704
--- Comment #3 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #2)
> How should we handle -march=native on hybrid core?
Nevermind, assume you're meaning the bellow parts are different on hybrid core
02H
EAX Cache and TLB Information
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96858
Hongtao.liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96244
Hongtao.liu changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744
--- Comment #2 from Hongtao.liu ---
in ix86_can_inline_p
static bool
ix86_can_inline_p (tree caller, tree callee)
{
tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754
--- Comment #1 from Hongtao.liu ---
Yes, __mm_set_epi32 will reverse order of parameters, Could you send out a
patch for review?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881
Bug ID: 99881
Summary: Regression compare -O2 -ftree-vectorize with -O2 on
SKX/CLX
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Pri
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99908
--- Comment #2 from Hongtao.liu ---
I'm testing
@@ -17759,6 +17759,35 @@ (define_insn "_pblendvb"
(set_attr "btver2_decode" "vector,vector,vector")
(set_attr "mode" "")])
+(define_split
+ [(set (match_operand:VI1_AVX2 0 "register_opera
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881
--- Comment #4 from Hongtao.liu ---
(In reply to Richard Biener from comment #3)
> But 2 element construction _should_ be cheap. What is missing is the move
> cost from GPR to XMM regs (but we do not have a good idea whether the sources
> are me
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99908
--- Comment #3 from Hongtao.liu ---
Created attachment 50517
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50517&action=edit
tested patch waiting for GCC12.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99941
--- Comment #1 from Hongtao.liu ---
If we were more concerned about the performance of the big core, the answer
would be yes.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99941
--- Comment #2 from Hongtao.liu ---
(In reply to H.J. Lu from comment #0)
> i386-options.c has
>
> #define m_ALDERLAKE (HOST_WIDE_INT_1U< #define m_CORE_AVX512 (m_SKYLAKE_AVX512 | m_CANNONLAKE \
>| m_ICELAKE_CLIENT | m_IC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930
--- Comment #7 from Hongtao.liu ---
i'm testing
1 file changed, 30 insertions(+)
gcc/combine.c | 30 ++
modified gcc/combine.c
@@ -1811,6 +1811,33 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx set,
void *dat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930
--- Comment #9 from Hongtao.liu ---
(In reply to Segher Boessenkool from comment #8)
> That patch is no good. The combination is not allowed because it is not
> known what the "use"s are *for*. Checking if something is from the constant
> pools
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19
--- Comment #4 from Hongtao.liu ---
> Oops,
> will backport r10-2664-ga9fcfec30f70c30883f53d4b1bd533fbea0e9fb2 (tigerlake
> part) to gcc9.
PTA_AVX512VP2INTERSECT is enabled in GCC10, don't plan to backport to gcc9, so
in GCC9 -march=native wou
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076
Bug ID: 100076
Summary: eembc/automotive/basefp01 has 30.3% regression compare
-O2 -ftree-vectorize with -O2 on SKX/CLX
Product: gcc
Version: 11.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076
--- Comment #2 from Hongtao.liu ---
(In reply to H.J. Lu from comment #1)
> Is -O3 slower than -O3 -fno-tree-vectorize? If not, why?
For this case O3 is Ok, because O3 will enable pass_cunroll to complete unroll
the loop1/loop2/loop3, and later
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076
--- Comment #4 from Hongtao.liu ---
Created attachment 50590
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50590&action=edit
eembc_automotive_basefp01.cpp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100089
Bug ID: 100089
Summary: [11 Performance regression ] 30% for
denbench/mp2decoddata2 with -O3
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088
--- Comment #3 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #2)
> >
> > This issue does not exist for sse or avx512f. Setting `-march=haswell` or
> > `-mtune=haswell` on the command line also seems to fix this but neither of
> > t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088
--- Comment #4 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #3)
> (In reply to Hongtao.liu from comment #2)
> > >
> > > This issue does not exist for sse or avx512f. Setting `-march=haswell` or
> > > `-mtune=haswell` on the comman
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093
Bug ID: 100093
Summary: different behavior between -mtune=cpu_type and
target_attribute (“arch=cputype”)
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076
--- Comment #6 from Hongtao.liu ---
(In reply to Richard Biener from comment #5)
> Note even when avoiding the STLF hit the vectorized version is slower.
> You can use -mtune-ctl=^sse_unaligned_load_optimal to force loading
> the lower/upper hal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093
--- Comment #1 from Hongtao.liu ---
When ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL] is false,
GCC goes to set up the bit MASK_AVX256_SPLIT_UNALIGNED_LOAD/STORE, but when
ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD/STO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19
--- Comment #5 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #3)
> > Response from Jim Wilson:
> > Looks like a bug in gcc-9. tigerlake was added to
> > gcc/config/i386/driver-i386.c but not to the arch_names_table in i386.c. I
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093
--- Comment #2 from Hongtao.liu ---
Created attachment 50611
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50611&action=edit
tested patch waiting for GCC12.
[i386] MASK_AVX256_SPLIT_UNALIGNED_STORE/LOAD should be cleared in
opts->x_targ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98348
--- Comment #21 from Hongtao.liu ---
(In reply to Dávid Bolvanský from comment #20)
> Some small regression (missed opportunity to use vptestnmd):
>
> Current trunk
>
> compare(unsigned int __vector(16)):
> vpxor xmm1, xmm1, xmm1
> vpcmpd k
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173
Bug ID: 100173
Summary: telecom/viterb00data_1 has 16.92% regression compared
O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2
on CLX/ICX, 9% regression on znver3
Pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94680
--- Comment #4 from Hongtao.liu ---
Let me do this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98911
--- Comment #2 from Hongtao.liu ---
Fixed in GCC12.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #7 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249
--- Comment #4 from Hongtao.liu ---
(In reply to Richard Biener from comment #3)
> Guess you want to figure what built the (vec_select:V8QI (V16QI)) and if
> it was appropriately simplified (and simplify_rtx would handle this case).
> In any case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97286
--- Comment #2 from Hongtao.liu ---
Seems similar issue as PR97366?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96849
--- Comment #4 from Hongtao.liu ---
Fixed in GCC11 by
https://gcc.gnu.org/g:1aa71af09350b9ff4d2fad88a440b682545682ec
commit r11-2947-g1aa71af09350b9ff4d2fad88a440b682545682ec
Author: liuhongt
Date: Tue Aug 11 11:05:40 2020 +0800
Refine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249
--- Comment #6 from Hongtao.liu ---
We all ready have bellow in simplify-rtx.c, it seems we can also handle such
situation here.
---
3954 case VEC_SELECT:
3955 if (!VECTOR_MODE_P (mode))
3956 {
3957 gcc_assert (VECTOR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97387
--- Comment #2 from Hongtao.liu ---
Same issue as PR93990?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249
--- Comment #7 from Hongtao.liu ---
I'm testing
---
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 869f0d11b2e..9c397157f28 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -4170,6 +4170,33 @@ simplify_binary_operation_1 (e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #15 from Hongtao.liu ---
I'm working on add the expander, i encounter a problem.
for V32HI vec_set with constant index, the expander existed under
TARGET_AVX512F, but for variable index, the expander should be existed under
TARGET_AV
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #16 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #15)
> I'm working on add the expander, i encounter a problem.
>
> for V32HI vec_set with constant index, the expander existed under
> TARGET_AVX512F, but for variable in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #19 from Hongtao.liu ---
(In reply to Richard Biener from comment #17)
> (In reply to Hongtao.liu from comment #15)
> > I'm working on add the expander, i encounter a problem.
> >
> > for V32HI vec_set with constant index, the expand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #20 from Hongtao.liu ---
(In reply to Richard Biener from comment #18)
> (In reply to Hongtao.liu from comment #16)
> > (In reply to Hongtao.liu from comment #15)
> > > I'm working on add the expander, i encounter a problem.
> > >
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97366
--- Comment #6 from Hongtao.liu ---
(In reply to Alexander Monakov from comment #5)
> afaict LRA is just following IRA decisions, and IRA allocates that pseudo to
> memory due to costs.
>
> Not sure where strange cost is coming from, but it depe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #2 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506
--- Comment #5 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #4)
> Yeah. On the other side, they don't need to try hard to optimize it because
> normally it should be simplified already. So, e.g. the above patch is fine
> if it w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506
--- Comment #7 from Hongtao.liu ---
Should i backport to GCC10?
Although it's exposed in GCC11, but it's still a potential bug in GCC10.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249
Hongtao.liu changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #6 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521
--- Comment #10 from Hongtao.liu ---
Speaking about how to represent the V*BImode constants, i think we need to
extend attribute vector_size to handle something like
---
typedef bool v8bi __attribute__ ((vector_size (1)));
---
currently there wo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #6 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532
--- Comment #8 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #7)
> memory_operand calls general_operand which for MEM does:
> /* Use the mem's mode, since it will be reloaded thus. LRA can
> generate move insn with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540
--- Comment #2 from Hongtao.liu ---
2588 /* For special_memory_operand, there could be a memory operand
inside,
2589 and it would cause a mismatch for constraint_satisfied_p. */
2590 if (UNARY_P (op) && op == ext
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532
--- Comment #10 from Hongtao.liu ---
Created attachment 49444
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49444&action=edit
Fix invalid address for special memory constraint
I'm testing this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540
--- Comment #4 from Hongtao.liu ---
Created attachment 49445
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49445&action=edit
Don't extract memory from operand for normal memory constraint.
I'm testing this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97606
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603
--- Comment #1 from Hongtao.liu ---
Shouldn't it be marked as target issue for x86?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603
--- Comment #2 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #1)
> Shouldn't it be marked as target issue for x86?
Or you means that middle-end should transform code to
int g();
int f(int a, int b)
{
int c = a - b;
if (c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540
--- Comment #5 from Hongtao.liu ---
The patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557143.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532
--- Comment #13 from Hongtao.liu ---
(In reply to Tom de Vries from comment #12)
> (In reply to Hongtao.liu from comment #10)
> > Created attachment 49444 [details]
> > Fix invalid address for special memory constraint
> >
> > I'm testing this p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97667
--- Comment #3 from Hongtao.liu ---
(In reply to Martin Liška from comment #2)
> Likely dup of PR97540.
Yes, it should be.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540
--- Comment #7 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #5)
> The patch is posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557143.html
With upper patch and
https://gcc.gnu.org/pipermail/gcc-patches/2020-Octob
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685
Bug ID: 97685
Summary: -march=tremont should enable
MOVDIRI/MOVDIR64B/CLDEMOTE/SGX/WAITPKG.
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97642
--- Comment #3 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #1)
> The problem is that in the RTL representation there is nothing that would
> tell cse, forward propagation or combiner etc. not to optimize the
> (insn 7 6 8 2 (set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540
--- Comment #9 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532
--- Comment #15 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685
--- Comment #1 from Hongtao.liu ---
HRESET wouldn't be supported on SPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759
--- Comment #3 from Hongtao.liu ---
for testcase:
---
#include
bool
is_power2_popcnt (int a)
{
return __builtin_popcount (a) == 1;
}
bool
is_power2_arithmetic (int a)
{
return !(a & (a - 1)) && a;
}
---
gcc -O2 -mavx2 -S got
---
.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685
Hongtao.liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
Bug ID: 97770
Summary: Missing vectorization for vpopcnt
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #1 from Hongtao.liu ---
For target side, we need to add expander for popcountm2 with m vector mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759
--- Comment #11 from Hongtao.liu ---
(In reply to gcc-bugs from comment #10)
> And maybe a related question:
>
> I know that an arithmetic implementation might auto-vectorize, but would a
> popcount implementation do that too?
>
> Since AVX512_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #2 from Hongtao.liu ---
After adding expander, successfully vectorize the loop.
---
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b153a87fb98..e8159997c40 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #3 from Hongtao.liu ---
> But for vector byte/word/quadword, vectorizer still use vpopcntd, but not
> vpopcnt{b,w,q}, missing corresponding ifn?
We don't have __builtin_popcount{w,b}, but we have __builtin_popcountl.
for testcase
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #5 from Hongtao.liu ---
(In reply to Richard Biener from comment #4)
> What's missing is middle-end folding support to narrow popcount to the
> appropriate internal function call with byte/half-word width when target
> support
> is av
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779
--- Comment #2 from Hongtao.liu ---
patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558578.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779
--- Comment #4 from Hongtao.liu ---
Fixed in GCC10,GCC9.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #9 from Hongtao.liu ---
> I guess that the vectorized popcount IFN is defined to be VnDI -> VnDI
> but we want to have VnSImode results. This means the instruction is
> wrongly modeled in vectorized form?
>
Yes, because we have __
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #11 from Hongtao.liu ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558777.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92492
--- Comment #7 from Hongtao.liu ---
I notice TARGET_VECTORIZE_RELATED_MODE is added, and can be used to handle
convertion, i'm working on this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #23 from Hongtao.liu ---
Fixed in GCC11, may need a bit adjustment for the modeless operand(the variable
index) as dicussed in
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559213.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97873
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #3 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891
--- Comment #3 from Hongtao.liu ---
This problem is very similar to the one pass_rpad deals with.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891
--- Comment #4 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #3)
> This problem is very similar to the one pass_rpad deals with.
We already have mov_xor for mov $0 to reg, so we only need to handle mov
$0 to mem.
and size for:
xor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906
--- Comment #7 from Hongtao.liu ---
(In reply to Jakub Jelinek from comment #6)
> Implemented now for non-AVX512*. Hongtao, do you think you could have a
> look at the avx512{bw,vl}/avx512bw splitter(s)?
Yes, i'll do it. Thanks for the patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770
--- Comment #13 from Hongtao.liu ---
(In reply to Richard Biener from comment #10)
> Hmm, but
>
> DEF_INTERNAL_INT_FN (POPCOUNT, ECF_CONST | ECF_NOTHROW, popcount, unary)
>
> so there's clearly a mismatch between either the vectorizers interpre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97642
--- Comment #6 from Hongtao.liu ---
Fixed in GCC11, GCC10 is fine, no need to backport.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906
--- Comment #9 from Hongtao.liu ---
Fixed in GCC11.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114
Bug ID: 98114
Summary: [11 regression] FAIL:
gcc.target/i386/avx512vl-vandnpd-2.c execution test
caused by r11-5391
Product: gcc
Version: 11.0
Status:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114
--- Comment #1 from Hongtao.liu ---
Looking at testcase there's are pointer type conversion
void
CALC (double *s1, double *s2, double *r)
{
int i;
long long tmp;
for (i = 0; i < SIZE; i++)
{
tmp = (~(*(long long *) &s1[i])) & (*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167
--- Comment #3 from Hongtao.liu ---
;; _3 = __builtin_ia32_shufps (b_2(D), b_2(D), 0);
(insn 7 6 8 (set (reg:V4SF 88)
(reg/v:V4SF 86 [ b ])) "./gcc/include/xmmintrin.h":746:19 -1
(nil))
(insn 8 7 9 (set (reg:V4SF 89)
(reg/v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114
Hongtao.liu changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98212
Hongtao.liu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #1 f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218
Bug ID: 98218
Summary: [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander
for 64bit vector
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98219
Bug ID: 98219
Summary: User-interrupt return pop corrupt RIP
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98219
--- Comment #2 from Hongtao.liu ---
(In reply to H.J. Lu from comment #1)
> Created attachment 49723 [details]
> A patch
>
> Hongtao, can you take it over?
I'll validate it.
1 - 100 of 1358 matches
Mail list logo