Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
Compile and run following code
#include
#define __align(n) __attribute__((aligned(n)))
__attribute__((aligned(32))) static struct {
unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71264
--- Comment #17 from Bingfeng Mei ---
OK, I will skip the vectorization check on our port then. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71264
Bingfeng Mei changed:
What|Removed |Added
CC||bmei at broadcom dot com
--- Comment #15
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
For the following example:
include
static int a, b;
static void bar()
{
asm volatile ("" : : : "memory");
}
void foo ()
{
a = 0;
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
#include
static int
clamp (int x, int lo, int hi)
{
return (x < lo) ? lo : ((x > hi) ? hi : x);
}
__attribute__((noinline))
short
foo (int N)
{
short value =
clamp (N,
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Compile the following code with gcc 5.0 (
Target: x86_64-unknown-linux-gnu gcc version 5.0.0 20150226 (experimental)
[trunk revision 143368] (GCC))
~/scratch/install
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61868
Bingfeng Mei changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61868
Bingfeng Mei changed:
What|Removed |Added
Component|driver |lto
--- Comment #1 from Bingfeng Mei ---
: driver
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Compile any simple file with -frandom-seed and -flto option.
#include
extern int foo (int);
int bar (int a)
{
return a * 5;
}
int main ()
{
printf("%d\n", foo (100));
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
typedef struct
{
short real;
short imag;
} complex16_t;
void
libvector_AccSquareNorm_ref (unsigned long long *acc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #5 from Bingfeng Mei ---
Created attachment 31559
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31559&action=edit
initial patch
Hi, Tejas, vect_create_cond_for_alias_checks contains a bug in handling
negative step. The computed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #3 from Bingfeng Mei ---
I can reproduce on aarch64. Still try to understand why. I constructed a
similar test but with positive loop step.
extern void abort (void);
int a[] = { 6, 0, 0, 0 };
int b;
int
main ()
{
for (;;)
{
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #1 from Bingfeng Mei ---
That is interesting. On x86-64, GCC does say it cannot determine dist vector
between a[3] & a[b] and needs run-time aliasing test. In the end it gives up
due to too few iterations.
note: === vect_analyze_data
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59544
Bingfeng Mei changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 59544, which changed state.
Bug 59544 Summary: Vectorizing store with negative step
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59544
What|Removed |Added
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569
--- Comment #9 from Bingfeng Mei ---
Seems simple patch is to just bypass permutation for constant operand as
vec_oprnd is a constant vector with identical elements.
Index: tree-vect-stmts.c
===
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569
--- Comment #8 from Bingfeng Mei ---
Sorry for the regression. The assertion happens if storing a constant value
with negative step. Doing permutation of constant is not the best optimization
here. So the easy way to fix is to skip vectorizing thi
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Created attachment 31467
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31467&action=edit
The patch against r206016
I was looking at some loops that can be vectorized by LLVM, but not G
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59249
--- Comment #4 from Bingfeng Mei ---
Even I split one critical predecessor edge, predicate of BB6 is still ORed
result of two conditions from BB4 & BB5. ORing two conditions results in a
sequence of statements that cannot be vectorized. Vectorizer
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59249
--- Comment #3 from Bingfeng Mei ---
Richard, I am not sure I understand about how to split edge.
BB4
/ \
/ \
BB5|
|\|
| \ |
| \ |
| BB6
| /
| /
BB7
Compiler (HEAD) complains "only critic
: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
I am doing some investigation on loops can be vectorized by LLVM, but not GCC.
One example is loop that contains more than one if-else
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57512
--- Comment #1 from Bingfeng Mei ---
Created attachment 30250
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30250&action=edit
Vectorized assembly code with unsigned char type
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Created attachment 30249
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30249&action=edit
Unvectorized with signed char type.
GCC (I use
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #7 from Bingfeng Mei 2011-12-15 10:18:06
UTC ---
Yes, the patch fixes the bug. Thanks.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49157
Summary: Unnecessary stack save/restore code generated for a
leaf function (arm-elf-gcc)
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45416
--- Comment #8 from Bingfeng Mei 2011-04-28 15:22:26
UTC ---
I am currently on vacation until 4/5/2011. I may access my mail irregularly.
Cheers,
Bingfeng Mei
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #5 from Bingfeng Mei 2011-01-13 15:49:23
UTC ---
It works. But I have no idea about the debug info issue in your other comment.
> (In reply to comment #2)
> > After tried patches one-by-one, I believe the misoptimization is down to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #2 from Bingfeng Mei 2011-01-11 16:16:28
UTC ---
After tried patches one-by-one, I believe the misoptimization is down to the
following patch.
Index: tree-ssa-copyrename.c
=
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #1 from Bingfeng Mei 2011-01-11 13:38:13
UTC ---
Created attachment 22944
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22944
Preprocessed test case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
Summary: Extra instruction generated in 4.5.2
Product: gcc
Version: 4.5.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: un
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
--- Comment #5 from Bingfeng Mei 2010-10-18 13:53:37
UTC ---
>
> Sure, but we have other means of dealing with that (MEM_ALIAS_SET == 0).
Do you mean this check is redundant here ? I dig out the ancient code (from
1997)
/* If both references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
--- Comment #3 from Bingfeng Mei 2010-10-18 12:16:59
UTC ---
I think that standard specifies that char * may refer to an alias of any
object, that's why QImode is different here. But I am not sure whether a
restrict qualifier will override that r
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
Bingfeng Mei changed:
What|Removed |Added
CC||richard.guenther at gmail
--- Comment #3 from bmei at broadcom dot com 2010-08-26 12:55 ---
I found I can reproduce the bug with ARM
ARM trunk -Os:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r2
--- Comment #2 from bmei at broadcom dot com 2010-08-26 12:47 ---
Sorry, I first observed this on our target. Then I tried to reproduce on x86,
but I forgot to turn on optimization flags. It does work for x86. Please delete
this report. I will figure out what happen with my target
atus: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC host triplet: x86_64-unknown-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45416
--- Comment #5 from bmei at broadcom dot com 2010-08-05 13:44 ---
I tried to apply the patches (this one alone is not enough) Richard suggested.
It becomes a chain of too many patches in the end. I am confident any more to
apply them to 4.5.
--
http://gcc.gnu.org/bugzilla
tus: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45176
signed at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44365
--- Comment #10 from bmei at broadcom dot com 2010-05-24 13:29 ---
annotating functions with externally_visible sounds a bit difficult to
maintain. Programmer needs to know whether a function is used outside of LTO
objects. This can change over time and extra efforts are needed to keep
--- Comment #8 from bmei at broadcom dot com 2010-05-24 09:31 ---
I integrated Dave's patch into LD with some modification (only emit those with
LTO sections) and hacked collect2 to support that. The size gain of LTO, our
main concern, is quite limited for our application. Large a
--- Comment #6 from bmei at broadcom dot com 2010-05-04 16:54 ---
> So this is a rough first draft of the-kind-of-thing-i-was-thinking-of. We get
> collect2 to run a dummy link early, and extract the output from the
> --lto-assist flag to get a list of archive members that we
--- Comment #12 from bmei at broadcom dot com 2010-03-09 14:20 ---
It seems that this bug still fails on my build:
~/work/install-x86/bin/gcc
/projects/firepath/tools/work/bmei/gcc-head/src/gcc/testsuite/gcc.dg/pr34668-1.c
--combine -O2
/projects/firepath/tools/work/bmei/gcc-head/src
tus: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43220
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43098
--- Comment #6 from bmei at broadcom dot com 2009-05-21 08:38 ---
I only submitted small patch before. To add a pass (may need new command-line
option, disabling the old rtl-level unrolling) seems to be a big issue to me.
Don't know what's procedure.
My code also conta
--- Comment #4 from bmei at broadcom dot com 2009-05-20 14:17 ---
I implemented a tree-level loop-unrolling pass in our private porting, which
takes advantage of later tree ivopt pass. It produces much better code than
rtl-level loop unrolling in such scenarios. Not sure whether
--
Summary: Inefficient loop unrolling
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: b
48 matches
Mail list logo