http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #39 from oleg at smolsky dot net 2012-03-06 19:39:03 UTC ---
Hmm... funky. I can reproduce the issue on a newer Intel machine:
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #38 from Jakub Jelinek 2012-03-06
17:26:24 UTC ---
Sorry, can't reproduce any performance degradation between 4.1 and 4.6
on the http://gcc.gnu.org/bugzilla/attachment.cgi?id=26814 testcase (-O3 -m64,
default -mtune=generic):
on i7-26
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #37 from oleg at smolsky dot net 2012-03-06 16:34:27 UTC ---
Hey Jakub, is this smaller example digestable?
http://gcc.gnu.org/bugzilla/attachment.cgi?id=26814
The asm output is straightforward, but I obviously have no clue about
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #36 from oleg at smolsky dot net 2012-03-03 02:59:11 UTC ---
Here is the code emitted by g++ 4.6.3 for smaller_test.cpp (attached to
the bug)
unsigned int test_constant<> proc near
mov r9d, cs:iterations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #35 from oleg at smolsky dot net 2012-03-03 02:45:15 UTC ---
Here is a smaller version. BTW, I've noticed another regression in
optimization in v4.1 when using a const global...
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #34 from oleg at smolsky dot net 2012-03-03 02:19:21 UTC ---
OK, here are some benchmark numbers for the test compiled verbatim with
g++41/g++463 -O2:
$ time ./test41
rv=4243767296
real0m6.063s
user0m6.058s
sys 0m0.001s
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #33 from Jakub Jelinek 2012-03-02
09:13:52 UTC ---
After Jason's patch (which needs to be kept, it was a wrong-code bugfix), we
get out of the FE the addition in int type, while previously it was in unsigned
char type. I.e.
int D.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #32 from Jakub Jelinek 2012-03-02
08:28:34 UTC ---
For me, 4.1 is equally fast to 4.6 on my CPU and on the reduced testcase I've
attached (not clear if it models what the original benchmark did right or not),
and on the trunk regresse
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #31 from oleg at smolsky dot net 2012-03-02 08:21:41 UTC ---
I don't think there is a need to actually check the result in this
benchmarkable fragment, so that will reduce the code a little. The only
thing that I was hitting is about
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #30 from Jakub Jelinek 2012-03-02
08:07:15 UTC ---
Created attachment 26809
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26809
pr50182.C
Even the reduced testcase is orders of magnitude longer than what would be
desirable for
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #29 from oleg at smolsky dot net 2012-03-02 00:54:53 UTC ---
Is it possible to target this to 4.7? These optimization issues result
in benchmarcably slower code...
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #28 from davidxl 2012-01-11 17:26:46
UTC ---
See comment 24 for shorter test case.
Summary:
1) the regression reported by Oleg in gcc4_6 and earlier versions is due to FE
code generation difference which lead to the backend to gener
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
Richard Guenther changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #26 from oleg at smolsky dot net 2012-01-10 18:06:28 UTC ---
Could someone toggle the state assign a milestone please?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #25 from davidxl 2011-10-24 23:02:14
UTC ---
Created attachment 25600
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25600
test case for 47
Note that with gcc46, the result is even slower -- it has the RAT stall problem
which
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #24 from davidxl 2011-10-24 23:00:22
UTC ---
(In reply to comment #23)
> Here is the source preprocessed for gcc47. The test exhibits the
> slowdown mentioned in comment 11.
The problem can be reproduced with a simplified test case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #23 from oleg at smolsky dot net 2011-10-24 21:11:21 UTC ---
Here is the source preprocessed for gcc47. The test exhibits the
slowdown mentioned in comment 11.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #22 from davidxl 2011-10-24 19:58:23
UTC ---
(In reply to comment #21)
> OK, just in case, here is my current test.
Preprocessed test case? I saw the main assembly difference that can explain the
performance diff, but want to make su
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #21 from oleg at smolsky dot net 2011-10-24 19:48:57 UTC ---
OK, just in case, here is my current test.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #20 from davidxl 2011-10-24 19:33:18
UTC ---
The test.cpp attached seems to be the same as the old version.
David
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #19 from oleg at smolsky dot net 2011-10-24 18:33:23 UTC ---
Also note that Bugzilla has quietly replaced an older attachment,
test.cpp, with a new one without adding a comment...
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #17 from oleg at smolsky dot net 2011-10-24 18:27:31 UTC ---
Created attachment 25595
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25595
test.cpp.144t.optimized
--- Comment #18 from oleg at smolsky dot net 2011-10-24 18:27:31 UT
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #16 from oleg at smolsky dot net 2011-10-24 18:27:28 UTC ---
$ /work/tools/gcc47/bin/g++ -v
Using built-in specs.
COLLECT_GCC=/work/tools/gcc47/bin/g++
COLLECT_LTO_WRAPPER=/work/tools/gcc47/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #15 from davidxl 2011-10-21 23:02:16
UTC ---
(In reply to comment #14)
> (In reply to comment #13)
> > David, it looks like we are seeing different things with v4.7... See my
> > comment 11 - I am still observing the slowdown. Do you
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #14 from davidxl 2011-09-15 17:28:10
UTC ---
(In reply to comment #13)
> David, it looks like we are seeing different things with v4.7... See my
> comment 11 - I am still observing the slowdown. Do you have access to
> v4.1 and v4.6
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #13 from oleg at smolsky dot net 2011-09-15 16:53:26 UTC ---
David, it looks like we are seeing different things with v4.7... See my
comment 11 - I am still observing the slowdown. Do you have access to
v4.1 and v4.6? Could you try re
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
Matt Hargett changed:
What|Removed |Added
CC||matt at use dot net
--- Comment #12 from M
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #11 from Oleg Smolsky 2011-08-26
00:48:02 UTC ---
Also, I have just built the same suite with GCC version 4.7 that came from
ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20110820/gcc-4.7-20110820.tar.bz2 and
the performance degradation rem
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #10 from Oleg Smolsky 2011-08-25
22:08:49 UTC ---
BTW, the uint16_t test also got slower for the same very reason. Here is the
inner-most loop generated by g++4.6:
text:00400DA0 loc_400DA0:
.text:00400DA0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #9 from Oleg Smolsky 2011-08-25
16:26:05 UTC ---
AFAIK it's a production processor, a couple of years old. From x86info:
Family: 6 Model: 15 Stepping: 4 Type: 0 Brand: 0
CPU Model: Core 2 Duo E6600 Original OEM
Feature flags:
fpu vm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #8 from davidxl 2011-08-25 16:17:10
UTC ---
gcc46 and gcc47 difference can be reproduced using -O2 -m64.
David
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #7 from H.J. Lu 2011-08-25 15:58:08
UTC ---
(In reply to comment #6)
>
> The processor is Intel quad core something:
>
> processor: 0
> vendor_id: GenuineIntel
> cpu family: 6
> model: 15
> model name: Genuine
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #6 from Oleg Smolsky 2011-08-25
15:25:49 UTC ---
Oh, the settings and things were discussed the mail thread... Here is the
digest:
I have compiled and run a set of C++ benchmarks on a CentOS4/64 box using the
following compilers:
a)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #5 from Oleg Smolsky 2011-08-25
15:19:57 UTC ---
Created attachment 25103
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25103
The same test preprocessed with g++ 4.1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #4 f
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #3 from davidxl 2011-08-25 00:13:00
UTC ---
Caused by differences in FE generated code:
46:
D.6887 = (int) D.6886;
D.6888 = custom_constant_add::do_shift (D.6887);
D.6889 = (unsigned char) D.6888;
re
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
davidxl changed:
What|Removed |Added
CC||xinliangli at gmail dot com
--- Comment #2 from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50182
--- Comment #1 from Oleg Smolsky 2011-08-24
22:13:26 UTC ---
Created attachment 25097
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25097
The test case
This is the preprocessed source for the test discussed in the mail thread.
38 matches
Mail list logo