[Bug target/35189] -mno-sse4.2 turns off SSE4a

2008-02-13 Thread michael dot meissner at amd dot com
--- Comment #4 from michael dot meissner at amd dot com 2008-02-14 00:20 --- In terms of shipping systems, no AMD system supports SSSE3 right now. As I understand it, the SSSE3 instructions were inbetween SSE3 and SSE4.1 on Intel systems, so -mno-sse3 should turn off SSSE3, but -mno

[Bug target/35189] -mno-sse4.2 turns off SSE4a

2008-02-13 Thread michael dot meissner at amd dot com
--- Comment #2 from michael dot meissner at amd dot com 2008-02-13 23:55 --- Umm, SSE4A is completely different from SSE4/SSE4.1/SSE4.2. SSE4A are the instructions added with AMD's Barcelona machine, while SSE4.1 is the instructions added with the current generation of Intel mac

[Bug c++/35004] Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers

2008-02-07 Thread michael dot meissner at amd dot com
--- Comment #6 from michael dot meissner at amd dot com 2008-02-07 17:22 --- Subject: RE: Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers The problem is there are two different vector shifts. There is vector shift by a scalar amount (each element

[Bug c++/35004] Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers

2008-01-28 Thread michael dot meissner at amd dot com
--- Comment #4 from michael dot meissner at amd dot com 2008-01-29 00:39 --- Created an attachment (id=15043) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15043&action=view) Proposed patch to fix the problem The problem is cp/cp-tree.h stores the tree_code in 8 bits,

[Bug middle-end/35004] Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers

2008-01-28 Thread michael dot meissner at amd dot com
--- Comment #2 from michael dot meissner at amd dot com 2008-01-29 00:10 --- Created an attachment (id=15041) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15041&action=view) Traceback for 35005 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35004

[Bug c++/35004] Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers

2008-01-28 Thread michael dot meissner at amd dot com
--- Comment #1 from michael dot meissner at amd dot com 2008-01-29 00:04 --- Created an attachment (id=15040) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15040&action=view) Preprocessed file from the build of the libstdc++ pre-compiled headers File is bzip2'ed -9.

[Bug c++/35004] New: Adding 4 more tree codes causes a crash in building libstdc++ pre-compiled headers

2008-01-28 Thread michael dot meissner at amd dot com
libstdc++ pre-compiled headers Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GC

[Bug target/34077] GCC -O1 -minline-all-stringops -minline-stringops-dynamically fails for spec 2006 bzip2, gobmk, and h264ref benchmarks

2007-11-13 Thread michael dot meissner at amd dot com
--- Comment #3 from michael dot meissner at amd dot com 2007-11-13 20:48 --- Created an attachment (id=14548) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14548&action=view) Patch to fix PR34077 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34077

[Bug target/34077] GCC -O1 -minline-all-stringops -minline-stringops-dynamically fails for spec 2006 bzip2, gobmk, and h264ref benchmarks

2007-11-12 Thread michael dot meissner at amd dot com
--- Comment #1 from michael dot meissner at amd dot com 2007-11-12 20:38 --- Created an attachment (id=14533) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14533&action=view) Reduced testcase for bug 34077 from 401.bzip2 This is the reduced testcase from 401.bzip2. --

[Bug target/34077] New: GCC -O1 -minline-all-stringops -minline-stringops-dynamically fails for spec 2006 bzip2, gobmk, and h264ref benchmarks

2007-11-12 Thread michael dot meissner at amd dot com
rtedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34077

[Bug target/33524] SSE5 vectorized SI->DI conversions broken

2007-09-21 Thread michael dot meissner at amd dot com
--- Comment #2 from michael dot meissner at amd dot com 2007-09-21 20:51 --- Created an attachment (id=14242) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14242&action=view) Test case that replicates the file -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33524

[Bug target/33524] SSE5 vectorized SI->DI conversions broken

2007-09-21 Thread michael dot meissner at amd dot com
--- Comment #1 from michael dot meissner at amd dot com 2007-09-21 20:50 --- Created an attachment (id=14241) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14241&action=view) Patch to fix problem -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33524

[Bug target/33524] New: SSE5 vectorized SI->DI conversions broken

2007-09-21 Thread michael dot meissner at amd dot com
MED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-pc-gnu-linux GCC host triplet: x86_64-pc-gnu-linux GCC target triplet: x86_64-pc-

[Bug middle-end/31307] Interaction between x86_64 builtin function and inline functions causes poor code

2007-04-12 Thread michael dot meissner at amd dot com
--- Comment #13 from michael dot meissner at amd dot com 2007-04-12 20:18 --- How hard would it be to back port the change to 4.1.3 and 4.2? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31307

[Bug c++/31307] Interaction between x86_64 builtin function and inline functions causes poor code

2007-03-21 Thread michael dot meissner at amd dot com
--- Comment #3 from michael dot meissner at amd dot com 2007-03-22 00:40 --- Created an attachment (id=13250) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13250&action=view) This is the good source compiled with -DUSE_MACRO -- http://gcc.gnu.org/bugzilla/show_bug

[Bug c++/31307] Interaction between x86_64 builtin function and inline functions causes poor code

2007-03-21 Thread michael dot meissner at amd dot com
--- Comment #2 from michael dot meissner at amd dot com 2007-03-22 00:39 --- Created an attachment (id=13249) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13249&action=view) This is the assembly language with the extra store in it -- http://gcc.gnu.org/bugzilla/show_

[Bug c++/31307] Interaction between x86_64 builtin function and inline functions causes poor code

2007-03-21 Thread michael dot meissner at amd dot com
--- Comment #1 from michael dot meissner at amd dot com 2007-03-22 00:38 --- Created an attachment (id=13248) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13248&action=view) C++ source that shows the bug This is the source that shows the bug. -- http://gcc.gnu.org/b

[Bug c++/31307] New: Interaction between x86_64 builtin function and inline functions causes poor code

2007-03-21 Thread michael dot meissner at amd dot com
: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-redhat-linux GCC host triplet: x86_64-redhat-linux GCC target triplet: x86_64-redhat

[Bug target/31018] TARGET_{K8,K6,GENERIC} refered to in i386.md file

2007-03-14 Thread michael dot meissner at amd dot com
--- Comment #2 from michael dot meissner at amd dot com 2007-03-14 20:59 --- Patch committed: http://gcc.gnu.org/ml/gcc-patches/2007-03/msg00951.html -- michael dot meissner at amd dot com changed: What|Removed |Added

[Bug target/31028] New: Microoptimization of the i386 and x86_64 compilers

2007-03-02 Thread michael dot meissner at amd dot com
o: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-pc-gnu-linux GCC host triplet: x86_64-pc-gnu-linux GCC target triplet: x86_64-pc-gnu-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31028

[Bug target/31019] New: Microoptimization of the i386 and x86_64 compilers

2007-03-01 Thread michael dot meissner at amd dot com
o: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-pc-gnu-linux GCC host triplet: x86_64-pc-gnu-linux GCC target triplet: x86_64-pc-gnu-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31019

[Bug target/31018] New: TARGET_{K8,K6,GENERIC} refered to in i386.md file

2007-03-01 Thread michael dot meissner at amd dot com
RGET_{K8,K6,GENERIC} refered to in i386.md file Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissn

[Bug driver/30728] New: Building a 32-bit compiler on a 64-bit system should pass --32 flag to the assembler

2007-02-07 Thread michael dot meissner at amd dot com
nu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30728

[Bug target/29775] redundant movzbl

2007-02-02 Thread michael dot meissner at amd dot com
--- Comment #1 from michael dot meissner at amd dot com 2007-02-03 04:49 --- If you look at the RTL, in the if statement, the RTL loads the QI value into the register and does the test against the QI value, and the movzbl is how the load is done. The second movzbl is to zero extend

[Bug target/30685] New: Move ASM_OUTPUT_* macros to gcc_target structure

2007-02-02 Thread michael dot meissner at amd dot com
Severity: enhancement Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot meissner at amd dot com GCC build triplet: x86_64-redhat-linux GCC host triplet: x86_64-redhat-linux GCC target triplet: x86_64-redhat-linux

[Bug inline-asm/28686] ebp from clobber list used as operand

2007-01-30 Thread michael dot meissner at amd dot com
--- Comment #3 from michael dot meissner at amd dot com 2007-01-30 20:17 --- Created an attachment (id=12982) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12982&action=view) Secondary error Note, this is 32-bit only. If you compile epb2.c with -fpic -m32 and no optim

[Bug target/25295] unused register saved in function prolog

2006-12-04 Thread michael dot meissner at amd dot com
--- Comment #3 from michael dot meissner at amd dot com 2006-12-04 23:21 --- I've done some analysis on the test case. The current GCC 4.2 and mainline branches no longer generate the initial push of %r8, but instead do a subq $8,%rsp. I believe in the compiler you used it di

[Bug rtl-optimization/23812] swapping DImode halves produces poor x86 register allocation

2005-10-18 Thread michael dot meissner at amd dot com
--- Comment #3 from michael dot meissner at amd dot com 2005-10-18 17:44 --- Note, since this is a rotate, the patches I proposed in 17886 will generate much better code for this one case (basically mov/mov/xchgl -- it could be improved by a peephole to do the moves directly instead of

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #21 from michael dot meissner at amd dot com 2005-10-04 20:46 --- Subject: RE: variable rotate and long long rotate should be better optimized Sorry, I got mixed up as to who the original poster was. SSE2 is harder to use because it deals with 128 bit items instead of

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #19 from michael dot meissner at amd dot com 2005-10-04 20:35 --- Subject: RE: variable rotate and long long rotate should be better optimized I almost forgot, kernels should be using -mno-mmx and -mno-sse as a matter of course (or -msoft-float). I first ran into this

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #18 from michael dot meissner at amd dot com 2005-10-04 20:32 --- Subject: RE: variable rotate and long long rotate should be better optimized Yep, all valid points. So I don't think it should be done by default. But I suspect the original poster's applicat

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #16 from michael dot meissner at amd dot com 2005-10-04 20:06 --- Created an attachment (id=9880) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9880&action=view) Respin of 17886 patch to match new tree contents This patch is meant to apply on top of Mark&#x

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #15 from michael dot meissner at amd dot com 2005-10-04 19:51 --- Note, Mark's patch as applied to the tree has a minor typo in it. The rotrdi3 define_expand uses (rotate:DI ...) instead of (rotatert:DI ...). It doesn't matter in practice, since the generator f

[Bug middle-end/17886] variable rotate and long long rotate should be better optimized

2005-10-04 Thread michael dot meissner at amd dot com
--- Comment #14 from michael dot meissner at amd dot com 2005-10-04 18:59 --- Created an attachment (id=9876) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9876&action=view) Patch for x86 double word shifts This patch fixes the bug from the x86 side of things instead of f