[Bug middle-end/27134] [4.1/4.2 regression] ICE with floor and -ffast-math

2006-04-12 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2006-04-12 17:54 --- > There seems to be something wrong with -ffast-math and floor. I have done some analysis on this. Start from expand_builtin_int_roundingfn() in builtins.c source, where we fallback to FP rounding optab. fallback_fnd

[Bug middle-end/27139] New: Optimize double INT->FP->INT conversions

2006-04-12 Thread uros at kss-loka dot si
: Optimize double INT->FP->INT conversions Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot

[Bug middle-end/27134] [4.1 regression] ICE with floor and -ffast-math

2006-04-14 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-04-14 07:18 --- Fixed on SVN head. -- uros at kss-loka dot si changed: What|Removed |Added Known to work

[Bug middle-end/27134] [4.1 regression] ICE with floor and -ffast-math

2006-04-16 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2006-04-16 11:22 --- Fixed. -- uros at kss-loka dot si changed: What|Removed |Added Status|ASSIGNED

[Bug target/27277] New: standard i387 constant loading insns (fldz, fld1) are not generated anymore

2006-04-24 Thread uros at kss-loka dot si
get AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC target triplet: i386-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27277

[Bug tree-optimization/27474] New: ICE: tree check: expected ssa_name, have struct_field_tag in verify_ssa, at tree-ssa.c:776

2006-05-07 Thread uros at kss-loka dot si
gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27474

[Bug tree-optimization/27474] ICE: tree check: expected ssa_name, have struct_field_tag in verify_ssa, at tree-ssa.c:776

2006-05-07 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-05-07 19:30 --- Created an attachment (id=11396) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11396&action=view) Reduced cpp testcase The testcase, reduced with Delta. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27474

[Bug target/27277] [4.2 Regression] standard i387 constant loading insns (fldz, fld1) are not generated anymore

2006-05-07 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2006-05-08 06:12 --- Fixed. -- uros at kss-loka dot si changed: What|Removed |Added Status|NEW

[Bug target/26726] -fivopts producing out of bounds array refs

2006-05-13 Thread uros at kss-loka dot si
--- Comment #14 from uros at kss-loka dot si 2006-05-13 08:46 --- (In reply to comment #13) > This is now a target specific problem, on i?86 and x86_64 we are left with an > offset of -4B and so referencing &a[5] in the exit condition. > This is PR target/24669. -- uro

[Bug tree-optimization/27638] New: Strange initialization of uninitialized structure part

2006-05-16 Thread uros at kss-loka dot si
Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27638

[Bug target/27790] [4.1/4.2 Regression] Unrecognizable insn with -ftree-vectorize -O1 -msse2

2006-05-29 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2006-05-29 10:29 --- I'm testing a patch. -- uros at kss-loka dot si changed: What|Removed |Added Assig

[Bug target/27790] [4.1/4.2 Regression] Unrecognizable insn with -ftree-vectorize -O1 -msse2

2006-05-29 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-05-29 11:52 --- (In reply to comment #4) > pr27790.patch > > This seems to work for me. In V4SImode case above, there is emit_insn (gen_subv4si3 (t1, cop0, cop1)); subv4si insn also needs cop0 in the

[Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3

2006-05-31 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2006-05-31 10:56 --- IMO the fact that gcc 3.x beats 4.x on this code could be attributed to pure luck. Looking into 3.x RTL, these things can be observed: Instruction that multiplies pA0 and rB0 is described as: __.20.combine: (insn 75 73

[Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3

2006-06-01 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2006-06-01 08:43 --- The benchmark run on a Pentium4 3.2G/800MHz FSB (32bit): vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 3.20GHz stepping: 9 cpu MHz

[Bug tree-optimization/27855] New: reassociation pass produces ~30% slower matrix multiplication code

2006-06-01 Thread uros at kss-loka dot si
Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet:

[Bug target/27855] reassociation pass produces ~30% slower matrix multiplication code

2006-06-02 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2006-06-02 10:04 --- (In reply to comment #1) > There is nothing special about reassociation at all. In fact what you are > seeing is register allocator going funky. This what you get with x87. This is also what you get wi

[Bug target/27790] [4.1 Regression] Unrecognizable insn with -ftree-vectorize -O1 -msse2

2006-06-07 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2006-06-07 07:05 --- Fixed on 4.1 branch. -- uros at kss-loka dot si changed: What|Removed |Added Status|ASSIGNED

[Bug target/28007] sse autovectorizer emits wrong code involving shifts

2006-06-13 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-06-13 07:44 --- Similar problem was solved for gcc-4.1 in PR target/22480. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28007

[Bug c++/28041] New: [gomp] ICE in g++.dg/gomp/atomic-[4,5,9].C

2006-06-15 Thread uros at kss-loka dot si
ICE in g++.dg/gomp/atomic-[4,5,9].C Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build

[Bug c++/28041] [gomp] ICE in g++.dg/gomp/atomic-[4,5,9].C

2006-06-19 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-06-19 08:56 --- Works OK with gcc version 4.2.0 20060619 (experimental). -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3

2006-06-25 Thread uros at kss-loka dot si
--- Comment #20 from uros at kss-loka dot si 2006-06-26 06:31 --- (In reply to comment #15) > Can someone tell me if anyone is looking into this problem with the hopes of > fixing it? I just noticed that despite the posted code demonstrating the > problem, and verification on

[Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3

2006-06-26 Thread uros at kss-loka dot si
--- Comment #22 from uros at kss-loka dot si 2006-06-27 05:49 --- (In reply to comment #21) > Note that you are running the opposite of my test case: SSE vs SSE rather than > x87 vs x87. This whole bug report is about x87 performance. You can get more > detail on why I want

[Bug middle-end/24929] long long shift/mask operations should be better optimized

2006-06-27 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-06-27 10:12 --- (In reply to comment #4) > which may be optimal. movzbl 18(%esp), %eax could be used in this particular case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929

[Bug middle-end/28252] pow(x,1/3.0) should be converted to cbrt(x)

2006-07-05 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2006-07-05 08:25 --- Created an attachment (id=11824) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11824&action=view) Patch to implement pow(x,1.0/3.0) = cbrt(x) optimization I have the patch that implements the optimizatio

[Bug middle-end/28252] pow(x,1/3.0) should be converted to cbrt(x)

2006-07-05 Thread uros at kss-loka dot si
-- uros at kss-loka dot si changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |uros at kss-loka dot si |dot org

[Bug tree-optimization/27474] ICE: tree check: expected ssa_name, have struct_field_tag in verify_ssa, at tree-ssa.c:776

2006-07-05 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2006-07-05 10:10 --- This still fails with current mainline gcc. -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/26949] [4.2 regression] worse code generated for -march=pentium4

2006-07-06 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-07-06 08:23 --- This problem appears to be fixed in gcc version 4.2.0 20060705 (experimental). The generated asm for the loop is now: -O2 -march=pentium4 -fno-tree-ch: jmp .L2 .L3: movl%esi, -4(%edx) addl

[Bug target/26949] [4.2 regression] worse code generated for -march=pentium4

2006-07-06 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2006-07-06 08:24 --- Closing it for real... -- uros at kss-loka dot si changed: What|Removed |Added Status

[Bug tree-optimization/24669] New: Loop index variable has offset of 1

2005-11-04 Thread uros at kss-loka dot si
ry: Loop index variable has offset of 1 Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-lo

[Bug tree-optimization/24669] Loop index variable has offset of 1

2005-11-04 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2005-11-04 09:20 --- -fno-ivopts produces: movl16(%esp), %edi movl20(%esp), %esi xorl%ebx, %ebx movl$4, %ecx <<< index starts with 1 .L2: leal(%ebx,%e

[Bug tree-optimization/24669] Loop index variable has offset of 1

2005-11-04 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2005-11-04 12:19 --- Following patch to ix86_address_cost: --- i386.c (revision 106482) +++ i386.c (working copy) @@ -5396,8 +5396,12 @@ if (parts.index && GET_CODE (parts.index) == SUBREG) parts.index = SU

[Bug target/19340] Compilation SEGFAULTs with -O1 -fschedule-insns2 -fsched2-use-traces on an x86 architecture.

2005-11-07 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2005-11-07 13:20 --- Patch here: http://gcc.gnu.org/ml/gcc-patches/2005-11/msg00438.html -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/19340] Compilation SEGFAULTs with -O1 -fschedule-insns2 -fsched2-use-traces on an x86 architecture.

2005-11-08 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2005-11-08 08:12 --- Fixed on mainline and 4.0 branch. -- uros at kss-loka dot si changed: What|Removed |Added

[Bug c/24101] [3.4/4.0/4.1 Regression] Segfault with preprocessed source

2005-11-08 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2005-11-08 10:04 --- Patch here: http://gcc.gnu.org/ml/gcc-patches/2005-11/msg00498.html -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/24265] [4.1 Regression] ICE: in extract_insn, at recog.c:2084 with -O -fgcse -fmove-loop-invariants -mtune=pentiumpro

2005-11-08 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2005-11-08 12:40 --- Created an attachment (id=10173) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10173&action=view) Patch to fix the ice This patch fixes the failure for me, but... -- http://gcc.gnu.org/bugzilla/show_bug

[Bug target/24265] [4.1 Regression] ICE: in extract_insn, at recog.c:2084 with -O -fgcse -fmove-loop-invariants -mtune=pentiumpro

2005-11-08 Thread uros at kss-loka dot si
--- Comment #8 from uros at kss-loka dot si 2005-11-08 12:53 --- > This patch fixes the failure for me, but... ... we actually gain nothing here. >From .loop2_done, we have following sequence, where mem->reg load is pushed out of the loop: (insn 21 16 39 0 (set (

[Bug target/24265] [4.1 Regression] ICE: in extract_insn, at recog.c:2084 with -O -fgcse -fmove-loop-invariants -mtune=pentiumpro

2005-11-08 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2005-11-08 13:23 --- Bah... set_unique_reg_note is needed: /* If new move insn is invalid (i.e. move of const_double to 387 stack register), force constant into memory. */ if (recog_memoized (inv->insn) == -1) { rtx

[Bug c/24101] [3.4/4.0/4.1 Regression] Segfault with preprocessed source

2005-11-08 Thread uros at kss-loka dot si
--- Comment #13 from uros at kss-loka dot si 2005-11-09 07:55 --- Fixed everywhere. -- uros at kss-loka dot si changed: What|Removed |Added Status|ASSIGNED

[Bug rtl-optimization/24319] [3.4/4.0/4.1 regression] amd64 register spill error with -fschedule-insns

2005-11-09 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2005-11-09 15:27 --- The problem is caused by the combination of (1) x86_64 parameter passing convention, (2) x86 instructions that _require_ parameters in specific registers and (3) sched1 scheduling pass. ad 1) x86_64 passes function

[Bug target/24315] [3.4 Regression] amd64 fails -fpeephole2

2005-11-09 Thread uros at kss-loka dot si
--- Comment #17 from uros at kss-loka dot si 2005-11-10 07:31 --- Fixed on 3.4 branch. -- uros at kss-loka dot si changed: What|Removed |Added Status

[Bug target/19340] Compilation SEGFAULTs with -O1 -fschedule-insns2 -fsched2-use-traces on an x86 architecture.

2005-11-09 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2005-11-10 07:33 --- Fixed on 3.4 branch. -- uros at kss-loka dot si changed: What|Removed |Added Known to work|4.0.3

[Bug rtl-optimization/15439] ICE with -fschedule-insns2 -fsched2-use-traces

2005-11-11 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2005-11-11 08:20 --- This is in fact duplicate of PR 19340. Fixed in 3.4.5. *** This bug has been marked as a duplicate of 19340 *** -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/19340] Compilation SEGFAULTs with -O1 -fschedule-insns2 -fsched2-use-traces on an x86 architecture.

2005-11-11 Thread uros at kss-loka dot si
--- Comment #10 from uros at kss-loka dot si 2005-11-11 08:20 --- *** Bug 15439 has been marked as a duplicate of this bug. *** -- uros at kss-loka dot si changed: What|Removed |Added

[Bug libgomp/24797] New: Segfault in libgomp.c/nested-1.c

2005-11-11 Thread uros at kss-loka dot si
ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24797

[Bug libgomp/24797] Segfault in libgomp.c/nested-1.c

2005-11-13 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2005-11-14 07:13 --- Fixed by Jakub's patch. -- uros at kss-loka dot si changed: What|Removed |Added S

[Bug target/24475] gcc.dg/tls/pr24428.c execution test and gcc.dg/tls/pr24428-2.c execution test fail on IA32

2005-11-15 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2005-11-15 08:13 --- Perhaps a runtime check should be added to target-supports.exp ( check_effective_target_tls-runtime perhaps) that would check if the system is capable of running tls enabled binaries. Alternatively, my proposed patch (http

[Bug target/24475] gcc.dg/tls/pr24428.c execution test and gcc.dg/tls/pr24428-2.c execution test fail on IA32

2005-11-15 Thread uros at kss-loka dot si
-- uros at kss-loka dot si changed: What|Removed |Added CC|uros at kss-loka dot si | AssignedTo|unassigned at gcc dot gnu |uros at kss-loka dot si

[Bug target/24476] [4.1/4.2 Regression] gcc.dg/tls/pr24428.c execution test and gcc.dg/tls/pr24428-2.c execution test fail on IA64

2005-11-24 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2005-11-24 08:09 --- The testsuite patch that fixes IA32 tests (and should also fix IA64 issues reported here) is at http://gcc.gnu.org/ml/gcc-patches/2005-11/msg01059.html. Patch is still waiting for review, however I can't test it on

[Bug rtl-optimization/24995] [4.1/4.2 Regression] gcc.dg/vect/vect-10.c fails for -march=athlon

2005-11-24 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2005-11-24 10:19 --- This also fails for i686-pc-linux-gnu with '-march=athlon'. The patch at http://gcc.gnu.org/ml/gcc-patches/2005-11/msg01648.html fixes i86_64-pc-linux-gnu failure in original report and -march=athlon fail

[Bug target/24982] [4.1/4.2 Regression] Bootstrap failure with ICE in refers_to_regno_for_reload_p

2005-11-24 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2005-11-24 10:19 --- *** Bug 24995 has been marked as a duplicate of this bug. *** -- uros at kss-loka dot si changed: What|Removed |Added

[Bug rtl-optimization/24982] [4.1/4.2 Regression] Bootstrap failure with ICE in refers_to_regno_for_reload_p

2005-11-24 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2005-11-24 10:24 --- (In reply to comment #4) > I've proposed a patch to this PR in > > http://gcc.gnu.org/ml/gcc-patches/2005-11/msg01648.html > > Does it solve PR 24995? Yes, both i86_64 and -march=athlon failures. -

[Bug rtl-optimization/24982] [4.1/4.2 Regression] Bootstrap failure with ICE in refers_to_regno_for_reload_p

2005-11-24 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2005-11-24 14:40 --- Critical, according to comment #7 and #8. -- uros at kss-loka dot si changed: What|Removed |Added

[Bug tree-optimization/20219] Missed optimisation sin / tan --> cos

2005-11-27 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2005-11-28 07:20 --- Reopened to ... -- uros at kss-loka dot si changed: What|Removed |Added Status|RESOLVED

[Bug middle-end/20219] Missed optimisation sin / tan --> cos

2005-11-27 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2005-11-28 07:32 --- ... close as FIXED. -- uros at kss-loka dot si changed: What|Removed |Added Status|REOPENED

[Bug target/24475] gcc.dg/tls/pr24428.c execution test and gcc.dg/tls/pr24428-2.c execution test fail on IA32

2005-12-01 Thread uros at kss-loka dot si
--- Comment #10 from uros at kss-loka dot si 2005-12-02 06:59 --- Fixed on 4.1 and mainline. -- uros at kss-loka dot si changed: What|Removed |Added Status

[Bug regression/25531] New: [4.0/4.1/4.2 Regression]: Handling of __attribute__ ((alias ("foo+X")))

2005-12-22 Thread uros at kss-loka dot si
fined symbol '_foo_b+2' -- Summary: [4.0/4.1/4.2 Regression]: Handling of __attribute__ ((alias ("foo+X"))) Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regressi

[Bug target/17390] missing floating point compare optimization

2006-01-18 Thread uros at kss-loka dot si
--- Comment #8 from uros at kss-loka dot si 2006-01-18 09:50 --- (In reply to comment #7) > Hmm, I get (but that looks like different branch predictions): It looks that your default is -mtune=pentium. > _testf: > fldl4(%esp) > ftst >

[Bug target/17390] missing floating point compare optimization

2006-01-18 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2006-01-18 09:53 --- Created an attachment (id=10666) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10666&action=view) patch to SVN GCC: (GNU) 4.2.0 20060117 (experimental) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17390

[Bug middle-end/28411] gfortran: Internal error: Illegal instruction

2006-07-18 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2006-07-18 07:29 --- This is the backtrace for the testcase in comment #3: #1 0x0827ae67 in fold_binary_to_constant (code=TRUNC_MOD_EXPR, type=0x402473f4, op0=0x402d9438, op1=0x0) at ../../gcc-svn/trunk/gcc/fold-const.c:12314 #2 0x08174b25

[Bug tree-optimization/28411] gfortran: Internal error: Illegal instruction

2006-07-18 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-07-18 08:06 --- This error can be tracked down to fold_negate_expr() returning NULL_TREE via this path: (a) constant_multiple_of() calls fold_unary_to_constant(): /* If BOT seems to be negative, try dividing by -BOT instead, and

[Bug middle-end/28685] New: Multiple comparisons are not simplified

2006-08-10 Thread uros at kss-loka dot si
Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86

[Bug target/27827] [4.0/4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3

2006-08-11 Thread uros at kss-loka dot si
--- Comment #64 from uros at kss-loka dot si 2006-08-11 09:18 --- Slightly offtopic, but to put some numbers to comment #8 and comment #11, equivalent SSE code now reaches only 50% of x87 single performance and 60% of x87 double performance on AMD x86_64: ALGORITHM NB REPS

[Bug rtl-optimization/21676] [4.0/4.1/4.2 Regression] Optimizer regression: SciMark sparse matrix benchmark

2006-08-16 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2006-08-16 12:15 --- IMO the problem here is in IVopts. Using gcc-3.x, the innermost loop compiles to: .L15: movl(%edi,%edx,4), %eax fldl(%ebp,%edx,8) addl$1, %edx fmull (%esi,%eax,8) cmpl

[Bug rtl-optimization/21676] [4.0/4.1/4.2 Regression] Optimizer regression: SciMark sparse matrix benchmark

2006-08-17 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2006-08-17 07:21 --- (In reply to comment #6) > I think that remaining time difference is due to strange loop above innermost: ... due to strange _header_ above innermost loop ... The problem is that we load zero in both arms of "if

[Bug rtl-optimization/21676] [4.0/4.1/4.2 Regression] Optimizer regression: SciMark sparse matrix benchmark

2006-08-17 Thread uros at kss-loka dot si
--- Comment #8 from uros at kss-loka dot si 2006-08-17 07:45 --- Also interesting is, that -march=pentium4 produces following "de-optimized" code, adding a couple more instructions and wasting %eax register: .L8: leal(%ebx,%ebx), %eax movl40(%

[Bug rtl-optimization/21676] [4.0/4.1/4.2 Regression] Optimizer regression: SciMark sparse matrix benchmark

2006-08-28 Thread uros at kss-loka dot si
--- Comment #10 from uros at kss-loka dot si 2006-08-29 06:12 --- (In reply to comment #9) > Fixed on the mainline by: > http://gcc.gnu.org/ml/gcc-patches/2006-08/msg01036.html Not really, the above patch fixed only one of three problems. The other two remains, that is: -

[Bug tree-optimization/28915] [4.2 regression] ICE: tree check: expected class 'constant', have 'declaration' (var_decl) in build_vector, at tree.c:973

2006-08-31 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2006-08-31 19:15 --- Confirmed on x86_64. Backtrace: (gdb) bt #0 build_vector (type=0x2db3e6e0, vals=0x2db37cc0) at ../../gcc-svn/trunk/gcc/tree.c:973 #1 0x007b829d in force_const_mem (mode=V2DImode, x=0x2da089e0) at

[Bug target/28909] Missed optimization with x86 sync builtins

2006-09-01 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2006-09-01 10:18 --- Patch at http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00010.html -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/28924] New: x86 sync builtins fail for char and short memory operands

2006-09-01 Thread uros at kss-loka dot si
rity: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28924

[Bug libgomp/28926] New: FAIL: libgomp.c/ordered-1.c execution test

2006-09-01 Thread uros at kss-loka dot si
libgomp AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28926

[Bug libgomp/28926] FAIL: libgomp.c/ordered-1.c execution test

2006-09-03 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-09-04 05:49 --- The problem is that RH8.0 defines SYS_gettid and SYS_futex in headers although futex syscall is not really supported in the kernel. The build process detects this and issues a warning to configure with --disable-linux

[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-04 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2006-09-05 06:20 --- (In reply to comment #2) > It is entirely coincident. For some processors, it is an optimization to avoid > partial flag register stall. When it is fixed, it should be reenabled with a > new flag, somet

[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-05 Thread uros at kss-loka dot si
--- Comment #5 from uros at kss-loka dot si 2006-09-05 09:35 --- The problem here is following: We already have the patterns, that would satisfy combined instruction (*lshrsi3_cmp) in above testcase. However, combiner rejects combined instruction because the register that holds shifted

[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-05 Thread uros at kss-loka dot si
--- Comment #6 from uros at kss-loka dot si 2006-09-05 11:45 --- Patch at http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00137.html BTW: This patch eliminates 869 "test" instructions in povray-3.6.1 compile. (And my test raytraced pictures are still correct.) -- uros at ks

[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-05 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2006-09-05 13:43 --- Hm, proposed patch now generates worse code for following test: extern int fnc1(void); extern int fnc2(void); int test(int x) { if (x & 0x02) return fnc1(); else if (x & 0x01)

[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-06 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2006-09-06 11:33 --- Patch at http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00162.html implements missing i386.md RTL patterns. This is i386 target-specific fix for this bug. The patch was bootstrapped on i686-pc-linux-gnu and x86_64-pc-linux

[Bug target/26968] [4.1 Regression] HDF5 1.7.52 test segfaults with 4.1.0, fine with 4.0.2 (regression)

2006-09-06 Thread uros at kss-loka dot si
--- Comment #9 from uros at kss-loka dot si 2006-09-07 06:58 --- I have built and run a testsuite of HDF5 library on i686-pc-linux-gnu with: gcc version 4.2.0 20060906 (experimental) hdf5-1.6.5 (production): (CFLAGS="-fno-strict-aliasing" is needed before configure) All

[Bug target/28924] x86 sync builtins fail for char and short memory operands

2006-09-07 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2006-09-08 05:47 --- I have been playing with following patch to optabs.c that forces operands in functions expand_sync_operation(), expand_sync_fetch_operation() and expand_sync_lock_test_and_set() into registers through subregs of word-mode

[Bug target/28946] assembler shifts set the flag ZF, no need to re-test to zero

2006-09-19 Thread uros at kss-loka dot si
--- Comment #14 from uros at kss-loka dot si 2006-09-19 11:31 --- Fixed everywhere. -- uros at kss-loka dot si changed: What|Removed |Added Status|ASSIGNED

[Bug target/29169] sse3-not-fisttp.c scan-assembler-not fisttp FAILs on i386-pc-solaris2.10

2006-09-23 Thread uros at kss-loka dot si
-- uros at kss-loka dot si changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |uros at kss-loka dot si |dot org

[Bug target/29169] sse3-not-fisttp.c scan-assembler-not fisttp FAILs on i386-pc-solaris2.10

2006-09-23 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2006-09-23 14:41 --- Fixed. -- uros at kss-loka dot si changed: What|Removed |Added Status|ASSIGNED

[Bug target/29300] FAIL: gcc.dg/pthread-init-[12].c (test for excess errors)

2006-10-03 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-10-03 07:04 --- Similar problems were recently fixed for solaris and glibc-2.3.5. It looks that hpux needs a fixinclude hack that would cure these errors/warnings, somehing like: http://gcc.gnu.org/ml/gcc-patches/2006-09/msg01317.html

[Bug target/29337] -mfpmath=387 doesn't use fistp for double-to-integer conversion

2006-10-03 Thread uros at kss-loka dot si
--- Comment #3 from uros at kss-loka dot si 2006-10-04 06:46 --- > I'm afraid you're missing my point. > The problem is that for 64-bit and 32-bit floating-point to integer > conversion, > x86 (32bit) target uses fistp* whereas x86_64 (64-bit) target uses cv

[Bug target/29337] -mfpmath=387 doesn't use fistp for double-to-integer conversion

2006-10-05 Thread uros at kss-loka dot si
--- Comment #8 from uros at kss-loka dot si 2006-10-05 07:08 --- > try -O2 -msse2, you get: > _Z8todoubledd: > subl$12, %esp > fldl24(%esp) > faddl 16(%esp) > fstpl (%esp) > movsd (%esp), %xmm0 >

[Bug target/29347] i386 mode switching clobbers fp exception handling bits

2006-10-05 Thread uros at kss-loka dot si
--- Comment #2 from uros at kss-loka dot si 2006-10-05 07:51 --- (In reply to comment #0) > The mode switching for floating point rounding that the i386 backend does > does not actually place mode switches, but rather the calculation of values > used for mode switches. Not

[Bug target/28924] x86 sync builtins fail for char and short memory operands

2006-10-06 Thread uros at kss-loka dot si
--- Comment #4 from uros at kss-loka dot si 2006-10-06 08:27 --- Please note, that in addition to http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00250.html, http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00244.html is also needed. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28924

[Bug target/28924] x86 sync builtins fail for char and short memory operands

2006-10-06 Thread uros at kss-loka dot si
--- Comment #8 from uros at kss-loka dot si 2006-10-07 06:12 --- Testcase was commited to trunk and 4.1 branch, and now passes everywhere. -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/29377] New: Build for h8300-elf crashes on 64bit hosts due to int/HWI mismatch

2006-10-07 Thread uros at kss-loka dot si
ion: 4.2.0 Status: UNCONFIRMED Keywords: build Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc

[Bug target/29377] Build for h8300-elf crashes on 64bit hosts due to int/HWI mismatch

2006-10-07 Thread uros at kss-loka dot si
--- Comment #1 from uros at kss-loka dot si 2006-10-07 07:51 --- Propsoed patch at http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00337.html -- uros at kss-loka dot si changed: What|Removed |Added

[Bug target/27440] [4.0/4.1/4.2 regression] code quality regression due to ivopts

2006-10-10 Thread uros at kss-loka dot si
--- Comment #7 from uros at kss-loka dot si 2006-10-10 14:48 --- (In reply to comment #6) > Confirmed (as in comment #1). With -Os instead of -O2 we even produce > > .L3: > movl%ebx, -4(%edx) The -4(...) part comes from PR 24669. -- http://gcc.gnu.

[Bug target/15617] building groff-1.19.1 with "-Os -march=pentium4" causes sig 11

2005-02-17 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-02-18 06:45 --- FYI: gcc 4.0 doesn't generate any SSE instructions for testcase.cc: gcc -Os -march=pentium4 -S testcase.cc grep xmm testcase.s | wc -l 0 -- What|Removed |

[Bug tree-optimization/16876] [3.3/3.4/4.0 Regression] ICE on testcase with -O3 in gen_lowpart

2005-02-22 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-02-22 10:49 --- The problem in mainline is that 'tree type' of RECORD_TYPE enters fold_convert() and triggers gcc_unreachable() in line 2003. However Borland C++ exits with error: "Call to function 'g

[Bug middle-end/19987] [meta-bug] fold missing optimizations in general

2005-02-26 Thread uros at kss-loka dot si
-- Bug 19987 depends on bug 20219, which changed state. Bug 20219 Summary: Missed optimisation sin / tan --> cos http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20219 What|Old Value |New Value

[Bug tree-optimization/20219] Missed optimisation sin / tan --> cos

2005-02-26 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-02-26 09:50 --- Here is the patch to implement missing folds: http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01024.html And here is the explanation why this transformation is not suitable for GCC even with -ffast-math: http

[Bug target/17688] [4.1] x87 fops can handle HImodes

2005-03-07 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-03-08 06:29 --- Patch for mainline awaiting review: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00644.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17688

[Bug target/18668] use prescott's fisttp

2005-03-10 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-03-10 11:01 --- Patch here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01009.html -- What|Removed |Added

[Bug target/18668] use prescott's fisttp

2005-03-11 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-03-11 09:31 --- Updated patch (no need for FLAGS_REG clobber and some mode macro stuff) at http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01119.html Regarding comment #4: I have the same thought as Ferdinand. fisttp insn should

[Bug target/20421] New: 387 mode switching clobbers flags

2005-03-11 Thread uros at kss-loka dot si
Priority: P2 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i386-pc-linux-gnu GCC host triplet: i386-pc-linux-gnu GCC target triplet: i386

[Bug target/12308] '387 mode switching clobbers flags

2005-03-11 Thread uros at kss-loka dot si
-- What|Removed |Added OtherBugsDependingO||20421 nThis|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12308

[Bug target/20415] [4.0/4.1 Regression] Vector init builtin produces invalid instruction pshufw

2005-03-11 Thread uros at kss-loka dot si
--- Additional Comments From uros at kss-loka dot si 2005-03-11 12:32 --- The problem is in "*vec_dupv4hi" pattern. However if constraint is changed to (correct) "TARGET_SSE || TARGET_3DNOW_A", testcase ICES with unrecognizable insn. -- http://gcc.gnu.org/bugz

  1   2   3   4   >