[Bug tree-optimization/92335] New: missed transformation to branchless

2019-11-03 Thread vincenzo.innocente at cern dot ch
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch Target Milestone: --- in the following code (compiled with -O2 or -O3 and even with -march=haswell) gcc will use a branchless construct in foo but not in bar (changing from float to int

[Bug tree-optimization/92335] missed transformation to branchless

2019-11-07 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92335 --- Comment #3 from vincenzo Innocente --- Understood for float it seems to me that the transformation does not occur for integer neither (signed or unsigned) as in using T= unsigned int; T bar(T const * __restrict__ x, T const * __restrict__

[Bug tree-optimization/56273] [4.8 regression] Bogus -Warray-bounds warning

2013-02-12 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56273 --- Comment #9 from vincenzo Innocente 2013-02-12 16:24:11 UTC --- I am just rebuilding (Updated to revision 195983.) and noticed /home/data/newsoft/gcc-build/./gcc/xgcc -B/home/data/newsoft/gcc-build/./gcc/ -B/afs/cern.ch/user/i/innocent/

[Bug c++/56381] New: ICE: cc1plus: internal compiler error: in gimplify_expr, at gimplify.c:7842

2013-02-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56381 Bug #: 56381 Summary: ICE: cc1plus: internal compiler error: in gimplify_expr, at gimplify.c:7842 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UN

[Bug c++/56381] ICE: cc1plus: internal compiler error: in gimplify_expr, at gimplify.c:7842

2013-02-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56381 --- Comment #1 from vincenzo Innocente 2013-02-18 17:10:03 UTC --- Created attachment 29484 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29484 preprocessed file of user code (sorry for not reducing)

[Bug middle-end/55266] vector expansion: 24 movs for 4 adds

2013-03-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55266 --- Comment #4 from vincenzo Innocente 2013-03-03 11:58:24 UTC --- I see still problems when calling inline functions. It seems that the code to satisfy the "calling ABI" is generated anyhow. take the example below and compare the code g

[Bug rtl-optimization/50728] Inefficient vector loads from aggregates passed by value

2013-03-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50728 --- Comment #5 from vincenzo Innocente 2013-03-03 12:01:23 UTC --- crosspost with PR55266. feel free to consolidate in a single PR I see still problems when calling inline functions. It seems that the code to satisfy the "calling ABI"

[Bug tree-optimization/56541] New: vectorizaton fails in conditional assignment of a constant

2013-03-05 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56541 Bug #: 56541 Summary: vectorizaton fails in conditional assignment of a constant Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRME

[Bug tree-optimization/50789] Gather vectorization

2013-04-02 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50789 vincenzo Innocente changed: What|Removed |Added CC||vincenzo.innocente at cern

[Bug tree-optimization/56829] New: Feature request: "generic" builtin for "movemask"

2013-04-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56829 Bug #: 56829 Summary: Feature request: "generic" builtin for "movemask" Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: enhance

[Bug libstdc++/57110] New: is the use of "uint_fast32_t" in intentional?

2013-04-29 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57110 Bug #: 57110 Summary: is the use of "uint_fast32_t" in intentional? Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug libstdc++/57110] is the use of "uint_fast32_t" in intentional?

2013-04-29 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57110 --- Comment #2 from vincenzo Innocente 2013-04-29 11:47:54 UTC --- Understood. The question should than be escalated to the c++ standard committee In my opinion the use of a 32-bit unsigned int as storage and return type for a mersenne_

[Bug c++/57132] New: spurious warning: division by zero [-Wdiv-by-zero] in if (m) res %=m;

2013-05-01 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57132 Bug #: 57132 Summary: spurious warning: division by zero [-Wdiv-by-zero] in if (m) res %=m; Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIR

[Bug tree-optimization/57162] New: Ofast does not make use of avx while O3 does

2013-05-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57162 Bug #: 57162 Summary: Ofast does not make use of avx while O3 does Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/57169] New: fully unrolled matrix multiplication not vectorized

2013-05-04 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57169 Bug #: 57169 Summary: fully unrolled matrix multiplication not vectorized Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: norma

[Bug lto/53895] [4.7/4.8 Regression][lto] symbol 'std::__once_callable' used as both __thread and non-__thread

2012-08-14 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53895 --- Comment #5 from vincenzo Innocente 2012-08-15 06:58:49 UTC --- btw I opened a gold bug http://sourceware.org/bugzilla/show_bug.cgi?id=14342 which did not get any attention yet

[Bug libstdc++/54268] New: std::string::reserve not consistent with std::vector::reserve

2012-08-15 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54268 Bug #: 54268 Summary: std::string::reserve not consistent with std::vector::reserve Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug libstdc++/54268] std::string::reserve not consistent with std::vector::reserve

2012-08-15 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54268 --- Comment #2 from vincenzo Innocente 2012-08-15 14:31:48 UTC --- clang behaves similarly (even with -stdlib=libc++)

[Bug libstdc++/54320] New: [c++11] range access to VLA

2012-08-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54320 Bug #: 54320 Summary: [c++11] range access to VLA Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug libstdc++/54320] [c++11] range access to VLA

2012-08-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54320 --- Comment #3 from vincenzo Innocente 2012-08-19 07:24:52 UTC --- int foo2(int N) { int v[N]; for ( auto a : v) if (a) return a; return 0; } works, though was similar to std::begin(v) std::end(v)

[Bug libstdc++/54320] [c++11] range access to VLA

2012-08-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54320 vincenzo Innocente changed: What|Removed |Added Severity|normal |enhancement --- Comment #5 from vinc

[Bug c++/54557] New: [c++ lambda] error in assigning lambda expr though "operator?:" while catching

2012-09-12 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54557 Bug #: 54557 Summary: [c++ lambda] error in assigning lambda expr though "operator?:" while catching Classification: Unclassified Product: gcc Version: 4.8.0 Status:

[Bug c++/54557] [c++ lambda] error in assigning lambda expr though "operator?:" while capturing

2012-09-12 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54557 vincenzo Innocente changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|

[Bug lto/54966] New: Does LTO requires a larger inline-unit-growth?

2012-10-17 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54966 Bug #: 54966 Summary: Does LTO requires a larger inline-unit-growth? Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal

[Bug lto/54966] Does LTO requires a larger inline-unit-growth?

2012-10-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54966 --- Comment #3 from vincenzo Innocente 2012-10-19 08:36:20 UTC --- the patch fails w.r.t. 4.7 patch -p0 < ../../inline.patch patching file ipa-inline.c Hunk #1 FAILED at 473. Hunk #2 FAILED at 491. Hunk #3 FAILED at 545. 3 out of 3

[Bug fortran/48636] Enable more inlining with -O2 and higher

2012-10-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #26 from vincenzo Innocente 2012-10-19 08:45:03 UTC --- I'm interested to test the patch on our large application currently compiled with 4.7.2. would it be possible to get the same patch against gcc-4_7-branch? thanks

[Bug c++/54999] New: [4.8 regression] ICE in tsubst_copy, at cp/pt.c:12387

2012-10-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54999 Bug #: 54999 Summary: [4.8 regression] ICE in tsubst_copy, at cp/pt.c:12387 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug c++/54844] [4.8 Regression] ice tsubst_copy, at cp/pt.c:12352

2012-10-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54844 vincenzo Innocente changed: What|Removed |Added CC||vincenzo.innocente at cern

[Bug c++/54999] [4.8 regression] ICE in tsubst_copy, at cp/pt.c:12387

2012-10-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54999 vincenzo Innocente changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|

[Bug tree-optimization/55016] New: request for specific builtins for rcp and rsqrt

2012-10-21 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55016 Bug #: 55016 Summary: request for specific builtins for rcp and rsqrt Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhanceme

[Bug tree-optimization/55016] request for specific builtins for rcp and rsqrt

2012-10-22 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55016 --- Comment #2 from vincenzo Innocente 2012-10-23 05:19:37 UTC --- For the application I have in mind a global flag will such as -ffaster-math will not be suitable as it would affect also places where full "single precision" is still requi

[Bug tree-optimization/55071] New: "Horizontal sum" of bultin vectors

2012-10-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55071 Bug #: 55071 Summary: "Horizontal sum" of bultin vectors Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement

[Bug middle-end/54400] recognize vector reductions

2012-10-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400 vincenzo Innocente changed: What|Removed |Added CC||vincenzo.innocente at cern

[Bug tree-optimization/55071] "Horizontal sum" of bultin vectors

2012-10-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55071 vincenzo Innocente changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug tree-optimization/50713] SLP vs loop: code generated differs (SLP less efficient)

2012-10-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50713 vincenzo Innocente changed: What|Removed |Added Summary|SLP vs loop: code generated |SLP vs loop: code generated

[Bug middle-end/50713] SLP vs loop: code generated differs (SLP less efficient)

2012-10-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50713 vincenzo Innocente changed: What|Removed |Added Component|tree-optimization |middle-end Versi

[Bug fortran/48636] Enable more inlining with -O2 and higher

2012-10-28 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636 --- Comment #32 from vincenzo Innocente 2012-10-28 11:27:22 UTC --- In a small test (that I will eventually publish here) the new patch at -O2 looks superior to 4.7.2 at O3. I would like to build a test with multiple source files where lto matte

[Bug c++/55149] New: capturing VLA in lambda (error in 4.7.2 ICE in 4.8

2012-10-31 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55149 Bug #: 55149 Summary: capturing VLA in lambda (error in 4.7.2 ICE in 4.8 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug lto/53746] [lto] segfault in std::vector::__base_ctor (with -fipa-pta)

2012-11-01 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53746 vincenzo Innocente changed: What|Removed |Added Known to work||4.8.0 Known to fail|

[Bug tree-optimization/55213] New: vectorizer ignores __restrict__

2012-11-05 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55213 Bug #: 55213 Summary: vectorizer ignores __restrict__ Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Pr

[Bug tree-optimization/55213] vectorizer ignores __restrict__

2012-11-05 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55213 --- Comment #2 from vincenzo Innocente 2012-11-05 13:28:51 UTC --- reading PR49279 it seems to me that gcc should NOT emit runtime alias checks, Instead I see 15: create runtime check for data references *_12 and *_9 15: create runtime

[Bug lto/54966] Does LTO requires a larger inline-unit-growth?

2012-11-08 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54966 --- Comment #8 from vincenzo Innocente 2012-11-09 06:39:33 UTC --- Created attachment 28646 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28646 test case (preprocessed with gcc 4.7.2)

[Bug lto/54966] Does LTO requires a larger inline-unit-growth?

2012-11-08 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54966 --- Comment #9 from vincenzo Innocente 2012-11-09 06:52:22 UTC --- better and worse! better than 4.7.2 lto is worse in 4.8 Attached is a test case, just one file bzip2 -d smatrix.ii.bz2 the main component is this three different way of comput

[Bug lto/54966] Does LTO requires a larger inline-unit-growth?

2012-11-09 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54966 --- Comment #10 from vincenzo Innocente 2012-11-09 11:33:37 UTC --- I've repeated the tests again on a different machine and the result are the same gcc version 4.8.0 20121108 (experimental) [trunk revision 19] (GCC) at O3 lto degrades the

[Bug tree-optimization/47860] New: is vectorization of "condition in nested loop" supported

2011-02-23 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47860 Summary: is vectorization of "condition in nested loop" supported Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug tree-optimization/47860] is vectorization of "condition in nested loop" supported

2011-02-23 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47860 --- Comment #1 from vincenzo Innocente 2011-02-23 16:32:36 UTC --- it seems that there is a problem in the use of "unsigned int": shall I open a different bug report? even a simple comparison fails to vectorize float amin(float * c, unsigned i

[Bug tree-optimization/47860] is vectorization of "condition in nested loop" supported

2011-02-24 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47860 --- Comment #3 from vincenzo Innocente 2011-02-24 08:24:31 UTC --- Thanks Ira for the quick answer. For what concern if (N <= 0) that was the reason to use "unsigned int" which apparently cause vectorization not to work. As we are on the subjec

[Bug tree-optimization/47860] is vectorization of "condition in nested loop" supported

2011-02-24 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47860 --- Comment #5 from vincenzo Innocente 2011-02-24 08:48:25 UTC --- I see, if you ever will commit anything in the mainline please let me know (I do like yet to work with patches in gcc :-). I understand that you need to provide an "architectural

[Bug middle-end/47895] New: usage of __attribute__ ((__target__ ("xyz"))) with buitins

2011-02-25 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47895 Summary: usage of __attribute__ ((__target__ ("xyz"))) with buitins Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Componen

[Bug middle-end/47895] usage of __attribute__ ((__target__ ("xyz"))) with buitins

2011-02-26 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47895 --- Comment #2 from vincenzo Innocente 2011-02-26 09:55:03 UTC --- I find that the solution with multiple files shifts the problem to the build system, which is not necessarily an easier solution in all projects, and make maintenance more difficu

[Bug tree-optimization/57634] New: Missed vectorization for a "fixed point multiplication" reduction

2013-06-17 Thread vincenzo.innocente at cern dot ch
ty: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch I the following code the loop in "red" does not vectorize "because"of note: reduction: not commutative/ass

[Bug tree-optimization/57796] New: AVX2 gather vectorization: code bloat and reduction of performance

2013-07-03 Thread vincenzo.innocente at cern dot ch
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch At least in scimark2 sparse matrix multiplication the use of gather instructions ends in code bloat and a substantial reduction of

[Bug tree-optimization/50789] Gather vectorization

2013-07-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50789 --- Comment #13 from vincenzo Innocente --- I just submitted a specific bug-report as PR57796

[Bug tree-optimization/57823] New: restrict qualifier non effective with pointer returned by new

2013-07-04 Thread vincenzo.innocente at cern dot ch
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch I am sure this has been already discussed, not found a specific report though. below the code emitted for "add" is what expected, for

[Bug tree-optimization/57823] restrict qualifier non effective with pointer returned by new

2013-07-04 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57823 --- Comment #3 from vincenzo Innocente --- indeed float * bar3() { const float * a = (float*) malloc(4*128); const float * b = (float*) malloc(4*128); float * c = (float*) malloc(4*128); a = (const float*)__builtin_assume_aligned (a, 16

[Bug tree-optimization/57858] New: AVX2: ymm used for div, not for sqrt

2013-07-09 Thread vincenzo.innocente at cern dot ch
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch in the following example div uses ymm registries while sqr only xmm ones gcc version 4.9.0 20130630 (experimental) [trunk revision 200570] (GCC) cat avx2sqrt.cc #include double div

[Bug tree-optimization/57858] AVX2: ymm used for div, not for sqrt

2013-07-09 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858 --- Comment #2 from vincenzo Innocente --- actually the code for div and sqr is different already for standard SSE c++ -std=c++11 -Ofast -S avx2sqrt.cc -ftree-vectorizer-verbose=1 -Wall ; cat avx2sqrt.s .L2: movdqa%xmm0, %xmm1 addl

[Bug tree-optimization/57858] AVX2: ymm used for div, not for sqrt

2013-07-10 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858 --- Comment #5 from vincenzo Innocente --- I remember something similar in the past --param max-completely-peel-times=1 sort of fix it… (why pre does not recognize that 1/(1+0) == 1 btw?? of course it is just a benchmark (and I can modify it t

[Bug target/57927] New: -march=core-avx2 different than -march=native on INTEL Haswell (i7-4700K)

2013-07-18 Thread vincenzo.innocente at cern dot ch
: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch for instance mkdir scimark2TMP cd scimark2TMP wget http://math.nist.gov/scimark2/scimark2_1c.zip . unzip scimark2_1c.zip c++ -S LU.c -O3

[Bug target/57927] -march=core-avx2 different than -march=native on INTEL Haswell (i7-4700K)

2013-07-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57927 --- Comment #2 from vincenzo Innocente --- COLLECT_GCC_OPTIONS='-S' '-O3' '-march=native' '-o' 'LU.native' '-v' '-shared-libgcc' /afs/cern.ch/user/i/innocent/w2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/cc1plus -quiet -v -D_GNU_SOURCE LU.c -marc

[Bug target/57952] New: AVX/AVX2 no ymm registries used in a trivial reduction

2013-07-22 Thread vincenzo.innocente at cern dot ch
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch in this quite trivial benchmark gcc does not generate avx/avx2 instruction using ymm registries c++ -Ofast -S polyAVX.cpp -march=core-avx2 ; grep -c "ymm" polyAVX.s 0 clan

[Bug target/57954] New: AVX missing vxorps (zeroing) before vcvtsi2s %edx, slow down AVX code

2013-07-22 Thread vincenzo.innocente at cern dot ch
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch in the following benchmark performances w/o vectorization are poor wrt to expectations I find out this is due to non zeroing a register before

[Bug target/57952] AVX/AVX2 no ymm registers used in a trivial reduction

2013-07-23 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57952 --- Comment #1 from vincenzo Innocente --- I modified a bit the benchmark adding timing and the new version now vectorize YMM with avx2, still not with old avx if I remove the call to rdtsc(); it does not use YMM anymore -fno-tree-pre does not hel

[Bug target/57954] AVX missing vxorps (zeroing) before vcvtsi2s %edx, slow down AVX code

2013-07-27 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954 --- Comment #5 from vincenzo Innocente --- confirmed that the patch fixes the issue c++ -O2 -march=corei7-avx polyAVX.cpp time ./a.out 10358474048 2.965u 0.001s 0:02.97 99.6%0+0k 0+0io 146pf+0w

[Bug target/57954] AVX missing vxorps (zeroing) before vcvtsi2s %edx, slow down AVX code

2013-07-29 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954 --- Comment #8 from vincenzo Innocente --- thanks for getting in the trunk. will be possible to back port to at least 4.8? (this issue is there till 4.4!)

[Bug target/58268] New: umm registers not used for -march=bdver1

2013-08-29 Thread vincenzo.innocente at cern dot ch
Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch in this trival example avx is used for corei7-avx and core-avx2 not for bdver1 float a[1024]; float x[1024]; float bar(float b) { float r=0.; for (int i=0; i!=1024; ++i) r += a[i]+b*x[i

[Bug ipa/58291] New: ICE with ipa-pta

2013-09-01 Thread vincenzo.innocente at cern dot ch
: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch this is a regression w.r.t. gcc version 4.9.0 20130820 (experimental) [trunk revision 201887] (GCC) c++ -g -O2 -c -std=gnu++11 -fipa-pta ipa_err.i RooMinimizer.cc: In destructor 'RooMinimizer::~RooMini

[Bug ipa/58291] ICE with ipa-pta

2013-09-01 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58291 --- Comment #1 from vincenzo Innocente --- Created attachment 30738 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30738&action=edit real-code file. just preprocessed no reduction attempted

[Bug libgomp/58462] New: gomp4: invalid controlling predicate for != (< is ok)

2013-09-18 Thread vincenzo.innocente at cern dot ch
P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch CC: jakub at gcc dot gnu.org took me years to learn and teach to use != instead of "<"…. float a[1024]; float b[1024]; void err() { #pragma omp sim

[Bug libgomp/58462] gomp4: invalid controlling predicate for != (< is ok)

2013-09-18 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58462 --- Comment #2 from vincenzo Innocente --- Thanks Jakub. Downloaded the standard. waiting for more examples of usage It is a pity that it does not support c++ range loop Let me highjack this bug to congratulate you and your collaborators for the

[Bug tree-optimization/58472] New: gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch #include float a[1024]; float b[1024]; float sumO1() { auto s = 0.f; #pragma omp simd reduction(+:s) for (auto i=0U;i<1024;++i) {

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #2 from vincenzo Innocente --- yes cat omp4red.cc float a[1024]; float b[1024]; float sumO1() { float s = 0.f; #pragma omp simd reduction(+:s) for (int i=0;i<1024;++i) { s += a[i]*b[i]; } return s; } pb-d-128-141-131-26:ve

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #3 from vincenzo Innocente --- on linux c++ -O2 -ftree-vectorizer-verbose=1 -S omp4red.cc -fopenmp omp4red.cc:8:13: note: loop vectorized omp4red.cc: In function 'float sumO1()': omp4red.cc:4:7: internal compiler error: in vectori

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #4 from vincenzo Innocente --- gcc -O2 libgomp/testsuite/libgomp.c/simd-3.c -fopenmp libgomp/testsuite/libgomp.c/simd-3.c: In function ‘foo’: libgomp/testsuite/libgomp.c/simd-3.c:14:1: internal compiler error: in vectorizable_store, a

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #6 from vincenzo Innocente --- seems so gcc -O2 libgomp/testsuite/libgomp.c/simd-4.c -fopenmp c++ -O2 -S omp4red.cc -fopenmp| cat omp4red.s .text .align 4,0x90 .globl __Z5sumO1v __Z5sumO1v: LFB0: etc could you please

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #8 from vincenzo Innocente --- Yes I compile gcc with -O2 -ftree-vectorize on linux I also do bootstrap-lto strange that the compiler does not warn about this uninitialized variable: it does for a couple of others that force me to com

[Bug tree-optimization/58472] gomp4: ICE in in vectorizable_store, at tree-vect-stmts.c:4192

2013-09-19 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58472 --- Comment #9 from vincenzo Innocente --- w/o opening another bug report c++ -O2 -S omp4red.cc -fopenmp -Wall omp4red.cc: In function ‘float sumO1()’: omp4red.cc:6:9: warning: ‘simduid.0’ is used uninitialized in this function [-Wuninitialized

[Bug libgomp/58482] New: gomp4: user defined reduction produce wrong result

2013-09-20 Thread vincenzo.innocente at cern dot ch
: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch CC: jakub at gcc dot gnu.org I acknowledge that my understanding of "omp declare" is still limited. Still the example below produces different result wi

[Bug libgomp/58482] gomp4: user defined reduction produce wrong result

2013-09-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482 --- Comment #2 from vincenzo Innocente --- Thanks Jakub for the clear answer. The reduction operator should be strictly commutative! and I now understand the meaning of omp declare reduction (I hope) so I modified it as you can see below results

[Bug libgomp/58482] gomp4: user defined reduction produce wrong result

2013-09-21 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58482 --- Comment #4 from vincenzo Innocente --- I see. I have several use cases in which the reduction requires the access to two variables (minloc for instance: the minimum and its location) btw tried omp parallel for simd got ICE c++ -std=c++11 u

[Bug tree-optimization/48092] associative property of builtins is not exploited on GIMPLE

2011-09-08 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48092 --- Comment #3 from vincenzo Innocente 2011-09-08 10:01:48 UTC --- btw even in C with -Ofast a*exp(x)*exp(y) (same for sqrt) is NOT optimized. compare double exp0(double x, double y) { return exp(x)*exp(y); } double exp1(double a, double x

[Bug tree-optimization/50374] New: Support vectorization of min/max location pattern

2011-09-13 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50374 Bug #: 50374 Summary: Support vectorization of min/max location pattern Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: enhancement

[Bug tree-optimization/50374] Support vectorization of min/max location pattern

2011-09-13 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50374 --- Comment #3 from vincenzo Innocente 2011-09-13 08:45:40 UTC --- with gcc version 4.7.0 20110910 (experimental) (GCC) int lmin(float const * __restrict__ c, int N) { int k=0; for (int i=1; i!=N; ++i) k = (c[k] > c[i]) ? i : k; re

[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2011-09-13 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770 --- Comment #10 from vincenzo Innocente 2011-09-13 09:52:53 UTC --- resurrecting this: just checked with gcc version 4.7.0 20110910 -mveclibabi=svml -LwhereverYouhaveIntelSoftware/linux/x86_64/Compiler/11.1/072/lib/intel64/ -lsvml -lirc and si

[Bug tree-optimization/50374] Support vectorization of min/max location pattern

2011-09-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50374 --- Comment #9 from vincenzo Innocente 2011-09-20 12:05:01 UTC --- does not compile to me ../.././gcc/tree-vect-loop.c: In function 'vect_is_simple_reduction_1': ../.././gcc/tree-vect-loop.c:2237:35: warning: suggest parentheses around '&&' with

[Bug tree-optimization/50374] Support vectorization of min/max location pattern

2011-09-20 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50374 --- Comment #12 from vincenzo Innocente 2011-09-20 13:46:16 UTC --- I'm getting these errors ../.././gcc/optabs.c: In function 'optab_d* optab_for_tree_code(tree_code, const_tree, optab_subtype)': ../.././gcc/optabs.c:470:9: error: cannot conver

[Bug libstdc++/50348] -fvisibility=hidden doesn't hide stl implementation details.

2011-09-22 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50348 vincenzo Innocente changed: What|Removed |Added CC||vincenzo.innocente at cern

[Bug lto/50483] New: lto turns visibility from HIDDEN to DEFAULT

2011-09-22 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50483 Bug #: 50483 Summary: lto turns visibility from HIDDEN to DEFAULT Classification: Unclassified Product: gcc Version: lto Status: UNCONFIRMED Severity: normal Priority

[Bug libstdc++/50348] -fvisibility=hidden doesn't hide stl implementation details.

2011-09-22 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50348 --- Comment #5 from vincenzo Innocente 2011-09-22 11:49:29 UTC --- indeed and in "exception" header-file is a place where visibility is correctly handled #pragma GCC visibility push(default) extern "C++" { namespace std { } #pragma GCC visibilit

[Bug middle-end/50534] New: sincos not supported for svlm

2011-09-27 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50534 Bug #: 50534 Summary: sincos not supported for svlm Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: enhancement Priority: P3

[Bug tree-optimization/50596] New: Problems in vectorization of condition expression

2011-10-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596 Bug #: 50596 Summary: Problems in vectorization of condition expression Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596 --- Comment #1 from vincenzo Innocente 2011-10-03 08:40:53 UTC --- manage to vectorize this int j[1024]; void foo5() { for (int i=0; i!=N; ++i) j[i] = (a[i]

[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596 --- Comment #3 from vincenzo Innocente 2011-10-04 09:11:53 UTC --- for (int i = 0; i < 1024; i++) a[i] = b[i] < c[i] ? d[i] : e[i]; DOES vectorize with -ftree-loop-if-convert-stores even with float * a; float * b; float * c; float * d; floa

[Bug c++/50622] New: ICE: verify_gimple failed for std::complex

2011-10-05 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50622 Bug #: 50622 Summary: ICE: verify_gimple failed for std::complex Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priorit

[Bug tree-optimization/50649] New: REGRESSION: ICE in vect_is_simple_use_1, at tree-vect-stmts.c:5689 after rev 179607

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50649 Bug #: 50649 Summary: REGRESSION: ICE in vect_is_simple_use_1, at tree-vect-stmts.c:5689 after rev 179607 Classification: Unclassified Product: gcc Version: 4.7.0 S

[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596 --- Comment #13 from vincenzo Innocente 2011-10-07 07:35:40 UTC --- is not PR50649 caused by your changes?

[Bug middle-end/50650] [4.7 Regression] ICE in vect_is_simple_use_1, at tree-vect-stmts.c:5689

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50650 --- Comment #4 from vincenzo Innocente 2011-10-07 09:30:46 UTC --- ok in my tests

[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596 --- Comment #14 from vincenzo Innocente 2011-10-07 10:15:03 UTC --- signed char k[1024]; void foo6() { for (int i=0; i!=N; ++i) k[i] = (a[i]

[Bug libstdc++/50661] std::equal should use more efficient version for arrays of pointers

2011-10-08 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50661 --- Comment #14 from vincenzo Innocente 2011-10-08 13:48:22 UTC --- Thanks for adding me in the loop. I wonder if we can reuse -funsafe-loop-optimizations to force loop vectorization. I know that INTEL has introduced a specific pragma to force v

[Bug tree-optimization/50622] [4.7 Regression] ICE: verify_gimple failed for std::complex

2011-10-09 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50622 vincenzo Innocente changed: What|Removed |Added Severity|normal |blocker --- Comment #4 from vincenzo

  1   2   3   4   5   6   >