[Bug libgcc/64677] incorrect result with complex division?

2015-01-21 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64677 --- Comment #11 from Sanjay Patel --- (In reply to Mikhail Maltsev from comment #10) > C++11 supports constexpr (and std::complex has constexpr constructor). Ah, that makes sense. Yes, we're only generating the answer using MPFR with c++11 and o

[Bug libgcc/64677] incorrect result with complex division?

2015-01-20 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64677 --- Comment #9 from Sanjay Patel --- (In reply to Sanjay Patel from comment #8) > It seems I don't need the -std=c++11 flag as I do on OS X? Actually, I screwed that up. We don't need that flag on OS X either...and thankfully, the behavior match

[Bug libgcc/64677] incorrect result with complex division?

2015-01-20 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64677 --- Comment #8 from Sanjay Patel --- (In reply to Andrew Pinski from comment #7) > Can you try this under Linux too, just to double check there? Wow, that other bug shows that there are a lot of variables here. I don't know what to make of th

[Bug libgcc/64677] incorrect result with complex division?

2015-01-20 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64677 --- Comment #5 from Sanjay Patel --- (In reply to Mikhail Maltsev from comment #3) > So, compile-time result is more precise. BTW, what does the disassembly look > like? In the -O0 case, it looks like all of the math is handled in: call

[Bug c++/64677] incorrect result with complex division?

2015-01-19 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64677 --- Comment #2 from Sanjay Patel --- This is on plain x86-64 with SSE (before the addition of any FMA instructions), so lack of FMA must be accounted for? The answers differ in the last digit / ULP. Is there some standard or golden implementatio

[Bug c++/64677] New: incorrect result with complex division?

2015-01-19 Thread spatel at rotateright dot com
++ Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com I'm not sure if this is a bug at -O0, at -O1 (in MPFR because all math is folded out in this case?), or neither: #include #include #include int main() { std::complex c(-61.8870735917

[Bug target/62191] New: extra shift generated for vector integer division by constant 2

2014-08-19 Thread spatel at rotateright dot com
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com Using gcc 4.9: $ cat sdiv.c typedef int vecint __attribute__((vector_size(16))); vecint f(vecint x) { return x/2; } $ gcc -O2 sdiv.c -S -o

[Bug target/62054] fabsf uses constant pool and andps (x86-64) - use pabsd instead?

2014-08-07 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62054 --- Comment #3 from Sanjay Patel --- I think there's still an optimization possible here regarding the constant pool data - see bug 62055. Hopefully, I didn't mess that one up. :)

[Bug target/62054] fabsf uses constant pool and andps (x86-64) - use pabsd instead?

2014-08-07 Thread spatel at rotateright dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62054 Sanjay Patel changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/62055] New: missed optimization: recognize fnabs (FP negative absolute value) (x86-64)

2014-08-07 Thread spatel at rotateright dot com
: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com $ cat fnabs.c #include float foo(float a) { return -fabsf(a); } $ gcc49 -O1 fnabs.c -S -o - .text .globl _foo _foo: LFB19: movss

[Bug target/62054] New: fabsf uses constant pool and andps (x86-64) - use pabsd instead?

2014-08-07 Thread spatel at rotateright dot com
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com $ cat fabs.c #include float foo(float a) { return fabsf(a); } $ gcc49 -O1 fabs.c -S -o - .text .globl _foo _foo: LFB19: movssLC0

[Bug target/62041] New: vector fneg codegen uses a subtract instead of an xor (x86-64)

2014-08-06 Thread spatel at rotateright dot com
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com $ cat fneg.c #include __m128 fneg4(__m128 x) { return _mm_sub_ps(_mm_set1_ps(-0.0), x); } $ ~gcc49/local/bin/gcc -march=core-avx2 -O2 -S fneg.c -o

[Bug target/60847] [4.9/4.10 Regression] x86 BMI intrinsics not recognized

2014-04-30 Thread spatel at rotateright dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60847 --- Comment #10 from Sanjay Patel --- Ah - thank you for the explanation! I found the original checkin from AMD: http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01356.html Strangely, I can't find any documentation for those double-underscores from A

[Bug target/60847] [4.9/4.10 Regression] x86 BMI intrinsics not recognized

2014-04-30 Thread spatel at rotateright dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60847 --- Comment #8 from Sanjay Patel --- Thanks, Jakub. I see that the fix duplicates all of the intrinsics with a double-leading-underscore variant. Why do we need that? AFAIK, no other x86 intrinsics have this kind of duplication.

[Bug c/60847] x86 BMI intrinsics not recognized

2014-04-15 Thread spatel at rotateright dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60847 Sanjay Patel changed: What|Removed |Added Component|target |c --- Comment #3 from Sanjay Patel --- He

[Bug c/60847] x86 BMI intrinsics not recognized

2014-04-15 Thread spatel at rotateright dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60847 --- Comment #1 from Sanjay Patel --- It looks like an extra leading underscore is required to recognize the BMI intrinsics. This is not happening with other (BMI2, SSE4) intrinsics. According to the Intel reference docs and previous versions of

[Bug c/60847] New: x86 BMI intrinsics not recognized

2014-04-15 Thread spatel at rotateright dot com
Assignee: unassigned at gcc dot gnu.org Reporter: spatel at rotateright dot com With gcc 4.9.0 (version details below), the x86 bit manipulation instruction (BMI) C intrinsics are not being recognized. This appears to be a regression from gcc 4.8.2. $ cat bmi.c #include int foo(int a