http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54703



             Bug #: 54703

           Summary: [miscompilation] _mm_sub_pd is incorrectly substituted

                    with vandnps

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: major

          Priority: P3

         Component: target

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: kr...@kde.org





The following testcase:



#include <xmmintrin.h>

__attribute__((aligned(16))) static const unsigned long long mask[2] = {

0xffffffffff000000ull, 0xffffffffff000000ull };

inline __m128d foo(__m128d v1) {

    const __m128d h1 = _mm_and_pd(v1, _mm_load_pd(reinterpret_cast<const double

*>(&mask)));

    const __m128d l1 = _mm_sub_pd(v1, h1);

    return _mm_mul_pd(h1, l1);

}

__m128d test() {

    __m128d a = _mm_set1_pd(2.);

    return foo(foo(a));

}



compiles to



        .cfi_startproc

        vmovaps _ZL4mask(%rip), %xmm0

        vandps  .LC0(%rip), %xmm0, %xmm2

        vandnps .LC0(%rip), %xmm0, %xmm1

        vmulpd  %xmm1, %xmm2, %xmm1

        vandps  %xmm0, %xmm1, %xmm0

        vsubpd  %xmm0, %xmm1, %xmm1

        vmulpd  %xmm1, %xmm0, %xmm0

        ret

        .cfi_endproc



The second foo call is correct: vandps and vsubpd are used. But the first call

uses vandps and vandnps. This pattern would be correct for integers, but is

obviously wrong for floating point numbers.

Reply via email to