https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78655

            Bug ID: 78655
           Summary: gcc doesn't exploit the fact that the result of
                    pointer addition can not be nullptr
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider the following piece of code:

#include <memory>

struct blob
{
    void* data;
    size_t size;
};

void uninitialized_copy(blob* first, blob* last, blob* current)
{
    for (; first != last; ++first, (void) ++current) {
        ::new (static_cast<void*>(current)) blob(*first);
    }
}

The nested loop generated for it by GCC 7 is the following:

.L4:
        testq   %rdx, %rdx
        je      .L3
        movdqu  (%rdi), %xmm0
        movups  %xmm0, (%rdx)
.L3:
        addq    $16, %rdi
        addq    $16, %rdx
        cmpq    %rdi, %rsi
        jne     .L4

As you can see after each iteration generated code checks if current is nullptr
and omit calling the copy constructor if it is so.

Clang 3.9 doesn't exhibit such behavior. It translates the loop into:

.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movups  (%rdi), %xmm0
        movups  %xmm0, (%rdx)
        addq    $16, %rdi
        addq    $16, %rdx
        cmpq    %rdi, %rsi
        jne     .LBB0_1

This optimization is valid, because if addition of pointer and integer results
in nullptr, the integer was clearly out of bound of allocated memory block thus
the addition causes undefined behavior.

Absence of this optimization affects std::uninitialized_copy and any functions
that use it (for example std::vector<T>::push_back).

The issue can be reproduced with a simpler piece of code:

bool g(int* a)
{
    return (a + 10) != nullptr;
}

GCC 7:
g(int*):
        cmpq    $-40, %rdi
        setne   %al
        ret

Clang 3.9:
g(int*):                                 # @g(int*)
        movb    $1, %al
        retq

P.S. Another issue is that GCC used mismatched instructions to read from/to
memory. movdqu -- is for integer data, movups -- is for single precision
floating point. I don't know if it causes any stalls on modern CPUs, I heard
that on older CPU writing register with an instruction of one type and reading
with an instruction of another might causes stalls. Should I report a separate
bug report issue for this?

Reply via email to