https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619
--- Comment #4 from Michael Veksler <mickey.veksler at gmail dot com> ---
It is interesting to check the impact on numerical C++ benchmarks.
Fortran has a conceptual restrict on all its parameter arrays,
since aliasing is not allowed.
void f(int * __restrict__ v1, int * __restrict__ v2, int n)
{
for (int i=0 ; i < n ; i++)
v1[0] += v2[i];
}
and Fortran:
subroutine f(v1, v2, n)
integer :: v1(100)
integer :: v2(100)
integer :: n
DO i=1, n
v1(1) = v1(1) + v2(i)
END DO
end subroutine f
Generate the same loop:
.L3:
addl (%rdx), %eax
addq $4, %rdx
cmpq %rdx, %r8
jne .L3
But without restrict, as expected, g++ generates:
.L8:
addl (%rdx), %eax
addq $4, %rdx
cmpq %r8, %rdx
movl %eax, (%rcx)
jne .L8
Running both variants from a loop (in a separate translation unit,
without whole program optimization) (g++ 7.2.0 with -O2 on 64 bit cygwin):
#include <ctime>
#include <iostream>
void f(int * __restrict__ v1, int *__restrict__ v2, int SIZE);
void g(int * v1, int * v2, int SIZE);
constexpr int SIZE = 1'000'000;
int v2[SIZE];
int main()
{
int v1;
f(&v1, v2, SIZE); // Warm up cache
auto start = std::clock();
constexpr int TIMES = 10'000;
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
f(&v1, v2, SIZE);
}
auto t1 = std::clock();
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
g(&v1, v2, SIZE);
}
auto t2 = std::clock();
std::cout << "with restrict: "
<< double(t1 - start) / CLOCKS_PER_SEC << " sec\n";
std::cout << "without restrict: "
<< double(t2 - t1) / CLOCKS_PER_SEC << " sec\n";
}
And the results are:
with restrict: 4.477 sec
without restrict: 5.756 sec
Which clearly demonstrates the impact of good alias analysis.
With plain C pointers, this is an unavoidable price.
But unfortunately this also happens when passing pointers or
references to arrays of different sizes, or when inheriting two
different types from std::array, in order to mark the parameters
as non-aliasing.