https://bugs.llvm.org/show_bug.cgi?id=35542

            Bug ID: 35542
           Summary: [OpenMP] reduction on array section produces incorrect
                    result after r316362
           Product: new-bugs
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedb...@nondot.org
          Reporter: daniil.si...@intel.com
                CC: llvm-bugs@lists.llvm.org

test.c:
==============

int printf(const char *, ...);

int main() {
    int total[3] = { 0 };
    int a[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
#pragma omp parallel for reduction(+:total[1:2])
    for (int i = 0; i < 10; i++) {
        total[1 + i % 2] += a[i];
    }

    printf("%d %d %d\n", total[0], total[1], total[2]);
}

===============

Before:

$ clang -v
clang version 6.0.0 (trunk 316361)
Target: x86_64-unknown-linux-gnu
Thread model: posix
...
$ clang -O0 -fopenmp test.c
$ ./a.out
0 25 30

---------

After:

$ clang -v
clang version 6.0.0 (trunk 316362)
Target: x86_64-unknown-linux-gnu
Thread model: posix
...
$ clang -O0 -fopenmp test.c
$ ./a.out
0 30 0

============

------------------------------------------------------------------------
r316362 | hahnfeld | 2017-10-23 21:01:35 +0200 (Mon, 23 Oct 2017) | 28 lines

[OpenMP] Avoid VLAs for some reductions on array sections

In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }

For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }

This relands commit r316229 that I reverted in r316235 because it
failed on some bots. During investigation I found that this was because
Clang and GCC evaluate the two arguments to emplace_back() in
ReductionCodeGen::emitSharedLValue() in a different order, hence
leading to a different order of generated instructions in the final
LLVM IR. Fix this by passing in the arguments from temporary variables
that are evaluated in a defined order.

Differential Revision: https://reviews.llvm.org/D39136
------------------------------------------------------------------------

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to