Issue 136357
Summary [flang][OpenMP] flaky firstprivate/lastprivate behavior due to misplaced barriers
Labels flang
Assignees
Reporter eugeneepshteyn
    (Thanks goes to @ebaskakov for doing the actual investigation.)

The variables in firstprivate/lastprivate don't get set properly in some cases, resulting in flaky behavior. 

Consider the following test:
```
  implicit none
 integer :: i, n
  logical :: first
  first = .true.
  n = 42
  !$omp parallel do firstprivate(n) firstprivate(first) lastprivate(n)
    do i=1,10
      if (first) then
        if (n/=42) stop -1*n
        first = .false.
      end if
      n = 100
    end do
  !$omp end parallel do
 if (n/=100) stop -2*n
  print *,'passed'
end
```

Using the following flang on x86_64:
```
$ flang --version
flang version 21.0.0git (https://github.com/eugeneepshteyn/llvm-project.git 31ddaef8d18d643ff4c343d03ddfe2edae7d22a2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Build config: +unoptimized, +assertions
```
... results in the following behavior:
```
$ ./a.out 
Fortran STOP: code -100

Fortran STOP: code -100

Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

Fortran STOP: code -100

$ ./a.out 
 passed
```
It seems that `n` is already set to 100, while it should still have the value of 42.

Looking at LLVM IR output of `flang -fopenmp -g -O0 -S -emit-llvm firstprivate.f90`:

Load `first` and `n` passed as the structure of two fields:
```
define internal void @_QQmain..omp_par(ptr noalias %tid.addr, ptr noalias %zero.addr, ptr %0) #1 !dbg !26 {
omp.par.entry:
  %gep_ = getelementptr { ptr, ptr }, ptr %0, i32 0, i32 0
  %loadgep_ = load ptr, ptr %gep_, align 8, !align !29
  %gep_1 = getelementptr { ptr, ptr }, ptr %0, i32 0, i32 1
  %loadgep_2 = load ptr, ptr %gep_1, align 8, !align !29
...
```
Bbarrier:
```
omp.par.region1: ; preds = %omp.par.region
 %omp_global_thread_num2 = call i32 @__kmpc_global_thread_num(ptr @4)
  call void @__kmpc_barrier(ptr @3, i32 %omp_global_thread_num2)
  br label %omp.private.init, !dbg !30

omp.private.init: ; preds = %omp.par.region1
  br label %omp.private.copy, !dbg !30
```
... but then the values for `first` and `n` are loaded via `loadgep_*` pointers:
```
omp.private.copy: ; preds = %omp.private.init
  %2 = load i32, ptr %loadgep_, align 4, !dbg !31
  store i32 %2, ptr %omp.private.alloc, align 4, !dbg !31
  %3 = load i32, ptr %loadgep_2, align 4, !dbg !32
  store i32 %3, ptr %omp.private.alloc4, align 4, !dbg !32
  br label %omp.wsloop.region, !dbg !32

```
Since the load happens after the barrier, some threads could have already set the value of `n` to 100.

It seems that the loads of the values from the passed structure should happen before the barrier.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to