estewart08 added a comment. In D99432#2726060 <https://reviews.llvm.org/D99432#2726060>, @ABataev wrote:
> In D99432#2726050 <https://reviews.llvm.org/D99432#2726050>, @estewart08 > wrote: > >> In D99432#2726025 <https://reviews.llvm.org/D99432#2726025>, @ABataev wrote: >> >>> In D99432#2726019 <https://reviews.llvm.org/D99432#2726019>, @estewart08 >>> wrote: >>> >>>> In reference to https://bugs.llvm.org/show_bug.cgi?id=48851, I do not see >>>> how this helps SPMD mode with team privatization of declarations >>>> in-between target teams and parallel regions. >>> >>> Diв you try the reproducer with the applied patch? >> >> Yes, I still saw the test fail, although it was not with latest >> llvm-project. Are you saying the reproducer passes for you? > > I don't have CUDA installed but from what I see in the LLVM IR it shall pass. > Do you have a debug log, does it crashes or produces incorrect results? This is on an AMDGPU but I assume the behavior would be similar for NVPTX. It produces incorrect/incomplete results in the dist[0] index after a manual reduction and in turn the final global gpu_results array is incorrect. When thread 0 does a reduction into dist[0] it has no knowledge of dist[1] having been updated by thread 1. Which tells me the array is still thread private. Adding some printfs, looking at one teams' output: SPMD Thread 0: dist[0]: 1 Thread 0: dist[1]: 0 // This should be 1 After reduction into dist[0]: 1 // This should be 2 gpu_results = [1,1] // [2,2] expected Generic Mode: Thread 0: dist[0]: 1 Thread 0: dist[1]: 1 After reduction into dist[0]: 2 gpu_results = [2,2] Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99432/new/ https://reviews.llvm.org/D99432 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits