xiangzh1 wrote: > AMDGPU can not unorll this case: > > https://godbolt.org/z/4Pq3bnzTT > > But the same code in X86 looks can unroll: > > https://godbolt.org/z/zr8aTG1KW > > We may need to continue debug on it.
X86 do very conservative unroll too,its upper bound send to 4 (default is 8), if we not fold the loop branch, it can fully unroll (16) https://github.com/llvm/llvm-project/pull/74268 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits