arpith-jacob marked 2 inline comments as done. arpith-jacob added a comment.
In https://reviews.llvm.org/D29506#669542, @ABataev wrote: > The patch is too big and quite hard to review? Could you split it into > several smaller parts? Alexey, thank you for your time. I have addressed your comments and split the patch into a smaller one, which I will post shortly. The new patch is to implement parallel reductions on the GPU. This is the smallest patch I can get such that the codegen is correct and fully functional (runs correctly on the GPU). Thanks, Arpith ================ Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:4280-4282 + // We don't need debug information in this function as nothing here refers to + // user source code. + CGF.disableDebugInfo(); ---------------- ABataev wrote: > It is not quite so, at least we have a reference in a list of reductions in > `reduction` clause, which may be considered a debug position Ok, removed. ================ Comment at: lib/CodeGen/CGOpenMPRuntime.h:524 + + static bool classof(const CGOpenMPRuntime *RT) { + return RT->getKind() == RK_HOST; ---------------- ABataev wrote: > arpith-jacob wrote: > > This is required to cast to the NVPTX runtime in a static function as > > follows; > > > > CGOpenMPRuntimeNVPTX &RT = > > cast<CGOpenMPRuntimeNVPTX>(CGM.getOpenMPRuntime()); > Are you going to make calls to `isa()`, `dyn_cast()` functions? If not, just > use `static_cast<>()` instead. Yes, I will use a static_cast and remove the OpenMPRuntimeKind functionality. ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:972-973 + auto CastTy = Size <= 4 ? CGM.Int32Ty : CGM.Int64Ty; + auto *ElemCast = Bld.CreateSExtOrBitCast(Elem, CastTy); + auto *WarpSize = Bld.CreateTruncOrBitCast(getNVPTXWarpSize(CGF), CGM.Int16Ty); + ---------------- ABataev wrote: > I'd prefer you to use `CGF.EmitScalarConversion()` rather than Builder casts. Ok. I have used CGF.EmitScalarConversion() when I can, for example for WarpSize. In other cases I want simple bitcasts and truncs so I have used the Builder. ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:975-978 + llvm::SmallVector<llvm::Value *, 3> Args; + Args.push_back(ElemCast); + Args.push_back(Offset); + Args.push_back(WarpSize); ---------------- ABataev wrote: > Do you really need a SmallVector<> or you can use just an array here? Used an array here and other places. ================ Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2249 + ? OMPD_parallel_for_simd + : OMPD_parallel); // Emit post-update of the reduction variables if IsLastIter != 0. ---------------- ABataev wrote: > OMPD_parallel or OMPD_parallel_for? OMPD_parallel is fine. I just need to know that it is a 'parallel' type reduction. ================ Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2424 CGF.OMPCancelStack.emitExit(CGF, S.getDirectiveKind(), CodeGen); - CGF.EmitOMPReductionClauseFinal(S); + CGF.EmitOMPReductionClauseFinal(S, OMPD_parallel); // Emit post-update of the reduction variables if IsLastIter != 0. ---------------- ABataev wrote: > OMPD_parallel_for? This is the reduction codegen for the 'sections' directive. So a reduction type of 'OMPD_parallel' works well. https://reviews.llvm.org/D29506 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits