ABataev added inline comments.
================ Comment at: lib/CodeGen/CGOpenMPRuntime.h:956-962 virtual void emitReduction(CodeGenFunction &CGF, SourceLocation Loc, ArrayRef<const Expr *> Privates, ArrayRef<const Expr *> LHSExprs, ArrayRef<const Expr *> RHSExprs, ArrayRef<const Expr *> ReductionOps, - bool WithNowait, bool SimpleReduction); + bool WithNowait, bool SimpleReduction, + OpenMPDirectiveKind ReductionKind); ---------------- Number of parameters is getting too big, maybe it is better to aggregate them into a struct/class? ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:118-133 +// GPU Configuration: This information can be derived from cuda registers, +// however, providing compile time constants helps generate more efficient +// code. For all practical purposes this is fine because the configuration +// is the same for all known NVPTX architectures. +enum MachineConfiguration : unsigned { + WarpSize = 32, + // Number of bits required to represent a lane identifier, which is ---------------- It's better to use `///` style of comments here ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:653-675 + /// Build int32_t __kmpc_shuffle_int32(int32_t element, + /// int16_t lane_offset, int16_t warp_size); + llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.Int16Ty, CGM.Int16Ty}; + llvm::FunctionType *FnTy = + llvm::FunctionType::get(CGM.Int32Ty, TypeParams, /*isVarArg*/ false); + RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_shuffle_int32"); + break; ---------------- Use `//` instead of `///` ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:963-965 +enum CopyAction : unsigned { + RemoteLaneToThread, + ThreadCopy, ---------------- Comments here? ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:969-974 +// Emit instructions to copy a Reduce list, which contains partially +// aggregated values, in the specified direction. +// +// RemoteLaneToThread: Copy over a Reduce list from a remote lane in +// the warp using shuffle instructions. +// ThreadCopy: Make a copy of a Reduce list on the thread's stack. ---------------- Use `///` ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1272 + +// Emit a helper that reduces data across two OpenMP threads (lanes) +// in the same warp. It uses shuffle instructions to copy over data from ---------------- `///` style here ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:1488 + +// +// Design of OpenMP reductions on the GPU ---------------- `///` here ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.h:245 -public: + /// \brief Emit a code for reduction clause. + /// ---------------- Bo \brief ================ Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.h:263 + + /// \brief Returns specified OpenMP runtime function for the current OpenMP + /// implementation. Specialized for the NVPTX device. ---------------- No \brief https://reviews.llvm.org/D29758 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits