yaxunl added inline comments.
================ Comment at: clang/lib/CodeGen/TargetInfo.cpp:9552 F->addFnAttr("uniform-work-group-size", "true"); + if (IsOpenMPkernel) + F->addFnAttr("amdgpu-flat-work-group-size", ---------------- jhuber6 wrote: > arsenm wrote: > > jhuber6 wrote: > > > arsenm wrote: > > > > Probably shouldn’t check the language, just it’s a kernel. Also > > > > shouldn’t emit this if it’s the default 1024. I’ve been trying to cut > > > > down on the superfluous attribute spam > > > There's a section for HIP above that does the same. We could probably > > > consolidate here for all "AMDGPU" kernels and get rid of the redundant > > > attribute. Maybe in a separate patch? > > All the isCUDA || HIP || OpenMP checks scattered around are driving me > > crazy. A bunch of the out of tree divergent patches are just adding to > > them. We should just purge everything checking languages to the actual > > features and stop putting language names in things > OpenCL is the odd one out as far as I know, HIP and OpenMP are mostly > equivalent as far as attributes go. OpenCL uses 256 as default max block size. This is to avoid performance regressions for existing apps. HIP uses 1024 by default. ================ Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:61 + DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ, + "1024"))); + ---------------- we should keep the default value in Options.td instead of having multiple copies at different places. save as below. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142393/new/ https://reviews.llvm.org/D142393 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits