[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-11-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 558095. arsenm added a comment. Drop bitcode auto upgrade handling CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141700/new/ https://reviews.llvm.org/D141700 Files: clang/lib/CodeGen/Targets/AMDGPU.cpp clang/test/CodeGenOpenCL/amdgpu-enqueue-ker

[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-11-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/IR/CMakeLists.txt:84 Demangle + TransformUtils + This introduces a circular dependency between LLVMCore and TransformUtils. Options are: 1. Move appendToUsed into Module 2. Don't bother with bitcode compati

[PATCH] D138507: HIP: Directly use sqrt builtins instead of calling ocml (f32 case)

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm abandoned this revision. arsenm added a comment. reposted D158131 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138507/new/ https://reviews.llvm.org/D138507 ___ cfe-commits mailing list cfe-commits@li

[PATCH] D158131: HIP: Directly use f32 sqrt intrinsic

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. bca125569f33bd6a27c4c54815697966a823254e CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158131/new/ https://reviews.llvm.org/D158131 __

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 15e0fe0b6122e32657b98daf74a1fce028d2e5bf CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 __

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156743#4644285 , @Anastasia wrote: > If we think there are no better alternatives and implementation is generic > enough for every vendor, LGTM! You could argue annotating the raw callsite is better but I don't know how to i

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping. This enum should just match FLT_ROUNDS and designing ABI around whatever this was doing doesn't really make sense CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156989/new/ https://reviews.llvm.org/D156989 ___ cf

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-09-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 6a08cf12d9cbc960159bf40e47078a882ca510ce CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157911/new/ https://reviews.llvm.org/D157911 __

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-09-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 556343. arsenm added a comment. Release notes CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157911/new/ https://reviews.llvm.org/D157911 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/Builtins.def clang/lib/CodeGen/CGBuiltin.cpp

[PATCH] D159206: [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode

2023-09-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:2030-2031 + bool EnabledForTarget = TEntry->second; + if (EnabledForTarget != EnabledForFunc) +return; +} jmmartinez wrote: > arsenm wrote: > > Early return breaks the ot

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D159206: [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode

2023-08-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:2017 +for (StringRef Feature : llvm::split(FFeatures.getValueAsString(), ',')) { + bool EnabledForFunc = Feature[0] == '+'; + StringRef Name = Feature.substr(1); Do you need to g

[PATCH] D159206: [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode

2023-08-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.h:398-401 +/// If \p F "target-features" are incompatible with the \p TargetOpts features, +/// it is correct to drop the function. \return true if \p F is dropped +bool dropFunctionWithIncompatibleAttributes(llvm

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping The alternative is to directly put the !fpmath on the sqrt call sites but I have no idea how to do that CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 ___ cfe-commits

[PATCH] D159206: [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode

2023-08-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:2035 + + FuncAttr.addAttribute("target-features", llvm::join(MergedFeatures, ",")); +} do you need to guard against adding the empty attribute? I don't want to see "target-features"="" Rep

[PATCH] D159206: [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode

2023-08-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:2034 + } + + FuncAttr.addAttribute("target-features", llvm::join(MergedFeatures, ",")); Really it would be less bad if the incompatible functions were not imported rather than the backend p

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/fp-contract-fast-pragma.cpp:77 +// CHECK: _Z13fp_contract_7f +// CHECK: tail call contract float @llvm.sqrt.f32(float %a) + return __builtin_sqrtf(a); This isn't demonstrating the strict support, proba

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:501 if (CGF.Builder.getIsFPConstrained()) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); Function *F = CGF.CGM.getIntrinsic(ConstrainedIntrinsicID, Src0->getType()); r

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/fp-contract-fast-pragma.cpp:11 #pragma clang fp contract(fast) - return a * b + c; + return a * b + c + __builtin_sqrtf(a); } Should leave the existing test function alone and add a new one. Also ca

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Codegen parts LGTM, questions with the driver parts Comment at: clang/lib/Driver/ToolChain.cpp:1368 if (A->getOption().matches(options::OPT_m_Group)) { - if (SameTripleAsHost) + // Pass code objection version to device toolchain + //

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17067 + +Value *Iscov5 = CGF.Builder.CreateICmpSGE( +ABIVersion, Capitalization is weird, IsCOV5? Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17082-17083 +

[PATCH] D158367: [AMDGPU] Add target feature gds/gws to clang

2023-08-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/TargetParser/TargetParser.cpp:289 Features["image-insts"] = true; + Features["gds"] = true; + Features["gws"] = true; yaxunl wrote: > arsenm wrote: > > Gds feature is unused > I am thinking to k

[PATCH] D158367: [AMDGPU] Add target feature gds/gws to clang

2023-08-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/TargetParser/TargetParser.cpp:289 Features["image-insts"] = true; + Features["gds"] = true; + Features["gws"] = true; Gds feature is unused CHANGES SINCE LAST ACTION https://reviews.llvm.org

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:145 +for (Function &F : llvm::make_early_inc_range(M)) + if (Apply || canTransformFunctionInIsolation(F)) +Changed |= runOnFunction(F); I think you need to guard agai

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:296 +// Note - same attribute handling as DeadArgumentElimination +NF->copyAttributesFrom(&F); +NF->setComdat(F.getComdat()); This might be missing copying the linkage R

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:74-77 +Value *Mask = ConstantInt::get(IntPtrTy, ~(DataAlignMinusOne)); +Value *vaListAligned = Builder.CreateIntToPtr( +Builder.CreateAnd(Builder.CreatePtrToInt(Incr, IntPtrTy), Mask),

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:297 +NF->copyAttributesFrom(&F); +NF->setComdat(F.getComdat()); +F.getParent()->getFunctionList().insert(F.getIterator(), NF); Test the comdat? Weird that copyAttributesFr

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:208-209 + +StructType *VarargsTy = StructType::create( +Ctx, LocalVarTypes, (Twine(NF->getName()) + ".vararg").str()); + Should we go for a packed struct forced to align 4

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:22 +// 5/ Delete the remaining parts of the original functions +// +//===--===// Can you expand on the ABI requirem

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: libc/config/gpu/entrypoints.txt:84-85 # stdio.h entrypoints +libc.src.stdio.snprintf +libc.src.stdio.vsnprintf libc.src.stdio.puts Split of the libc stuff into a separate patch, the lowering pass should

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:40-43 +__device__ void bar(int *out) +{ + *out = __builtin_amdgcn_workgroup_size_x(); +} test all the builtins? Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17057 + Constant *Offset, *OffsetOld; + Value *DP, *DP1; + Spell out to DispatchPtr? Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1206-1208 + getTargetCodeGenInfo()

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/CodeGen/Generic/expand-variadic-intrinsics.ll:76 +} + + arsenm wrote: > arsenm wrote: > > Needs some indirect variadic call tests > Also some metadata and signext/zeroext preservation tests Also a case where the

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/CodeGen/Generic/expand-variadic-intrinsics.ll:76 +} + + arsenm wrote: > Needs some indirect variadic call tests Also some metadata and signext/zeroext preservation tests Repository: rG LLVM Github Monorepo

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:38 + +#include + Don't need Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:44-47 +static cl::opt +ApplyToAllOverride(DEBUG_TYPE "-all", cl::init(false),

[PATCH] D145648: [clang][Driver] recognize `-ffp-contract=fast-honor-pragmas`

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. LGTM, not recognizing this in the driver is incomplete CHANGES SINCE LAST ACTION https://reviews.llvm.org/D145648/new/ https://reviews.llvm.org/D145648 ___

[PATCH] D76283: [IRBuilder] Use preferred target type for len argument of memory intrinsic functions

2023-08-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/IR/IRBuilder.h:438-446 + ConstantInt *getIntPtrSize(Value *Ptr, uint64_t Size) { +assert(BB && "Must have a basic block to retrieve the module!"); + +Module *M = BB->getParent()->getParent(); +auto *PtrType

[PATCH] D76283: [IRBuilder] Use preferred target type for len argument of memory intrinsic functions

2023-08-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Herald added a project: All. I think any size type should be valid for the intrinsic. Legalization should have to cast the type to the target libcall if that's how it chooses to impl

[PATCH] D157917: clang/HIP: Use abs builtins instead of implementing them

2023-08-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 43f314f5e6cebe02ff63d5197c8e5c25204b20d2 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157917/new/ https://reviews.llvm.org/D157917 __

[PATCH] D157738: [OpenMP] Emit offloading entries for indirect target variables

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:1996-1997 +llvm::GlobalValue *GV) { + std::optional ActiveAttr = + OMPDeclareTargetDeclAttr::getActiveAttr(FD); + not a huge fan

[PATCH] D157917: clang/HIP: Use abs builtins instead of implementing them

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, AlexVlx, JonChesterfield, jhuber6, doru1004. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. InstCombine already put these back together so there's no visible change in the -O1 tes

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/TargetPassConfig.cpp:1281-1282 +else + WithColor::warning() + << "-fsplit-machine-functions is only valid for X86.\n"; } shenhan wrote: > arsenm wrote: > > You cannot spam warnings he

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: jcranmer-intel, kpn, sepavloff, andrew.w.kaylor, foad, bob80905. Herald added a subscriber: StephenFan. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. https://reviews.llvm.org/D157911 F

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 9e3d9c9eae03910d93e2312e1e0845433c779998 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156737/new/ https://reviews.llvm.org/D156737 __

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/TargetPassConfig.cpp:1281-1282 +else + WithColor::warning() + << "-fsplit-machine-functions is only valid for X86.\n"; } You cannot spam warnings here. The other instance of printing

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 549490. arsenm added a comment. Release note CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156737/new/ https://reviews.llvm.org/D156737 Files: clang/docs/LanguageExtensions.rst clang/docs/ReleaseNotes.rst clang/include/clang/Basic/Builtins.def

[PATCH] D155773: [llvm][MemoryBuiltins] Add alloca support to getInitialValueOfAllocation

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:809-811 + Updater.AddAvailableValue( + Alloca.getParent(), + getInitialValueOfAllocation(&Alloca, nullptr, VectorTy)); This is very specifically handling alloca, n

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156737/new/ https://reviews.llvm.org/D156737 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157452: [RFC][Clang][Codegen] `std::type_info` needs special care with explicit address spaces

2023-08-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:5237-5238 + + if (VTy->isPointerTy() && + VTy->getPointerAddressSpace() != IRTy->getPointerAddressSpace()) { +// In the case of targets that use a non-default address space f

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Probably should just wrap uses in macros for now Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156816/new/ https://reviews.llvm.org/D1

[PATCH] D157438: [OpenMP] Ensure wrapper headers are included on both host and device

2023-08-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1190-1191 // the resource directory at clang/lib/Headers/llvm_libc_wrappers. -if (C.getActiveOffloadKinds() == Action::OFK_None) { +if ((getToolChain().getTriple().isNVPTX() || +

[PATCH] D155850: [Clang][CodeGen][RFC] Add codegen support for C++ Parallel Algorithm Offload

2023-08-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/BackendUtil.cpp:1101-1102 +MPM.addPass(StdParAcceleratorCodeSelectionPass()); +} +else if (LangOpts.HIPStdParInterposeAlloc) { + MPM.addPass(StdParAllocationInterpositionPass()); For

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/opencl-c-base.h:832 + +inline float __ovld __cnfn sqrt(float __x) { + return __builtin_elementwise_sqrt(__x); Anastasia wrote: > arsenm wrote: > > svenvh wrote: > > > Anastasia wrote: > > > > Is this a

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17143-17145 + llvm::LoadInst *LD; + Constant *Offset, *Offset1; + Value *DP, *DP1; Move down to define and initialize Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17163

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156989#4558486 , @sepavloff wrote: > Support of rounding mode in C standard is based on IEEE-754 model, where > rounding mode is a global state and affects all FP operations. The case of > two rounding modes does not fit this

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156928#4561811 , @JonChesterfield wrote: > What does code objects version= none mean? Handle any version Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156928/new/ https://revi

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561619 , @arsenm wrote: > In D139730#4561575 , @jhuber6 wrote: > >> In D139730#4561573 , @arsenm wrote: >> >>> In D139730#4561540

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561575 , @jhuber6 wrote: > In D139730#4561573 , @arsenm wrote: > >> In D139730#4561540 , @jhuber6 >> wrote: >> >>> Could you explain b

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561540 , @jhuber6 wrote: > Could you explain briefly what the approach here is? I'm confused as to > what's actually changed and how we're handling this difference. I thought if > this was just the definition of some

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17146 +Value *ABIVersion; +if (ABIVersionC) { + ABIVersion = CGF.Builder.CreateAlignedLoad(CGF.Int32Ty, ABIVersionC, this must always pass Comment at: openmp

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156989#4558133 , @sepavloff wrote: > Rounding mode is presented in FPOptions with 3 bits, so there is only 8 > values available for particular modes. 5 of them, which are specified in > IEEE-754, are listed in `RoundingMode`.

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546815. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156989/new/ https://reviews.llvm.org/D156989 Files: clang/include/clang/Basic/FPOptions.def clang/include/clang/Basic/LangOptions.h clang/lib/AST/JSONNodeDumper.cpp clang/lib/AST/TextNodeDu

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/opencl-c-base.h:832 + +inline float __ovld __cnfn sqrt(float __x) { + return __builtin_elementwise_sqrt(__x); svenvh wrote: > Anastasia wrote: > > Is this a generic implementation enough? Would some tar

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546814. arsenm marked an inline comment as done. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 Files: clang/lib/Headers/opencl-c-base.h clang/lib/Headers/opencl-c.h clang/lib/Sema/OpenCLBuiltins.td

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: sepavloff, rjmccall, kpn, cameron.mcinally, uweigand, scanon, jcranmer-intel, foad. Herald added subscribers: StephenFan, tpr. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. Herald added

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. missing tests Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1066 +if (!IsCC1As) { + std::string CodeObjVerStr = (CodeObjVer ? Twine(CodeObjVer) : "none").str(); CmdArgs.insert(CmdArgs.begin() + 1, don't need to go th

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156816#4551409 , @Anastasia wrote: > Why not to just use target address space and define it to some macro with > desirable spelling? If you mean the numbered address spaces, that's the broken thing this is specifically tryin

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546164. arsenm retitled this revision from "[wip] clang/OpenCL: Add inline implementations of sqrt in builtin header" to "clang/OpenCL: Add inline implementations of sqrt in builtin header". arsenm edited the summary of this revision. arsenm added a comment.

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I don't really see the point of doing this. These introduce ambiguous terminology. The reason you need the attributes is basically for FFI to opencl code, so might as well make the specific meaning clearer with the opencl bit Repository: rG LLVM Github Monorepo CHANG

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2548 +case Builtin::BI__builtin_sqrtf128: +case Builtin::BI__builtin_elementwise_sqrt: { llvm::Value *Call = emitUnaryMaybeConstrainedFPBuiltin( bob80905 wrote: > Nit: I thin

[PATCH] D156743: [wip] clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, svenvh, Anastasia. Herald added a subscriber: Naghasan. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. We want the !fpmath metadata to be attached to the sqrt intrinsic to make it

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, fhahn, bob80905. Herald added subscribers: StephenFan, Anastasia. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. This will be used in the opencl builtin headers to provide direct

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/dynamic-alloca-with-address-space.c:1 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s + AlexVlx wrote: > arsenm wrote: > > Can you add an opencl 1.2 and 2.0 run line too

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/dynamic-alloca-with-address-space.c:1 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s + Can you add an opencl 1.2 and 2.0 run line too Comment at: clan

[PATCH] D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Herald added subscribers: nlopes, StephenFan. Herald added a project: All. Should be obsoleted by D147732 Repository: rG LLVM Github Monorepo CH

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:3540 + return RValue::get(Builder.CreateAddrSpaceCast(AI, CGM.Int8PtrTy)); +else + return RValue::get(AI); No return after else Comment at: clang/lib/CodeGe

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156539#4542836 , @rjmccall wrote: > We should probably write this code to work properly in case we add a target > that makes `__builtin_alloca` return a pointer in the private address space. > Could you recover the target AS

[PATCH] D156357: clang: Add elementwise bitreverse builtin

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/LanguageExtensions.rst:634 the most negative integer remains the most negative integer - T __builtin_elementwise_fma(T x, T y, T z) fused multiply add, (x * y) + z.

[PATCH] D156366: HIP: Use __builtin_sqrt instead of routing through ocml sqrt for f64

2023-07-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 63dbe7e808d07bdf25bad85301980bc323b0cd64 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156366/new/ https://reviews.llvm.org/D156366 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2023-07-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9461-9463 + bool CorrectSqrt = CGM.getLangOpts().OpenCL + ? CGM.getCodeGenOpts().OpenCLCorrectlyRoundedDivSqrt + : CGM.getCodeGenOpts().HIPCorrectlyRoundedD

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2023-07-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. We should just do this now. clang shouldn't have to dig around on disk to emit a constant definition for a constant it already knows, and we have a clear path to removing these globals altogether. I have adequate patches to completely delete `__oclc_daz_opt` today. `__oc

[PATCH] D156040: [AMDGPU] Add dynamic stack bit info to kernel-resource-usage Rpass output

2023-07-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156040#4526036 , @JonChesterfield wrote: > What's the use case I'm missing which makes this flag necessary/beneficial? The metadata is also irrelevant to this patch which is just reporting optimization hint information Rep

[PATCH] D156040: [AMDGPU] Add dynamic stack bit info to kernel-resource-usage Rpass output

2023-07-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. lgtm with nits Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1298 + CurrentProgramInfo.DynamicCallStack ? "True" : "False"; + EmitResourceUsageRemark("UsesDyn

[PATCH] D153310: clang: Add elementwise pow builtin

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added inline comments. This revision is now accepted and ready to land. Comment at: clang/test/CodeGen/strictfp-elementwise-bulitins.cpp:223 +// CHECK-NEXT: entry: +// CHECK-NEXT:[[TMP0:%.*]] = tail call <4 x float> @llvm.pow.v4f32(<4 x

[PATCH] D153310: Add codegen for llvm pow builtin

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Code looks fine, patch title is a bit confusing. Don't say codegen, and say clang: Add elementwise pow builtin? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153310/new/ https://reviews.llvm.org/D153310 ___

[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/HIPSupport.rst:65 + + clang++ --offload-arch=gfx906 -xhip sample.cpp -o sample + scchan wrote: > missing --hip-link What does hip-link do? Why is it needed? I never use it and it works CHANGES SINCE LAST AC

[PATCH] D156040: [AMDGPU] Add dynamic stack bit info to kernel-resource-usage Rpass output

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/Frontend/amdgcn-machine-analysis-remarks.cl:13 +// expected-remark@+2 {{LDS Size [bytes/block]: 0}} +// expected-remark@+1 {{Uses Dynamic Stack: False}} __kernel void foo() { Print right after the scra

[PATCH] D156040: [AMDGPU] Add dynamic stack bit info to kernel-resource-usage Rpass output

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156040#4526036 , @JonChesterfield wrote: > I don't see how this conveys any information. The compiler writes the stack > size to be allocated. If it doesn't know what is sufficient, it's going to > request some maximum and h

[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:187 +Value *AMDGPULateCodeGenPrepare::buildLegalLaneIntrinsic( +IRBuilder<> &B, Intrinsic::ID IID, Value *Data0, Value *Data1, Value *Lane0, jrbyrnes wrote: > ars

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 9b2dfff57a382b757c358b43ee1df7591cb480ee CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155982/new/ https://reviews.llvm.org/D155982 __

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 543111. arsenm marked an inline comment as done. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155982/new/ https://reviews.llvm.org/D155982 Files: clang/lib/Headers/__clang_hip_libdevice_declares.h clang/test/Headers/__clang_hip_math_deprecated.hi

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 543025. arsenm added a comment. Add versioned deprecated macro CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155982/new/ https://reviews.llvm.org/D155982 Files: clang/lib/Headers/__clang_hip_libdevice_declares.h clang/test/Headers/__clang_hip_ma

[PATCH] D85917: [MSP430] Fix passing C structs and unions as function arguments

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Herald added a project: All. Is there a reason this never landed? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85917/new/ https://reviews.llvm.org/D85917 ___ cfe-commits mailing

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/__clang_hip_libdevice_declares.h:319 +// relevant. +__device__ inline _Float16 __llvm_amdgcn_rcp_f16(_Float16 x) { + return ((_Float16)1.0f) / x; arsenm wrote: > yaxunl wrote: > > Can we add the depreca

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/__clang_hip_libdevice_declares.h:319 +// relevant. +__device__ inline _Float16 __llvm_amdgcn_rcp_f16(_Float16 x) { + return ((_Float16)1.0f) / x; yaxunl wrote: > Can we add the deprecated attribute to u

[PATCH] D155982: Partially revert "clang/HIP: Remove __llvm_amdgcn_* wrapper hacks"

2023-07-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added a reviewer: yaxunl. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. Revert part of f407a7399575a6821940973c54754d42e72dd9ce .

[PATCH] D155191: clang/HIP: Directly use f32 exp and log builtins

2023-07-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 5f1d3834a2bc3b77e126a278a0e7f00bce5576fc CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155191/new/ https://reviews.llvm.org/D155191 __

[PATCH] D153310: Add codegen for llvm pow builtin

2023-07-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Add a test for the strictfp case (there's an existing strictfp test for all the elementwise builtins) Comment at: clang/lib/CodeGen/CGBuiltin.cpp:3241 +unsigned Opc = llvm::Intrinsic::pow; +Value* Result = Builder.CreateBinaryIntrinsic(Opc, Op0,

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. bac2a075408377a8aa41f6626b17bb3e471221f3 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154495/new/ https://reviews.llvm.org/D154495 __

  1   2   3   4   5   6   7   8   9   10   >