================ @@ -0,0 +1,95 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu verde -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu tonga -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu gfx1100 -emit-llvm -o - %s | FileCheck %s + +// CHECK-LABEL: @test_amdgcn_make_buffer_rsrc_p0( +// CHECK-NEXT: entry: +// CHECK-NEXT: [[TMP0:%.*]] = tail call ptr addrspace(8) @llvm.amdgcn.make.buffer.rsrc.p0(ptr [[P:%.*]], i16 [[STRIDE:%.*]], i32 [[NUM:%.*]], i32 [[FLAGS:%.*]]) +// CHECK-NEXT: ret ptr addrspace(8) [[TMP0]] +// +__buffer_rsrc_t test_amdgcn_make_buffer_rsrc_p0(void *p, short stride, int num, int flags) { + return __builtin_amdgcn_make_buffer_rsrc(p, stride, num, flags); ---------------- shiltian wrote:
For example, we have the following code: ``` void test_amdgcn_buffer_rsrc_t_assignment(void *p, short stride, int num, int flags, char c) { __buffer_rsrc_t rsrc = __builtin_amdgcn_make_buffer_rsrc(p, stride, num, flags); bar(); __builtin_amdgcn_raw_ptr_buffer_store_i8(c, rsrc, 0, 0, 0); } ``` The generated IR would be: ``` define dso_local void @test_amdgcn_buffer_rsrc_t_assignment(ptr nocapture noundef writeonly %p, i16 noundef signext %stride, i32 noundef %num, i32 noundef %flags, i8 noundef signext %c) local_unnamed_addr { entry: %0 = tail call ptr addrspace(8) @llvm.amdgcn.make.buffer.rsrc.p0(ptr %p, i16 %stride, i32 %num, i32 %flags) tail call void @bar() tail call void @llvm.amdgcn.raw.ptr.buffer.store.i8(i8 %c, ptr addrspace(8) %0, i32 0, i32 0, i32 0) ret void } declare ptr addrspace(8) @llvm.amdgcn.make.buffer.rsrc.p0(ptr readnone, i16, i32, i32) #1 declare void @bar() local_unnamed_addr #2 ``` However, I just checked the potential use case of this, such as https://github.com/ROCm/composable_kernel/blob/acda4c5a3c34c13b71475fdd963e61182bba8a76/include/ck_tile/core/arch/amd_buffer_addressing.hpp#L71, we will need this type to be able to be passed around, so a sizeless type doesn't work. To move forward, I think we still need to make it a 128-bit fat pointer. I'm not sure yet if we want to make it an `i128` or `4xi32`, or a struct type because we definitely need to prevent the case like `__buffer_rsrc_t rsrc = some_i128_val;` or `__buffer_rsrc_t rsrc = some_4xi32_val;`. At clang codegen level, it is still taken as AS8 pointer. WDYT? @yxsamliu @arsenm @krzysz00 https://github.com/llvm/llvm-project/pull/95276 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits