[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-07-01 Thread Vikash Gupta via cfe-commits

vg0204 wrote:

Ping!!

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-07-02 Thread Vikash Gupta via cfe-commits


@@ -0,0 +1,255 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL30 %s

vg0204 wrote:

Can you let me know what really need to be tested via this test case, so 
exactly I could trim out the unnecessary portion of comment prefix.

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-03 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 edited https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-03 Thread Vikash Gupta via cfe-commits


@@ -0,0 +1,255 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 \
+// RUN: -emit-llvm -o - | FileCheck --check-prefix=OPENCL30 %s

vg0204 wrote:

yeah, that's true, as now no more address space casting is happening for any 
version of OpenCL!

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-03 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/3] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-03 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/4] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-03 Thread Vikash Gupta via cfe-commits


@@ -1981,6 +1981,29 @@ static bool OpenCLBuiltinToAddr(Sema &S, unsigned 
BuiltinID, CallExpr *Call) {
   return false;
 }
 
+// In OpenCL, __builtin_alloca_* should return a pointer to address space
+// that corresponds to the stack address space i.e private address space.
+static bool OpenCLBuiltinAllocaAddrSpace(Sema &S, CallExpr *TheCall) {
+  S.Diag(TheCall->getBeginLoc(), diag::warn_alloca)
+  << TheCall->getDirectCallee();
+

vg0204 wrote:

Or  I could place this non-OpenCL specific code just before the invocation of 
this function. My idea to do so was to wrap everything in a function, so just a 
single function call comes into picture.

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 created 
https://github.com/llvm/llvm-project/pull/95750

The __builtin_alloca was returning a flat pointer with no address space when 
compiled using openCL1.2 or below but worked fine with openCL2.0 and above. 
This accounts to the fact that later uses the concept of generic address space 
which supports cast to other address space(i.e to private address space which 
is used for stack allocation) .

So, in  case of openCL1.2 and below __built_alloca is supposed to return 
pointer to private address space to eliminate the need of casting as not 
supported here. Thus,it requires redefintion of the builtin function with 
appropraite return pointer to appropriate address space.

>From 04ff7bd289d0840b8aa2b0502e4ac050174c95ab Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and
 below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 99a8704298314..e12c2a9209706 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits


@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr, align 8, addrspace(5)
+// OPENCL20-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL20-NEXT:[[TMP1:%.*]] = addrspacecast ptr addrspace(5) [[TMP0]] to 
ptr
+// OPENCL20-NEXT:store ptr [[TMP1]], ptr addrspace(5) [[ALLOC_PTR]], align 
8
+// OPENCL20-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspace(5) [[ALLOC_PTR]], 
align 8
+// OPENCL20-NEXT:ret ptr [[TMP2]]
+//
+// OPENCL30-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL30-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL30-NEXT:  [[ENTRY:.*:]]
+// OPENCL30-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL30-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL30-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL30-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL30-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL30-EXT-LABEL: define dso_local ptr @test1(
+// OPENCL30-EXT-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL30-EXT-NEXT:  [[ENTRY:.*:]]
+// OPENCL30-EXT-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr, align 8, addrspace(5)
+// OPENCL30-EXT-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, 
addrspace(5)
+// OPENCL30-EXT-NEXT:[[TMP1:%.*]] = addrspacecast ptr addrspace(5) 
[[TMP0]] to ptr
+// OPENCL30-EXT-NEXT:store ptr [[TMP1]], ptr addrspace(5) [[ALLOC_PTR]], 
align 8
+// OPENCL30-EXT-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspace(5) 
[[ALLOC_PTR]], align 8
+// OPENCL30-EXT-NEXT:ret ptr [[TMP2]]
+//
+float* test1() {
+float* alloc_ptr = (float*)__builtin_alloca(32 * sizeof(int));

vg0204 wrote:

Currently, I made it as such for opencl1.2 & below, it will return private 
pointer so no cast would be seen if __private qualifier is used. But for 
openCL2.0 & above, it returns a pointer to generic address space, so test2 
becomes problematic as initializing '__private void *__private' with an 
expression of type '__generic void *' changes address space of pointer.

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits


@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr, align 8, addrspace(5)
+// OPENCL20-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL20-NEXT:[[TMP1:%.*]] = addrspacecast ptr addrspace(5) [[TMP0]] to 
ptr
+// OPENCL20-NEXT:store ptr [[TMP1]], ptr addrspace(5) [[ALLOC_PTR]], align 
8
+// OPENCL20-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspace(5) [[ALLOC_PTR]], 
align 8
+// OPENCL20-NEXT:ret ptr [[TMP2]]
+//
+// OPENCL30-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL30-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL30-NEXT:  [[ENTRY:.*:]]
+// OPENCL30-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL30-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL30-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL30-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL30-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL30-EXT-LABEL: define dso_local ptr @test1(
+// OPENCL30-EXT-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL30-EXT-NEXT:  [[ENTRY:.*:]]
+// OPENCL30-EXT-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr, align 8, addrspace(5)
+// OPENCL30-EXT-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, 
addrspace(5)
+// OPENCL30-EXT-NEXT:[[TMP1:%.*]] = addrspacecast ptr addrspace(5) 
[[TMP0]] to ptr
+// OPENCL30-EXT-NEXT:store ptr [[TMP1]], ptr addrspace(5) [[ALLOC_PTR]], 
align 8
+// OPENCL30-EXT-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspace(5) 
[[ALLOC_PTR]], align 8
+// OPENCL30-EXT-NEXT:ret ptr [[TMP2]]
+//
+float* test1() {
+float* alloc_ptr = (float*)__builtin_alloca(32 * sizeof(int));

vg0204 wrote:

So, should it return __private pointer everytime?

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 edited https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From 04ff7bd289d0840b8aa2b0502e4ac050174c95ab Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/2] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 99a8704298314..e12c2a9209706 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From 04ff7bd289d0840b8aa2b0502e4ac050174c95ab Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/2] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 99a8704298314..e12c2a9209706 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From 04ff7bd289d0840b8aa2b0502e4ac050174c95ab Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/2] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 99a8704298314..e12c2a9209706 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From 04ff7bd289d0840b8aa2b0502e4ac050174c95ab Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/2] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 99a8704298314..e12c2a9209706 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-18 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 edited https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-18 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/2] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] Use private address space for builtin_alloca return type for OpenCL (PR #95750)

2024-07-25 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From c9652ae9092e2d42268cd7f843d921c16410bfc9 Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/9] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 8d24e34520e77..cf4c98fbe2c38 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6121,7 +6121,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6165,13 +6168,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] Use private address space for builtin_alloca return type for OpenCL (PR #95750)

2024-07-26 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From ac967e4f887bd511bdfcaf30a0c94aa083cbf980 Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/9] [Clang] Use private address space for builtin_alloca for
 OpenCL

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in general for OpenCL, built_alloca should always return pointer
to private address space, thus eliminating need of use of address
space cast. Thus,it requires redefintion of the builtin function with
return pointer type to private address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 8d24e34520e77..cf4c98fbe2c38 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6121,7 +6121,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6165,13 +6168,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:[[ALLOC_PTR:%.*]] = alloca 

[clang] [Clang] Use private address space for builtin_alloca return type for OpenCL (PR #95750)

2024-07-26 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 closed https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-19 Thread Vikash Gupta via cfe-commits

vg0204 wrote:

PING!

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-22 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/7] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-22 Thread Vikash Gupta via cfe-commits

vg0204 wrote:

@arsenm, updated with needed reviewed changes!

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-09 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/5] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-09 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/5] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-11 Thread Vikash Gupta via cfe-commits

https://github.com/vg0204 updated 
https://github.com/llvm/llvm-project/pull/95750

>From cbe656fa6db50319e74c0fab166538518506974e Mon Sep 17 00:00:00 2001
From: vg0204 
Date: Mon, 17 Jun 2024 11:20:02 +0530
Subject: [PATCH 1/6] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2
 and below

The __builtin_alloca was returning a flat pointer with no address
space when compiled using openCL1.2 or below but worked fine with
openCL2.0 and above. This accounts to the fact that later uses the
concept of generic address space which supports cast to other address
space(i.e to private address space which is used for stack allocation)
.

So, in  case of openCL1.2 and below __built_alloca is supposed to
return pointer to private address space to eliminate the need of
casting as not supported here. Thus,it requires redefintion of the
builtin function with appropraite return pointer to appropriate
address space.
---
 clang/lib/Sema/SemaExpr.cpp | 23 +-
 clang/test/CodeGenOpenCL/builtins-alloca.cl | 86 +
 clang/test/CodeGenOpenCL/memcpy.cl  |  0
 3 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-alloca.cl
 mode change 100644 => 100755 clang/test/CodeGenOpenCL/memcpy.cl

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 4db8b4130c3c7..bb63020dadb83 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -6231,7 +6231,10 @@ bool Sema::CheckArgsForPlaceholders(MultiExprArg args) {
 ///  it does not contain any pointer arguments without
 ///  an address space qualifer.  Otherwise the rewritten
 ///  FunctionDecl is returned.
-/// TODO: Handle pointer return types.
+///
+/// Pointer return type with no explicit address space is assigned the
+/// default address space where pointer points to based on the language
+/// option used to compile it.
 static FunctionDecl *rewriteBuiltinFunctionDecl(Sema *Sema, ASTContext 
&Context,
 FunctionDecl *FDecl,
 MultiExprArg ArgExprs) {
@@ -6275,13 +6278,27 @@ static FunctionDecl *rewriteBuiltinFunctionDecl(Sema 
*Sema, ASTContext &Context,
 OverloadParams.push_back(Context.getPointerType(PointeeType));
   }
 
+  QualType ReturnTy = FT->getReturnType();
+  QualType OverloadReturnTy = ReturnTy;
+  if (ReturnTy->isPointerType() &&
+  !ReturnTy->getPointeeType().hasAddressSpace()) {
+if (Sema->getLangOpts().OpenCL) {
+  NeedsNewDecl = true;
+
+  QualType ReturnPtTy = ReturnTy->getPointeeType();
+  LangAS defClAS = Context.getDefaultOpenCLPointeeAddrSpace();
+  ReturnPtTy = Context.getAddrSpaceQualType(ReturnPtTy, defClAS);
+  OverloadReturnTy = Context.getPointerType(ReturnPtTy);
+}
+  }
+
   if (!NeedsNewDecl)
 return nullptr;
 
   FunctionProtoType::ExtProtoInfo EPI;
   EPI.Variadic = FT->isVariadic();
-  QualType OverloadTy = Context.getFunctionType(FT->getReturnType(),
-OverloadParams, EPI);
+  QualType OverloadTy =
+  Context.getFunctionType(OverloadReturnTy, OverloadParams, EPI);
   DeclContext *Parent = FDecl->getParent();
   FunctionDecl *OverloadDecl = FunctionDecl::Create(
   Context, Parent, FDecl->getLocation(), FDecl->getLocation(),
diff --git a/clang/test/CodeGenOpenCL/builtins-alloca.cl 
b/clang/test/CodeGenOpenCL/builtins-alloca.cl
new file mode 100644
index 0..74a86955f2e4f
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-alloca.cl
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL2.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL20 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 -emit-llvm 
-o - | FileCheck --check-prefix=OPENCL30 %s
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL3.0 
-cl-ext=+__opencl_c_generic_address_space -emit-llvm -o - | FileCheck 
--check-prefix=OPENCL30-EXT %s
+
+// OPENCL12-LABEL: define dso_local ptr addrspace(5) @test1(
+// OPENCL12-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL12-NEXT:  [[ENTRY:.*:]]
+// OPENCL12-NEXT:[[ALLOC_PTR:%.*]] = alloca ptr addrspace(5), align 4, 
addrspace(5)
+// OPENCL12-NEXT:[[TMP0:%.*]] = alloca i8, i64 128, align 8, addrspace(5)
+// OPENCL12-NEXT:store ptr addrspace(5) [[TMP0]], ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:[[TMP1:%.*]] = load ptr addrspace(5), ptr addrspace(5) 
[[ALLOC_PTR]], align 4
+// OPENCL12-NEXT:ret ptr addrspace(5) [[TMP1]]
+//
+// OPENCL20-LABEL: define dso_local ptr @test1(
+// OPENCL20-SAME: ) #[[ATTR0:[0-9]+]] {
+// OPENCL20-NEXT:  [[ENTRY:.*:]]
+// OPENCL20-NEXT:  

[clang] [Clang] [WIP] Added builtin_alloca right Address Space for OpenCL (PR #95750)

2024-07-11 Thread Vikash Gupta via cfe-commits


@@ -1981,6 +1981,26 @@ static bool OpenCLBuiltinToAddr(Sema &S, unsigned 
BuiltinID, CallExpr *Call) {
   return false;
 }
 
+// In OpenCL, __builtin_alloca_* should return a pointer to address space
+// that corresponds to the stack address space i.e private address space.
+static bool OpenCLBuiltinAllocaAddrSpace(Sema &S, CallExpr *TheCall) {
+  QualType RT = TheCall->getType();
+  if (!RT->isPointerType() || RT->getPointeeType().hasAddressSpace())
+return true;
+
+  if (S.getLangOpts().OpenCL) {
+RT = RT->getPointeeType();
+
+// Stack Address space corresponds to private address space.
+LangAS openCLStackAS = LangAS::opencl_private;

vg0204 wrote:

Done

https://github.com/llvm/llvm-project/pull/95750
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits