[clang] [libc] RFC: clang, libc: Extend the ext_vector_type attribute to support the scalable vector sizes (PR #183307)

Paul Osmialowski via cfe-commits Wed, 25 Feb 2026 06:45:31 -0800

https://github.com/pawosm-arm created 
https://github.com/llvm/llvm-project/pull/183307


This patch is a work-in-progress snapshot of the ongoing process to introduce a 
portable way of representing the scalable vector data types, or extend existing 
general vector data types with the ability to cover the scalable vectors too. 
This is far from being complete, the test coverage is insufficient, the 
discussion behind it is still ongoing, yet we would like to share this, just to 
find out what any other potential users of this may think about it.

The `ext_vector_type` attribute has been added to Clang in order to introduce 
the OpenCL vector types. In case of Arm's NEON, the types annotated with this 
attribute are compatible with the `neon_vector_type` types, and can be used by 
the intrinsics defined in the arm-neon.h header.

The Clang's builtin functions that deal with vector data types (e.g., 
`__builtin_vectorelements()`) also see no difference in the vector data types 
defined with either of these attributes. Also, the `[]` operator works 
correctly with those data types.

The `ext_vector_type` attribute is not supported by GCC, but it is so neat that 
there were proposals to introduce it in there, see [1]. The libc's C++ support 
layer contains this libc/src/__support/CPP/simd.h header file, which uses the 
same attribute to introduce a portable simd data type.

Using the `ext_vector_type attribute` (as it is being done in the libc's simd.h 
header file) is a neat way of representing vector data types in the 
architecture agnostic manner. In order to extend this with ability to cover the 
scalable vectors, there seem to be two possible solutions:

1. To introduce a new attribute, e.g., `ext_scalable_vector_type`.

2. To extend the existing `ext_vector_type` attribute with the ability to 
encode scalable vector sizes in some way, e.g., with negative numbers.

In this patch I'm exploring the second path (to extend currently existing 
attribute), yet most of the challenges that it introduces will be similar also 
when we take the first path.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88602

>From 71a75cf1e2061409e727f82ac74010ffb8aada6c Mon Sep 17 00:00:00 2001
From: Paul Osmialowski <[email protected]>
Date: Tue, 10 Feb 2026 20:29:46 +0000
Subject: [PATCH] RFC: clang, libc: Extend the ext_vector_type attribute to
 support the scalable vector sizes

This patch is a work-in-progress snapshot of the ongoing process to
introduce a portable way of representing the scalable vector data
types, or extend existing general vector data types with the ability
to cover the scalable vectors too. This is far from being complete,
the test coverage is insufficient, the discussion behind it is still
ongoing, yet we would like to share this, just to find out what any
other potential users of this may think about it.

The `ext_vector_type` attribute has been added to Clang in order to
introduce the OpenCL vector types. In case of Arm's NEON, the types
annotated with this attribute are compatible with the
`neon_vector_type` types, and can be used by the intrinsics defined in
the arm-neon.h header.

The Clang's builtin functions that deal with vector data types
(e.g., `__builtin_vectorelements()`) also see no difference in the
vector data types defined with either of these attributes. Also, the
`[]` operator works correctly with those data types.

The `ext_vector_type` attribute is not supported by GCC, but it is so
neat that there were proposals to introduce it in there, see [1]. The
libc's C++ support layer contains this libc/src/__support/CPP/simd.h
header file, which uses the same attribute to introduce a portable
simd data type.

Using the `ext_vector_type attribute` (as it is being done in the
libc's simd.h header file) is a neat way of representing vector data
types in the architecture agnostic manner. In order to extend this
with ability to cover the scalable vectors, there seem to be two
possible solutions:

1. To introduce a new attribute, e.g., `ext_scalable_vector_type`.

2. To extend the existing `ext_vector_type` attribute with the ability
   to encode scalable vector sizes in some way, e.g., with negative
   numbers.

In this patch I'm exploring the second path (to extend currently
existing attribute), yet most of the challenges that it introduces
will be similar also when we take the first path.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88602
---
 clang/lib/Sema/SemaType.cpp                 |  10 +
 clang/lib/Sema/TreeTransform.h              |   5 +-
 clang/test/CodeGen/64bit-swiftcall.c        | 128 +++++++++++
 clang/test/CodeGen/arm64-abi-sve.c          | 230 ++++++++++++++++++++
 clang/test/CodeGen/builtin_vectorelements.c |  34 ++-
 libc/src/__support/CPP/simd.h               |  47 ++--
 libc/test/src/__support/CPP/simd_test.cpp   |  14 ++
 7 files changed, 443 insertions(+), 25 deletions(-)
 create mode 100644 clang/test/CodeGen/arm64-abi-sve.c

diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index 28d1d63ff7acf..4248c4e6c945d 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -2435,6 +2435,16 @@ QualType Sema::BuildExtVectorType(QualType T, Expr 
*SizeExpr,
     }
 
     if (VecSize->isNegative()) {
+      if (Context.getTargetInfo().hasFeature("sve")) {
+        // The length of an SVE vector type is only known at runtime, but it is
+        // always a multiple of 128bits.
+        unsigned NumEls = 128U / Context.getTypeSize(T);
+        unsigned NF = static_cast<unsigned>(-1L * VecSize->getZExtValue());
+        QualType Result = Context.getScalableVectorType(T, NumEls * NF);
+        if (!Result.isNull())
+          return Result;
+      }
+
       Diag(SizeExpr->getExprLoc(), diag::err_attribute_vec_negative_size);
       return QualType();
     }
diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index fb32b0e70e3c9..1329f8d9967e9 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -5445,6 +5445,9 @@ TypeSourceInfo 
*TreeTransform<Derived>::TransformType(TypeSourceInfo *TSI) {
   QualType Result = getDerived().TransformType(TLB, TL);
   if (Result.isNull())
     return nullptr;
+  if (isa<DependentSizedExtVectorType>(TL.getType()) &&
+      isa<BuiltinType>(Result))
+    return SemaRef.Context.CreateTypeSourceInfo(Result);
 
   return TLB.getTypeSourceInfo(SemaRef.Context, Result);
 }
@@ -6132,7 +6135,7 @@ QualType 
TreeTransform<Derived>::TransformDependentSizedExtVectorType(
     DependentSizedExtVectorTypeLoc NewTL
       = TLB.push<DependentSizedExtVectorTypeLoc>(Result);
     NewTL.setNameLoc(TL.getNameLoc());
-  } else {
+  } else if (!isa<BuiltinType>(Result)) {
     ExtVectorTypeLoc NewTL = TLB.push<ExtVectorTypeLoc>(Result);
     NewTL.setNameLoc(TL.getNameLoc());
   }
diff --git a/clang/test/CodeGen/64bit-swiftcall.c 
b/clang/test/CodeGen/64bit-swiftcall.c
index 448bca7acbca3..cc60ac0f6844c 100644
--- a/clang/test/CodeGen/64bit-swiftcall.c
+++ b/clang/test/CodeGen/64bit-swiftcall.c
@@ -2,6 +2,7 @@
 // RUN: %clang_cc1 -no-enable-noundef-analysis -triple x86_64-apple-darwin10 
-target-cpu core2 -emit-llvm -o - %s | FileCheck %s --check-prefix=X86-64
 // RUN: %clang_cc1 -no-enable-noundef-analysis -triple arm64-apple-ios9 
-target-cpu cyclone -emit-llvm -o - %s | FileCheck %s
 // RUN: %clang_cc1 -no-enable-noundef-analysis -triple arm64-apple-ios9 
-target-cpu cyclone -emit-llvm -o - %s | FileCheck %s --check-prefix=ARM64
+// RUN: %clang_cc1 -no-enable-noundef-analysis -triple arm64-apple-ios9 
-target-feature +sve -emit-llvm -o - %s | FileCheck %s 
--check-prefixes=ARM64,ARM64-SVE
 
 // REQUIRES: aarch64-registered-target,x86-registered-target
 
@@ -1059,3 +1060,130 @@ TEST(vector_union)
 
 // CHECK-LABEL: define swiftcc { float, float, float, float } 
@return_vector_union()
 // CHECK-LABEL: define swiftcc void @take_vector_union(float %0, float %1, 
float %2, float %3)
+
+#if defined(__ARM_FEATURE_SVE)
+
+#define SCALABLE_SIZE(N) (-1 * ((signed)(N)))
+
+typedef float svfloat1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef float svfloat4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+typedef double svdouble1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef double svdouble4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+typedef int svint1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef int svint4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+typedef signed char svchar1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef signed char svchar4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+typedef short svshort1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef short svshort4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+typedef long long svlong1 __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+typedef long long svlong4 __attribute__((ext_vector_type(SCALABLE_SIZE(4))));
+
+TEST(__SVFloat32_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVFloat32_t()
+// ARM64-SVE: ret [[SVFLOAT1_T:.+]] %0
+
+TEST(svfloat1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svfloat1()
+// ARM64-SVE: ret [[SVFLOAT1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svfloat1(<vscale x 4 x 
float> %v)
+
+TEST(__clang_svfloat32x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svfloat32x4_t()
+// ARM64-SVE: ret [[SVFLOAT4_T:.+]] %0
+
+TEST(svfloat4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svfloat4()
+// ARM64-SVE: ret [[SVFLOAT4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svfloat4(<vscale x 4 x 
float> %v.coerce0, <vscale x 4 x float> %v.coerce1, <vscale x 4 x float> 
%v.coerce2, <vscale x 4 x float> %v.coerce3)
+
+TEST(__SVFloat64_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVFloat64_t()
+// ARM64-SVE: ret [[SVDOUBLE1_T:.+]] %0
+
+TEST(svdouble1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svdouble1()
+// ARM64-SVE: ret [[SVDOUBLE1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svdouble1(<vscale x 2 x 
double> %v)
+
+TEST(__clang_svfloat64x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svfloat64x4_t()
+// ARM64-SVE: ret [[SVDOUBLE4_T:.+]] %0
+
+TEST(svdouble4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svdouble4()
+// ARM64-SVE: ret [[SVDOUBLE4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svdouble4(<vscale x 2 x 
double> %v.coerce0, <vscale x 2 x double> %v.coerce1, <vscale x 2 x double> 
%v.coerce2, <vscale x 2 x double> %v.coerce3)
+
+TEST(__SVInt32_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVInt32_t()
+// ARM64-SVE: ret [[SVINT1_T:.+]] %0
+
+TEST(svint1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svint1()
+// ARM64-SVE: ret [[SVINT1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svint1(<vscale x 4 x i32> 
%v)
+
+TEST(__clang_svint32x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svint32x4_t()
+// ARM64-SVE: ret [[SVINT4_T:.+]] %0
+
+TEST(svint4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svint4()
+// ARM64-SVE: ret [[SVINT4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svint4(<vscale x 4 x i32> 
%v.coerce0, <vscale x 4 x i32> %v.coerce1, <vscale x 4 x i32> %v.coerce2, 
<vscale x 4 x i32> %v.coerce3)
+
+TEST(__SVInt8_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVInt8_t()
+// ARM64-SVE: ret [[SVCHAR1_T:.+]] %0
+
+TEST(svchar1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svchar1()
+// ARM64-SVE: ret [[SVCHAR1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svchar1(<vscale x 16 x i8> 
%v)
+
+TEST(__clang_svint8x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svint8x4_t()
+// ARM64-SVE: ret [[SVCHAR4_T:.+]] %0
+
+TEST(svchar4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svchar4()
+// ARM64-SVE: ret [[SVCHAR4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svchar4(<vscale x 16 x i8> 
%v.coerce0, <vscale x 16 x i8> %v.coerce1, <vscale x 16 x i8> %v.coerce2, 
<vscale x 16 x i8> %v.coerce3)
+
+TEST(__SVInt16_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVInt16_t()
+// ARM64-SVE: ret [[SVSHORT1_T:.+]] %0
+
+TEST(svshort1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svshort1()
+// ARM64-SVE: ret [[SVSHORT1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svshort1(<vscale x 8 x 
i16> %v)
+
+TEST(__clang_svint16x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svint16x4_t()
+// ARM64-SVE: ret [[SVSHORT4_T:.+]] %0
+
+TEST(svshort4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svshort4()
+// ARM64-SVE: ret [[SVSHORT4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svshort4(<vscale x 8 x 
i16> %v.coerce0, <vscale x 8 x i16> %v.coerce1, <vscale x 8 x i16> %v.coerce2, 
<vscale x 8 x i16> %v.coerce3)
+
+TEST(__SVInt64_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___SVInt64_t()
+// ARM64-SVE: ret [[SVLONG1_T:.+]] %0
+
+TEST(svlong1)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svlong1()
+// ARM64-SVE: ret [[SVLONG1_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svlong1(<vscale x 2 x i64> 
%v)
+
+TEST(__clang_svint64x4_t)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return___clang_svint64x4_t()
+// ARM64-SVE: ret [[SVLONG4_T:.+]] %0
+
+TEST(svlong4)
+// ARM64-SVE-LABEL: define{{.*}} swiftcc {{.+}} @return_svlong4()
+// ARM64-SVE: ret [[SVLONG4_T]] %0
+// ARM64-SVE-LABEL: define{{.*}} swiftcc void @take_svlong4(<vscale x 2 x i64> 
%v.coerce0, <vscale x 2 x i64> %v.coerce1, <vscale x 2 x i64> %v.coerce2, 
<vscale x 2 x i64> %v.coerce3)
+
+#endif /* defined(__ARM_FEATURE_SVE) */
diff --git a/clang/test/CodeGen/arm64-abi-sve.c 
b/clang/test/CodeGen/arm64-abi-sve.c
new file mode 100644
index 0000000000000..9570867d76e6e
--- /dev/null
+++ b/clang/test/CodeGen/arm64-abi-sve.c
@@ -0,0 +1,230 @@
+// RUN: %clang_cc1 -triple arm64-apple-ios7 -target-abi darwinpcs 
-target-feature +sve -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-linux-android -target-feature +sve 
-emit-llvm -o - %s | FileCheck %s
+
+#include <stdarg.h>
+
+#define SCALABLE_SIZE(N) (-1 * ((signed)(N)))
+
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(1)) ))  char __char1s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(2)) ))  char __char2s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(3)) ))  char __char3s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(4)) ))  char __char4s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(1)) ))  short __short1s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(2)) ))  short __short2s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(3)) ))  short __short3s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(1)) ))  int __int1s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(4)) ))  int __int4s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(1)) ))  double 
__double1s;
+typedef __attribute__(( ext_vector_type(SCALABLE_SIZE(2)) ))  double 
__double2s;
+
+double svfunc__char1s(__char1s arg);
+
+double vec_s1c(int fixed, __char1s c1s) {
+// CHECK-LABEL: @vec_s1c
+// CHECK: [[PTR:%.*]] = alloca <vscale x 16 x i8>, align 16
+// CHECK: store <vscale x 16 x i8> %c1s, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__char1s(<vscale x 16 x i8> 
{{%.*}})
+  double sum = fixed;
+
+  return sum + svfunc__char1s(c1s);
+}
+
+double test_s1c(__char1s *in) {
+// CHECK-LABEL: @test_s1c
+// CHECK: call double @vec_s1c(i32 noundef 1, <vscale x 16 x i8> {{%.*}})
+  return vec_s1c(1, *in);
+}
+
+double svfunc__char2s(__char2s arg);
+
+double vec_s2c(int fixed, __char2s c2s) {
+// CHECK-LABEL: @vec_s2c
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 16 x i8>, <vscale x 16 x i8> }, 
align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-1: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 16 x i8>, <vscale x 16 x i8> } {{%.*}}, ptr 
[[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__char2s(<vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1)
+  double sum = fixed;
+
+  return sum + svfunc__char2s(c2s);
+}
+
+double test_s2c(__char2s *in) {
+// CHECK-LABEL: @test_s2c
+// CHECK: call double @vec_s2c(i32 noundef 1, <vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1)
+  return vec_s2c(1, *in);
+}
+
+double svfunc__char3s(__char3s arg);
+
+double vec_s3c(int fixed, __char3s c3s) {
+// CHECK-LABEL: @vec_s3c
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 16 x i8>, <vscale x 16 x i8>, 
<vscale x 16 x i8> }, align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-2: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 16 x i8>, <vscale x 16 x i8>, <vscale x 16 x i8> } 
{{%.*}}, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__char3s(<vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1, <vscale x 16 x i8> 
{{%.*}}.extract2)
+  double sum = fixed;
+
+  return sum + svfunc__char3s(c3s);
+}
+
+double test_s3c(__char3s *in) {
+// CHECK-LABEL: @test_s3c
+// CHECK: call double @vec_s3c(i32 noundef 1, <vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1, <vscale x 16 x i8> 
{{%.*}}.extract2)
+  return vec_s3c(1, *in);
+}
+
+double svfunc__char4s(__char4s arg);
+
+double vec_s4c(int fixed, __char4s c4s) {
+// CHECK-LABEL: @vec_s4c
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 16 x i8>, <vscale x 16 x i8>, 
<vscale x 16 x i8>, <vscale x 16 x i8> }, align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-3: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 16 x i8>, <vscale x 16 x i8>, <vscale x 16 x i8>, 
<vscale x 16 x i8> } {{%.*}}, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__char4s(<vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1, <vscale x 16 x i8> 
{{%.*}}.extract2, <vscale x 16 x i8> {{%.*}}.extract3)
+  double sum = fixed;
+
+  return sum + svfunc__char4s(c4s);
+}
+
+double test_s4c(__char4s *in) {
+// CHECK-LABEL: @test_s4c
+// CHECK: call double @vec_s4c(i32 noundef 1, <vscale x 16 x i8> 
{{%.*}}.extract0, <vscale x 16 x i8> {{%.*}}.extract1, <vscale x 16 x i8> 
{{%.*}}.extract2, <vscale x 16 x i8> {{%.*}}.extract3)
+  return vec_s4c(1, *in);
+}
+
+double svfunc__short1s(__short1s arg);
+
+double vec_s1s(int fixed, __short1s s1s) {
+// CHECK-LABEL: @vec_s1s
+// CHECK: [[PTR:%.*]] = alloca <vscale x 8 x i16>, align 16
+// CHECK: store <vscale x 8 x i16> %s1s, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__short1s(<vscale x 8 x i16> 
{{%.*}})
+  double sum = fixed;
+
+  return sum + svfunc__short1s(s1s);
+}
+
+double test_s1s(__short1s *in) {
+// CHECK-LABEL: @test_s1s
+// CHECK: call double @vec_s1s(i32 noundef 1, <vscale x 8 x i16> {{%.*}})
+  return vec_s1s(1, *in);
+}
+
+double svfunc__short2s(__short2s arg);
+
+double vec_s2s(int fixed, __short2s s2s) {
+// CHECK-LABEL: @vec_s2s
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 8 x i16>, <vscale x 8 x i16> }, 
align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-1: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 8 x i16>, <vscale x 8 x i16> } {{%.*}}, ptr 
[[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__short2s(<vscale x 8 x i16> 
{{%.*}}.extract0, <vscale x 8 x i16> {{%.*}}.extract1)
+  double sum = fixed;
+
+  return sum + svfunc__short2s(s2s);
+}
+
+double test_s2s(__short2s *in) {
+// CHECK-LABEL: @test_s2s
+// CHECK: call double @vec_s2s(i32 noundef 1, <vscale x 8 x i16> 
{{%.*}}.extract0, <vscale x 8 x i16> {{%.*}}.extract1)
+  return vec_s2s(1, *in);
+}
+
+double svfunc__short3s(__short3s arg);
+
+double vec_s3s(int fixed, __short3s s3s) {
+// CHECK-LABEL: @vec_s3s
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 8 x i16>, <vscale x 8 x i16>, 
<vscale x 8 x i16> }, align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-2: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 8 x i16>, <vscale x 8 x i16>, <vscale x 8 x i16> } 
{{%.*}}, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__short3s(<vscale x 8 x i16> 
{{%.*}}.extract0, <vscale x 8 x i16> {{%.*}}.extract1, <vscale x 8 x i16> 
{{%.*}}.extract2)
+  double sum = fixed;
+
+  return sum + svfunc__short3s(s3s);
+}
+
+double test_s3s(__short3s *in) {
+// CHECK-LABEL: @test_s3s
+// CHECK: call double @vec_s3s(i32 noundef 1, <vscale x 8 x i16> 
{{%.*}}.extract0, <vscale x 8 x i16> {{%.*}}.extract1, <vscale x 8 x i16> 
{{%.*}}.extract2)
+  return vec_s3s(1, *in);
+}
+
+double svfunc__int1s(__int1s arg);
+
+double vec_s1i(int fixed, __int1s i1s) {
+// CHECK-LABEL: @vec_s1i
+// CHECK: [[PTR:%.*]] = alloca <vscale x 4 x i32>, align 16
+// CHECK: store <vscale x 4 x i32> %i1s, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__int1s(<vscale x 4 x i32> {{%.*}})
+  double sum = fixed;
+
+  return sum + svfunc__int1s(i1s);
+}
+
+double test_s1i(__int1s *in) {
+// CHECK-LABEL: @test_s1i
+// CHECK: call double @vec_s1i(i32 noundef 1, <vscale x 4 x i32> {{%.*}})
+  return vec_s1i(1, *in);
+}
+
+double svfunc__int4s(__int4s arg);
+
+double vec_s4i(int fixed, __int4s i4s) {
+// CHECK-LABEL: @vec_s4i
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 4 x i32>, <vscale x 4 x i32>, 
<vscale x 4 x i32>, <vscale x 4 x i32> }, align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-3: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 4 x i32>, <vscale x 4 x i32>, <vscale x 4 x i32>, 
<vscale x 4 x i32> } {{%.*}}, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__int4s(<vscale x 4 x i32> 
{{%.*}}.extract0, <vscale x 4 x i32> {{%.*}}.extract1, <vscale x 4 x i32> 
{{%.*}}.extract2, <vscale x 4 x i32> {{%.*}}.extract3)
+  double sum = fixed;
+
+  return sum + svfunc__int4s(i4s);
+}
+
+double test_s4i(__int4s *in) {
+// CHECK-LABEL: @test_s4i
+// CHECK: call double @vec_s4i(i32 noundef 1, <vscale x 4 x i32> 
{{%.*}}.extract0, <vscale x 4 x i32> {{%.*}}.extract1, <vscale x 4 x i32> 
{{%.*}}.extract2, <vscale x 4 x i32> {{%.*}}.extract3)
+  return vec_s4i(1, *in);
+}
+
+double svfunc__double1s(__double1s arg);
+
+double vec_s1d(int fixed, __double1s d1s) {
+// CHECK-LABEL: @vec_s1d
+// CHECK: [[PTR:%.*]] = alloca <vscale x 2 x double>, align 16
+// CHECK: store <vscale x 2 x double> %d1s, ptr [[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__double1s(<vscale x 2 x double> 
{{%.*}})
+  double sum = fixed;
+
+  return sum + svfunc__double1s(d1s);
+}
+
+double test_s1d(__double1s *in) {
+// CHECK-LABEL: @test_s1d
+// CHECK: call double @vec_s1d(i32 noundef 1, <vscale x 2 x double> {{%.*}})
+  return vec_s1d(1, *in);
+}
+
+double svfunc__double2s(__double2s arg);
+
+double vec_s2d(int fixed, __double2s d2s) {
+// CHECK-LABEL: @vec_s2d
+// CHECK: [[PTR:%.*]] = alloca { <vscale x 2 x double>, <vscale x 2 x double> 
}, align 16
+// CHECK: {{%.*}} = insertvalue {{.*}} poison, {{.*}}.coerce{{.*}}
+// CHECK-COUNT-1: {{%.*}} = insertvalue {{.*}} {{%.*}}, {{.*}}.coerce{{.*}}
+// CHECK: store { <vscale x 2 x double>, <vscale x 2 x double> } {{%.*}}, ptr 
[[PTR]], align 16
+// CHECK: [[CALL:%.*]] = call double @svfunc__double2s(<vscale x 2 x double> 
{{%.*}}.extract0, <vscale x 2 x double> {{%.*}}.extract1)
+  double sum = fixed;
+
+  return sum + svfunc__double2s(d2s);
+}
+
+double test_s2d(__double2s *in) {
+// CHECK-LABEL: @test_s2d
+// CHECK: call double @vec_s2d(i32 noundef 1, <vscale x 2 x double> 
{{%.*}}.extract0, <vscale x 2 x double> {{%.*}}.extract1)
+  return vec_s2d(1, *in);
+}
diff --git a/clang/test/CodeGen/builtin_vectorelements.c 
b/clang/test/CodeGen/builtin_vectorelements.c
index 45f7a3c34562b..7047bd930419f 100644
--- a/clang/test/CodeGen/builtin_vectorelements.c
+++ b/clang/test/CodeGen/builtin_vectorelements.c
@@ -1,13 +1,7 @@
-// RUN: %clang_cc1 -O1 -triple x86_64                        %s -emit-llvm 
-disable-llvm-passes -o - | FileCheck --check-prefixes=CHECK       %s
-
-// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -O1 -triple aarch64 -target-feature +neon %s -emit-llvm 
-disable-llvm-passes -o - | FileCheck --check-prefixes=CHECK,NEON  %s
-
-// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -O1 -triple aarch64 -target-feature +sve  %s -emit-llvm 
-disable-llvm-passes -o - | FileCheck --check-prefixes=CHECK,SVE   %s
-
-// REQUIRES: riscv-registered-target
-// RUN: %clang_cc1 -O1 -triple riscv64 -target-feature +v    %s -emit-llvm 
-disable-llvm-passes -o - | FileCheck --check-prefixes=CHECK,RISCV %s
+// RUN: %clang_cc1                                  -O1 -triple x86_64         
               %s -emit-llvm -disable-llvm-passes -o - | FileCheck 
--check-prefixes=CHECK       %s
+// RUN: %if aarch64-registered-target %{ %clang_cc1 -O1 -triple aarch64 
-target-feature +neon %s -emit-llvm -disable-llvm-passes -o - | FileCheck 
--check-prefixes=CHECK,NEON  %s %}
+// RUN: %if aarch64-registered-target %{ %clang_cc1 -O1 -triple aarch64 
-target-feature +sve  %s -emit-llvm -disable-llvm-passes -o - | FileCheck 
--check-prefixes=CHECK,SVE   %s %}
+// RUN: %if riscv-registered-target   %{ %clang_cc1 -O1 -triple riscv64 
-target-feature +v    %s -emit-llvm -disable-llvm-passes -o - | FileCheck 
--check-prefixes=CHECK,RISCV %s %}
 
 /// Note that this does not make sense to check for x86 SIMD types, because
 /// __m128i, __m256i, and __m512i do not specify the element type. There are no
@@ -19,6 +13,10 @@ typedef int int8 __attribute__((vector_size(32)));
 typedef int int16 __attribute__((vector_size(64)));
 typedef float float2 __attribute__((vector_size(8)));
 typedef long extLong4 __attribute__((ext_vector_type(4)));
+#if defined(__ARM_FEATURE_SVE)
+#define SCALABLE_SIZE(N) (-1 * ((signed)(N)))
+typedef long extLong1s __attribute__((ext_vector_type(SCALABLE_SIZE(1))));
+#endif
 
 
 int test_builtin_vectorelements_int1() {
@@ -82,6 +80,22 @@ int test_builtin_vectorelements_neon64x1() {
 #if defined(__ARM_FEATURE_SVE)
 #include <arm_sve.h>
 
+long test_builtin_vectorelements_sve64() {
+  // SVE: i64 @test_builtin_vectorelements_sve64(
+  // SVE: [[VSCALE:%.+]] = call i64 [[I64_VSCALE_CALL:@llvm.vscale.i64]]()
+  // SVE: [[RES:%.+]] = mul nuw i64 [[VSCALE]], [[I64_MUL:2]]
+  // SVE: ret i64 [[RES]]
+  return __builtin_vectorelements(svuint64_t);
+}
+
+long test_builtin_vectorelements_extLong1s() {
+  // SVE-LABEL: i64 @test_builtin_vectorelements_extLong1s(
+  // SVE: [[VSCALE:%.+]] = call i64 [[I64_VSCALE_CALL]]()
+  // SVE: [[RES:%.+]] = mul nuw i64 %0, [[I64_MUL]]
+  // SVE: ret i64 [[RES]]
+  return __builtin_vectorelements(extLong1s);
+}
+
 long test_builtin_vectorelements_sve32() {
   // SVE: i64 @test_builtin_vectorelements_sve32(
   // SVE: [[VSCALE:%.+]] = call i64 @llvm.vscale.i64()
diff --git a/libc/src/__support/CPP/simd.h b/libc/src/__support/CPP/simd.h
index 422d2f4c8433d..6a5fb73fac456 100644
--- a/libc/src/__support/CPP/simd.h
+++ b/libc/src/__support/CPP/simd.h
@@ -32,6 +32,9 @@
 namespace LIBC_NAMESPACE_DECL {
 namespace cpp {
 
+template <size_t N>
+constexpr signed scalable_size = -1 * static_cast<signed>(N);
+
 namespace internal {
 
 #if defined(LIBC_TARGET_CPU_HAS_AVX512F)
@@ -52,7 +55,9 @@ template <typename T> LIBC_INLINE constexpr size_t 
native_vector_size = 1;
 // Type aliases.
 template <typename T, size_t N>
 using fixed_size_simd = T [[clang::ext_vector_type(N)]];
-template <typename T, size_t N = internal::native_vector_size<T>>
+template <typename T, size_t N>
+using scalable_size_simd = T [[clang::ext_vector_type(scalable_size<N>)]];
+template <typename T, auto N = internal::native_vector_size<T>>
 using simd = T [[clang::ext_vector_type(N)]];
 template <typename T>
 using simd_mask = simd<bool, internal::native_vector_size<T>>;
@@ -64,18 +69,21 @@ struct simd_size : cpp::integral_constant<size_t, 
__builtin_vectorelements(T)> {
 template <class T> constexpr size_t simd_size_v = simd_size<T>::value;
 
 template <typename T> struct is_simd : cpp::integral_constant<bool, false> {};
-template <typename T, unsigned N>
+template <typename T, auto N>
 struct is_simd<simd<T, N>> : cpp::integral_constant<bool, true> {};
 template <class T> constexpr bool is_simd_v = is_simd<T>::value;
 
+template <auto N> constexpr bool is_scalable_size_v =
+                                   static_cast<signed>(N) < 0;
+
 template <typename T>
 struct is_simd_mask : cpp::integral_constant<bool, false> {};
-template <unsigned N>
+template <auto N>
 struct is_simd_mask<simd<bool, N>> : cpp::integral_constant<bool, true> {};
 template <class T> constexpr bool is_simd_mask_v = is_simd_mask<T>::value;
 
 template <typename T> struct simd_element_type;
-template <typename T, size_t N> struct simd_element_type<simd<T, N>> {
+template <typename T, auto N> struct simd_element_type<simd<T, N>> {
   using type = T;
 };
 template <typename T>
@@ -153,6 +161,13 @@ using enable_if_integral_t = 
cpp::enable_if_t<cpp::is_integral_v<T>, T>;
 template <typename T>
 using enable_if_simd_t = cpp::enable_if_t<is_simd_v<T>, bool>;
 
+template <auto N>
+using enable_if_scalable_size_t = cpp::enable_if_t<is_scalable_size_v<N>, 
bool>;
+
+template <auto N>
+using enable_if_not_scalable_size_t =
+        cpp::enable_if_t<!is_scalable_size_v<N>, bool>;
+
 } // namespace internal
 
 // Casting.
@@ -356,27 +371,31 @@ LIBC_INLINE constexpr static void compress(simd<bool, 
simd_size_v<T>> mask, T v,
 }
 
 // Construction helpers.
-template <typename T, size_t N>
+template <typename T, auto N = internal::native_vector_size<T>,
+          internal::enable_if_not_scalable_size_t<N> = 0>
 LIBC_INLINE constexpr static simd<T, N> splat(T v) {
   return simd<T, N>(v);
 }
-template <typename T> LIBC_INLINE constexpr static simd<T> splat(T v) {
-  return splat<T, simd_size_v<simd<T>>>(v);
+template <typename T, auto N = internal::native_vector_size<T>,
+          internal::enable_if_scalable_size_t<N> = 0>
+LIBC_INLINE constexpr static simd<T, N> splat(T v) {
+  simd<T, N> sv;
+  size_t n = __builtin_vectorelements(simd<T, N>);
+  for (unsigned i = 0U; i < n; ++i)
+    sv[i] = v;
+  return sv;
 }
-template <typename T, unsigned N>
+template <typename T, auto N = internal::native_vector_size<T>>
 LIBC_INLINE constexpr static simd<T, N> iota(T base = T(0), T step = T(1)) {
   simd<T, N> v{};
-  for (unsigned i = 0; i < N; ++i)
+  size_t n = __builtin_vectorelements(simd<T, N>);
+  for (unsigned i = 0U; i < n; ++i)
     v[i] = base + T(i) * step;
   return v;
 }
-template <typename T>
-LIBC_INLINE constexpr static simd<T> iota(T base = T(0), T step = T(1)) {
-  return iota<T, simd_size_v<simd<T>>>(base, step);
-}
 
 // Conditional helpers.
-template <typename T, size_t N>
+template <typename T, auto N>
 LIBC_INLINE constexpr static simd<T, N> select(simd<bool, N> m, simd<T, N> x,
                                                simd<T, N> y) {
   return m ? x : y;
diff --git a/libc/test/src/__support/CPP/simd_test.cpp 
b/libc/test/src/__support/CPP/simd_test.cpp
index 8bead8461d649..d9c79c4a2b5e8 100644
--- a/libc/test/src/__support/CPP/simd_test.cpp
+++ b/libc/test/src/__support/CPP/simd_test.cpp
@@ -148,3 +148,17 @@ TEST(LlvmLibcSIMDTest, MaskedCompressExpand) {
 
   EXPECT_TRUE(cpp::all_of(!mask_expand || v2 <= SIZE / 2));
 }
+
+#if defined(LIBC_TARGET_CPU_HAS_SVE) || defined(LIBC_TARGET_CPU_HAS_SVE2)
+
+TEST(LlvmLibcSIMDTest, SizelessVectorCreation) {
+  cpp::simd<int, cpp::scalable_size<1>> svsplat = cpp::splat(5);
+  cpp::simd<int, cpp::scalable_size<1>> sviota = cpp::iota(0);
+
+  EXPECT_EQ(svsplat[0], 5);
+  EXPECT_EQ(svsplat[1], 5);
+  EXPECT_EQ(sviota[0], 0);
+  EXPECT_EQ(svioota[1], 1);
+}
+
+#endif

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] RFC: clang, libc: Extend the ext_vector_type attribute to support the scalable vector sizes (PR #183307)

Reply via email to