[PATCH] D116203: [clang] adds unary type transformations as compiler built-ins

Richard Smith - zygoloid via Phabricator via cfe-commits Mon, 25 Apr 2022 17:06:05 -0700

rsmith added inline comments.


================
Comment at: clang/include/clang/AST/TransformTypeTraits.def:27
+TRANSFORM_TYPE_TRAIT_DEF(RemoveVolatile, remove_volatile)
+TRANSFORM_TYPE_TRAIT_DEF(EnumUnderlyingType, underlying_type)
----------------
This file should undef the `TRANSFORM_TYPE_TRAIT_DEF` macro at the end, like 
our other `.def` files do. Then you can remove all of the `#undef`s elsewhere 
in this patch.


================
Comment at: clang/include/clang/AST/Type.h:4567-4582
+    EnumUnderlyingType,
+    AddLvalueReference,
+    AddPointer,
+    AddRvalueReference,
+    Decay,
+    MakeSigned,
+    MakeUnsigned,
----------------
Consider using the `.def` file here.


================
Comment at: clang/include/clang/AST/Type.h:6528
+  (void)IsCPlusPlus;
+  assert(IsCPlusPlus && "Referenceable types only make sense in C++");
+  const Type &Self = **this;
----------------
I think the existence of the `IsCPlusPlus` flag may have the opposite effect of 
the one that's intended:

* Without this flag, looking only at the declaration of this function, I would 
assume you're not supposed to call this function except in C++ mode.
* With this flag, looking only at the declaration of this function, I would 
assume that passing in `IsCPlusPlus = false` would cause the function to always 
return `false` (because there are no reference types).

It might be better to define "referenceable" as "function type with no 
ref-qualifier nor cv-qualifier, object type, or reference type", and just 
accept calls to `isReferenceable` across all language modes.


================
Comment at: clang/include/clang/AST/Type.h:6532-6537
+  if (!Self.isFunctionType())
+    return false;
+
+  const auto *F = Self.getAs<FunctionProtoType>();
+  assert(F != nullptr);
+  return F->getMethodQuals().empty() && F->getRefQualifier() == RQ_None;
----------------
It's better not to test whether the type is a function type (walking past sugar 
etc) more than once.


================
Comment at: clang/include/clang/Sema/DeclSpec.h:340
   /*TSS*/unsigned TypeSpecSign : 2;
-  /*TST*/unsigned TypeSpecType : 6;
+  /*TST*/ unsigned TypeSpecType : 7;
   unsigned TypeAltiVecVector : 1;
----------------
Please keep the formatting consistent here, even if you need to override 
clang-format.


================
Comment at: clang/include/clang/Sema/DeclSpec.h:407
   static bool isTypeRep(TST T) {
-    return (T == TST_typename || T == TST_typeofType ||
-            T == TST_underlyingType || T == TST_atomic);
+    static const llvm::SmallVector<TST, 19> validTSTs = {
+        TST_atomic,
----------------
aaron.ballman wrote:
> Should we find a way we can drop the `19` here so that this doesn't become 
> incorrect when someone adds anything to `TransformTypeTraits.def`?
It'd be nice to just compare `T` against the least and greatest trait value -- 
this linear scan seems more expensive than we might want. (And sure, we don't 
care about efficiency given that this function is only used by asserts, but I 
don't think we should assume it'll remain that way.)

Eg, something like:

```
constexpr TST Traits[] = {
#define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) TST_##Trait,
#include "clang/AST/TransformTypeTraits.def"
#undef TRANSFORM_TYPE_TRAIT_DEF
};
static_assert(std::is_sorted(std::begin(Traits), std::end(Traits)));
return T == TST_atomic || T == TST_typename || T == TST_typeofType || (T >= 
Traits[0] && T <= std::end(Traits)[-1]);
```


================
Comment at: clang/lib/AST/ItaniumMangle.cpp:3962
   if (T->isDependentType()) {
-    Out << 'U';
+    Out << "Uu";
 
----------------
This should just be `u`, not `Uu`.


================
Comment at: clang/lib/AST/ItaniumMangle.cpp:3976
 
   mangleType(T->getBaseType());
 }
----------------
You're missing the `I`...`E` around this argument.


================
Comment at: clang/lib/Parse/ParseDecl.cpp:3419-3422
     case tok::kw___super:
     case tok::kw_decltype:
     case tok::identifier: {
+    ParseIdentifier:
----------------
Our normal convention is to put the `{` after the last label in a sequence of 
labels.


================
Comment at: clang/lib/Parse/ParseDecl.cpp:4155
+#undef TRANSFORM_TYPE_TRAIT_DEF
+      if (!ParseTypeTransformTypeSpecifier(DS))
+        goto ParseIdentifier;
----------------
The name signature you're using for this function is inconsistent with the 
conventions in the rest of the parser: when a `Parser::Parse...` function with 
a `bool` return type returns `true`, it means "I have failed and issued a 
diagnostic". For "parse this if possible and return whether you did", we 
usually use `Parser::MaybeParse...` (eg, `MaybeParseAttributes`).

Alternatively you could do the one-token lookahead here. If the `setKind` call 
is here, it'll be a lot clearer why it makes sense to `goto ParseIdentifier;`.

Also, for workarounds for a standard library issue, we normally include a `// 
HACK:` comment explaining what we're doing and why.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:1041-1044
   case tok::identifier: {      // primary-expression: identifier
                                // unqualified-id: identifier
                                // constant: enumeration-constant
+  ParseIdentifier:
----------------
Our normal convention is to put the `{` after the last label in a sequence of 
labels.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:1750
+#undef TRANSFORM_TYPE_TRAIT_DEF
+    if (NextToken().is(tok::less)) {
+      Tok.setKind(tok::identifier);
----------------
Here you're checking for `trait<` and treating it as an identifier; elsewhere 
you're checking for `trait(` and treating it as a builtin. Is there a reason to 
treat `trait` followed by a token other than `(` or `<` inconsistently?


================
Comment at: clang/lib/Parse/ParseStmt.cpp:185-186
 
   case tok::identifier: {
+  ParseIdentifier:
     Token Next = NextToken();
----------------
As before, `{` should go after the last label.


================
Comment at: clang/lib/Sema/SemaType.cpp:6049-6050
+      // Make sure it is a unary transform type.
+      assert(DS.getTypeSpecType() >= DeclSpec::TST_add_lvalue_reference &&
+             DS.getTypeSpecType() <= DeclSpec::TST_underlying_type);
       TL.setKWLoc(DS.getTypeSpecTypeLoc());
----------------
This assert will start failing if someone adds another trait that isn't in this 
range. Can we do something more robust?

One possibility would be to extend the `.def` file with macros for selecting 
the first and last trait only. You could also use that in 
`DeclSpec::isTypeRep()`. Another would be to factor out the part of 
`isTypeRep()` that deals with this so you can call it from here.


================
Comment at: clang/lib/Sema/SemaType.cpp:9154
+  constexpr auto UKind = UTTKind::RemovePointer;
+  if (!BaseType->isPointerType() && !BaseType->isObjCObjectPointerType())
+    return Context.getUnaryTransformType(BaseType, BaseType, UKind);
----------------
This is `!BaseType->isAnyPointerType()`.

What about block pointers? I think this patch is doing the right thing here, by 
saying that this only applies to pointers spelled with `*` (plain and obj-c 
pointers) and not to pointers spelled with `^`, but it seems worth calling out 
to ensure that I'm not the only one who's thought about it :)


================
Comment at: clang/lib/Sema/SemaType.cpp:9174
+  Qualifiers Quals = Underlying.getQualifiers();
+  Quals.removeCVRQualifiers();
+  return Context.getUnaryTransformType(
----------------
This seems inconsistent: the specification says that `std::decay_t` produces 
`std::remove_cv_t` in this case, but your implementation of `std::remove_cv_t` 
doesn't remove `restrict` and this does. If this is intentional, I think it 
warrants a comment.


================
Comment at: clang/lib/Sema/SemaType.cpp:9182-9192
+QualType Sema::BuiltinAddReference(QualType BaseType, UTTKind UKind,
+                                   SourceLocation Loc) {
+  DeclarationName EntityName(BaseType.getBaseTypeIdentifier());
+  QualType PointerToT =
+      QualType(BaseType).isReferenceable(LangOpts.CPlusPlus)
+          ? BuildReferenceType(BaseType,
+                               UKind == UnaryTransformType::AddLvalueReference,
----------------
Should we even expose this trait in non-C++ languages? Making this a C++-only 
keyword seems better than making it a no-op in C.


================
Comment at: clang/lib/Sema/SemaType.cpp:9196-9205
+  if (!BaseType->isArrayType())
+    return Context.getUnaryTransformType(BaseType, BaseType, UKind);
+  const ArrayType *ArrayType = BaseType->getAsArrayTypeUnsafe();
+  QualType ElementType =
+      UKind == UnaryTransformType::RemoveAllExtents
+          ? QualType(ArrayType->getBaseElementTypeUnsafe(), {})
+          : ArrayType->getElementType();
----------------
This looks like it'll lose all qualifiers other than CVR qualifiers. Presumably 
that's not the intent. (This will also unnecessarily remove type sugar in some 
cases, but that's not super important here.) Please also don't use 
copy-list-initialization to form a `QualType`; see 
https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor

See the suggested edits for an alternative, simpler formulation (that preserves 
as much type sugar as possible).



================
Comment at: clang/lib/Sema/SemaType.cpp:9240
+
+  QualType Underlying(Split.Ty, Quals.getAsOpaqueValue());
+  return Context.getUnaryTransformType(BaseType, Underlying, UKind);
----------------
It doesn't look like this will properly remove qualifiers from within an array 
type in general, eg:
```
using T = const volatile int;
using U = T[3];
using V = __remove_cv(U);
```

Also, you can't use the opaque value of a `Qualifiers` object like this -- it's 
opaque, and is not always just a CVR mask, which is what the `QualType` 
constructor wants. This will assert or otherwise misbehave if there are non-CVR 
qualifiers on the type.

The simple thing to do would be:

```
Qualifiers Quals;
QualType Unqual = Context.getUnqualifiedArrayType(BaseType, Quals);
// ... remove stuff from Quals ...
QualType Result = Context.getQualifiedType(BaseType, Quals);
```

It'd be nice to leave the type sugar alone to the extent that we can, but it 
doesn't really matter too much given that this will usually be wrapped inside a 
type-sugar-erasing template anyway.


================
Comment at: clang/lib/Sema/SemaType.cpp:9246-9247
                                        SourceLocation Loc) {
-  switch (UKind) {
-  case UnaryTransformType::EnumUnderlyingType:
-    if (!BaseType->isDependentType() && !BaseType->isEnumeralType()) {
-      Diag(Loc, diag::err_only_enums_have_underlying_types);
-      return QualType();
-    } else {
-      QualType Underlying = BaseType;
-      if (!BaseType->isDependentType()) {
-        // The enum could be incomplete if we're parsing its definition or
-        // recovering from an error.
-        NamedDecl *FwdDecl = nullptr;
-        if (BaseType->isIncompleteType(&FwdDecl)) {
-          Diag(Loc, diag::err_underlying_type_of_incomplete_enum) << BaseType;
-          Diag(FwdDecl->getLocation(), diag::note_forward_declaration) << 
FwdDecl;
-          return QualType();
-        }
+  if (BaseType->isDependentType())
+    return Context.getUnaryTransformType(BaseType, BaseType, UKind);
+  bool IsMakeSigned = UKind == UnaryTransformType::MakeSigned;
----------------
This check seems unnecessary: we shouldn't call any of these functions for a 
dependent type, and it looks like we don't.


================
Comment at: clang/lib/Sema/SemaType.cpp:9251
+      BaseType->isBooleanType()) {
+    Diag(Loc, diag::err_make_signed_integral_only) << IsMakeSigned << BaseType;
+    return QualType();
----------------
Should we support vector types here?


================
Comment at: clang/lib/Sema/SemaType.cpp:9257-9258
 
-        EnumDecl *ED = BaseType->castAs<EnumType>()->getDecl();
-        assert(ED && "EnumType has no EnumDecl");
+  // Clang doesn't seem to have a way to distinguish `char` from `signed char`
+  // and `unsigned char`
+  if (IsMakeSigned == IsSigned && !IsNonCharIntegral)
----------------
I don't understand this comment. `char`, `signed char`, and `unsigned char` are 
three different types, and you can certainly tell them apart if you want to, 
for example using `BaseType->isSpecificBuiltinType(BuiltinType::SChar)` or 
`Context.hasSameType(BaseType, Context.SignedCharTy)` to detect `signed char`.


================
Comment at: clang/lib/Sema/SemaType.cpp:9265-9266
+          ? Context.getBitIntType(!IsMakeSigned, Context.getIntWidth(BaseType))
+          : Context.getIntTypeForBitwidth(Context.getIntWidth(BaseType),
+                                          IsMakeSigned);
+  Qualifiers Quals = Underlying.getQualifiers();
----------------
This is wrong: if, say, `int` and `long` are the same bit-width, this will give 
the same type for `__make_unsigned(int)` and `__make_unsigned(long)`, where 
they are required to produce `unsigned int` and `unsigned long` respectively.

Please look at `Context.getCorrespondingSignedType` and 
`Context.getCorrespondingUnsignedType` to see how to do this properly; you may 
be able to call those functions directly in some cases, but be careful about 
the cases where we're required to return the lowest-rank int type of the given 
size. Note that `getIntTypeForBitwidth` does *not* do that; rather, it produces 
the *preferred* type of the given width, and for WebAssembly and AVR it 
produces something other than the lowest-rank type in practice in some cases.


================
Comment at: clang/lib/Sema/SemaType.cpp:9267-9270
+  Qualifiers Quals = Underlying.getQualifiers();
+  Quals.setCVRQualifiers(BaseType.getCVRQualifiers());
+  Underlying = QualType(Underlying.getSplitUnqualifiedType().Ty,
+                        Quals.getAsOpaqueValue());
----------------
As before, the opaque value is, well, opaque, and you shouldn't be using it in 
this way. Also, you just created `Underlying` so you know it's unqualified, so 
there's no need to ask for its qualifiers.

See suggested edit. Alternatively (#1), if you want to preserve all qualifiers:

```
Underlying = Context.getQualifiedType(BaseType.getQualifiers());
```

Alternatively (#2) we could strictly follow the standard wording and preserve 
only cv-qualifiers and not `restrict`. I think preserving all qualifiers is 
more in line with the intent, but preserving only `const` and `volatile` is 
probably the best match for what a C++ implementation of the type trait would 
do. *shrug*


================
Comment at: clang/lib/Sema/SemaType.cpp:9136
+  Qualifiers Quals = Underlying.getQualifiers();
+  Quals.removeCVRQualifiers();
+  return Context.getUnaryTransformType(
----------------
cjdb wrote:
> aaron.ballman wrote:
> > What about things like type attributes (are those lost during decay)?
> According to https://eel.is/c++draft/tab:meta.trans.other, no.
Type attributes are non-standard, so we can't answer this question by looking 
at the standard. Nonetheless, doing what `getDecayedType` does seems like the 
best choice.


================
Comment at: clang/test/CodeGenCXX/mangle.cpp:54
 // CHECK-LABEL: define{{.*}} void @_Z1fno
-void f(__int128_t, __uint128_t) { } 
+void f(__int128_t, __uint128_t) {}
 
----------------
If it's easy, can you undo the unrelated formatting changes in this file? 
That'll make future archaeology a little easier and help reduce the chance of 
merge conflicts. (Not a bit deal though.)


================
Comment at: clang/test/CodeGenCXX/mangle.cpp:1113
 template void fn<E>(E, __underlying_type(E));
-// CHECK-LABEL: @_ZN6test552fnINS_1EEEEvT_U3eutS2_
+// CHECK-LABEL: @_ZN6test552fnINS_1EEEEvT_Uu17__underlying_typeS2_
 }
----------------
Do we have a corresponding test for the MS ABI?


================
Comment at: clang/test/CodeGenCXX/mangle.cpp:1159
+
+namespace test61 {
+template <typename T> void f(T, __add_lvalue_reference(T)) {}
----------------
Let's keep all the tests for this together -- please move these to `namespace 
test55`.


================
Comment at: clang/test/SemaCXX/remove_pointer.mm:7
+
+template <class T> using remove_pointer_t = __remove_pointer(T);
+
----------------
Is there a reason for this wrapper? You're directly using `__is_same` below, 
why not directly use `__remove_pointer` too?


================
Comment at: clang/test/SemaCXX/remove_pointer.mm:12-46
+void remove_pointer() {
+  { int a[T(__is_same(remove_pointer_t<void>, void))]; }
+  { int a[T(__is_same(remove_pointer_t<const void>, const void))]; }
+  { int a[T(__is_same(remove_pointer_t<volatile void>, volatile void))]; }
+  { int a[T(__is_same(remove_pointer_t<const volatile void>, const volatile 
void))]; }
+  { int a[T(__is_same(remove_pointer_t<int>, int))]; }
+  { int a[T(__is_same(remove_pointer_t<const int>, const int))]; }
----------------
I expected to see some tests for ObjC pointers in here, but I don't see any. 
I'd like to see `__remove_pointer(id)` and `@class X; 
static_assert(__is_same(__remove_pointer(X*, X)));`


================
Comment at: clang/test/SemaCXX/remove_pointer.mm:13
+void remove_pointer() {
+  { int a[T(__is_same(remove_pointer_t<void>, void))]; }
+  { int a[T(__is_same(remove_pointer_t<const void>, const void))]; }
----------------
Please use `static_assert` instead of faking it with an array.


================
Comment at: clang/test/SemaCXX/transform-type-traits.cpp:5
+
+// libstdc++ uses __remove_cv as an alias, so we make our transform type 
traits alias-revertible
+template <class T, class U>
----------------
We generally name these tests for libstdc++ workarounds as 
`libstdcxx_..._hack.cpp`


================
Comment at: clang/test/SemaCXX/type-traits.cpp:2862
+struct S {};
+template <class T> using remove_const_t = __remove_const(T);
+
----------------
Consider using the trait directly instead of wrapping it in an alias.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:2865
+void check_remove_const() {
+  { int a[T(__is_same(remove_const_t<void>, void))]; }
+  { int a[T(__is_same(remove_const_t<const void>, void))]; }
----------------
Please use `static_assert`.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:2986
+
+void check_remove_cv() {
+  { int a[T(__is_same(remove_cv_t<void>, void))]; }
----------------
This is should have some test coverage for `restrict` and for at least one 
non-standard qualifier. (Similarly for the `remove_cvref` and `decay` tests.)


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3071
+  { int a[T(__is_same(remove_pointer_t<int>, int))]; }
+  { int a[T(__is_same(remove_pointer_t<const int>, const int))]; }
+  { int a[T(__is_same(remove_pointer_t<volatile int>, volatile int))]; }
----------------
You seem to be missing coverage that `__remove_pointer(const int*)` preserves 
the `const`.
You also seem to be missing coverage that this works on a cv-qualified pointer, 
eg `__remove_pointer(int *const)` -> `int`.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3099-3100
+
+  { int a[T(__is_same(remove_pointer_t<int __attribute__((address_space(1))) 
*>, int))]; }
+  { int a[T(__is_same(remove_pointer_t<S __attribute__((address_space(2))) *>, 
S))]; }
+}
----------------
It seems strange to remove the address space qualifier here, given that this 
trait doesn't remove cv-qualifiers.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3269
+  { int a[T(__is_same(remove_cvref_t<int S::*const volatile>, int S::*))]; }
+  { int a[T(__is_same(remove_cvref_t<int (S::*const volatile)()>, 
int(S::*)()))]; }
+  { int a[T(__is_same(remove_cvref_t<int (S::*const volatile)() &>, 
int(S::*)() &))]; }
----------------
The tests you have here for references to pointers-to-members seem excessive 
(throughout); I don't think there's any reason to think that a reference to 
pointer-to-member would be so different from a reference to any other type that 
would justify this level of testing of the former case.

The cases that seem interesting here are the ones for non-referenceable types, 
but those are covered separately below.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3391
+  static void checks() {
+    { int a[T(__is_same(add_lvalue_reference_t<M>, M))]; }
+    { int a[T(__is_same(add_pointer_t<M>, M))]; }
----------------
It's weird that we permit an abominable function type to show up in the 
argument of a trait during template instantiation but not when written 
directly. That behavior should really be consistent, and I think we should 
consistently allow it. That is, we should allow:

`static_assert(__is_same(__add_pointer(int () &), int () &));`

... just like we allow it if you hide the usage of the traits inside a 
template. That'd make this easier to test, too: you can then put these tests 
with the other tests for the corresponding traits. (This should only require 
passing `DeclaratorContext::TemplateArg` to `ParseTypeName` instead of the 
default of `DeclaratorContext::TypeName` in the places where we parse type 
traits, reflecting that we want to use the rules for template arguments in 
these contexts.) This would also fix an incompatibility between GCC and Clang: 
https://godbolt.org/z/GdxThdvjz

That said, I don't think we need to block this patch on this change, so if 
you'd prefer to not do this now, that's fine :)


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3470
+
+  { int a[T(__is_same(make_signed_t<unsigned long long>, long))]; }
+  { int a[T(__is_same(make_signed_t<const unsigned long long>, const long))]; }
----------------
This is the wrong result type, it should be `long long`.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3495-3496
+
+  { int a[T(__is_same(make_signed_t<Enum>, int))]; }
+  { int a[T(__is_same(make_signed_t<SignedEnum>, int))]; }
+  { int a[T(__is_same(make_signed_t<const SignedEnum>, const int))]; }
----------------
It'd be useful to test enums with different underlying types. However, this 
test is not portable: if `short` and `int` are the same size, this is required 
to produce `short`, not `int`. It'd be good to have some test coverage of that 
quirk too. Perhaps this is easiest to see with a test like:

```
enum E : long long {};
static_assert(__is_same(__make_signed(E), long));
```

... which should hold in cases where `long` and `long long` are the same size 
and are larger than `int`.


================
Comment at: clang/test/SemaCXX/type-traits.cpp:3822-3825
+  { int x[T(__is_same(remove_all_extents_t<int[][2]>, int))]; }
+  { int x[T(__is_same(remove_all_extents_t<const int[]>, const int))]; }
+  { int x[T(__is_same(remove_all_extents_t<const int[1]>, const int))]; }
+  { int x[T(__is_same(remove_all_extents_t<const int[1][2]>, const int))]; }
----------------
Please also test the case where the qualifiers are attached outside the array 
type:

```
using T = int[1][2];
static_assert(__is_same(__remove_all_extents(const T), const int), "");
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116203/new/

https://reviews.llvm.org/D116203

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116203: [clang] adds unary type transformations as compiler built-ins

Reply via email to