https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102780
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> --- In the real code, I find that a separate constrained specialization like this: template<typename T, typename... U> union variadic_union<T, U...> { T first; variadic_union<U...> rest; static constexpr int size = variadic_union<U...>::size + 1; }; template<typename T, typename... U> requires (!trivially_destructible<T, U...>) union variadic_union<T, U...> { variadic_union(const variadic_union&) = default; variadic_union(variadic_union&&) = default; variadic_union& operator=(const variadic_union&) = default; variadic_union& operator=(variadic_union&&) = default; // Non-trivial dtor is required for this partial specialization constexpr ~variadic_union() { } T first; variadic_union<U...> rest; static constexpr int size = variadic_union<U...>::size + 1; }; Is much faster (14s instead of 20s+) than a constrained destructor in the primary template: template<typename T, typename... U> union variadic_union<T, U...> { variadic_union(const variadic_union&) = default; variadic_union(variadic_union&&) = default; variadic_union& operator=(const variadic_union&) = default; variadic_union& operator=(variadic_union&&) = default; ~variadic_union() = default; // Conditionally non-trivial dtor, if required. constexpr ~variadic_union() requires (!trivially_destructible<T, U...>) { } T first; variadic_union<U...> rest; static constexpr int size = variadic_union<U...>::size + 1; }; I haven't been able to reproduce that time difference in the reduced examples though.