On Fri, Mar 24, 2023 at 3:06 AM Jonathan Wakely <jwakely....@gmail.com> wrote: > > On Fri, 24 Mar 2023 at 09:58, Jonathan Wakely <jwakely....@gmail.com> wrote: > > > > On Fri, 24 Mar 2023 at 07:10, Ken Matsui via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > Hi, > > > > > > I am working on the GSoC project, "C++: Implement compiler built-in > > > traits for the standard library traits". I found the following library > > > traits that I am not sure if implementing built-in traits brings > > > reasonable speed up. > > > > > > * std::is_fundamental > > > * std::is_arithmetic > > > * std::is_scalar > > > * std::is_object > > > * std::is_compound > > > * std::is_scoped_enum > > > > > > For example, std::is_object has no template specializations, but its > > > inheriting class looks complicated. > > > > > > __not_<__or_<is_function<_Tp>, is_reference<_Tp>, is_void<_Tp>>>::type > > > > > > If we define the built-in trait for this trait, we have: (as > > > equivalence of the above code) > > > > > > __bool_constant<__is_object(_Tp)> > > > > > > And __is_object built-in trait should be like: > > > > > > !(type1 == FUNCTION_TYPE || type1 == ...) > > > > > > In this case, could someone tell me which one would be faster? Or, is > > > there no other way to know which but to benchmark? > > > > You should benchmark it anyway, I was always expecting that to be a > > part of this GSoC project :-) > > > > But is_object is NOT a "relatively simple" trait. What you show above > > is very complex. One of the more complex traits we have. Partial > > specializations are quite fast to match (in general) so eliminating > > partial speclializations (e.g. like we have for is_const) should not > > be the goal. > > > > For is_object<const int> we instantiate: > > > > is_function<const int> > > is_const<const int> > > is_reference<const int> > > is_void<const int> > > __bool_constant<false> > > __or_<false_type, false_type, false_type> > > __not_<false_type> > > __bool_constant<true> > > > > This is a ton of work! Instantiating class templates is the slowest > > part of trait evaluation, not matching partial specializations. > > And then if the same program also instantiates is_object<int> (without > the const), then currently we instantiate: > > is_function<int> > is_const<int> > is_reference<int> > is_void<int> > __bool_constant<false> > __or_<false_type, false_type, false_type> > __not_<false_type> > __bool_constant<true> > > The first four instantiations are not shared with is_object<const > int>, so we have to instantiate them anew. The last four are common > with is_object<const int> so don't need to be instantiated again, the > compiler will have cached those instantiations. > But if the same program also uses is_object<long> then we have another > four new instantiations to generate. And another four for > is_object<float>. And another four for is_object<std::string> etc. > etc. > > With a built-in they will all use __bool_constant<true> and nothing > else (apart from the top-level is_object<T> specialization itself, > which is unavoidable* since that's the trait actually being > evaluated). > > * In fact, it's not really unavoidable, MSVC avoids it entirely. Their > compiler pattern matches the standard traits and never even > instantiates them, so every use of is_object<T>::value gets expanded > directly to __is_object(T) without instantiating anything. We don't do > that, and if we just replace turn 8 or 9 class template instantiations > into 1 then we'll be doing great.
Thank you so much for your detailed explanation! I totally did not understand well what class templates were doing! (And yes, I need to prepare my environment to take benchmarks...) So, do we expect to have __is_object built-in trait? Since std::is_object is a combination of multiple traits, would doing the following be the best implementation over implementing __is_object(T)? (you told me that built-ins make the compiler slightly bigger and slower) __bool_constant<!(__is_function(_Tp) || __is_reference ...)> This would instantiate only __bool_constant<true> and __bool_constant<false>, which can be mostly shared, and we can also avoid adding an additional built-in. Sincerely, Ken Matsui