On Fri, Mar 24, 2023 at 3:06 AM Jonathan Wakely <jwakely....@gmail.com> wrote:
>
> On Fri, 24 Mar 2023 at 09:58, Jonathan Wakely <jwakely....@gmail.com> wrote:
> >
> > On Fri, 24 Mar 2023 at 07:10, Ken Matsui via Gcc <gcc@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > >
> > > I am working on the GSoC project, "C++: Implement compiler built-in
> > > traits for the standard library traits". I found the following library
> > > traits that I am not sure if implementing built-in traits brings
> > > reasonable speed up.
> > >
> > > * std::is_fundamental
> > > * std::is_arithmetic
> > > * std::is_scalar
> > > * std::is_object
> > > * std::is_compound
> > > * std::is_scoped_enum
> > >
> > > For example, std::is_object has no template specializations, but its
> > > inheriting class looks complicated.
> > >
> > > __not_<__or_<is_function<_Tp>, is_reference<_Tp>, is_void<_Tp>>>::type
> > >
> > > If we define the built-in trait for this trait, we have: (as
> > > equivalence of the above code)
> > >
> > > __bool_constant<__is_object(_Tp)>
> > >
> > > And __is_object built-in trait should be like:
> > >
> > > !(type1 == FUNCTION_TYPE || type1 == ...)
> > >
> > > In this case, could someone tell me which one would be faster? Or, is
> > > there no other way to know which but to benchmark?
> >
> > You should benchmark it anyway, I was always expecting that to be a
> > part of this GSoC project :-)
> >
> > But is_object is NOT a "relatively simple" trait. What you show above
> > is very complex. One of the more complex traits we have. Partial
> > specializations are quite fast to match (in general) so eliminating
> > partial speclializations (e.g. like we have for is_const) should not
> > be the goal.
> >
> > For is_object<const int> we instantiate:
> >
> > is_function<const int>
> > is_const<const int>
> > is_reference<const int>
> > is_void<const int>
> > __bool_constant<false>
> > __or_<false_type, false_type, false_type>
> > __not_<false_type>
> > __bool_constant<true>
> >
> > This is a ton of work! Instantiating class templates is the slowest
> > part of trait evaluation, not matching partial specializations.
>
> And then if the same program also instantiates is_object<int> (without
> the const), then currently we instantiate:
>
> is_function<int>
> is_const<int>
> is_reference<int>
> is_void<int>
> __bool_constant<false>
> __or_<false_type, false_type, false_type>
> __not_<false_type>
> __bool_constant<true>
>
> The first four instantiations are not shared with is_object<const
> int>, so we have to instantiate them anew. The last four are common
> with is_object<const int> so don't need to be instantiated again, the
> compiler will have cached those instantiations.
> But if the same program also uses is_object<long> then we have another
> four new instantiations to generate. And another four for
> is_object<float>. And another four for is_object<std::string> etc.
> etc.
>
> With a built-in they will all use __bool_constant<true> and nothing
> else (apart from the top-level is_object<T> specialization itself,
> which is unavoidable* since that's the trait actually being
> evaluated).
>
> * In fact, it's not really unavoidable, MSVC avoids it entirely. Their
> compiler pattern matches the standard traits and never even
> instantiates them, so every use of is_object<T>::value gets expanded
> directly to __is_object(T) without instantiating anything. We don't do
> that, and if we just replace turn 8 or 9 class template instantiations
> into 1 then we'll be doing great.

Thank you so much for your detailed explanation! I totally did not
understand well what class templates were doing! (And yes, I need to
prepare my environment to take benchmarks...)

So, do we expect to have __is_object built-in trait? Since
std::is_object is a combination of multiple traits, would doing the
following be the best implementation over implementing __is_object(T)?
(you told me that built-ins make the compiler slightly bigger and
slower)

__bool_constant<!(__is_function(_Tp) || __is_reference ...)>

This would instantiate only __bool_constant<true> and
__bool_constant<false>, which can be mostly shared, and we can also
avoid adding an additional built-in.

Sincerely,
Ken Matsui

Reply via email to