Hi,

I have posted v3 patches with changes I have made locally for first 6
patches, and I think this series
is ready to land, in addition to
https://gcc.gnu.org/pipermail/libstdc++/2025-July/062727.html, that I
already reviewed.
I will keep aligned accessor on top of it.

As the separate commit, we should update the __fwd_partial_prod and
__rev_partial_prod computation
to use fact that this is partial prod, and we have previous partial
product computed:
    template<array _Extents>
      constexpr auto __fwd_partial_prods = [] consteval
        {
          constexpr size_t __rank = _Extents.size();
          std::array<size_t, __rank + 1> __ret;
          __ret[0] = 1;
          for(size_t __r = 1; __r < __rank + 1; ++__r)
            if (_Extents[__r] != dynamic_extents)
              __ret[__r] = __ret[0] * _Extents[__r];
          return __ret;
        }();
We are doing this at compile time, but this should help composition speed.
I would then inline __static_prod loop into the __mdspan::__size function
and remove that function
entirely.


I would be interested if we could reduce the __fwd/rev_partial_prod sizes,
but not keep the outermost dimensions there,
i.e. adding a check for __r == 0 early in the __fwd_prod funciton.
        constexpr size_t __rank = _Extents::rank();
        constexpr auto& __sta_exts = __static_extents<_Extents>();
        if constexpr (__rank == 1)
          return 1;
       // new if here, that is extracted from __rank == 2
        else if (__r == 0)
          return 1;
        else if constexpr (__rank == 2)
          return _exts.extent(0);
        else if constexpr
(__all_dynamic(std::span(__sta_exts).first(__rank-1)))
          return __extents_prod(__exts, 1, 0, __r);
        else
          {
            size_t __sta_prod = __fwd_partial_prods<__sta_exts>[__r-1]; //
size reduce by one here
            return __extents_prod(__exts, __sta_prod, 0, __r);
          }

Similarly for __rev_prod:
        constexpr size_t __rank = _Extents::rank();
        constexpr auto& __sta_exts = __static_extents<_Extents>();
        if constexpr (__rank == 1)
          return 1;
       else if (__r == __rank - 1)
          return 1;
        else if constexpr (__rank == 2)
          return __exts.extent(1);
        else if constexpr
(__all_dynamic(std::span(__sta_exts).last(__rank-1)))
          return __extents_prod(__exts, 1, __r + 1, __rank);
        else
          {
            size_t __sta_prod = __rev_partial_prods<__sta_exts>[__r-1]; //
size reduced by one here.
            return __extents_prod(__exts, __sta_prod, __r + 1, __rank);
          }

Regards,
Tomasz


On Mon, Aug 4, 2025 at 7:51 PM Luc Grosheintz <luc.groshei...@gmail.com>
wrote:

>
>
> On 8/4/25 17:42, Tomasz Kaminski wrote:
> > On Mon, Aug 4, 2025 at 1:14 PM Tomasz Kaminski <tkami...@redhat.com>
> wrote:
> >
> >>
> >>
> >> On Mon, Aug 4, 2025 at 1:08 PM Luc Grosheintz <luc.groshei...@gmail.com
> >
> >> wrote:
> >>
> >>> Hi Tomasz,
> >>>
> >>> Thank you for the review! Sorry about the missing parens, even after
> >>> "Ctrl+F"ing each of the emails I can't find the requires clause (the
> >>> two I found both need the parens).
> >>>
> >> I was mentioning this one, but it is possible that I am wrong on this
> one,
> >> I haven't yet checked it, as I was planning to do it during full review.
> >>      template<array _Extents>
> >>        requires (__all_dynamic<_Extents>())
> >>        class _StaticExtents<_Extents>;
> >> But regardless, I will make that change locally if it does work, so you
> do
> >> not
> >> need to update patch series for it.
> >>
> > Parenthesis seem to be indeed required here, so my comment was a red
> > herring.
>
> I'm pretty sure that this is the reason why I wrongly concluded
> that the pattern is `requires(cond)`, i.e. the parens are always
> needed.
>
> >
> >>
> >>> Do you think it's possible to merge this series before the other two
> >>> outstanding patch series? There's some risk of collision; making
> changes
> >>> to this patch series is much more time consuming than rebasing (and
> >>> retesting the other two patch series).
> >>>
> >> This sounds very reasonable. I will prepare a stack of patches in your
> >> suggested
> >> order locally, and push them once they are approved. But it still make
> >>
> >>>
> >>> Thank you,
> >>>
> >> Thank you for your continued contributions.
> >>
> >>> Luc
> >>>
> >>> On 8/4/25 12:43, Tomasz Kaminski wrote:
> >>>> Hi,
> >>>>
> >>>> I this time I made a quick pass through all changes, before commenting
> >>> on
> >>>> first commits,
> >>>> and they look solid to me, and I haven't noticed anything I would like
> >>> to
> >>>> change (except parentheses
> >>>> around requires, but I will handle that locally). I will try to do a
> >>> full
> >>>> review during this week.
> >>>>
> >>>> Regards,
> >>>> Tomasz
> >>>>
> >>>> On Sun, Aug 3, 2025 at 10:59 PM Luc Grosheintz <
> >>> luc.groshei...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> The combined effect of this sequence of change is:
> >>>>>
> >>>>>     * a reduction in the number of template instantiations, by
> >>>>>       - avoiding needless dependency of IndexType,
> >>>>>       - special formulas for low-rank extents,
> >>>>>       - special formulas for (nearly) fully dynamic extents.
> >>>>>
> >>>>>     * improved code quality, by
> >>>>>       - precomputing partial products of the static extents,
> >>>>>       - special cases for low-rank extents,
> >>>>>       - rewriting the condition E[i] == dynamic_extent in a more
> >>>>>         optimizer friendly manner.
> >>>>>       - effectively loop-unrolling extents::operator==.
> >>>>>
> >>>>> While simplistic micro-benchmarking shows the effectiveness of these
> >>>>> changes, likely the stronger argument is presented in each commit:
> >>>>>     a) each change removes needless complexity,
> >>>>>     b) before/after examples of generated code show the
> effectiveness.
> >>>>>
> >>>>> Luc Grosheintz (8):
> >>>>>     libstdc++: Reduce template instantiations in <mdspan>.
> >>>>>     libstdc++: Precompute products of static extents.
> >>>>>     libstdc++: Improve low-rank layout_{left,right}::stride.
> >>>>>     libstdc++: Improve fully dynamic extents in mdspan.
> >>>>>     libstdc++: Improve nearly fully dynamic extents in mdspan.
> >>>>>     libstdc++: Reduce indirection in extents::extent.
> >>>>>     libstdc++: Improve extents::operator==.
> >>>>>     libstdc++: Replace numeric_limit with __int_traits in mdspan.
> >>>>>
> >>>>>    libstdc++-v3/include/std/mdspan               | 282
> >>> +++++++++++++-----
> >>>>>    .../mdspan/extents/class_mandates_neg.cc      |   3 +
> >>>>>    2 files changed, 208 insertions(+), 77 deletions(-)
> >>>>>
> >>>>> --
> >>>>> 2.50.0
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >
>
>

Reply via email to