[Bug libstdc++/115444] std::copy_n generates more code than needed

redi at gcc dot gnu.org via Gcc-bugs Thu, 27 Jun 2024 05:20:32 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115444


--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Arthur O'Dwyer from comment #0)
> But if you compile with `-std=c++20 -ULESS`, you get more verbose codegen:
> first a call to `It::operator+(int)`, then a call to `It::operator-(It)`,
> and then the same loop as before. That is, it appears that libstdc++ is
> "lowering" `std::copy_n(first, n, dest)` into `std::copy(first, first+n,
> dest)`

Yes, so that std::copy_n benefits from the same memmove optimization as
std::copy.

> and then "lowering" that back into `std::copy_n(first, first+n -
> first, dest)` before finally getting to the loop.

No, we just do last - first in std::copy if we have random access iterators:

          for(_Distance __n = __last - __first; __n > 0; --__n)
            {
              *__result = *__first;
              ++__first;
              ++__result;
            }

We could just use while (__first != __last) instead, but that would remove a
very intentional "optimization" that's explicitly mentioned in a comment:

  // All of these auxiliary structs serve two purposes.  (1) Replace
  // calls to copy with memmove whenever possible.  (Memmove, not memcpy,
  // because the input and output ranges are permitted to overlap.)
  // (2) If we're using random access iterators, then write the loop as
  // a for loop with an explicit count.

If we stopped using a loop with an explicit count then we could get rid of the
partial specializations for random access iterators, as they'd now be
equivalent to the default loop for input iterators.

That wouldn't change the fact that copy_n dispatches to copy though, so
wouldn't do anything about the s+1000 case.

[Bug libstdc++/115444] std::copy_n generates more code than needed

Reply via email to