https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
(writing down some notes)

Calling

      size_type
      _M_check_len_one(const char* __s) const
      {
        if (max_size() - size() < 1)
          __throw_length_error(__N(__s));

        const size_type __len = size() + (std::max)(size(), (size_t)1);
        return (__len > max_size()) ? max_size() : __len;
      }

instead of _M_check_len reduces the running time of this micro-benchmark

#include <vector>
int main(){
  volatile int a=0;
  for(int i=0;i<1000000;++i){
    std::vector<int> v;
    for(int j=0;j<1000;++j){
      v.push_back(j);
    }
    a=v[a];
  }
}

from .88s to .66s at -O3. Two key elements (the perf gain only comes if we do
both) are removing the overflow check, and having the comparison between size
and max_size optimized to be done on byte length (not divided by the element
size).

I think the overflow check could be removed from the normal _M_check_len: we
have already checked that max_size() - size() >= __n so size() + __n cannot
overflow, and size() must be smaller than max_size(), which should be at most
SIZE_MAX/2, at least if ptrdiff_t and size_t have the same size, so size() +
size() cannot overflow either.
I should check if the compiler could help more. It is supposed to know how to
optimize .ADD_OVERFLOW based on the range of the operands.

I suspect that a single_use restriction explains why max_size() == size()
compares values without division while max_size() - size() < __n (for __n = 1)
doesn't.

Reply via email to