Issue 131692
Summary [libc++] {std, ranges}::copy fails to copy vector<bool> correctly with small storage types
Labels libc++
Assignees
Reporter winner245
    The current implementation of `{std, ranges}::copy` does not correctly handle `vector<bool>` when the underlying storage type (`__storage_type`) is smaller than `int`, e.g., `unsigned char`, `unsigned short`, `uint8_t` and `uint16_t`. 


##### Reproducer
A minimal reproducer is as follows ([Godbolt Link](https://godbolt.org/z/aoWThxbxr)):

```cpp
#include <limits>
#include <memory>
#include <vector>
#include <algorithm>
#include <cassert>
#include <cstdint>
#include <iostream>

template <typename T, typename Size, typename Difference>
class sized_allocator {
  template <typename U, typename Sz, typename Diff>
  friend class sized_allocator;

public:
  using value_type                  = T;
  using size_type                   = Size;
  using difference_type             = Difference;
  using propagate_on_container_swap = std::true_type;

  constexpr explicit sized_allocator(int d = 0) : data_(d) {}

  template <typename U, typename Sz, typename Diff>
  constexpr sized_allocator(const sized_allocator<U, Sz, Diff>& a) noexcept : data_(a.data_) {}

  constexpr T* allocate(size_type n) {
    if (n > max_size())
      throw std::bad_array_new_length();
 return std::allocator<T>().allocate(n);
  }

  constexpr void deallocate(T* p, size_type n) noexcept { std::allocator<T>().deallocate(p, n); }

  constexpr size_type max_size() const noexcept {
    return std::numeric_limits<size_type>::max() / sizeof(value_type);
 }

private:
  int data_;

  constexpr friend bool operator==(const sized_allocator& a, const sized_allocator& b) {
    return a.data_ == b.data_;
  }
  constexpr friend bool operator!=(const sized_allocator& a, const sized_allocator& b) {
    return a.data_ != b.data_;
  }
};

int main() { 
  using Alloc = sized_allocator<bool, std::uint8_t, std::int8_t>;
  std::vector<bool, Alloc> in(8, false, Alloc(1));
 std::vector<bool, Alloc> out(8, true, Alloc(1));
  std::copy(in.begin(), in.begin() + 1, out.begin()); // out[0] = false

  // We only assigned a single bit, but entire byte got zeroed!
  for (std::size_t i = 0; i < out.size(); ++i)
    std::cout << out[i];
  std::cout << '\n';
}

```
 

In this example, a single bit from `vector<bool>` is copied and set to `false` using `std::copy`. However, due to incorrect behavior in `std::copy`, all bits in the whole storage word are zeroed. 


##### Root Cause Analysis

The root cause in this specific example lies in the following problematic bit mask computation in `std::copy`: 

https://github.com/llvm/llvm-project/blob/584f8cc30554c89fdd27cc9e527416a6e4e2cc45/libcxx/include/__algorithm/copy.h#L56


The evaluation of `~__storage_type(0)` is first integral-promoted to `-1`, which is subsequently left shifted and right-shifted. However, left-shifting a negative value leads to UB before C++20 [1], and right-shifting a negative value results in sign-bit extension since C++20 [2]. Due to sign-bit extension, all bits in the mask are set to `1`, making the mask incorrect and leading to erroneous copying behavior. 

It appears that Clang exhibits consistent incorrect behavior in `std::copy` even before C++20. I believe this is because Clang already performed sign-bit extension in right-shift operations even before C++20. 


There are several other bitwise operations in `std::copy` that are similarly flawed and need to be fixed.



[1] [[expr.shift]/3](https://eel.is/c++draft/expr.shift#note-2)
> Right-shift on signed integral types is an arithmetic right shift, which performs sign-extension[.](https://eel.is/c++draft/expr.shift#3.sentence-3)

[2] [cppreference](https://en.cppreference.com/w/cpp/language/operator_arithmetic#Bitwise_shift_operators)

> For negative a, the behavior of a << b is undefined. (until C++20)

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to