Issue |
131692
|
Summary |
[libc++] {std, ranges}::copy fails to copy vector<bool> correctly with small storage types
|
Labels |
libc++
|
Assignees |
|
Reporter |
winner245
|
The current implementation of `{std, ranges}::copy` does not correctly handle `vector<bool>` when the underlying storage type (`__storage_type`) is smaller than `int`, e.g., `unsigned char`, `unsigned short`, `uint8_t` and `uint16_t`.
##### Reproducer
A minimal reproducer is as follows ([Godbolt Link](https://godbolt.org/z/aoWThxbxr)):
```cpp
#include <limits>
#include <memory>
#include <vector>
#include <algorithm>
#include <cassert>
#include <cstdint>
#include <iostream>
template <typename T, typename Size, typename Difference>
class sized_allocator {
template <typename U, typename Sz, typename Diff>
friend class sized_allocator;
public:
using value_type = T;
using size_type = Size;
using difference_type = Difference;
using propagate_on_container_swap = std::true_type;
constexpr explicit sized_allocator(int d = 0) : data_(d) {}
template <typename U, typename Sz, typename Diff>
constexpr sized_allocator(const sized_allocator<U, Sz, Diff>& a) noexcept : data_(a.data_) {}
constexpr T* allocate(size_type n) {
if (n > max_size())
throw std::bad_array_new_length();
return std::allocator<T>().allocate(n);
}
constexpr void deallocate(T* p, size_type n) noexcept { std::allocator<T>().deallocate(p, n); }
constexpr size_type max_size() const noexcept {
return std::numeric_limits<size_type>::max() / sizeof(value_type);
}
private:
int data_;
constexpr friend bool operator==(const sized_allocator& a, const sized_allocator& b) {
return a.data_ == b.data_;
}
constexpr friend bool operator!=(const sized_allocator& a, const sized_allocator& b) {
return a.data_ != b.data_;
}
};
int main() {
using Alloc = sized_allocator<bool, std::uint8_t, std::int8_t>;
std::vector<bool, Alloc> in(8, false, Alloc(1));
std::vector<bool, Alloc> out(8, true, Alloc(1));
std::copy(in.begin(), in.begin() + 1, out.begin()); // out[0] = false
// We only assigned a single bit, but entire byte got zeroed!
for (std::size_t i = 0; i < out.size(); ++i)
std::cout << out[i];
std::cout << '\n';
}
```
In this example, a single bit from `vector<bool>` is copied and set to `false` using `std::copy`. However, due to incorrect behavior in `std::copy`, all bits in the whole storage word are zeroed.
##### Root Cause Analysis
The root cause in this specific example lies in the following problematic bit mask computation in `std::copy`:
https://github.com/llvm/llvm-project/blob/584f8cc30554c89fdd27cc9e527416a6e4e2cc45/libcxx/include/__algorithm/copy.h#L56
The evaluation of `~__storage_type(0)` is first integral-promoted to `-1`, which is subsequently left shifted and right-shifted. However, left-shifting a negative value leads to UB before C++20 [1], and right-shifting a negative value results in sign-bit extension since C++20 [2]. Due to sign-bit extension, all bits in the mask are set to `1`, making the mask incorrect and leading to erroneous copying behavior.
It appears that Clang exhibits consistent incorrect behavior in `std::copy` even before C++20. I believe this is because Clang already performed sign-bit extension in right-shift operations even before C++20.
There are several other bitwise operations in `std::copy` that are similarly flawed and need to be fixed.
[1] [[expr.shift]/3](https://eel.is/c++draft/expr.shift#note-2)
> Right-shift on signed integral types is an arithmetic right shift, which performs sign-extension[.](https://eel.is/c++draft/expr.shift#3.sentence-3)
[2] [cppreference](https://en.cppreference.com/w/cpp/language/operator_arithmetic#Bitwise_shift_operators)
> For negative a, the behavior of a << b is undefined. (until C++20)
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs