Issue |
123855
|
Summary |
[libc++] Question about __libcpp_timed_backoff_policy implementation
|
Labels |
libc++
|
Assignees |
|
Reporter |
bobsayshilol
|
I'm not sure if this is the right place to ask this, but I tried using `std::barrier` and observed large sleep calls that I wasn't expecting when calling `wait()`. The issue seems to occur because of the interaction of the 2 largest checks in `__libcpp_timed_backoff_policy`:
https://github.com/llvm/llvm-project/blob/635e154bbc94342080ccba583ff6fb16ea364f4b/libcxx/include/__thread/timed_backoff_policy.h#L28-L31
Once `__elapsed` goes over 64 *micro*seconds we then backoff at increasing intervals until it reaches 128 *milli*seconds, at which point we then cap it to only 8ms. This means that it can sleep for up to 64ms before dropping down to 8ms.
As an example repro case:
```c++
int main() {
// In practice this doesn't get close to 1M entries, but play it safe.
std::vector<chrono::nanoseconds> timings(1'000'000);
std::size_t counter = 0;
chrono::nanoseconds last_elapsed = chrono::nanoseconds::zero();
auto capture_backoff_timings = [&](chrono::nanoseconds elapsed) {
// Save elapsed time to avoid an additional now() call in __poll().
last_elapsed = elapsed;
// Add the elapsed time of this call to the timings.
timings[counter] = elapsed;
++counter;
return __libcpp_timed_backoff_policy{}(elapsed);
};
auto poll_for_200ms = [&] {
// As cheap as we can make it - this is just an atomic load in std::barrier::wait().
return last_elapsed > chrono::milliseconds(250);
};
__libcpp_thread_poll_with_backoff(poll_for_200ms, capture_backoff_timings);
// Sanity check.
assert(counter < timings.size());
timings.resize(counter);
// Convert timings to how long the sleeps are.
std::vector<chrono::nanoseconds> sleeps(counter);
std::adjacent_difference(timings.begin(), timings.end(), sleeps.begin());
// Print largest sleep call.
auto max_sleep = std::max_element(sleeps.begin(), sleeps.end());
auto &os = std::cout;
os << "max sleep: "
<< chrono::duration_cast<chrono::milliseconds>(*max_sleep).count()
<< "ms\n";
#if 0 // Print timings too for graphing.
os << "timings = [";
for (auto &t : timings) {
os << t.count() << ',';
}
os << "]\nsleeps = [";
for (auto &s : sleeps) {
os << s.count() << ',';
}
os << "]\n";
#endif
}
```
Running this locally prints `max sleep: 59ms` (it's consistently in the 40-60ms range) and plotting out the timings (with 0 markers) gives this:

I don't know if `chrono::milliseconds(128)` is meant to be in `microseconds` or if it should be `chrono::milliseconds(8)` to match the final backoff value, or if this is intentionally a large wait? I couldn't see anything on the PR that introduced this code https://reviews.llvm.org/D68480, though the "Show Older Changes" button doesn't seem to work so it might be hidden in there.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs