Use a better algorithm to calculate n % 7, taking into account that the divisor
is a Mersenne number. This is explained in this two-part lightning talk at
CppCon 2024 [1, 2].

Roughly speaking, the algorithm is used for n = sys_days but it has a
precondition that doesn't hold for all values of sys_days. However, it does hold
for every value coming out of year_month_day::operator sys_days().

Nevertheless, there's an issue because the conversion is done in two steps:
year_month_day -> sys_days -> weekday and in the second step the information
that the input sys_days was obtained from the first step (so the precondition
holds) is lost. In the talk a single function performs the two steps and uses
the optimization. Unless the standard adds a similar function (e.g. a weekday
constructor) or this is added as an extension (not done here), this is not an
option for libstdc++.

The issue is adressed in the following way. At the end of the fist step, the
code informs the compiler, through an [[assume]] attribute that the precondition
holds. Then, at the beginning of the second step, the code checks, through
__builtin_constant_p, if the precondition holds, in which case, it uses the
better algorithm or, otherwise, it falls back to n % 7.

The idea is illustrated in [3] which compares code generation (with and without
the optimization) and provides an exhaustive test of correctness. A benchmark is
shown in [4].

References:

[1] https://youtu.be/64mTEXnSnZs
[2] https://youtu.be/bnVkWEjRNeI
[3] https://godbolt.org/z/E14asPK8r
[4] https://quick-bench.com/q/vzxlrLaEccner3h-ahIrbB0BZvE

libstdc++-v3/ChangeLog:

        * include/std/chrono: Add postcondition to year_month_date::operator
        sys_days and check this condition in weekday::_S_from_days.
---
The approach might be somewhat controversial and might need to be discussed. 
The problem is that passing information from [[assume]] to __builtin_constant_p 
is
quite fragile in 3 different ways.

1) The distance in source code between [[assume]] and __builtin_constant_p makes
hard to understand why they are there in the first place. A reader who doesn't
know the full picture might feel tempted to remove either of these lines.

2) Logically, the post-condition expressed by [[assume]] only needs to imply the
condition tested by __builtin_constant_p. However, even whey they are obviously
equivalent for a human, this might not be the case for the compiler. Even
slight cosmetic changes might break the compiler's chain of thought and thus,
the optimization.

3) Similarly to 2, changes in the compiler code that handles [[assume]] or 
__builtin_constant_p might break the optimization.

If any of the issues above happens, the result will still be right but a
performance regression will appear. Comments can mitigate the issues but a
performance test or a test for code generation would be more effective to
prevent performance regressions. Unfortunately, I don't know how either of these
can be done.

 libstdc++-v3/include/std/chrono | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 8eb9fd9baac..e762dd46cc4 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -984,7 +984,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       static constexpr weekday
       _S_from_days(const days& __d)
       {
-       return weekday{__detail::__add_modulo<7>(4, __d.count())};
+       const auto __days = __d.count();
+
+       const auto __n = static_cast<unsigned>(__days) + 12'687'434u;
+       const auto __use_fast_mod7 = __n < 178'956'973u;
+       // This condition is true when __d = __dp.time_since_epoch() where __dp
+       // comes from year_month_day::operator sys_days().
+       if (__builtin_constant_p(__use_fast_mod7))
+       {
+         const auto __wd = 613'566'757u * __n >> 29;
+         [[assume(__wd != 7)]];
+         return weekday{__wd};
+       }
+
+       return weekday{__detail::__add_modulo<7>(4, __days)};
       }
 
     public:
@@ -1652,7 +1665,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       constexpr
       operator sys_days() const noexcept
-      { return sys_days{_M_days_since_epoch()}; }
+      {
+       const auto __days = _M_days_since_epoch();
+
+       // The assume attribute below allows weekday::_S_from_days to use a
+       // faster algorithm for modulus 7.
+       const auto __n = static_cast<unsigned>(__days.count()) + 12'687'434u;
+       const auto __use_fast_mod7 = __n < 178'956'973u;
+       [[assume(__use_fast_mod7)]];
+
+       return sys_days{__days};
+      }
 
       explicit constexpr
       operator local_days() const noexcept
-- 
2.49.0

Reply via email to