Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

Jakub Jelinek Tue, 03 Jul 2018 14:40:50 -0700

On Tue, Jul 03, 2018 at 10:02:47PM +0100, Jonathan Wakely wrote:
> +#ifndef _GLIBCXX_BIT
> +#define _GLIBCXX_BIT 1
> +
> +#pragma GCC system_header
> +
> +#if __cplusplus >= 201402L
> +
> +#include <type_traits>
> +#include <limits>
> +
> +namespace std _GLIBCXX_VISIBILITY(default)
> +{
> +_GLIBCXX_BEGIN_NAMESPACE_VERSION
> +
> +  template<typename _Tp>
> +    constexpr _Tp
> +    __rotl(_Tp __x, unsigned int __s) noexcept
> +    {
> +      constexpr auto _Nd = numeric_limits<_Tp>::digits;
> +      const unsigned __sN = __s % _Nd;
> +      if (__sN)
> +        return (__x << __sN) | (__x >> (_Nd - __sN));


Wouldn't it be better to use some branchless pattern that
GCC can also optimize well, like:
      return (__x << __sN) | (__x >> ((-_sN) & (_Nd - 1)));
(iff _Nd is always power of two), or perhaps
      return (__x << __sN) | (__x >> ((-_sN) % _Nd));
which is going to be folded into the above one for power of two constants?
E.g. ia32intrin.h also uses:
/* 64bit rol */
extern __inline unsigned long long
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
__rolq (unsigned long long __X, int __C)
{
  __C &= 63;
  return (__X << __C) | (__X >> (-__C & 63));
}
etc.

        Jakub

Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

Reply via email to