On Tue, Jun 8, 2021 at 2:23 PM Matthias Kretz <m.kr...@gsi.de> wrote:
>
>
> From: Matthias Kretz <kr...@kde.org>
>
> Explicitly support use of the stdx::simd implementation in situations
> where the user links TUs that were compiled with different -m flags. In
> general, this is always a (quasi) ODR violation for inline functions
> because at least codegen may differ in important ways. However, in the
> resulting executable only one (unspecified which one) of them might be
> used. For simd we want to support users to compile code multiple times,
> with different -m flags and have a runtime dispatch to the TU matching
> the target CPU. But if internal functions are not inlined this may lead
> to unexpected performance loss or execution of illegal instructions.
> Therefore, inline functions that are not marked as always_inline must
> use an additional template parameter somewhere in their name, to
> disambiguate between the different -m translations.

Note that excessive use of always_inline can cause compile-time issues
(see for example PR99785).  I wonder whether the inlines can be
placed in an anonymous namespace instead of the difficult to maintain
explict list of SIMD features?  It also doesn't solve the issue when
instantiating the functions from a TU which contains #pragma GCC target
sections to switch options, of course.

Richard.

> Signed-off-by: Matthias Kretz <m.kr...@gsi.de>
>
> libstdc++-v3/ChangeLog:
>
>         * include/experimental/bits/simd.h: Move feature detection bools
>         and add __have_avx512bitalg, __have_avx512vbmi2,
>         __have_avx512vbmi, __have_avx512ifma, __have_avx512cd,
>         __have_avx512vnni, __have_avx512vpopcntdq.
>         (__detail::__machine_flags): New function which returns a unique
>         uint64 depending on relevant -m and -f flags.
>         (__detail::__odr_helper): New type alias for either an anonymous
>         type or a type specialized with the __machine_flags number.
>         (_SimdIntOperators): Change template parameters from _Impl to
>         _Tp, _Abi because _Impl now has an __odr_helper parameter which
>         may be _OdrEnforcer from the anonymous namespace, which makes
>         for a bad base class.
>         (many): Either add __odr_helper template parameter or mark as
>         always_inline.
>         * include/experimental/bits/simd_detail.h: Add defines for
>         AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD,
>         AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT.
>         * include/experimental/bits/simd_builtin.h: Add __odr_helper
>         template parameter or mark as always_inline.
>         * include/experimental/bits/simd_fixed_size.h: Ditto.
>         * include/experimental/bits/simd_math.h: Ditto.
>         * include/experimental/bits/simd_scalar.h: Ditto.
>         * include/experimental/bits/simd_neon.h: Add __odr_helper
>         template parameter.
>         * include/experimental/bits/simd_ppc.h: Ditto.
>         * include/experimental/bits/simd_x86.h: Ditto.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 380 ++++++++++++------
>  .../include/experimental/bits/simd_builtin.h  |  41 +-
>  .../include/experimental/bits/simd_detail.h   |  40 ++
>  .../experimental/bits/simd_fixed_size.h       |  39 +-
>  .../include/experimental/bits/simd_math.h     |  45 ++-
>  .../include/experimental/bits/simd_neon.h     |   4 +-
>  .../include/experimental/bits/simd_ppc.h      |   4 +-
>  .../include/experimental/bits/simd_scalar.h   |  71 +++-
>  .../include/experimental/bits/simd_x86.h      |   4 +-
>  9 files changed, 440 insertions(+), 188 deletions(-)
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  std::experimental::simd              https://github.com/VcDevel/std-simd
> ──────────────────────────────────────────────────────────────────────────

Reply via email to