[PATCH 00/12] AVX10.2: Support new instructions

Haochen Jiang Mon, 19 Aug 2024 01:57:33 -0700

Hi all,

The AVX10.2 ymm rounding patches has been merged to trunk around
6 hours ago. As mentioned before, next step will be AVX10.2 new
instruction support.


This patch series could be divided into three part.

The first patch will refactor m512-check.h under testsuite to reuse
AVX-512 helper functions and unions and avoid ABI warnings when using
AVX10.

The following ten patches will support all AVX10.2 new instrctions,
including:

  - AI Datatypes, Conversions, and post-Convolution Instructions.
  - Media Acceleration.
  - IEEE-754-2019 Minimum and Maximum Support.
  - Saturating Conversions.
  - Zero-extending Partial Vector Copies.
  - FP Scalar Comparison.

For FP Scalar Comparison part (a.k.a comx instructions), we will only
provide pattern support but not intrin support since it is redundant
with comi ones for common usage. We will also add some optimizations
afterwards for common usage with comx instructions. If there are some
strong requests, we will add intrin support in the future.

The final patch will add bf8 -> fp16 intrin for convenience. Since the
conversion from bf8 to fp16 is only casting for fraction part due to
same bits for exponent part, we will use a sequence of instructions
instead of new instructions. It is just like the scenario for bf16 ->
fp32 conversion.

After all these patch merged, the next step would be optimizations based
on AVX10.2 new instructions, including vnni vectorization, bf16
vectorization, comx optmization, etc.

Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk?

Thx,
Haochen

[PATCH 00/12] AVX10.2: Support new instructions

Reply via email to