Hi all,

I have just commited AVX10.2 new instructions patches into trunk hours
ago. The next and final part for AVX10.2 upstream is to optimize code
with AVX10.2 new instructions.

In this patch series, it will contain the following optimizations:

  - VNNI instruction auto vectorize (PATCH 1).
  - Codegen optimization with new scalar comparison instructions to
    eliminate redundant code (PATCH 2-3).
  - BF16 instruction auto vectorize (PATCH 4-8).

This will finish the upstream for AVX10.2 series.

Afterwards, we may add V2BF/V4BF in another thread just like what we
have done for V2HF/V4HF when AVX512FP16 upstreamed.

Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk?

Thx,
Haochen


Reply via email to