On Mon, Aug 19, 2024 at 4:57 PM Haochen Jiang <haochen.ji...@intel.com> wrote: > > Hi all, > > The AVX10.2 ymm rounding patches has been merged to trunk around > 6 hours ago. As mentioned before, next step will be AVX10.2 new > instruction support. > > This patch series could be divided into three part. > > The first patch will refactor m512-check.h under testsuite to reuse > AVX-512 helper functions and unions and avoid ABI warnings when using > AVX10. > > The following ten patches will support all AVX10.2 new instrctions, > including: > > - AI Datatypes, Conversions, and post-Convolution Instructions. > - Media Acceleration. > - IEEE-754-2019 Minimum and Maximum Support. > - Saturating Conversions. > - Zero-extending Partial Vector Copies. > - FP Scalar Comparison. > > For FP Scalar Comparison part (a.k.a comx instructions), we will only > provide pattern support but not intrin support since it is redundant > with comi ones for common usage. We will also add some optimizations > afterwards for common usage with comx instructions. If there are some > strong requests, we will add intrin support in the future. > > The final patch will add bf8 -> fp16 intrin for convenience. Since the > conversion from bf8 to fp16 is only casting for fraction part due to > same bits for exponent part, we will use a sequence of instructions > instead of new instructions. It is just like the scenario for bf16 -> > fp32 conversion. > > After all these patch merged, the next step would be optimizations based > on AVX10.2 new instructions, including vnni vectorization, bf16 > vectorization, comx optmization, etc. > > Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk? Ok for all 12 patches. > > Thx, > Haochen >
-- BR, Hongtao