* Yang Zhong (yang.zh...@linux.intel.com) wrote: > On Sat, Sep 24, 2022 at 12:01:16AM +0800, Xiaoyao Li wrote: > > On 9/23/2022 9:30 PM, Yang Zhong wrote: > > > On Wed, Sep 21, 2022 at 03:51:42PM +0100, Dr. David Alan Gilbert wrote: > > > > * Wang, Lei (lei4.w...@intel.com) wrote: > > > > > The new CPU model mostly inherits features from Icelake-Server, while > > > > > adding new features: > > > > > - AMX (Advance Matrix eXtensions) > > > > > - Bus Lock Debug Exception > > > > > and new instructions: > > > > > - AVX VNNI (Vector Neural Network Instruction): > > > > > - VPDPBUS: Multiply and Add Unsigned and Signed Bytes > > > > > - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with > > > > > Saturation > > > > > - VPDPWSSD: Multiply and Add Signed Word Integers > > > > > - VPDPWSSDS: Multiply and Add Signed Integers with Saturation > > > > > - FP16: Replicates existing AVX512 computational SP (FP32) > > > > > instructions > > > > > using FP16 instead of FP32 for ~2X performance gain > > > > > - SERIALIZE: Provide software with a simple way to force the > > > > > processor to > > > > > complete all modifications, faster, allowed in all privilege > > > > > levels and > > > > > not causing an unconditional VM exit > > > > > - TSX Suspend Load Address Tracking: Allows programmers to choose > > > > > which > > > > > memory accesses do not need to be tracked in the TSX read set > > > > > - AVX512_BF16: Vector Neural Network Instructions supporting > > > > > BFLOAT16 > > > > > inputs and conversion instructions from IEEE single precision > > > > > > > > > > Features may be added in future versions: > > > > > - CET (virtualization support hasn't been merged) > > > > > Instructions may be added in future versions: > > > > > - fast zero-length MOVSB (KVM doesn't support yet) > > > > > - fast short STOSB (KVM doesn't support yet) > > > > > - fast short CMPSB, SCASB (KVM doesn't support yet) > > > > > > > > > > Signed-off-by: Wang, Lei <lei4.w...@intel.com> > > > > > Reviewed-by: Robert Hoo <robert...@linux.intel.com> > > > > > > > > Hi, > > > > What fills in the AMX tile and tmul information leafs > > > > (0x1D, 0x1E)? > > > > In particular, how would we make sure when we migrate between two > > > > generations of AMX/Tile/Tmul capable devices with different > > > > register/palette/tmul limits that the migration is tied to the CPU type > > > > correctly? > > > > Would you expect all devices called a 'SappireRapids' to have the > > > > same > > > > sizes? > > > > > > > > > > There is only one palette in current design. This palette include 8 > > > tiles. Those two CPUID leafs defined bytes_per_tile, > > > total_tile_bytes, > > > max_rows and etc, the AMX tool will configure those values into > > > TILECFG with > > > ldtilecfg instrcutions. Once tiles are configured, we can use > > > tileload instruction to load data into those tiles. > > > > > > We did migration between two SappireRapids with amx self test tool > > > (tools/testing/selftests/x86/amx.c)started in two sides, the migration > > > work well. > > > > > > As for SappireRapids and more newer cpu types, those two CPUID leafs > > > definitions are all same on AMX. > > > > I'm not sure what definitions mean here. Are you saying the CPUID values of > > leaf 0x1D and 0x1E won't change for any future Intel Silicion? > > > > Personally, I doubt it. And we shouldn't take such assumption unless Intel > > states it SDM. > > The current 0x1D and 0x1E definitions as below: > > /* CPUID Leaf 0x1D constants: */ > #define INTEL_AMX_TILE_MAX_SUBLEAF 0x1 > #define INTEL_AMX_TOTAL_TILE_BYTES 0x2000 > #define INTEL_AMX_BYTES_PER_TILE 0x400 > #define INTEL_AMX_BYTES_PER_ROW 0x40 > #define INTEL_AMX_TILE_MAX_NAMES 0x8 > #define INTEL_AMX_TILE_MAX_ROWS 0x10 > > /* CPUID Leaf 0x1E constants: */ > #define INTEL_AMX_TMUL_MAX_K 0x10 > #define INTEL_AMX_TMUL_MAX_N 0x40 > > These values are defined from SDM, and from the new developping CPU, > these values are still same with SappireRapids. thanks!
But there's nothing stopping them increasing in future versions ? Dave > Yang > > > > > So, on AMX perspective, the migration > > > should be workable on subsequent cpu types. thanks! > > > > I think what Dave worried is that when migrating one VM created with > > "SapphireRapids" model on SPR machine to some newer platform in the future, > > where the newer platform reports different value on CPUID leaves 0x1D and > > 0x1E than SPR platform. > > > > I think we need to contain CPUID leaves 0x1D and 0x1E into CPU model as > > well. Otherwise we will hit the same as Intel PT that SPR reports less > > capabilities that ICX. > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK