On Fri, Apr 16, 2021 at 06:05:10PM -0400, Len Brown wrote: > I'm not aware of any intent to transparently use AMX for bcopy, like > what happened > with AVX-512. (didn't they undo that mistake?)
No clue, did they? > Tasks are created without an 8KB AMX buffer. > Tasks have to actually touch the AMX TILE registers for us to allocate > one for them. When tasks do that it doesn't matter too much - for the library it does! If the library does that by default and the processes which comprise of that pipe I mentioned earlier, get all 8K buffers because the underlying library decided so and swinging those buffers around when saving/restoring contexts turns out to be a performance penalty, then we have lost. Lost because if that thing goes upstream in this way of use of AMX is allowed implicitly, there ain't fixing it anymore once it becomes an ABI. So, that library should ask the kernel whether it supports AMX and only use it if has gotten a positive answer. And by default that answer should be "no" because the majority of processes - that same pipe I keep mentioning - don't need it. I have no good idea yet how granulary that should be - per process, per thread, whatever, but there should be a way for the kernel to control whether the library uses AMX, AVX512 or whatever fat state is out there available. Then, if a process wants the library to use AMX on its behalf, then it can say so and the library can do that but only after having asked for explicitly. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette