------- Comment From andreas.kreb...@de.ibm.com 2018-11-27 03:43 EDT------- There were some z13 related patches from IBM and modified versions of that got upstream. e.g. https://sourceforge.net/p/math-atlas/patches/74/
>From our experiments the z13 support does not appear to be functional in current versions of libatlas. We are working on a patch including also z14 support. First measurements show improvements of up to 5x compared to the libatlas version from 18.04. . There is one major problem to fix. After that we plan to bring these patches upstream. We also plan to provide the proper tuning files for z13 and z14. These will allow building libatlas tuned for z13/z14 without actually requiring such a machine during build. If such tuning files are missing during build phase libatlas will try to run a lengthy tuning run. This step alone would take more than a day per machine. In order to make use of this in a distro I think having separate libraries is the way to go. All the infrastructure is in place for many years. The dynamic loader already checks various lib subdirs depending on the hardware capabilities. Just placing a library version in the proper subdir will do the trick. No runtime overhead. No extra memory occupied. wrt IFUNC: This is good for function level optimizations. I.e. if there is a subset of functions in a lib which would benefit from machine optimizations this is the way to go. We do this already in glibc. But in case of libatlas IFUNC would be needed for basically everything in the lib. In the end you would have a lib 3 times the size of the current version plus one more for every new CPU level. Without investing some effort in libatlas to keep the ifunc versions for one CPU level together in one particular area (an ELF section probably) the different versions would end up being interleaved in the binary forcing the entire thing to occupy memory. But the toughest part would probably be to make libatlas build differently tuned versions of a function. The libatlas tuning is not just about building something with different compiler options. libatlas generates code itself depending on cache characteristics, availability of multiply and add, vector instructions and measurements done during tuning. The build mechanism is supposed to produce on lib currently and would require substantial changes to perform the tuning on a function level base. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1803077 Title: libatlas not using vector instructions - large performance impact To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1803077/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs