On Sun, Oct 28, 2018 at 8:42 AM Andy Polyakov via cfarm-users <cfarm-users@lists.tetaneutral.net> wrote: > > > The initial benchmarks are kind of flat when using 3.8 GHz as the > > frequency. I think the problem is, we are not working the machine hard > > enough so the cpu's are reluctant to move from a low energy state. > > I'd say it's more likely because POWER9 appears to be "allergic" to > mixtures of vector and scalar instructions. And since you are likely to > reference memory you will always have scalar instructions at least to > calculate effective addresses. Normalized[!] difference to POWER8 can be > anywhere from "little" to a "lot". Example of "little" can be ~15% in > SHA512(*) and VSX Chacha20. Example of "lot" is ~50% for pre-VSX > Chacha20 implementation where one interleaves scalar and vector in more > or less equal proportion. Though on the other hand pure scalar code is > normally faster...
Thanks Andy. That's what we are seeing. AES and SHA slowed down, and the ChaChaR sped-up (even the SIMD version of ChaCha benefited). Jeff _______________________________________________ cfarm-users mailing list cfarm-users@lists.tetaneutral.net https://lists.tetaneutral.net/listinfo/cfarm-users