On Sun, Oct 28, 2018 at 8:42 AM Andy Polyakov via cfarm-users
<cfarm-users@lists.tetaneutral.net> wrote:
>
> > The initial benchmarks are kind of flat when using 3.8 GHz as the
> > frequency. I think the problem is, we are not working the machine hard
> > enough so the cpu's are reluctant to move from a low energy state.
>
> I'd say it's more likely because POWER9 appears to be "allergic" to
> mixtures of vector and scalar instructions. And since you are likely to
> reference memory you will always have scalar instructions at least to
> calculate effective addresses. Normalized[!] difference to POWER8 can be
> anywhere from "little" to a "lot". Example of "little" can be ~15% in
> SHA512(*) and VSX Chacha20. Example of "lot" is ~50% for pre-VSX
> Chacha20 implementation where one interleaves scalar and vector in more
> or less equal proportion. Though on the other hand pure scalar code is
> normally faster...

Thanks Andy.

That's what we are seeing. AES and SHA slowed down, and the ChaChaR
sped-up (even the SIMD version of ChaCha benefited).

Jeff
_______________________________________________
cfarm-users mailing list
cfarm-users@lists.tetaneutral.net
https://lists.tetaneutral.net/listinfo/cfarm-users

Reply via email to