On Thu, 26 May 2022 15:20:29 +0200 Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> > +#else > > + /* Slower method requiring floating point divide > > + * > > Do you know how much slower? I ran rand_perf_test on two of my systems. > > AMD 5900X Pi4 (ARM Cortex-A72) > IEEE754 version 12 1.19 > Non-IEEE754 version 11 1.16 > Naive version* 24 1.16 > > * (double)rte_rand() / (double)UINT64_MAX > > Numbers are TSC cycles/op. > > Surprisingly, it seems like the IEEE754 version is slower on both of > these machines. > > Do you have a machine (or a different use case) where the supposedly > more optimized version actually runs faster? The direct method is based off the concept used by glibc and others and the divide (including spelling error) are from FreeBSD. Be careful with micro benchmarks. A better one would be do rte_drand() compared with something to check whether it is in range.