On Sat, Mar 12, 2016 at 11:02 AM, Michael Niedermayer <mich...@niedermayer.cc> wrote: > On Sat, Mar 12, 2016 at 08:58:41AM -0500, Ganesh Ajjanagadde wrote: >> Code taken from the Julia project, licensed under MIT: >> https://github.com/JuliaLang/julia/blob/master/base/random.jl, in turn >> derived from: "The Ziggurat Method for generating random variables" - >> Marsaglia and Tsang. >> >> Paper and reference code: http://www.jstatsoft.org/v05/i08/. This is >> well known to be the fastest method empirically for generating normal random >> variables for a fixed PRNG source. >> >> Note that there are a large number of implementations with >> various tunings, this was one of the simpler ones and also has a friendly >> license. >> >> This results in ~ 3x speedup of random number generation: >> old: >> 15090 decicycles in av_bmg_get, 1 runs, 0 skips >> 13140 decicycles in av_bmg_get, 2 runs, 0 skips >> 10117 decicycles in av_bmg_get, 4 runs, 0 skips >> [...] >> 2133 decicycles in av_bmg_get, 524268 runs, 20 skips=60.4x >> 2134 decicycles in av_bmg_get, 1048531 runs, 45 skips=61.3x >> 2135 decicycles in av_bmg_get, 2097061 runs, 91 skips=61.9x >> >> new: >> 7650 decicycles in av_gaussian_get, 1 runs, 0 skips >> 5490 decicycles in av_gaussian_get, 2 runs, 0 skips >> 3982 decicycles in av_gaussian_get, 4 runs, 0 skips >> [...] >> 812 decicycles in av_gaussian_get, 524281 runs, 7 skips >> 813 decicycles in av_gaussian_get, 1048563 runs, 13 skips >> 813 decicycles in av_gaussian_get, 2097131 runs, 21 skips >> >> and accordingly a ~2% speedup in aac encoding (-march=native, Haswell, >> clang): >> >> old: >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac 5.30s user 0.02s system >> 99% cpu 5.322 total >> new: >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac 5.16s user 0.03s system >> 99% cpu 5.198 total >> >> Function added as av_gaussian_get with documentation, minor bumped. >> > [...] > >> +static inline double ziggurat(AVLFG *lfg) >> +{ >> + while (1) { > >> + uint64_t r = (((uint64_t)av_lfg_get(lfg) << 32) + av_lfg_get(lfg)) >> & 0x000fffffffffffff; > > the order of Function evaluations is undefined here i think
And why would that matter? Aren't these just approximations to truly uniform 32 bits? rand()<<32 + rand() is common: https://stackoverflow.com/questions/8120062/generate-random-64-bit-integer, and it is not undefined AFAIK, merely unspecified: https://stackoverflow.com/questions/24348383/order-of-execution-of-function-calls-within-a-line-undefined. > > > [...] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > I know you won't believe me, but the highest form of Human Excellence is > to question oneself and others. -- Socrates > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel