On Sat, Oct 10, 2015 at 09:58:47PM -0400, Ganesh Ajjanagadde wrote: > This uses Stein's binary GCD algorithm: > https://en.wikipedia.org/wiki/Binary_GCD_algorithm > to get a roughly 4x speedup over Euclidean GCD on standard architectures > with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise. > At the moment, the compiler intrinsic is used on GCC and Clang due to > its easy availability. > > Quick note regarding overflow: yes, subtractions on int64_t can, but the > llabs takes care of that. The llabs is also guaranteed to be safe, with > no annoying INT64_MIN business since INT64_MIN being a power of 2, is > shifted down before being sent to llabs. > > The binary GCD needs ff_ctzll, an extension of ff_ctz for long long > (int64_t). On > GCC, this is provided by a built-in. On Microsoft, there is a > BitScanForward64 analog of BitScanForward that should work; but I can't > confirm. > Apparently it is not available on 32 bit builds; so this may or may not > work correctly. On Intel, per the documentation there is only an > intrinsic for _bit_scan_forward and people have posted on forums > regarding _bit_scan_forward64, but often their documentation is > woeful. Again, I don't have it, so I can't test. > > As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest > use a compiled version based on the De-Bruijn method of Leiserson et al: > http://supertech.csail.mit.edu/papers/debruijn.pdf. > > Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell) > with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a > make fate. > > aac-am00_88.err: > builtin: > 714 decicycles in av_gcd, 4095 runs, 1 skips > > de-bruijn: > 1440 decicycles in av_gcd, 4096 runs, 0 skips > > previous: > 2889 decicycles in av_gcd, 4096 runs, 0 skips > > Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> > --- > libavutil/intmath.h | 19 +++++++++++++++++++ > libavutil/mathematics.c | 26 +++++++++++++++++++++----- > 2 files changed, 40 insertions(+), 5 deletions(-)
applied thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB It is dangerous to be right in matters on which the established authorities are wrong. -- Voltaire
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel