On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote: > Signed-off-by: James Almer <jamr...@gmail.com> > --- > I decided to go the configure route since other features (cmov, clz) also do > it , but if prefered this could instead be done with a new intmath.h header > in the x86/ folder containing something like > > #if defined(__GNUC__) && defined(__POPCNT__) > #define av_popcount __builtin_popcount > #if ARCH_X86_64 > #define av_popcount64 __builtin_popcountll > #endif > #endif > > For a cleaner compile time check. > > configure | 12 ++++++++++-- > libavutil/intmath.h | 13 +++++++++++++ > 2 files changed, 23 insertions(+), 2 deletions(-) >
For the record, the builtin implementation looks like this here: 0000000000000000 <av_popcount_c>: 0: 89 f8 mov %edi,%eax 2: d1 e8 shr %eax 4: 25 55 55 55 55 and $0x55555555,%eax 9: 29 c7 sub %eax,%edi b: 89 fa mov %edi,%edx d: c1 ef 02 shr $0x2,%edi 10: 81 e2 33 33 33 33 and $0x33333333,%edx 16: 81 e7 33 33 33 33 and $0x33333333,%edi 1c: 8d 04 17 lea (%rdi,%rdx,1),%eax 1f: 89 c2 mov %eax,%edx 21: c1 ea 04 shr $0x4,%edx 24: 01 d0 add %edx,%eax 26: 25 0f 0f 0f 0f and $0xf0f0f0f,%eax 2b: 89 c2 mov %eax,%edx 2d: c1 ea 08 shr $0x8,%edx 30: 01 d0 add %edx,%eax 32: 89 c2 mov %eax,%edx 34: c1 ea 10 shr $0x10,%edx 37: 01 d0 add %edx,%eax 39: 83 e0 3f and $0x3f,%eax 3c: c3 retq 3d: 0f 1f 00 nopl (%rax) 0000000000000040 <popcount_gcc>: 40: 48 83 ec 08 sub $0x8,%rsp 44: 89 ff mov %edi,%edi 46: e8 00 00 00 00 callq 4b <popcount_gcc+0xb> 4b: 48 83 c4 08 add $0x8,%rsp 4f: c3 retq 0000000000000040 <popcount_clang>: 40: 89 f8 mov %edi,%eax 42: d1 e8 shr %eax 44: 25 55 55 55 55 and $0x55555555,%eax 49: 29 c7 sub %eax,%edi 4b: 89 f8 mov %edi,%eax 4d: 25 33 33 33 33 and $0x33333333,%eax 52: c1 ef 02 shr $0x2,%edi 55: 81 e7 33 33 33 33 and $0x33333333,%edi 5b: 01 c7 add %eax,%edi 5d: 89 f8 mov %edi,%eax 5f: c1 e8 04 shr $0x4,%eax 62: 01 f8 add %edi,%eax 64: 25 0f 0f 0f 0f and $0xf0f0f0f,%eax 69: 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax 6f: c1 e8 18 shr $0x18,%eax 72: c3 retq We might see relevant "optimizations" for our reference code. [...] -- Clément B.
pgpzj_uiJQRKV.pgp
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel