Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer <jamr...@gmail.com> wrote:
> On 25/02/15 9:41 AM, Ronald S. Bultje wrote: > > Hi, > > > > On Tue, Feb 24, 2015 at 8:05 PM, James Almer <jamr...@gmail.com> wrote: > >> > >> +#if HAVE_FAST_POPCNT > >> +#if AV_GCC_VERSION_AT_LEAST(4,5) > >> +#ifndef av_popcount > >> + #define av_popcount __builtin_popcount > >> +#endif /* av_popcount */ > >> +#if HAVE_FAST_64BIT > >> +#ifndef av_popcount64 > >> + #define av_popcount64 __builtin_popcountll > >> +#endif /* av_popcount64 */ > >> +#endif /* HAVE_FAST_64BIT */ > >> +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ > >> +#endif /* HAVE_FAST_POPCNT */ > >> > > > > Is this just to get the sse4 popcnt instruction if we compile with > > -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet > > configure still does an arch/cpu check. I'd expect the built-in/compiler > to > > do that for us based on -mcpu, and we could always unconditionally use > this > > (as long as gcc >= 4.5); alternatively, you could use inline asm and then > > have the configure check (HAVE_FAST_POPCNT). But doing both seems a > little > > odd. I have no objection to it, patch is still fine, just odd. > > > > Ronald > > I purposely made the checks for gcc 4.5 and in configure for cpus that > support popcnt > because otherwise __builtin_popcount (at least gcc's) is slower than our > generic > av_popcount_c function from lavu/common.h. > When the CPU supports popcnt the builtin becomes a single inlined > instruction. > > I tried the __asm__ approach, but the code generated by the builtin seemed > better. That's interesting, can you show the code you tried? Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel