On 10/25/2015 1:39 PM, Matt Oliver wrote: > On 26 October 2015 at 00:43, Hendrik Leppkes <h.lepp...@gmail.com> wrote: > >> On Sun, Oct 25, 2015 at 12:30 PM, Matt Oliver <protogo...@gmail.com> >> wrote: >>> On 25 October 2015 at 22:16, Hendrik Leppkes <h.lepp...@gmail.com> >> wrote: >>> >>>> On Sun, Oct 25, 2015 at 11:26 AM, Matt Oliver <protogo...@gmail.com> >>>> wrote: >>>>> On 25 October 2015 at 12:22, Ganesh Ajjanagadde <gajja...@mit.edu> >>>> wrote: >>>>> >>>>>> On Sat, Oct 24, 2015 at 7:03 PM, James Almer <jamr...@gmail.com> >> wrote: >>>>>>> On 10/24/2015 7:48 PM, Ganesh Ajjanagadde wrote: >>>>>>>> On Sat, Oct 24, 2015 at 6:08 PM, James Almer <jamr...@gmail.com> >>>> wrote: >>>>>>>>> This gives the compiler some flexibility >>>>>>>>> >>>>>>>>> Signed-off-by: James Almer <jamr...@gmail.com> >>>>>>>>> --- >>>>>>>>> libavutil/x86/intmath.h | 2 +- >>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/libavutil/x86/intmath.h b/libavutil/x86/intmath.h >>>>>>>>> index 7881e3c..10d3e7f 100644 >>>>>>>>> --- a/libavutil/x86/intmath.h >>>>>>>>> +++ b/libavutil/x86/intmath.h >>>>>>>>> @@ -36,7 +36,7 @@ static av_always_inline av_const int >>>>>> ff_ctzll_x86(long long v) >>>>>>>>> { >>>>>>>>> # if ARCH_X86_64 >>>>>>>>> uint64_t c; >>>>>>>>> - __asm__("bsfq %1,%0" : "=r" (c) : "r" (v)); >>>>>>>>> + __asm__ ("bsfq %1,%0" : "=r" (c) : "rm" (v)); >>>>>>>>> return c; >>>>>>>>> # else >>>>>>>>> return ((uint32_t)v == 0) ? _bit_scan_forward((uint32_t)(v >>>> >>>>>> 32)) + 32 : _bit_scan_forward((uint32_t)v); >>>>>>>>> -- >>>>>>>>> 2.6.2 >>>>>>>> >>>>>>>> This question may be silly, but can you clarify when this asm >> code is >>>>>>>> used instead of __builtin_ctzll? For instance, I think on >> GNU/Linux, >>>>>>>> x86-64, sufficiently modern GCC (even with asm enabled), we should >>>>>>>> always use the builtin: the compiler knows best what to do with >> its >>>>>>>> builtin. >>>>>>> >>>>>>> This is ICC/ICL, not GCC. >>>>>> >>>>>> Ah, now I recall that _bit_scan_forward64 was not always available on >>>>>> ICC. Anyway, this is just a heads up to whoever uses ICC/ICL and the >>>>>> like: you may want to find out to which versions this built in >>>>>> applies. >>>>> >>>>> >>>>> bit_scan_forward64 isnt available on ICL at all, hence the use of asm. >>>>> >>>> >>>> Intels intrinsic guide lists _BitScanForward64 as the intrinsic to use, >>>> however. >>>> >>>> >> https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=Other&expand=373 >>> >>> >>> _BitScanForward64 is a msvc intrinsic only available on windows >> platforms >>> so not usable with ICC. The native asm implementation also performs >> better >>> with ICL hence its use. >> >> The "Intel(R) C++ Compiler User and Reference Guide" would care to >> disagree with you. >> > > Well ill be... I dont have ICC on hand to check it but ill take Intels word > for it. Its interesting that Intel decided to support that intrinsic as > opposed to just extending there own version to 64b (i.e. > _bit_scan_forward64). _BitScanForward64 as far as I know was introduced by > msvc and made available in "winnt.h" hence the different naming convention > to other intrinsics. That said the inline asm version is still preferable > with Intel as the intrinsic passes the result via pointer which when tested > with msvc11 and 12 crt's resulted in horrible performance hits as the > compiler didnt optimise out the memory access in order to keep the result > in register (works fine with newer ICL 16 and msvcrt14 though).
Tried ICC 13 x86_64 from https://gcc.godbolt.org/ and it doesn't recognize _BitScanForward64, or _BitScanForward for that matter. > > Using tzcnt would be the more interesting option. Changing the asm to tzcnt > works with ICL. But also the use of the tzcnt intrinsic (_tzcnt_u64) would > be possible with both intel and msvc. I dont have anything older than > Haswell or Piledriver to double check but I have seen its use in several > other projects without issue so in theory it shouldnt be a problem. > > If tzcnt is preferable I can make up a patch to convert the intel and msvc > versions to use the required intrinsic. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel