On 9/2/2017 3:29 PM, Clément Bœsch wrote: > On Sat, Sep 02, 2017 at 02:07:01PM -0300, James Almer wrote: > [...] >> +size_t av_cpu_max_align(void) >> +{ >> + int av_unused flags = av_get_cpu_flags(); >> + >> +#if ARCH_ARM || ARCH_AARCH64 >> + if (flags & AV_CPU_FLAG_NEON) >> + return 16; >> +#elif ARCH_PPC >> + if (flags & AV_CPU_FLAG_ALTIVEC) >> + return 16; > >> +#elif ARCH_X86 >> + if (flags & AV_CPU_FLAG_AVX) >> + return 32; >> + if (flags & AV_CPU_FLAG_SSE) >> + return 16; >> +#endif > > mmh, will this really work in FFmpeg? I think we have a difference related > to the flags dependency. Typically, if having SSE2 doesn't imply you have > SSE. I think you may want to extend the mask.
Mmh, you're right, forgot we have av_parse_cpu_caps(). What do i do then? Define two masks with all the CPU flags that would apply for each alignment value? AVX to AVX2 plus FMA3/4 and the slow variants for 32, then SSE to SSE4 plus XOP and the slow variants for 16? > > [...] > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel