Hi, 2014-10-04 1:19 GMT+02:00 James Almer <jamr...@gmail.com>: > Or how everything is declared as sse4 even though less than half the code > actually uses sse4 instructions.
The incorrect insn patch actually caught another issue in qpel_hv (where packusdw is required) , which made me the macros mess even more messier. > You already tried to deal with this in "x86: hevc_mc: port to SSSE3 v2", but > it got blocked by one of the patches that broke x86_32. Maybe it's worth > looking at again. Well, I've tried this, but: - you still need sse4 for WP - maybe not that useful now, but that was a 10% hit - having sse4 versions where needed, even when reusing the ssse3 functions, increased the total object size to near 500K, hence this patchset > Hell, quite a few are sse2, even, but the macros are kinda messy and it's > much easier declaring everything as ssse3/sse4 than micromanaging stuff. Indeed. I bet most of the sse2 versions are for 10+ bits versions. I wouldn't have expected main10 to get such a large acceptance, seeing avc's "hi10p", but it did, so people may find this sufficiently desirable to invest what is needed. In the end, all of this (clean proxying, ssse3/sse2) is what one would like to get to have neat code, but except avx2, nothing that would matter to a non-negligible percentage of ffmpeg's users. One may even argue that having 32bits asm would be more important. -- Christophe _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel