On 12/3/2017 3:09 PM, Martin Vignali wrote: >> 2017-12-03 17:46 GMT+01:00 Paul B Mahol <one...@gmail.com>: >> >>> On 12/3/17, Martin Vignali <martin.vign...@gmail.com> wrote: >>>> Hello, >>>> >>>> Maybe you can use a macro for byte and short version, >>>> only few lines are different in each version >>> >>> Sure, feel free to send patches. >>> >>> I'm not very macro proficient. >>> >> >> Ok, i will take a look. >> >> Martin >> > > I write a basic checkasm test. Seems like the byte version is slower than c > > hflip_byte_c: 31.8 > hflip_byte_ssse3: 108.1 > hflip_short_c: 300.1 > hflip_short_ssse3: 139.8 > > (checkasm patch in attach if you want to test) > > Martin
$ tests/checkasm/checkasm.exe --test=vf_hflip --bench benchmarking with native FFmpeg timers nop: 32.0 hflip_byte_c: 362.0 hflip_byte_ssse3: 96.0 hflip_short_c: 374.0 hflip_short_ssse3: 121.0 Guess your compiler is really good at optimizing this code, or something funny is going on. Can you post a disassembly of hflip_byte_c? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel