On Sun, Mar 06, 2016 at 04:46:08PM -0300, James Almer wrote: > On 3/6/2016 4:14 PM, Reimar Döffinger wrote: > > On Sun, Mar 06, 2016 at 03:49:00PM -0300, James Almer wrote: > >> On 3/6/2016 3:35 PM, Reimar Döffinger wrote: > >> Are you sure this wasn't vectorized already? I remember i checked and it > >> mostly > >> was, at least on gcc 5.3 mingw-w64 with default settings. > > > > Then it would hardly get 10% faster, would it (though > > I admit I didn't test the two parts separately)? > > But I am fairly sure that before the patch it only > > used sqrtss instructions and not sqrtps. > > Without your patch, GCC 5.3 mingw-w64 x86_64 default settings. > [...] > > Didn't bench but it seems to help GCC vectorize more efficiently so this patch > is probably ok, especially if in your case it made your compiler actually be > able to vectorize at all.
Actually, I retract that patch. It might cause a very minor speedup (maybe 1.5%) due to what you saw, which is basically that gcc now also uses SIMD in the unaligned input path. However the big speedup comes from a different change I by accident mixed into this one. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel