On Sat, Jan 12, 2019 at 04:49:40PM +0100, Carl Eugen Hoyos wrote: > 2019-01-12 16:46 GMT+01:00, Michael Niedermayer <mich...@niedermayer.cc>: > > On Sat, Jan 12, 2019 at 04:07:42PM +0100, Carl Eugen Hoyos wrote: > >> 2019-01-04 20:22 GMT+01:00, Michael Niedermayer <mich...@niedermayer.cc>: > >> > >> > +static void scaledown(uint8_t *dst, const uint8_t *src, int w) > >> > +{ > >> > + int x; > >> > + for (x = 0; x < w - 7; x+=8) { > >> > + dst[x + 0] = src[2*x + 0]; > >> > + dst[x + 1] = src[2*x + 2]; > >> > + dst[x + 2] = src[2*x + 4]; > >> > + dst[x + 3] = src[2*x + 6]; > >> > + dst[x + 4] = src[2*x + 8]; > >> > + dst[x + 5] = src[2*x +10]; > >> > + dst[x + 6] = src[2*x +12]; > >> > + dst[x + 7] = src[2*x +14]; > >> > >> Could you add to the commit message the information > >> which compiler is able to optimize this? > >> (Assuming this is a reason for the speedup) > > > > if what you ask for is "which compiler turns this into SIMD" > > i do not know, and i suspect mine does not from the limited > > increase in performance > > I think the speedup is primarly from simply unrolling the trivial loop > > > > is there something you want me to change in the commit message still ? > > No, I am a little surprised that unrolling without SIMD makes > a difference.
will apply thx > > Thank you for the explanation, Carl Eugen > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB There will always be a question for which you do not know the correct answer.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel