Hi, On Sun, Jul 12, 2015 at 6:48 AM, Paul B Mahol <one...@gmail.com> wrote:
> Dana 12. 7. 2015. 01:56 osoba "Ronald S. Bultje" <rsbul...@gmail.com> > napisala je: > > > > --- > > libavfilter/vf_ssim.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/libavfilter/vf_ssim.c b/libavfilter/vf_ssim.c > > index 0721ddd..3ef122f 100644 > > --- a/libavfilter/vf_ssim.c > > +++ b/libavfilter/vf_ssim.c > > @@ -134,7 +134,7 @@ static float ssim_end1(int s1, int s2, int ss, int > s12) > > / ((float)(fs1 * fs1 + fs2 * fs2 + ssim_c1) * (float)(vars + > ssim_c2)); > > } > > > > -static float ssim_end4(int sum0[5][4], int sum1[5][4], int width) > > +static float ssim_endn(int (*sum0)[4], int (*sum1)[4], int width) > > { > > float ssim = 0.0; > > int i; > > @@ -169,8 +169,7 @@ static float ssim_plane(uint8_t *main, int > main_stride, > > &sum0[x]); > > } > > > > - for (x = 0; x < width - 1; x += 4) > > - ssim += ssim_end4(sum0 + x, sum1 + x, FFMIN(4, width - x - > 1)); > > + ssim += ssim_endn(sum0, sum1, width - 1); > > } > > > > return ssim / ((height - 1) * (width - 1)); > > -- > > 2.1.2 > > > > > > Why? There was reason behind this code I guess. > I think it's for simd code simplification. See, I'm guessing the code you took from libvpx had an extra condition to do only 4-sized chunks through a function pointer, and then the odd tail in c code. If you do this, the simd code has a fixed size (always 4), which makes the implementation much more trivial: 4 16-byte loads, add, transpose4x4d, and then ssim_end1 to get 4 results, which you horizontal-add and return. The disadvantage is overhead. First, call overhead since each 4-element chunk requires a function call, second overhead for function initialization (anything outside the main loop, either before or after). This includes the horizontal-add, which is relatively expensive. Third, it limits us to 16-byte: no avx(2). Doing a variable-size function makes the simd slightly more complex, but is more future-proof (avx/2) and theoretically faster. Does this change results? No. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel