Hi, On Sun, Jul 12, 2015 at 10:29 AM, Paul B Mahol <one...@gmail.com> wrote:
> Dana 12. 7. 2015. 14:18 osoba "Ronald S. Bultje" <rsbul...@gmail.com> > napisala je: > > > > Hi, > > > > On Sun, Jul 12, 2015 at 6:48 AM, Paul B Mahol <one...@gmail.com> wrote: > > > > > Dana 12. 7. 2015. 01:56 osoba "Ronald S. Bultje" <rsbul...@gmail.com> > > > napisala je: > > > > > > > > --- > > > > libavfilter/vf_ssim.c | 5 ++--- > > > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/libavfilter/vf_ssim.c b/libavfilter/vf_ssim.c > > > > index 0721ddd..3ef122f 100644 > > > > --- a/libavfilter/vf_ssim.c > > > > +++ b/libavfilter/vf_ssim.c > > > > @@ -134,7 +134,7 @@ static float ssim_end1(int s1, int s2, int ss, > int > > > s12) > > > > / ((float)(fs1 * fs1 + fs2 * fs2 + ssim_c1) * (float)(vars > + > > > ssim_c2)); > > > > } > > > > > > > > -static float ssim_end4(int sum0[5][4], int sum1[5][4], int width) > > > > +static float ssim_endn(int (*sum0)[4], int (*sum1)[4], int width) > > > > { > > > > float ssim = 0.0; > > > > int i; > > > > @@ -169,8 +169,7 @@ static float ssim_plane(uint8_t *main, int > > > main_stride, > > > > &sum0[x]); > > > > } > > > > > > > > - for (x = 0; x < width - 1; x += 4) > > > > - ssim += ssim_end4(sum0 + x, sum1 + x, FFMIN(4, width - x > - > > > 1)); > > > > + ssim += ssim_endn(sum0, sum1, width - 1); > > > > } > > > > > > > > return ssim / ((height - 1) * (width - 1)); > > > > -- > > > > 2.1.2 > > > > > > > > > > > > > > Why? There was reason behind this code I guess. > > > > > > > I think it's for simd code simplification. See, I'm guessing the code you > > took from libvpx had an extra condition to do only 4-sized chunks through > a > > function pointer, and then the odd tail in c code. If you do this, the > simd > > code has a fixed size (always 4), which makes the implementation much > more > > trivial: 4 16-byte loads, add, transpose4x4d, and then ssim_end1 to get 4 > > results, which you horizontal-add and return. > > > > I took this from tiny_ssim.c as pengvado said its ok to relicense to lgpl. I think the same reasoning still applies - this will get better performance, particularly if we consider avx2. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel