On Thu, Nov 19, 2015 at 06:29:23PM +0100, Clément Bœsch wrote: > On Thu, Nov 19, 2015 at 04:50:54PM +0100, Michael Niedermayer wrote: > > On Thu, Nov 19, 2015 at 11:48:53AM +0100, Clément Bœsch wrote: > > > From: Matthieu Bouron <matthieu.bou...@stupeflix.com> > > > > > > Signed-off-by: Matthieu Bouron <matthieu.bou...@stupeflix.com> > > > Signed-off-by: Clément Bœsch <clem...@stupeflix.com> > > > > > > --- > > > The function takes about 29ms with a 1080p source (testsrc2) on a > > > cortex-a8. Though, 16ms (more than half the time!) is spend in the vst2 > > > call. Any suggestion on how to speed up this? > > > > > > Also, the reference code seems to cause some kind of ringing, while our > > > ASM doesn't: > > > http://b.pkh.me/nv12-rgba-ref.png > > > http://b.pkh.me/nv12-rgba-neon.png > > > > what did you test exactly here ? > > ./ffmpeg -f lavfi -i testsrc2 -vf format=nv12,format=rgba -ss 1 -frames:v 1 > -y nv12-rgba-ref.png > > (on ARM though, and with -cpuflags 0) > > > but there are several codepathes for rgb output, one uses LUTs and > > not all use full resolution chroma > > > > Yeah, we noticed... > > Note: on x86 there are some yuv2rgb mmx code but it's not called above > because it doesn't handle nv12 (only yuv420 & friends), so the chroma > issue is reproducible (it's calling the LUT path). > > > > > > > > > Last, we noticed that the y_offset is scaled to 1<<9 for some reason we > > > couldn't figure out. Hopefully we're doing it correctly here. > > > > [...] > > > +.macro compute_half_line dst half_y ofmt > > > + vmovl.u8 q7, \half_y @ > > > 8px of Y > > > + vdup.16 q5, r9 > > > + vsub.s16 q7, q5 > > > + vmull.s16 q1, d14, d0 @ > > > q1 = (srcY - y_offset) * y_coeff (left) > > > + vmull.s16 q2, d15, d0 @ > > > q2 = (srcY - y_offset) * y_coeff (right) > > > > if you do something like (srcY) * y_coeff - y_offset2 > > then you could keep a bit more precission in the requested brightness > > correction > > The code in swscale/output.c seems to always use the form we use here. Is > it on purpose?
if srcY has some extra bits precission then it shuld be fine > > > OTOH maybe you want to be bitexact to some existing codepath > > > > Right... I suppose we don't have much tests with custom > brightness/contrast/saturation. Should I add expose them in vf_scale and > see how much breaks? :) contrast/brightness/saturation fate tests are welcome [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel