2017-10-10 3:16 GMT+02:00 Ivan Kalvachev <ikalvac...@gmail.com>: > On 10/9/17, Martin Vignali <martin.vign...@gmail.com> wrote: > > 2017-10-07 18:16 GMT+02:00 Ronald S. Bultje <rsbul...@gmail.com>: > > > >> Hi Martin, > >> > >> On Sat, Oct 7, 2017 at 11:49 AM, Martin Vignali < > martin.vign...@gmail.com> > >> wrote: > >> > >> > 2017-10-07 17:30 GMT+02:00 Ronald S. Bultje <rsbul...@gmail.com>: > >> > > On Sat, Oct 7, 2017 at 10:22 AM, Martin Vignali < > >> > martin.vign...@gmail.com> > >> > > wrote: > >> > > > Patch in attach add a new dsp > >> > > > for manipulation of qmat > >> > > > > >> > > > for now, i move this code inside > >> > > > > >> > > > for (i = 0; i < 64; i++) { > >> > > > qmat_luma_scaled [i] = ctx->qmat_luma [i] * qscale; > >> > > > qmat_chroma_scaled[i] = ctx->qmat_chroma[i] * qscale; > >> > > > } > >> > > > > >> > > > i add a special case for qscale == 1 > >> > > > and SSE2, AVX2 optimization > >> > > > >> > > This loop only executes once per slice. We typically do not > >> SIMD-optimize > >> > > at that level, because it won't give significant speed gains... > >> > > >> > Ok didn't know that. > >> > I mostly follow, what there are already done, like in > >> blockdsp.clear_block > >> > > >> > >> Right, so consider that blockdsp is done per block (16x16 pixels), not > per > >> slice. > >> > > Ok on principle (only improve, a func which is called quite often) > > It's more of: We can't refuse code that makes a measurable improvement. > > Also have in mind that compilers are getting smarter and this code is > good target for auto-vectorization. Of course FFmpeg disables is, > because of long history of compiler bugs related to it. > > >> You could remove this entirely from the slice processing code by simply > >> pre-calculating the values in the init function once for the whole > stream, > >> there's only 224 qscale values so it's 224*64*2 multiplications, which > is > >> (in the context of prores) virtually negligible. > >> > > > > Not sure, we can do that for prores decoder > > the qmat seems to be set on the decode frame header func > > (based on the header of the frame). > > You can at least check if the qscale has changed and avoid recalculation. > I think that the lgpl decoder does that. > > Yes you're right, the lgpl decoder only calculate it, if qscale (and qmat) doesn't change I will take a look on this
Thanks Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel