On Wed, Aug 20, 2014 at 11:54:08AM +0200, Daniel Oberhoff wrote:
> 
> ---
>  Daniel Oberhoff 
>  daniel.oberh...@gmail.com
> 
> 
> 
> On Aug 20, 2014, at 11:47 AM, Michael Niedermayer <michae...@gmx.at> wrote:
> 
> > On Wed, Aug 20, 2014 at 08:14:31AM +0200, Daniel Oberhoff wrote:
> >> 
> >> 
> >> Von meinem iPhone gesendet
> >> 
> >>> Am 20.08.2014 um 03:16 schrieb Michael Niedermayer <michae...@gmx.at>:
> >>> 
> >>>> On Wed, Aug 20, 2014 at 12:12:49AM +0200, Daniel Oberhoff wrote:
> >>>> Hello,
> >>>> 
> >>>> As a follow-up to my last patch I now factored out the floating point 
> >>>> based interpolation from transform.h/transform.c and applied it in the 
> >>>> vf_lenscorrection filter I introduced in my last patch. What I did not 
> >>>> do is also factor out fixed point based interpolation as used in 
> >>>> vf_rotate and vf_perspective and maybe others as could probably also be 
> >>>> done. Also I did not look further for uses of floating point based 
> >>>> interpolation. Basically I just wanted to introduce interpolation in 
> >>>> vf_lenscorrection without code duplication. As a side note I also tried 
> >>>> to introduce fixed point calculations to vf_lenscorrection but found 
> >>>> myself effectively doing floating point “by hand” since due to the high 
> >>>> order of the algorithm (up to 4th order) it is very hard to keep track 
> >>>> of the right amount of pre/post-comma digits for a given step in the 
> >>>> algorithm and given parameters and it felt rather futile after a while.
> >>> 
> >>>> Looking forward to reviews :).
> >>> 
> >>> why did you use the deshake code and not the vf_perspective code ?!
> >>> i suspect this will be quite a bit harder to get into shape for a
> >>> common API
> >>> 
> >> 
> >> As i Said, perspective does fixed point calculations. Also would you care 
> >> to ekaborate what exactly are your whoes with this API?
> > 
> > there where further commments further down in the reply, that list
> > some problems
> > 
> > In addition, being float based is already a problem on its own,
> > we dont really want it to be float based. Its very likely slower and
> > its not bitexact making tests harder.
> 
> Well, as I had outlined fixed point is not a viable option for my algorithm.

blending 4 or 16 8bit samples linearly together works fine in fixed
point. Please see the vf_perspective code, its exactly doing
that.

If you have a problem with fixed point, please show exactly where this
problem is and ill try to help


> 
> > besides a API that works 1 sample at a time redundantly recalculates
> > the Coefficients for each color plane, also doing 1 sample per call
> > isnt very SIMD friendly
> > 
> > or to see this from another point of view
> > what we would like to have is a SSE*/AVX* based generic resampling
> > code using fixed point (with 16bit coeffs for 8bit samples for example)
> > 
> > Theres no need to implement SSE/AVX but the API should be a step
> > toward this, and the patch feels like its going the opposit direction
> > this is even worse as this is common API, so other filters could
> > start using this increasing the amount of code that would eventually
> > be changed/updated/rewritten
> 
> There is some contradiction in your comments. You want to move towards 
> vectorization (btw. how would you do that in ffmpeg? I had previously 
> proposed an sse version using intrinsics and builtin operators, but that was 
> dismissed as not being portable and I was asked to write assembly instead, 
> which I am not proficient in), but you want to do all planes simultanously, 
> even though they are far apart in memory. Imho the gain by being memory local 
> (which is also what you want for vectorization) far outweights the 
> calculation cost for the coefficients.

I think you misunderstand how cpu caches work

theres no reason why reading from 2 or 3 planes sequentially would
be significnatly slower
also i wonder if this filter shouldnt be working in RGB space.

camera sensors work in RGB not YUV and correct gamma correction
for interpolation also is a RGB space thing.
now RGB is generally packed pixel which would avoid the multiple
planes

of course if someone wants to use the filter as a quick correction
for a random video downloaded somewhere thats likely in YUV space


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Avoid a single point of failure, be that a person or equipment.

Attachment: signature.asc
Description: Digital signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to