On Mon, Jun 20, 2016 at 09:54:15AM +0000, Davinder Singh wrote: > On Sat, Jun 18, 2016 at 3:16 AM Michael Niedermayer <mich...@niedermayer.cc> > wrote: > > > On Fri, Jun 17, 2016 at 08:19:00AM +0000, Davinder Singh wrote: > > [...] > > > Yes, I did that, after understanding it completely. It now works with the > > > motion vectors generated by mEstimate filter. Now I’m trying to improve > > it > > > based on this paper: Overlapped Block Motion Compensation: An > > > Estimation-Theoretic Approach > > > > > < > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.112.8359&rep=rep1&type=pdf > > > > > > > this is 22 years old > > > > > > > and > > > this one: Window Motion Compensation > > > <https://www.researchgate.net/publication/252182199>.Takes a lot of time > > > > this is 25 years old > > > > not saying old papers are bad, just that this represents the knowledge > > of 20 years ago > > > > also its important to keep in mind that blind block matching of any > > metric will not be enough. To find true motion the whole motion > > vector fields of multiple frames will need to be considered > > > > For example a ball thrown accross the field of view entering and > > exiting the picture needs to move smoothly and at the ends (in time) > > there are frames without the ball then a frame with the ball > > these 2 are not enough to interpolate the frames between as we have > > just one location where the ball is. With the next frames though > > we can find the motion trajectory of the ball and interpolate it end > > to end > > > > I think papers which work on problems like this and also interpolation > > of all the areas that end up overlapping and covering each other > > like the backgroud behind the ball in that example would be better > > starting points for implementing motion estiation because ultimatly > > that is the kind of ME code we would like to have. > > Block matching with various windows, OBMC, ... are all good but > > if in our example the vectors for the ball or background are off that > > will look rather bad with any motion compensation > > So trying to move a bit toward this would make sense but first > > having some motion estimation even really basic and dumb with > > mc working in a testable filter (pair) should probably be done. > > Iam just mentioning this as a bit of a preview of what i hope could > > eventually be implemented, maybe this would be after GSoC but its > > the kind of code needed to have really usable frame interpolation > > > > > > > > > reading them. I think we need to add new Raised Cosine window (weights) > > > along with Linear Window (currently implemented). What do you say? > > > > i dont know, the windows used in snow are already the best of several > > tried (for snow). > > no great gains will be found by changing the OBMC window from snow. > > > > > > > > > > Also making mInterpolate work with variable macroblock size MC. The > > current > > > interpolation works without half pel accuracy, though. > > > > mcfps has fully working 1/4 pel OBMC code, that should be fine to be > > used as is i think unless i miss something > > > > half pel is 20 years old, it is not usefull > > multiple block sizes on the MC side should not really matter ATM > > smaller blocks are a bit slower but first we should get the code > > working, then working with good quality and then working fast. > > > > multiple block sizes may be usefull for the estimation side if it > > improves estimation somehow. > > > > Can i see your current "work in progress" ? > > > > > > [...] > > > I’m moving estimation code to some new file motion_est.c file and the > > > methods are shared by both mEstimate and mInterpolate filters. mEstimate > > > store the MVs in frame’s side data for any other filter. Moreover, any > > > other filter if need post processing on MVs it can directly use the > > shared > > > methods. But, mInterpolate use them internally, no saving in sidedata, > > and > > > saving unnecessary processing. > > > > This design sounds good > > > > > > > > > > > > > Also, Paper [1] doesn’t uses window with OBMC at all. It just find normal > > > average without weight. Perhaps to compare papers I either need to add > > > multiple option for each setting or need to assign the algorithm as > > > researcher’s name in filter options. > > > > > > > Paper [1] and [2] uses functions or do post processing on motion vectors, > so needs fast ME algorithms, which currently I’m working on. [*M] > > Let me summarize the papers (from Email 1, this thread): > > Paper [1]: Zhai et al. (2005) A Low Complexity Motion Compensated Frame > Interpolation Method > > [Quote] > This paper propose a MCFI method intended for real time processing. It > first examines the motion vectors in the bitstream [*1]. 8x8 block size is > used rather than 16x16 as in most cases; Using smaller block size leads to > denser motion field, so neighboring MVs are more highly correlated, so > prediction is better. To reduce complexity, MVs in bitstream are utilized > [*1]. But need to be filtered as not all of them represent true motion. > They are grouped into “good vectors, can be used directly” and “bad > vectors, need to find true motion”. For classification of MVs into groups, > SAD and BAD is used. For an 8x8 block in to-be-interpolated frame F(in) we > get motion vector MV of block at same location in current frame. If F(in) > is exactly middle of F(prev) and F(cur), then MV/2 points to avblock in > prev frame & -MV/2 points to a block in current frame from F(in). Then SAD > & BAD of both of these blocks are compared to certain thresholds [*2], > based on which blocks are classified. For bad ones, overlapped block > bi-directional motion estimation (OBBME) is carried out to find true > motion. In OBBME, the size of block in F(in) is enlarged to 12x12, then > bi-directional ME is performed to get MV that minimizes the diff. between > two block located at MV/2 & -MV/2 in F(prev) & F(cur) wrt current block in > F(in). Diff is calc by eq (1) in Paper. Like in BMA, we can use any fast ME > algo here [*M]. After this, there are still few MVs. For that post > processing is performed on MVs that break the continuity. We calculate the > variation of each motion vector and its neighboring MVs. If variation > exceeds a certain threshold, the MV is regarded as a single bad motion > vector and then vector median filtering is applied. It finds one vector > among 8, that minimizes eq (2). Finally, OBMC is applied. No weights are > used [*3]. Pixels are simple averages given by eq 4-6. > [/Qoute] > > [*1] We can for now use motion vectors generated on filter side. As you > suggested, later we can use decoder’s vectors. > [*2] Threshold values are not given in paper :( > [*3] Initially, we can test using the generated/refined motion vector field > with the currently implemented window based OBMC. Later to reduce > complexity we can use their method. > > > Paper [2]: Choi et al. (2007) Motion-Compensated Frame Interpolation Using > Bilateral Motion Estimation and Adaptive Overlapped Block Motion > Compensation > > [Quote] > This algorithm has four steps. First, we propose the bilateral ME scheme to > obtain the motion field of an interpolated frame. Then, we partition a > frame into several object regions by clustering MVs. We apply the > variable-size block MC (VS-BMC) to object boundaries in order to > reconstruct edge information with a higher quality. Finally, we use the > adaptive overlapped block MC, which adjusts the coefficients of overlapped > windows based on the reliabilities of neighboring MVs. The adaptive OBMC > (AOBMC) can overcome the limitations of the conventional OBMC, such as > over-smoothing and poor de-blocking. > I. We perform bilateral ME which prevents overlapping and hole problem [*4] > by estimating the motion vectors of interpolated frame directly. If the > conventional BMA is used to find a block-wise motion vector field between > the previous frame and the current frame, the motion trajectories may not > cover all pixels in the interpolated frame, consequently yielding hole > regions. In addition, multiple trajectories may pass through the same > pixel, causing overlapping regions. Therefore, we should estimate the > motion vectors for the blocks in the interpolated frame, instead of using > the motion vectors between the previous frame and the current frame. In > proposed Bilateral ME we obtain the MV by comparing a block at a shifted > position in the F(prev) and another block at the opposite position in > F(curr), by minimizing SAD [*5][*M]. Since there can be multiple > trajectories through the current block, we impose a spatial smoothness > constraint to improve robustness of ME. The SMD is calc which is avg. > between abs. px. values at boundary of predicted and neighboring block. We > find best MV by minimizing weighted sum of SAD & SMD given by eq (6). > II. Then MBs are classified into clusters according to MVs. [TL;DR] First, > all MBs are considered as single object and cluster center is set to avg. > MV of blocks. If diff b/w block's MV and threshold T (=8), the block > belongs to new object. The avg. MV of the blocks in new object is set as a > new cluster center. Each cluster center is updated to the avg. of MVs in > the cluster. Steps 2–4 are iteratively repeated until there is no change in > the cluster centers. [/TL;DR] > III. To express complex motions, we adopt VS-BMC to reconstruct boundary > blocks. We adopt a quadtree-based VS-BMC, which divides an 8x8 boundary > block into 4x4 or 2x2 sub-blocks. [*6] Then we find MVs for sub-block using > SAD like before, if new MV is less than 1/4MV of orig. block, accept the > division - iterate it by subdividing it furthermore, or terminate procedure. > IV. Finally adaptive OBMC is used with window such as raised cosine [*7]. > Conventional OBMC can yield blurring or over-smoothing artifacts. AOBMC > reconstructs the interpolated frame faithfully by controlling the weights > of overlapping windows according to the reliabilities of MVs. See Fig. 6 & > 7. > [/Quote] >
> [*M] The shared methods from motion_est.c will allow this without > repetition of code. Just keep in mind the motion estimation we have is a bit mpeg centric so block sizes below 8x8 will not work with all the routines we have > [*5] It is very similar to OBBME used in Paper [1] except the block size is > not changed. > [*6] This required the current mInterpolate code to support variable size > OBMC. > [*7] We could use the linear window instead of raised cosine one. But too > late, I already implemented it. :) > > Another interesting paper I found is 3D recursive search. It's little old > but very popular. See images here: > http://i65.tinypic.com/zkfgox.png > http://i67.tinypic.com/2dihmb7.png > http://i65.tinypic.com/rgw38n.png interresting one thing that is very noticable on this though is that what they use as comparission (full search) in these 3 images is alot worse than what modern encoders use (rate distortion based predictive zonal ME) this shouldnt matter much but i wanted to point out that its not possible from this to conclude how these relate to what a modern video encoder would use as "full search" > > > Paper [3]: de Haan et al. (1994) True Motion Estimation with 3D Recursive > Search Block Matching > (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=246088) > Gonna read now. It has unusual notation. > > Once we implement these, then we can deal with objects entering or exiting > the screen. I think it is hole or overlapping problem addressed in paper > [2], several approaches have been proposed to handle it like median > filtering, spatial interpolation ( > http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=389461) or MC using > neighboring motion fields. Will look into it more. The hole or overlapping > problem is handled by bilateral motion estimation used in paper [2] (*4). > Also have to handle scene changing issues. I read in some paper that they yes, scene changing will need to be handled too, it was a problem in mcfps too the quick solution is probably to just detect by some threshold that there is a scene change and then set all MVs to 0,0 that will look alot better than random bits of images randomly moving and merging into each other > are too computational expensive. > > Which one do you think we should start with? I think it should be 3DRS. > 3DRS is fastest of these three. Paper 2 compares result of all these three. > 3DRS is around 16fps, [1] is ~7fps. [2] is ~3fps. Paper 2 outperforms both > of them. is the full text of paper 2 available somewhere ? also motion trajectories should be interpolated through more than 2 frames, i dont know if the quoted papers do that but vf_mcfps already provides the framework for this (aka its neccessary to have 2 future and past frames available) a random paper which seems to compare linear vs cubic shows very significant gains http://www.ripublication.com/ijaer16/ijaerv11n10_42.pdf I dont know if that paper is good or not but for example a ball (to keep the example used previously) would move along the edges of a polygon if linear MV interpolation is used, That might work with a slow moving ball but a spinning wheel should be heavily distorted with interpolation along straight lines. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB it is not once nor twice but times without number that the same ideas make their appearance in the world. -- Aristotle
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel