Hi Lynne, On Wed, Jul 1, 2020 at 3:37 PM Lynne <d...@lynne.ee> wrote:
> Jun 29, 2020, 18:58 by hanish...@gmail.com: > > > v03-20200629IST2208 fbdetile > > > > Added a generic detiling logic, which can be easily configured to > > detile many different tiling schemes. > > > > The same is inturn used to detile Intel Tile-Yf layout. > > > > NOTE: This is a full patch, it contains the previous versions also > > in it. > > > > v02-20200627IST2331 > > > > Unrolled Intel Legacy Tile-Y detiling logic. > > > > Also a consolidated patch file, instead of the previous development > > flow based multiple patch files. > > > > v01-20200627IST1308 > > > > Implemented Intel Legacy Tile-X and Tile-Y detiling logic > > > > NOTES: > > > > This video filter allows framebuffers which are tiled to be detiled > > using logic running on the cpu, into a linear layout. > > > > Currently it supports Intel Legacy Tile-X and Tile-Y layout detiling, > > as well as the newer Intel Tile-Yf layouts. > > > > THis should help one to work with frames captured (say using kmsgrab) > > on laptops having Intel GPU. This can be done live while capturing > > itself, or it can be applied later as a seperate pass. > > > > Tile-X conversion logic has been explicitly cross checked, with Tile-X > > based frames. However Tile-Y and Tile-Yf conv logics havent been tested > > with Tile-Y | Tile-Yf based frames, but it should potentially get the > > job done, based on my current understanding of these layout formats. > > > > TODO1: At a later time have to generate Tile-Y|Yf based frames, and then > > cross check the corresponding logic explicitly. > > > > TODO2: May be use OpenGL or Vulcan buffer helper routines to do the > > layout conversion. But some online discussions from sometime back seem > > to indicate that this path is not fully bug free currently. > > > > Still not happening, I'd like to see this done properly with hwdownload. > While what you > have works as a hack, we're not interested in hacks but something that > works universally. > As I said before, it can be easily sped up by a factor of 4 or 8 using > SIMD, so its > unjustifiable to have this in the codebase as a filter. > > Can you tell me how this is not universal. Rather by embedding it within hwdownload, we will be making it limited to use from a hwcontext, while keeping it has a seperate filter, allows one to use it either with a hw context or from any other source. And also it gives the flexibility to do it live or offline. So not sure in what sense you call my current flow restricted and a possible embedded within hwdownload one has being universal? Also I am assuming that gcc + libc is sensible enough to use a appropriate fast memcpy with say rep movs or simd load-stores as the case may be based on which cpu architecture to which the code is being built. The overhead with FullHD content is negligible. Beyond that if required I have structured the generic detile logic which I have implemented to do parallel detiling of multiple tiles in step, which could be easily translated into true parallel detiling in a hw or multicore setup. -- Keep ;-) HanishKVC _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".