Hi Mark, A small additional clarification to my last email, where I have responded to your queries/thoughts.
The additional flexible generic logic which I am experimenting currently, allows the more complex Tile-Yf to be detiled with around 50% overhead compared to the targetted Tile-X or Tile-Y implementation. WHile the flexible generic logic handles Tile-X using only additional 3% overhead compared to the targetted Tile-X implementation. So in that sense the generic logic which I am currently experimenting seems to do good at one level. So for TileX, TileY one uses the targeted logic, while for the more intricate tiled layouts use the flexible | configurable generic detile logic. On Mon, Jun 29, 2020 at 3:10 AM C Hanish Menon <hanish...@gmail.com> wrote: > Hi Mark, > > **** hwdownload vs separate filter > > True, for kmsgrab use-case one could potentially do this transform as part > of the drm_transfer_data logic (which currently mmaps and does a linear > copy, if even I remember correctly). But like what I had mentioned in my > previous email, as this is done on the cpu side, if one wants to capture > very large framebuffers (say 4K or 8K at high fps), it could impact the > performance to some extent, so in such a situation decoupling the capture > from detiling, allows one to capture the screen at a very high resolution > without worrying about detiling and then handle detile in a offline / > separate pass manner. > > NOTE1: Also as a side note, I dont think the existing logic is currently > fetching the format modifier of the actual frame buffer, I think it gets > set to NONE type by default and remains like that, unless user passes the > format_modifier argument, but I could be wrong in this understanding of > mine, as I have only gone through the code flow quickly once and also as I > am in alien territory in some sense at one level. > > **** Tile layouts > > As it mainly supports Intel tile layouts for now, and as older Intel GPUs > didnt support Tile-Y format for scan out purpose, I think currently most > set the framebuffer layout to Tile-X for display purpose. So in that sense > the default type of Tile-X which is used by the filter should be fine for > most cases. However if one wants, one can change the tile conversion format > to Tile-Y by passing a argument to the filter. Also as I wasnt very sure > the format-modifer is being picked up by default, so also used the most > likely case as the default and inturn provided the option to change the > layout conversion to use if required. > > NOTE2: The Tile-X being the default is my understanding based on a quick > glance through the Intel GPU documents and potentially some things which I > might have seen online. > > NOTE3: I am not much clued in into this domain in general, nor tracking > it, but more as I had a issue with some capturing which I wanted to do, I > went through the ffmpeg kmsgrab + hwup/down and hwcontext code path a bit, > some documents and headers quickly and then based on a rough logical > understanding I wanted to implement a quick and flexible solution to solve > my problem as well as potentially help others who might have a similar > issue. And that is how this filter got done. > > Also I am planning to add a additional generic detile logic later, where > the user can configure the tile format as a list of direction changes and > few other constraints and then the same logic can handle either TileX or > TileY or TileYs or TileYf or ... This will be slower (based on some initial > tests the generic logic seems to be around 50% slower compared to current > specific targeted conversion logics which I have implemented), but should > allow one to try and detile any (or rather more correctly - many) kind of > tile layouts, as the case may be. Again the idea is to use this generic > path has a offline / second pass. > > > On Mon, Jun 29, 2020 at 2:28 AM Mark Thompson <s...@jkqxz.net> wrote: > >> On 27/06/2020 20:57, hanishkvc wrote: >> > v02-20200627IST2331 >> > >> > Unrolled Intel Legacy Tile-Y detiling logic. >> > >> > Also a consolidated patch file, instead of the previous development >> > flow based multiple patch files. >> > >> > v01-20200627IST1308 >> > >> > Implemented Intel Legacy Tile-X and Tile-Y detiling logic >> > >> > NOTES: >> > >> > This video filter allows framebuffers which are tiled to be detiled >> > using logic running on the cpu, into a linear layout. >> > >> > Currently it supports Intel Legacy Tile-X and Tile-Y layout detiling. >> > THis should help one to work with frames captured (say using kmsgrab) >> > on laptops having Intel GPU. >> > >> > Tile-X conversion logic has been explicitly cross checked, with Tile-X >> > based frames. However Tile-Y conv logic hasnt been tested with Tile-Y >> > based frames, but it should potentially do the job, based on my current >> > understanding of the Tile-Y layout format. >> > >> > TODO1: At a later time have to generate Tile-Y based frames, and then >> > cross check the corresponding logic explicitly. >> > >> > TODO2: May be use OpenGL or Vulcan buffer helper routines to do the >> > layout conversion. But some online discussions from sometime back seem >> > to indicate that this path is not fully bug free currently. >> > --- >> > Changelog | 1 + >> > doc/filters.texi | 62 ++++++++ >> > libavfilter/Makefile | 1 + >> > libavfilter/allfilters.c | 1 + >> > libavfilter/vf_fbdetile.c | 309 ++++++++++++++++++++++++++++++++++++++ >> > 5 files changed, 374 insertions(+) >> > create mode 100644 libavfilter/vf_fbdetile.c >> >> For your kmsgrab use-case I think you are doing this in the wrong place. >> There is already a copy during the download step (the hwdownload filter >> before this), and that does know what the tiling mode >> is such that it could detile transparently without a need for an extra >> filter doing another copy. See drm_transfer_data_from() in >> libavutil/hwcontext_drm.c, which currently just does the linear copy >> you observe regardless of the format modifier on the input buffer. >> >> Unrelated to the previous point, does the dependence of the actual layout >> of the X and Y tiled formats on the exact model of GPU in use cause any >> problems here? If the layout is actually the same on >> everything people might use nowadays then it's probably fine; if that >> isn't true then maybe it needs some extra check. >> >> - Mark >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > > > > -- > Keep ;-) > HanishKVC > -- Keep ;-) HanishKVC _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".