Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
Hello Giovanni, I have recently been working on a DRM/KMS driver which does not support OpenGL rendering (it only provides plane composition functionalities): [1]. If I understand correctly you patch series might solve some of the issues I am facing. I'm trying to get wayland working with HW cursor and several planes, the problem is that it depends on GBM to provides drm plane and drm cursor support. I tried to get EGL working with my DRM device and it always ask for an atmel-hlcdc_dri.so module (I have applied this patch [2] to get to this point). First of all, am I mistaken in thinking this series could solve my issue ? If not, could you tell me on which branch (or which tag) you based your work ? I'm asking this because I tried to apply your patches on top of the master branch (a few days ago), and after fixing some conflict I got a segfault (sorry, I don't have the backtrace anymore :-(, but this was related to negative stride value which was implying faulty memory access). Yesterday I tried to rebase your patches on top of last master branch modifications, and it seems they've completely rework the gallium dri module architecture ([3]) and know I get an 'undefined symbol: dd_create_screen' error when mesa tries to load kms_swrast_dri.so. Sorry if my questions seem stupid to you, but I'm new in graphic related developments :-). Best Regards, Boris [1] https://lkml.org/lkml/2014/6/9/487 [2] http://thread.gmane.org/gmane.comp.video.mesa3d.devel/66385 [3] http://thread.gmane.org/gmane.comp.video.mesa3d.devel/78175 -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
On Thu, 3 Jul 2014 11:14:48 +0200 Giovanni Campagna wrote: > 2014-07-03 10:48 GMT+02:00 Boris BREZILLON > : > > Hello Giovanni, > > > > I have recently been working on a DRM/KMS driver which does not support > > OpenGL rendering (it only provides plane composition functionalities): > > [1]. > > > > If I understand correctly you patch series might solve some of the > > issues I am facing. > > It might get your working EGL, but it's not a complete solution, > because buffer management is limited to linear CPU-addressable "dumb" > buffers, which is probably not the most efficient choice (altough how > much slower it gets depends on the driver and on the HWl). I'm only providing DUMB ioctls (through the CMA GEM implementation), so it should be enough for me, isn't it. > > > I'm trying to get wayland working with HW cursor and several planes, > > the problem is that it depends on GBM to provides drm plane and drm > > cursor support. > > > > I tried to get EGL working with my DRM device and it always ask for an > > atmel-hlcdc_dri.so module (I have applied this patch [2] to get to this > > point). > > > > First of all, am I mistaken in thinking this series could solve my > > issue ? > > Indeed, using my patch stack (patches 2 and 3) you will have a working > GBM device that will allocate GPU memory using the "dumb" interface. > If your driver is then able to upload this buffers to the plane HW (or > directly capable of allocating in GPU memory), that may be good enough > for you. > OTOH, this will not provide the wayland clients with the ability to > render directly to the plane buffers, because the "dumb" interface > does not provide global names that can be shared between processes, > therefore clients will have to render into a shared memory location, > that then the wayland compositor (weston, I assume) will have to > memcpy into a GBM allocated buffer. Indeed, I'm using weston (just forgot to mention it). > If you want to avoid that, you will need to design an ioctl interface > for your driver to allocate buffers, then write a "winsys" for the > userspace side that uses those ioctls (directly or through libdrm) - > first it allocates the buffer with your driver specific ioctl and then > calls GEM_FLINK to get the global name, which can be passed to weston > and in there to gbm_bo_import(). > If your HW is uncapable of GL rendering (and thus you want to use SW > rendering always) is quite likely that your driver will not be that > different from > dri_kms_swrast, except that will be able to share buffers (patch 3) > using GEM names. Okay, thanks for these enlightenment. I'll try to get the dri_kms_swrast first and then see if I need performance improvements ;-). > > > If not, could you tell me on which branch (or which tag) you based > > your work ? > > > > I'm asking this because I tried to apply your patches on top of the > > master branch (a few days ago), and after fixing some conflict I got a > > segfault (sorry, I don't have the backtrace anymore :-(, but this was > > related to negative stride value which was implying faulty memory > > access). > > The patches were made against an old version of mesa, and the build > system was updated meanwhile. Emil said he will rebase them, and that > will happen in a couple days. You should just wait until they land. Perfect! Emil, could you add me in Cc of this future submission ? Thanks for taking the time to answer to my questions. Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
On Thu, 3 Jul 2014 12:24:44 +0300 Pekka Paalanen wrote: > On Thu, 3 Jul 2014 10:48:26 +0200 > Boris BREZILLON wrote: > > > Hello Giovanni, > > > > I have recently been working on a DRM/KMS driver which does not support > > OpenGL rendering (it only provides plane composition functionalities): > > [1]. > > > > If I understand correctly you patch series might solve some of the > > issues I am facing. > > > > I'm trying to get wayland working with HW cursor and several planes, > > the problem is that it depends on GBM to provides drm plane and drm > > cursor support. > > Which compositor? All the dependencies are in the compositors, not > Wayland per se. Sorry, I meant weston not wayland. > > If it is Weston, have you tried --use-pixman to avoid EGL altogether? > I see Weston still tries to use GBM for cursor fbs, while primary fbs > are dumb buffers, but then again, I'm not sure if cursors are supposed > to support dumb buffers. Yes weston works fine with --use-pixman, but then I can't use HW cursor and drm overlay planes (because it requires gbm). > > Weston's overlay planes code however totally depends on EGL to provide > hw-capable buffers from clients. A software renderer support in EGL-DRM > won't help in that. Okay, if I understand correctly, this means I'll have to implement an atmel-hlcdc_dri.so module (as suggested by Giovanni), right ? Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
On Thu, 3 Jul 2014 13:49:06 +0300 Pekka Paalanen wrote: > On Thu, 3 Jul 2014 12:10:36 +0200 > Boris BREZILLON wrote: > > > On Thu, 3 Jul 2014 12:24:44 +0300 > > Pekka Paalanen wrote: > > > > > On Thu, 3 Jul 2014 10:48:26 +0200 > > > Boris BREZILLON wrote: > > > > > > > Hello Giovanni, > > > > > > > > I have recently been working on a DRM/KMS driver which does not support > > > > OpenGL rendering (it only provides plane composition functionalities): > > > > [1]. > > > > > > > > If I understand correctly you patch series might solve some of the > > > > issues I am facing. > > > > > > > > I'm trying to get wayland working with HW cursor and several planes, > > > > the problem is that it depends on GBM to provides drm plane and drm > > > > cursor support. > > > > > > Which compositor? All the dependencies are in the compositors, not > > > Wayland per se. > > > > Sorry, I meant weston not wayland. > > > > > > > > If it is Weston, have you tried --use-pixman to avoid EGL altogether? > > > I see Weston still tries to use GBM for cursor fbs, while primary fbs > > > are dumb buffers, but then again, I'm not sure if cursors are supposed > > > to support dumb buffers. > > > > Yes weston works fine with --use-pixman, but then I can't use HW cursor > > and drm overlay planes (because it requires gbm). > > > > > > > > Weston's overlay planes code however totally depends on EGL to provide > > > hw-capable buffers from clients. A software renderer support in EGL-DRM > > > won't help in that. > > > > Okay, if I understand correctly, this means I'll have to implement an > > atmel-hlcdc_dri.so module (as suggested by Giovanni), right ? > > Well, uhh, I suppose... > > That means you need to get clients actually rendering into hw-capable > buffers, so that usually means having a GL(ES) rendering working on > EGL Wayland platform, too. > > Or, clients could use something like libva(?) to fill the buffers and > use Mesa's internal wl_drm protocol to pass those to the compositor. If > the compositor is able to initialize EGL_WL_bind_wayland_display > extension, then with Mesa, the clients will have wl_drm available. > Still probably requires working GLESv2 rendering for the EGL DRM/GBM > platform, because the pixman renderer cannot handle anything else than > wl_shm buffers. > > Or, you migh hack Weston to copy pixels from wl_shm buffers into dumb > buffers, and put those into overlays (err, did dumb buffers support > going on overlays, or were they primary plane only?). But if you have > like less than 10 overlays in hw, that might be a net lose in > performance. I have, at most, 3 overlays (it depends on the HLCDC IP version), so this might be an acceptable solution. ITOH, I'd like to keep the implementation as clean as possible in order to be able to base future work on offical weston versions (and tell me if I'm wrong, but I'm not sure the proposed solution can ever make it to the official weston version). > > Or, you might go wild and have a hack on my hacky zlinux_dmabuf support > in weston: > http://cgit.collabora.com/git/user/pq/weston.git/log/?h=dmabuf-WIP > It is missing pixman-renderer integration, and the test client is > intel-only, but if you hack around those, you can have clients filling > dmabufs, sending those to Weston, and Weston using GBM to import them > to put them on overlays via DRM - unless the scenegraph forces them to > go through pixman-renderer in which case you are in a slight pickle. > That sounds interesting! I'll take a closer look at it. > Warning: that weston branch may get rewritten or deleted without notice. > > I guess the take-away from this is that DRM overlay planes have not > really been considered for use with server nor client software rendering > in Weston. Yes, I kinda realize that know. My main goal here is to provide a video player demo application where the primary plane (or an overlay plane) is used to display video player controls (play, pause, ...) and another plane is used to display video content (using gstreamer or any other alternative). This needs to be done using overlays in order to get acceptable performances (avoid software rendering for plane composition), and thus should use drm overlay planes. I thought about developing or using an existing Qt application, but AFAIU, the problem remains the same with Qt DRM/KMS backend: it depends on EGL/GBM. Please let me know if you have any other ideas. Thanks. Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
On Thu, 3 Jul 2014 15:46:14 +0300 Pekka Paalanen wrote: > On Thu, 3 Jul 2014 14:15:34 +0200 > Boris BREZILLON wrote: > > > On Thu, 3 Jul 2014 13:49:06 +0300 > > Pekka Paalanen wrote: > > > > > On Thu, 3 Jul 2014 12:10:36 +0200 > > > Boris BREZILLON wrote: > > > > > > > On Thu, 3 Jul 2014 12:24:44 +0300 > > > > Pekka Paalanen wrote: > > > > > Weston's overlay planes code however totally depends on EGL to provide > > > > > hw-capable buffers from clients. A software renderer support in > > > > > EGL-DRM > > > > > won't help in that. > > > > > > > > Okay, if I understand correctly, this means I'll have to implement an > > > > atmel-hlcdc_dri.so module (as suggested by Giovanni), right ? > > > > > > Well, uhh, I suppose... > > > > > > That means you need to get clients actually rendering into hw-capable > > > buffers, so that usually means having a GL(ES) rendering working on > > > EGL Wayland platform, too. > > > > > > Or, clients could use something like libva(?) to fill the buffers and > > > use Mesa's internal wl_drm protocol to pass those to the compositor. If > > > the compositor is able to initialize EGL_WL_bind_wayland_display > > > extension, then with Mesa, the clients will have wl_drm available. > > > Still probably requires working GLESv2 rendering for the EGL DRM/GBM > > > platform, because the pixman renderer cannot handle anything else than > > > wl_shm buffers. > > > > > > Or, you migh hack Weston to copy pixels from wl_shm buffers into dumb > > > buffers, and put those into overlays (err, did dumb buffers support > > > going on overlays, or were they primary plane only?). But if you have > > > like less than 10 overlays in hw, that might be a net lose in > > > performance. > > > > I have, at most, 3 overlays (it depends on the HLCDC IP version), so > > this might be an acceptable solution. > > > > ITOH, I'd like to keep the implementation as clean as possible in order > > to be able to base future work on offical weston versions (and tell me > > if I'm wrong, but I'm not sure the proposed solution can ever make it to > > the official weston version). > > > > > > > > Or, you might go wild and have a hack on my hacky zlinux_dmabuf support > > > in weston: > > > http://cgit.collabora.com/git/user/pq/weston.git/log/?h=dmabuf-WIP > > > It is missing pixman-renderer integration, and the test client is > > > intel-only, but if you hack around those, you can have clients filling > > > dmabufs, sending those to Weston, and Weston using GBM to import them > > > to put them on overlays via DRM - unless the scenegraph forces them to > > > go through pixman-renderer in which case you are in a slight pickle. > > > > > > > That sounds interesting! > > I'll take a closer look at it. > > Note, that the protocol there does not address the problem of > compatibility at all, and the implementation does not even advertise any > pixel formats. It is all based on luck and clairvoyance, that the client > just happens to create exactly the right kind of dmabufs that the > compositor can use. If you fail that, the client gets kicked or you > get a mess on the screen. Obviously not upstreamable material. ;-) > > > > Warning: that weston branch may get rewritten or deleted without notice. > > > > > > I guess the take-away from this is that DRM overlay planes have not > > > really been considered for use with server nor client software rendering > > > in Weston. > > > > Yes, I kinda realize that know. > > > > My main goal here is to provide a video player demo application where > > the primary plane (or an overlay plane) is used to display video player > > controls (play, pause, ...) and another plane is used to display video > > content (using gstreamer or any other alternative). > > > > This needs to be done using overlays in order to get acceptable > > performances (avoid software rendering for plane composition), and > > thus should use drm overlay planes. > > Oh, a video player! How do you get the video frames? Do you have > hardware decoding? Can you perhaps decode straight into dmabufs? If > yes, then you could use zlinux_dmabuf to throw those video frames to > Weston zero-copy. Then the tricky part is to make Weston cope with those > video frame buffers, as if they even attempt to go throu
Re: [Mesa-dev] [PATCH v3 0/3] Software rendering in EGL-DRM
Hello Giovanni, On Thu, 3 Jul 2014 11:14:48 +0200 Giovanni Campagna wrote: > 2014-07-03 10:48 GMT+02:00 Boris BREZILLON > : > > Hello Giovanni, > > > > I have recently been working on a DRM/KMS driver which does not support > > OpenGL rendering (it only provides plane composition functionalities): > > [1]. > > > > If I understand correctly you patch series might solve some of the > > issues I am facing. > > It might get your working EGL, but it's not a complete solution, > because buffer management is limited to linear CPU-addressable "dumb" > buffers, which is probably not the most efficient choice (altough how > much slower it gets depends on the driver and on the HWl). > > > I'm trying to get wayland working with HW cursor and several planes, > > the problem is that it depends on GBM to provides drm plane and drm > > cursor support. > > > > I tried to get EGL working with my DRM device and it always ask for an > > atmel-hlcdc_dri.so module (I have applied this patch [2] to get to this > > point). > > > > First of all, am I mistaken in thinking this series could solve my > > issue ? > > Indeed, using my patch stack (patches 2 and 3) you will have a working > GBM device that will allocate GPU memory using the "dumb" interface. > If your driver is then able to upload this buffers to the plane HW (or > directly capable of allocating in GPU memory), that may be good enough > for you. > OTOH, this will not provide the wayland clients with the ability to > render directly to the plane buffers, because the "dumb" interface > does not provide global names that can be shared between processes, > therefore clients will have to render into a shared memory location, > that then the wayland compositor (weston, I assume) will have to > memcpy into a GBM allocated buffer. > If you want to avoid that, you will need to design an ioctl interface > for your driver to allocate buffers, then write a "winsys" for the > userspace side that uses those ioctls (directly or through libdrm) - > first it allocates the buffer with your driver specific ioctl and then > calls GEM_FLINK to get the global name, which can be passed to weston > and in there to gbm_bo_import(). > If your HW is uncapable of GL rendering (and thus you want to use SW > rendering always) is quite likely that your driver will not be that > different from > dri_kms_swrast, except that will be able to share buffers (patch 3) > using GEM names. I'm just curious: what are you using this dri_kms_swrast implementation for ? Okay, my real question here is: Is there other people trying to do what I'm doing or do you need it for another use case :-) ? Best Regards, Boris -- Boris Brezillon, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: egl: correctly manage loader/xmlconfig
On Thu, 14 Feb 2019 11:43:13 + Emil Velikov wrote: > From: Emil Velikov > > Earlier commit introduced support for haiku yet did not properly > annotate the loader/xmlconfig dependencies. > > Thus we ended up adding inc_loader for each !haiku platform - see > 659910eda01 9a96bf0ecd0 731508b988 cec6cb01e216. > > One piece remained though - the wayland platform. Hence the following > would fail: > > meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\ >-Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \ >build/ > > Cc: Alexander von Gluck IV > Cc: Dylan Baker > Cc: Boris Brezillon > Reported-by: Boris Brezillon > Fixes: 834d221512f ("meson: Add Haiku platform support v4") > Signed-off-by: Emil Velikov Tested-by: Boris Brezillon Thanks, Boris > --- > src/egl/meson.build | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/src/egl/meson.build b/src/egl/meson.build > index a23cc36fc2b..b7ff09e9fed 100644 > --- a/src/egl/meson.build > +++ b/src/egl/meson.build > @@ -93,10 +93,11 @@ if with_dri2 > 'drivers/dri2/egl_dri2.h', > 'drivers/dri2/egl_dri2_fallbacks.h', >) > + link_for_egl += [libloader, libxmlconfig] > + incs_for_egl += inc_loader > >if with_platform_x11 > files_egl += files('drivers/dri2/platform_x11.c') > -incs_for_egl += inc_loader > if with_dri3 >files_egl += files('drivers/dri2/platform_x11_dri3.c') >link_for_egl += libloader_dri3_helper > @@ -105,13 +106,12 @@ if with_dri2 >endif >if with_platform_drm > files_egl += files('drivers/dri2/platform_drm.c') > -link_for_egl += [libloader, libgbm, libxmlconfig] > -incs_for_egl += [inc_loader, inc_gbm, include_directories('../gbm/main')] > +link_for_egl += libgbm > +incs_for_egl += [inc_gbm, include_directories('../gbm/main')] > deps_for_egl += dep_libdrm >endif >if with_platform_surfaceless > files_egl += files('drivers/dri2/platform_surfaceless.c') > -incs_for_egl += [inc_loader] >endif >if with_platform_wayland > deps_for_egl += [dep_wayland_client, dep_wayland_server, > dep_wayland_egl_headers] > @@ -127,7 +127,6 @@ if with_dri2 >if with_platform_android > deps_for_egl += dep_android > files_egl += files('drivers/dri2/platform_android.c') > -incs_for_egl += [inc_loader] >endif > elif with_platform_haiku >incs_for_egl += inc_haikugl > @@ -166,7 +165,7 @@ libegl = shared_library( > > '-D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_@0@'.format(egl_native_platform.to_upper()), >], >include_directories : incs_for_egl, > - link_with : [link_for_egl, libloader, libxmlconfig, libglapi, > libmesa_util], > + link_with : [link_for_egl, libglapi, libmesa_util], >link_args : [ld_args_bsymbolic, ld_args_gc_sections], >dependencies : [deps_for_egl, dep_dl, dep_libdrm, dep_clock, dep_thread], >install : true, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 1/2] etnaviv: add support for CPU-based super/multi tile tiling/untiling
Sometimes using CPU based untiling/tiling makes, like when updating a really small region of the texture atlas which is not RS-aligned. CPU-based support for simple tile layout was already supported, but not the multi/super tile cases. Make the etna_texture_[un]tile() more generic to support those cases. Signed-off-by: Boris Brezillon --- src/gallium/drivers/etnaviv/etnaviv_tiling.c | 85 --- src/gallium/drivers/etnaviv/etnaviv_tiling.h | 10 ++- .../drivers/etnaviv/etnaviv_transfer.c| 16 ++-- 3 files changed, 84 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_tiling.c b/src/gallium/drivers/etnaviv/etnaviv_tiling.c index f4f85c1d6e6c..7e2b8bd48d3a 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_tiling.c +++ b/src/gallium/drivers/etnaviv/etnaviv_tiling.c @@ -32,39 +32,95 @@ #define TEX_TILE_WIDTH (4) #define TEX_TILE_HEIGHT (4) #define TEX_TILE_WORDS (TEX_TILE_WIDTH * TEX_TILE_HEIGHT) +#define TEX_SUPERTILE_WIDTH (64) +#define TEX_SUPERTILE_HEIGHT (64) +#define TEX_SUPERTILE_TWIDTH (16) +#define TEX_SUPERTILE_THEIGHT (16) +#define TEX_SUPERTILE_WORDS (TEX_SUPERTILE_WIDTH * TEX_SUPERTILE_HEIGHT) +#define TEX_TILES_PER_SUPERTILE (TEX_SUPERTILE_TWIDTH * TEX_SUPERTILE_THEIGHT) + +static unsigned tile_buf_offset(enum etna_surface_layout tlayout, +unsigned tx, unsigned ty, + unsigned tstride, unsigned th, + unsigned elmtsize) +{ + unsigned offs = 0, tile; + unsigned tiles_per_line = (tstride / elmtsize) / TEX_TILE_WIDTH; + + /* Multi tile layouts are described here: +* https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#multi-tiling +*/ + if (tlayout & ETNA_LAYOUT_BIT_MULTI) { + if tx / (TEX_TILE_WIDTH * 2)) & 1) ^ ((ty / TEX_TILE_HEIGHT) & 1))) { + offs += ((tstride / elmtsize) * (th / 2)); + if ((tx / (TEX_TILE_WIDTH * 2)) & 1) +tx -= TEX_TILE_WIDTH * 2; +else +tx += TEX_TILE_HEIGHT * 2; + } + + ty = ((ty / 2) & ~(TEX_TILE_HEIGHT - 1)) + + (ty % TEX_TILE_HEIGHT); + } + + if (tlayout & ETNA_LAYOUT_BIT_SUPER) { + /* We use the supertile layout described here: + * https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#supertiling + * FIXME: According to + * https://github.com/laanwj/etna_viv/blob/master/tools/detiler.py + * another layout exists. We should probably support CPU-based detiling + * for this layout too and determine the layout to used based on HW + * features. */ + tile = (ty / TEX_SUPERTILE_HEIGHT) * tiles_per_line * + TEX_SUPERTILE_THEIGHT; + ty &= TEX_SUPERTILE_HEIGHT - 1; + tile += (tx / TEX_SUPERTILE_WIDTH) * TEX_TILES_PER_SUPERTILE; + tx &= TEX_SUPERTILE_WIDTH - 1; + tile += (ty / (4 * TEX_TILE_HEIGHT)) * 16 * 4; + ty &= (4 * TEX_TILE_HEIGHT) - 1; + tile += (ty / TEX_TILE_HEIGHT) * 2; + tile += ((tx / TEX_TILE_WIDTH) & ~0x1) * 4; + tile += (tx / TEX_TILE_WIDTH) & 0x1; + } else { + tile = (ty / TEX_TILE_HEIGHT) * tiles_per_line; + tile += tx / TEX_TILE_WIDTH; + } + + offs += tile * TEX_TILE_WORDS; + ty &= TEX_TILE_HEIGHT - 1; + tx &= TEX_TILE_WIDTH - 1; + offs += (ty * TEX_TILE_WIDTH) + tx; + + return offs; +} #define DO_TILE(type) \ src_stride /= sizeof(type); \ - dst_stride = (dst_stride * TEX_TILE_HEIGHT) / sizeof(type); \ for (unsigned srcy = 0; srcy < height; ++srcy) { \ - unsigned dsty = basey + srcy; \ - unsigned ty = (dsty / TEX_TILE_HEIGHT) * dst_stride + \ -(dsty % TEX_TILE_HEIGHT) * TEX_TILE_WIDTH; \ for (unsigned srcx = 0; srcx < width; ++srcx) { \ - unsigned dstx = basex + srcx; \ - ((type *)dest)[ty + (dstx / TEX_TILE_WIDTH) * TEX_TILE_WORDS + \ -(dstx % TEX_TILE_WIDTH)] = \ + unsigned destoffs = tile_buf_offset(baselayout, basex + srcx, \ + basey + srcy, dst_stride, \ + baseh, elmtsize); \ + ((type *)dest)[destoffs] = \ ((type *)src)[srcy * src_stride + srcx];\ } \ } #define DO_UNTILE(type) \ - src_stride = (src_stride * TEX_TILE_HEIGHT) / sizeof(type);\ dst_stride /= sizeof(type);\ for (unsigned dsty = 0; dsty < height; +
[Mesa-dev] [RFC PATCH 0/2] etnaviv: use CPU-based tiling/untiling for small regions
Hi, This series aims at optimizing updates of small regions inside a texture. This is particularly useful when updating a texture atlas one bit at a time. Before going for this CPU-based tiling/untiling approach we tried optimizing things by keeping track of all updates triggered by the transfer_map/unmap() calls to avoid GPU -> CPU syncs when new regions are updated without touching other already written regions [1]. Unfortunately, because of the supertile+RS-alignment constraints this led to almost all updates requiring such a sync (at least in our use case where each element of the atlas is not properly aligned). This led us to consider another approach: avoid GPU-based tiling for small regions, especially when they're not properly aligned. This is what this patchset implements. Patch 1 add supports for super and multi tile untiling, and patch 2 makes use of the CPU-based tiling/untiling when the region is small enough. Note that the thresholds used here have been chosen arbitrarily but they seem to speed things up in our use case. Future possible improvements involve supporting NEON-based tiling/untiling (as started here[2]). Another option we considered was keeping a shadow linear buffer attached to the tiled resource so that we could update the linear version without waiting for the GPU to finish rendering things on the final BO, but this option will likely be more invasive. Please let me know what you think of this CPU-based tiling approach, and if you don't like it, feel free to propose other solution that would be worth investigating. Thanks, Boris [1]https://gitlab.collabora.com/bbrezillon/mesa/commits/etna-texture-atlas-18.2.4 [2]https://github.com/laanwj/mesa/commit/6d575b3094f17e29246be72dce8fbb6fe048db2c Boris Brezillon (2): etnaviv: add support for CPU-based super/multi tile tiling/untiling etnaviv: try to use CPU to tile/untile when the region is small enough src/gallium/drivers/etnaviv/etnaviv_tiling.c | 85 --- src/gallium/drivers/etnaviv/etnaviv_tiling.h | 10 ++- .../drivers/etnaviv/etnaviv_transfer.c| 83 ++ 3 files changed, 145 insertions(+), 33 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 2/2] etnaviv: try to use CPU to tile/untile when the region is small enough
Using the GPU has a cost, it implies preparing the RS/BLT request and then possibly requires extra GPU -> CPU syncs, so let's only use GPU-based tiling/untiling for rather big regions. While at it, move the "should we use the GPU to tile/untile" logic to its own function to make things more readable. Signed-off-by: Boris Brezillon --- .../drivers/etnaviv/etnaviv_transfer.c| 67 +-- 1 file changed, 61 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c b/src/gallium/drivers/etnaviv/etnaviv_transfer.c index 38648564b701..a3013e624ead 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c +++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c @@ -132,6 +132,66 @@ etna_transfer_unmap(struct pipe_context *pctx, struct pipe_transfer *ptrans) slab_free(&ctx->transfer_pool, trans); } +#define GPU_TILING_MIN_SURFACE_SIZE1024 +#define GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE 8192 + +static bool +etna_transfer_use_gpu(struct etna_context *ctx, struct etna_resource *rsc, + const struct pipe_box *box) +{ + unsigned w_align, h_align; + + /* Always use the GPU to tile/untile when the resource has a Tile Status +* buffer attached. */ + if (rsc->ts_bo) + return true; + + /* Nothing to tile/untile if the resource is using a linear format. */ + if (rsc->layout == ETNA_LAYOUT_LINEAR) + return false; + + /* Do not use the GPU when the resource is using a 1byte/pixel format. */ + if (util_format_get_blocksize(rsc->base.format) == 1) + return false; + + /* HALIGN 4 resources are incompatible with the resolve engine, so fall back +* to using software to detile this resource. */ + if (rsc->halign == TEXTURE_HALIGN_FOUR) + return false; + + /* Using the GPU has a cost, it implies preparing the RS/BLT request and +* then possibly requires extra GPU -> CPU syncs, so let's only use +* GPU-based tiling for rather big regions. Right now the minimum surface +* surface size is arbitrarily set to 1024 pixels. */ + if (box->width * box->height < GPU_TILING_MIN_SURFACE_SIZE) + return false; + + /* No alignment constraints when using BLT. */ + if (ctx->specs.use_blt) + return true; + + if (rsc->layout & ETNA_LAYOUT_BIT_SUPER) { + w_align = h_align = 64; + } else { + w_align = ETNA_RS_WIDTH_MASK + 1; + h_align = ETNA_RS_HEIGHT_MASK + 1; + } + h_align *= ctx->screen->specs.pixel_pipes; + + /* Everything is properly aligned, let's use the RS engine to +* tile/untile. */ + if (!((box->x | box->width) & (w_align - 1)) && + !((box->y | box->height) & (h_align - 1))) + return true; + + /* We want the minimum surface size to be even bigger for unaligned +* requests. */ + if (box->width * box->height < GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE) + return false; + + return true; +} + static void * etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, unsigned level, @@ -179,12 +239,7 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, * render resource. Use the texture resource, which avoids bouncing * pixels between the two resources, and we can de-tile it in s/w. */ rsc = etna_resource(rsc->texture); - } else if (rsc->ts_bo || - (rsc->layout != ETNA_LAYOUT_LINEAR && - util_format_get_blocksize(format) > 1 && - /* HALIGN 4 resources are incompatible with the resolve engine, -* so fall back to using software to detile this resource. */ - rsc->halign != TEXTURE_HALIGN_FOUR)) { + } else if (etna_transfer_use_gpu(ctx, rsc, box)) { /* If the surface has tile status, we need to resolve it first. * The strategy we implement here is to use the RS to copy the * depth buffer, filling in the "holes" where the tile status -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5] etnaviv: fix resource usage tracking across different pipe_context's
Christian, Marek, On Wed, 30 Jan 2019 05:28:14 +0100 Marek Vasut wrote: > From: Christian Gmeiner > > A pipe_resource can be shared by all the pipe_context's hanging off the > same pipe_screen. We seem to be impacted by the problem you're fixing here, but, while this patch definitely make things much better, the problem does not seem to be entirely fixed (at least in our case). A bit more context: we have Qt App using QtWebEngine to render html content. When we call QtWebEngine::initialize(), which as for effect to allow shared GL contexts, we sometimes notice that part of the web page is mis-rendered. That does not happen when we omit the QtWebEngine::initialize() call. As said above, this patch make those rendering issues less likely to happen, but we still have the problem from time to time. So I thought I'd share my guesses about what could cause these issues before debugging it further. First thing I noticed: I couldn't reproduce the problem with [1] applied (+ a tiny change forcing CPU-based tiling no matter the size of the region to tile/untile). So, my guess is that it's related to how we handle GPU-based tiling/untiling. Also noticed that we're testing the rsc->status here [2] without the screen->lock held, and there might be a race with another thread calling resource_used(). We might also lack a resource_read(ctx, &src->base) here [3]. But even after fixing those problems, the rendering issues are not gone. So I'm wondering is the problem is not coming from the GPU-based tiling/untiling logic we have in etnaviv_transfer.c. Indeed, RS-based tiling require the surface we tile/untile to be aligned on at least 16 pixels (and even more in case of multi/super tiles). In order to handle those constraints when the region the user wants to map/update is not properly aligned, etna_transfer_map() first untiles an RS-aligned region into a temporary linear buffer which will then be updated according to the caller's needs and tiled back to the actual texture when etna_transfer_unmap() is called. But what happens if, between the untile operation in etna_transfer_map() and the tile operation in etna_transfer_unmap(), another rendering request is done on region that overlaps the RS-aligned region we untiled in our temporary buffer? Can't we end up in a situation where some a the data we are about to copy back to the actual texture are actually outdated? I'm completely new to etnaviv and mesa, so I wouldn't be surprised if my analysis of the problem was wrong, but I thought it was worth an email before digging further. Please let me know if you have other ideas or need more details about our use case. Thanks, Boris [1]https://lists.freedesktop.org/archives/mesa-dev/2019-February/215597.html [2]https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/etnaviv/etnaviv_transfer.c#n316 [3]https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/etnaviv/etnaviv_rs.c#n731 > > Signed-off-by: Christian Gmeiner > Signed-off-by: Marek Vasut > To: mesa-dev@lists.freedesktop.org > Cc: etna...@lists.freedesktop.org > --- > Changes from v1 -> v2: > - to remove the resource from the used_resources set when it is destroyed > Changes from v2 -> v3: > - add locking with mtx_*() to resource and screen (Marek) > Changes from v3 -> v4: > - drop rsc->lock, just use screen->lock for the entire serialization (Marek) > - simplify etna_resource_used() flush condition, which also prevents >potentially flushing resources twice (Marek) > - don't remove resouces from screen->used_resources in >etna_cmd_stream_reset_notify(), they may still be used in other >contexts and may need flushing there later on (Marek) > Changes from v4 -> v5: > - Fix coding style issues reported by Guido > --- > src/gallium/drivers/etnaviv/etnaviv_context.c | 26 +- > src/gallium/drivers/etnaviv/etnaviv_context.h | 3 -- > .../drivers/etnaviv/etnaviv_resource.c| 52 +++ > .../drivers/etnaviv/etnaviv_resource.h| 8 +-- > src/gallium/drivers/etnaviv/etnaviv_screen.c | 12 + > src/gallium/drivers/etnaviv/etnaviv_screen.h | 6 +++ > 6 files changed, 78 insertions(+), 29 deletions(-) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c > b/src/gallium/drivers/etnaviv/etnaviv_context.c > index 3038d21..2f8cae8 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_context.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c > @@ -36,6 +36,7 @@ > #include "etnaviv_query.h" > #include "etnaviv_query_hw.h" > #include "etnaviv_rasterizer.h" > +#include "etnaviv_resource.h" > #include "etnaviv_screen.h" > #include "etnaviv_shader.h" > #include "etnaviv_state.h" > @@ -329,7 +330,8 @@ static void > etna_cmd_stream_reset_notify(struct etna_cmd_stream *stream, void *priv) > { > struct etna_context *ctx = priv; > - struct etna_resource *rsc, *rsc_tmp; > + struct etna_screen *screen = ctx->screen; > + struct set_entry *entry; > > etna_set_state(stream,
Re: [Mesa-dev] [PATCH v5] etnaviv: fix resource usage tracking across different pipe_context's
On Thu, 21 Feb 2019 23:29:53 +0100 Boris Brezillon wrote: > Christian, Marek, > > On Wed, 30 Jan 2019 05:28:14 +0100 > Marek Vasut wrote: > > > From: Christian Gmeiner > > > > A pipe_resource can be shared by all the pipe_context's hanging off the > > same pipe_screen. > > We seem to be impacted by the problem you're fixing here, but, while > this patch definitely make things much better, the problem does not > seem to be entirely fixed (at least in our case). > > A bit more context: we have Qt App using QtWebEngine to render html > content. When we call QtWebEngine::initialize(), which as for effect > to allow shared GL contexts, we sometimes notice that part of the web > page is mis-rendered. That does not happen when we omit the > QtWebEngine::initialize() call. > As said above, this patch make those rendering issues less likely to > happen, but we still have the problem from time to time. So I thought > I'd share my guesses about what could cause these issues before > debugging it further. > > First thing I noticed: I couldn't reproduce the problem with [1] > applied (+ a tiny change forcing CPU-based tiling no matter the size of > the region to tile/untile). So, my guess is that it's related to how we > handle GPU-based tiling/untiling. > Also noticed that we're testing the rsc->status here [2] without the > screen->lock held, and there might be a race with another thread calling > resource_used(). We might also lack a resource_read(ctx, &src->base) > here [3]. But even after fixing those problems, the rendering issues > are not gone. I tested again with the following diff applied on top of your patch, and the remaining rendering issues we had seem to be gone (don't know what I messed up in my previous tests :-/). --->8--- diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c index fc4f65dbeee1..b8c8b96a6f72 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c @@ -729,6 +729,7 @@ etna_try_rs_blit(struct pipe_context *pctx, etna_submit_rs_state(ctx, ©_to_screen); resource_written(ctx, &dst->base); + resource_read(ctx, &src->base); dst->seqno++; dst->levels[blit_info->dst.level].ts_valid = false; ctx->dirty |= ETNA_DIRTY_DERIVE_TS; diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c b/src/gallium/drivers/etnaviv/etnaviv_transfer.c index a3013e624ead..e4b2ac605e63 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c +++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c @@ -356,6 +356,7 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, * transfers without a temporary resource. */ if (trans->rsc || !(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) { + struct etna_screen *screen = ctx->screen; uint32_t prep_flags = 0; /* @@ -364,11 +365,13 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, * current GPU usage (reads must wait for GPU writes, writes must have * exclusive access to the buffer). */ + mtx_lock(&screen->lock); if ((trans->rsc && (etna_resource(trans->rsc)->status & ETNA_PENDING_WRITE)) || (!trans->rsc && (((usage & PIPE_TRANSFER_READ) && (rsc->status & ETNA_PENDING_WRITE)) || ((usage & PIPE_TRANSFER_WRITE) && rsc->status pctx->flush(pctx, NULL, 0); + mtx_unlock(&screen->lock); if (usage & PIPE_TRANSFER_READ) prep_flags |= DRM_ETNA_PREP_READ; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] etnaviv: blt: mark used src resource as read from
On Fri, 22 Feb 2019 11:10:29 +0100 Christian Gmeiner wrote: > Signed-off-by: Christian Gmeiner Reviewed-by: Boris Brezillon > --- > src/gallium/drivers/etnaviv/etnaviv_blt.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_blt.c > b/src/gallium/drivers/etnaviv/etnaviv_blt.c > index 52731a9c770..42190d75d4e 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_blt.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_blt.c > @@ -510,7 +510,9 @@ etna_try_blt_blit(struct pipe_context *pctx, > etna_stall(ctx->stream, SYNC_RECIPIENT_FE, SYNC_RECIPIENT_BLT); > etna_set_state(ctx->stream, VIVS_GL_FLUSH_CACHE, 0x0c23); > > + resource_read(ctx, &src->base); > resource_written(ctx, &dst->base); > + > dst->seqno++; > dst_lev->ts_valid = false; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] etnaviv: rs: mark used src resource as read from
On Fri, 22 Feb 2019 11:02:34 +0100 Christian Gmeiner wrote: > Signed-off-by: Christian Gmeiner Reviewed-by: Boris Brezillon > --- > src/gallium/drivers/etnaviv/etnaviv_rs.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c > b/src/gallium/drivers/etnaviv/etnaviv_rs.c > index fc4f65dbeee..a9d3872ad41 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c > @@ -728,6 +728,7 @@ etna_try_rs_blit(struct pipe_context *pctx, > }); > > etna_submit_rs_state(ctx, ©_to_screen); > + resource_read(ctx, &src->base); > resource_written(ctx, &dst->base); > dst->seqno++; > dst->levels[blit_info->dst.level].ts_valid = false; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/util: Fix off-by-one in box intersection
From: Daniel Stone pipe_boxes are x/y + width/height, rather than x0/y0 -> x1/y1. This means that (x+width) is not included in the box. The box intersection check was seemingly written for inclusive regions, and would falsely assert that adjacent boxes would overlap. Fix the off-by-one by being one pixel less greedy. Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon --- src/gallium/auxiliary/util/u_box.h | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/auxiliary/util/u_box.h b/src/gallium/auxiliary/util/u_box.h index b3f478e7bfc4..ead7189ecaf8 100644 --- a/src/gallium/auxiliary/util/u_box.h +++ b/src/gallium/auxiliary/util/u_box.h @@ -161,15 +161,15 @@ u_box_test_intersection_2d(const struct pipe_box *a, unsigned i; int a_l[2], a_r[2], b_l[2], b_r[2]; - a_l[0] = MIN2(a->x, a->x + a->width); - a_r[0] = MAX2(a->x, a->x + a->width); - a_l[1] = MIN2(a->y, a->y + a->height); - a_r[1] = MAX2(a->y, a->y + a->height); + a_l[0] = MIN2(a->x, a->x + a->width - 1); + a_r[0] = MAX2(a->x, a->x + a->width - 1); + a_l[1] = MIN2(a->y, a->y + a->height - 1); + a_r[1] = MAX2(a->y, a->y + a->height - 1); - b_l[0] = MIN2(b->x, b->x + b->width); - b_r[0] = MAX2(b->x, b->x + b->width); - b_l[1] = MIN2(b->y, b->y + b->height); - b_r[1] = MAX2(b->y, b->y + b->height); + b_l[0] = MIN2(b->x, b->x + b->width - 1); + b_r[0] = MAX2(b->x, b->x + b->width - 1); + b_l[1] = MIN2(b->y, b->y + b->height - 1); + b_r[1] = MAX2(b->y, b->y + b->height - 1); for (i = 0; i < 2; ++i) { if (a_l[i] > b_r[i] || a_r[i] < b_l[i]) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v6] etnaviv: fix resource usage tracking across different pipe_context's
On Sat, 23 Feb 2019 16:15:19 +0100 Christian Gmeiner wrote: > A pipe_resource can be shared by all the pipe_context's hanging off the > same pipe_screen. > > Changes from v2 -> v3: > - add locking with mtx_*() to resource and screen (Marek) > Changes from v3 -> v4: > - drop rsc->lock, just use screen->lock for the entire serialization (Marek) > - simplify etna_resource_used() flush condition, which also prevents >potentially flushing resources twice (Marek) > - don't remove resouces from screen->used_resources in >etna_cmd_stream_reset_notify(), they may still be used in other >contexts and may need flushing there later on (Marek) > Changes from v4 -> v5: > - Fix coding style issues reported by Guido > Changes from v5 -> v6: > - Add missing locking in etna_transfer_map(..) (Boris) > > Signed-off-by: Christian Gmeiner > Signed-off-by: Marek Vasut > Signed-off-by: Boris Brezillon Reviewed-by: Boris Brezillon Tested-by: Boris Brezillon This being said, I'm still unsure all races are fixed with this patch (see the part about RS-based tiling in my reply to v5). > Tested-by: Marek Vasut > --- > src/gallium/drivers/etnaviv/etnaviv_context.c | 26 +- > src/gallium/drivers/etnaviv/etnaviv_context.h | 3 -- > .../drivers/etnaviv/etnaviv_resource.c| 52 +++ > .../drivers/etnaviv/etnaviv_resource.h| 8 +-- > src/gallium/drivers/etnaviv/etnaviv_screen.c | 12 + > src/gallium/drivers/etnaviv/etnaviv_screen.h | 6 +++ > .../drivers/etnaviv/etnaviv_transfer.c| 5 ++ > 7 files changed, 83 insertions(+), 29 deletions(-) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c > b/src/gallium/drivers/etnaviv/etnaviv_context.c > index 44b50925a4f..83a703f7cc2 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_context.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c > @@ -36,6 +36,7 @@ > #include "etnaviv_query.h" > #include "etnaviv_query_hw.h" > #include "etnaviv_rasterizer.h" > +#include "etnaviv_resource.h" > #include "etnaviv_screen.h" > #include "etnaviv_shader.h" > #include "etnaviv_state.h" > @@ -329,7 +330,8 @@ static void > etna_cmd_stream_reset_notify(struct etna_cmd_stream *stream, void *priv) > { > struct etna_context *ctx = priv; > - struct etna_resource *rsc, *rsc_tmp; > + struct etna_screen *screen = ctx->screen; > + struct set_entry *entry; > > etna_set_state(stream, VIVS_GL_API_MODE, VIVS_GL_API_MODE_OPENGL); > etna_set_state(stream, VIVS_GL_VERTEX_ELEMENT_CONFIG, 0x0001); > @@ -384,16 +386,18 @@ etna_cmd_stream_reset_notify(struct etna_cmd_stream > *stream, void *priv) > ctx->dirty = ~0L; > ctx->dirty_sampler_views = ~0L; > > - /* go through all the used resources and clear their status flag */ > - LIST_FOR_EACH_ENTRY_SAFE(rsc, rsc_tmp, &ctx->used_resources, list) > - { > - debug_assert(rsc->status != 0); > - rsc->status = 0; > - rsc->pending_ctx = NULL; > - list_delinit(&rsc->list); > - } > + /* > +* Go through all _resources_ associated with this _screen_, pending > +* in this _context_ and mark them as not pending in this _context_ > +* anymore, since they were just flushed. > +*/ > + mtx_lock(&screen->lock); > + set_foreach(screen->used_resources, entry) { > + struct etna_resource *rsc = (struct etna_resource *)entry->key; > > - assert(LIST_IS_EMPTY(&ctx->used_resources)); > + _mesa_set_remove_key(rsc->pending_ctx, ctx); > + } > + mtx_unlock(&screen->lock); > } > > static void > @@ -437,8 +441,6 @@ etna_context_create(struct pipe_screen *pscreen, void > *priv, unsigned flags) > /* need some sane default in case state tracker doesn't set some state: */ > ctx->sample_mask = 0x; > > - list_inithead(&ctx->used_resources); > - > /* Set sensible defaults for state */ > etna_cmd_stream_reset_notify(ctx->stream, ctx); > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.h > b/src/gallium/drivers/etnaviv/etnaviv_context.h > index 6ad9f3431e1..50a2cdf3d07 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_context.h > +++ b/src/gallium/drivers/etnaviv/etnaviv_context.h > @@ -136,9 +136,6 @@ struct etna_context { > uint32_t prim_hwsupport; > struct primconvert_context *primconvert; > > - /* list of resources used by currently-unsubmitted renders */ > - struct list_head used_resources; > - > struct slab_child_pool transfer_pool; >
Re: [Mesa-dev] [PATCH] gallium/util: Fix off-by-one in box intersection
Hello Gurchetan, On Wed, 27 Feb 2019 10:34:26 -0800 Gurchetan Singh wrote: > On Mon, Feb 25, 2019 at 12:35 AM Boris Brezillon > wrote: > > > > From: Daniel Stone > > > > pipe_boxes are x/y + width/height, rather than x0/y0 -> x1/y1. This > > means that (x+width) is not included in the box. > > > > The box intersection check was seemingly written for inclusive regions, > > and would falsely assert that adjacent boxes would overlap. > > > > Fix the off-by-one by being one pixel less greedy. > > Is there a reason for this change? I only see this used in a warning > in the nine state tracker and virgl (where reporting adjacent > intersections is preferred). This patch was part of a series Daniel worked on to optimize texture atlas updates on Vivante GPUs [1]. In the end, this work has been put on hold because the perf optimization was not as high as expected, but it might be resurrected at some point. Anyway, back to the point. In this patchset, the pipe_region_overlaps() helper needs to check when regions overlap and not when they're adjacent. If other users need u_box_test_intersection_2d() to also detect when boxes are adjacent, then we should definitely keep the code unchanged, but maybe it's worth a comment in the code to clarify the behavior. Regards, Boris [1]https://gitlab.collabora.com/bbrezillon/mesa/commits/etna-texture-atlas-18.2.4 > > > > > Signed-off-by: Daniel Stone > > Signed-off-by: Boris Brezillon > > --- > > src/gallium/auxiliary/util/u_box.h | 16 > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/src/gallium/auxiliary/util/u_box.h > > b/src/gallium/auxiliary/util/u_box.h > > index b3f478e7bfc4..ead7189ecaf8 100644 > > --- a/src/gallium/auxiliary/util/u_box.h > > +++ b/src/gallium/auxiliary/util/u_box.h > > @@ -161,15 +161,15 @@ u_box_test_intersection_2d(const struct pipe_box *a, > > unsigned i; > > int a_l[2], a_r[2], b_l[2], b_r[2]; > > > > - a_l[0] = MIN2(a->x, a->x + a->width); > > - a_r[0] = MAX2(a->x, a->x + a->width); > > - a_l[1] = MIN2(a->y, a->y + a->height); > > - a_r[1] = MAX2(a->y, a->y + a->height); > > + a_l[0] = MIN2(a->x, a->x + a->width - 1); > > + a_r[0] = MAX2(a->x, a->x + a->width - 1); > > + a_l[1] = MIN2(a->y, a->y + a->height - 1); > > + a_r[1] = MAX2(a->y, a->y + a->height - 1); > > > > - b_l[0] = MIN2(b->x, b->x + b->width); > > - b_r[0] = MAX2(b->x, b->x + b->width); > > - b_l[1] = MIN2(b->y, b->y + b->height); > > - b_r[1] = MAX2(b->y, b->y + b->height); > > + b_l[0] = MIN2(b->x, b->x + b->width - 1); > > + b_r[0] = MAX2(b->x, b->x + b->width - 1); > > + b_l[1] = MIN2(b->y, b->y + b->height - 1); > > + b_r[1] = MAX2(b->y, b->y + b->height - 1); > > > > for (i = 0; i < 2; ++i) { > >if (a_l[i] > b_r[i] || a_r[i] < b_l[i]) > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] broadcom/vc4: Add support for extended CL submission
Recent versions of the VC4 driver support CL extensions, which allows one to pass extra attributes to the CL submission. In order to make this feature backward compatible with the existing SUBMIT_CL ioctl, we had to add a new flag and re-use the optional bin_cl fields. Binner CL are now passed as an extra chunk. Signed-off-by: Boris Brezillon --- src/gallium/drivers/vc4/vc4_job.c| 23 +-- src/gallium/drivers/vc4/vc4_screen.c | 2 ++ src/gallium/drivers/vc4/vc4_screen.h | 1 + 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_job.c b/src/gallium/drivers/vc4/vc4_job.c index 7fe20c16bad9..fb0c5bbc78cf 100644 --- a/src/gallium/drivers/vc4/vc4_job.c +++ b/src/gallium/drivers/vc4/vc4_job.c @@ -362,12 +362,18 @@ vc4_submit_setup_rcl_msaa_surface(struct vc4_job *job, rsc->writes++; } +#define MAX_CHUNKS 1 + /** * Submits the job to the kernel and then reinitializes it. */ void vc4_job_submit(struct vc4_context *vc4, struct vc4_job *job) { +struct vc4_screen *screen = vc4_screen(vc4->base.screen); +union drm_vc4_submit_cl_chunk chunks[MAX_CHUNKS] = { }; +uint32_t nchunks = 0; + if (!job->needs_flush) goto done; @@ -446,14 +452,27 @@ vc4_job_submit(struct vc4_context *vc4, struct vc4_job *job) submit.bo_handles = (uintptr_t)job->bo_handles.base; submit.bo_handle_count = cl_offset(&job->bo_handles) / 4; -submit.bin_cl = (uintptr_t)job->bcl.base; -submit.bin_cl_size = cl_offset(&job->bcl); +if (!screen->has_extended_cl) { +submit.bin_cl = (uintptr_t)job->bcl.base; +submit.bin_cl_size = cl_offset(&job->bcl); +} else if (cl_offset(&job->bcl)) { +chunks[nchunks].bin.type = VC4_BIN_CL_CHUNK; +chunks[nchunks].bin.size = cl_offset(&job->bcl); +chunks[nchunks].bin.ptr = (uintptr_t)job->bcl.base; +nchunks++; +} submit.shader_rec = (uintptr_t)job->shader_rec.base; submit.shader_rec_size = cl_offset(&job->shader_rec); submit.shader_rec_count = job->shader_rec_count; submit.uniforms = (uintptr_t)job->uniforms.base; submit.uniforms_size = cl_offset(&job->uniforms); +if (nchunks) { +submit.flags |= VC4_SUBMIT_CL_EXTENDED; +submit.cl_chunks = (uintptr_t)chunks; +submit.num_cl_chunks = nchunks; +} + assert(job->draw_min_x != ~0 && job->draw_min_y != ~0); submit.min_x_tile = job->draw_min_x / job->tile_width; submit.min_y_tile = job->draw_min_y / job->tile_height; diff --git a/src/gallium/drivers/vc4/vc4_screen.c b/src/gallium/drivers/vc4/vc4_screen.c index a42ba675c130..4b63e940822d 100644 --- a/src/gallium/drivers/vc4/vc4_screen.c +++ b/src/gallium/drivers/vc4/vc4_screen.c @@ -696,6 +696,8 @@ vc4_screen_create(int fd, struct renderonly *ro) vc4_has_feature(screen, DRM_VC4_PARAM_SUPPORTS_THREADED_FS); screen->has_madvise = vc4_has_feature(screen, DRM_VC4_PARAM_SUPPORTS_MADVISE); +screen->has_extended_cl = +vc4_has_feature(screen, DRM_VC4_PARAM_SUPPORTS_EXTENDED_CL); if (!vc4_get_chip_info(screen)) goto fail; diff --git a/src/gallium/drivers/vc4/vc4_screen.h b/src/gallium/drivers/vc4/vc4_screen.h index 09d1c342ed19..83719d88baf0 100644 --- a/src/gallium/drivers/vc4/vc4_screen.h +++ b/src/gallium/drivers/vc4/vc4_screen.h @@ -97,6 +97,7 @@ struct vc4_screen { bool has_threaded_fs; bool has_madvise; bool has_tiling_ioctl; +bool has_extended_cl; struct vc4_simulator_file *sim_file; }; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] drm-uapi: Update vc4 header with CL chunks and perfmon related definitions
Signed-off-by: Boris Brezillon --- include/drm-uapi/vc4_drm.h | 156 ++--- 1 file changed, 146 insertions(+), 10 deletions(-) diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h index 3415a4b71884..ca0220257f05 100644 --- a/include/drm-uapi/vc4_drm.h +++ b/include/drm-uapi/vc4_drm.h @@ -42,6 +42,9 @@ extern "C" { #define DRM_VC4_GET_TILING0x09 #define DRM_VC4_LABEL_BO 0x0a #define DRM_VC4_GEM_MADVISE 0x0b +#define DRM_VC4_PERFMON_CREATE0x0c +#define DRM_VC4_PERFMON_DESTROY 0x0d +#define DRM_VC4_PERFMON_GET_VALUES0x0e #define DRM_IOCTL_VC4_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl) #define DRM_IOCTL_VC4_WAIT_SEQNO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno) @@ -55,6 +58,9 @@ extern "C" { #define DRM_IOCTL_VC4_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling) #define DRM_IOCTL_VC4_LABEL_BODRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo) #define DRM_IOCTL_VC4_GEM_MADVISE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GEM_MADVISE, struct drm_vc4_gem_madvise) +#define DRM_IOCTL_VC4_PERFMON_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_CREATE, struct drm_vc4_perfmon_create) +#define DRM_IOCTL_VC4_PERFMON_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_DESTROY, struct drm_vc4_perfmon_destroy) +#define DRM_IOCTL_VC4_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_GET_VALUES, struct drm_vc4_perfmon_get_values) struct drm_vc4_submit_rcl_surface { __u32 hindex; /* Handle index, or ~0 if not present. */ @@ -70,6 +76,67 @@ struct drm_vc4_submit_rcl_surface { }; /** + * @VC4_BIN_CL_CHUNK: binner CL chunk + * @VC4_PERFMON_CHUNK: performance monitor chunk + */ +enum { + VC4_BIN_CL_CHUNK, + VC4_PERFMON_CHUNK, +}; + +/** + * struct drm_vc4_submit_cl_chunk - dummy chunk + * @type: extension type + * @pad: unused, should be set to zero + * + * Meant to be used for chunks that do not require extra parameters. + */ +struct drm_vc4_submit_cl_dummy_chunk { + __u32 type; + __u32 pad[3]; +}; + +/** + * struct drm_vc4_submit_cl_bin_chunk - binner CL chunk + * + * @type: extention type, should be set to %VC4_BIN_CL_CHUNK + * @size: size in bytes of the binner CL + * @ptr: userspace pointer to the binner CL + */ +struct drm_vc4_submit_cl_bin_chunk { + __u32 type; + __u32 size; + __u64 ptr; +}; + +/** + * struct drm_vc4_submit_cl_perfmon_chunk - performance monitor extension + * + * @type: extention type, should be set to %VC4_PERFMON_CHUNK + * @id: id of the perfmance monitor previously allocated with + * %DRM_IOCTL_VC4_PERFMON_CREATE + * @pad: unused, should be set to zero + */ +struct drm_vc4_submit_cl_perfmon_chunk { + __u32 type; + __u32 id; + __u64 pad; +}; + +/** + * union drm_vc4_submit_cl_chunk - CL chunk + * + * CL chunks allow us to easily extend the set of arguments one can pass + * to the submit CL ioctl without having to add new ioctls/struct everytime + * we run out of free fields in the drm_vc4_submit_cl struct. + */ +union drm_vc4_submit_cl_chunk { + struct drm_vc4_submit_cl_dummy_chunk dummy; + struct drm_vc4_submit_cl_bin_chunk bin; + struct drm_vc4_submit_cl_perfmon_chunk perfmon; +}; + +/** * struct drm_vc4_submit_cl - ioctl argument for submitting commands to the 3D * engine. * @@ -83,14 +150,23 @@ struct drm_vc4_submit_rcl_surface { * BO. */ struct drm_vc4_submit_cl { - /* Pointer to the binner command list. -* -* This is the first set of commands executed, which runs the -* coordinate shader to determine where primitives land on the screen, -* then writes out the state updates and draw calls necessary per tile -* to the tile allocation BO. -*/ - __u64 bin_cl; + union { + /* Pointer to the binner command list. +* +* This is the first set of commands executed, which runs the +* coordinate shader to determine where primitives land on +* the screen, then writes out the state updates and draw calls +* necessary per tile to the tile allocation BO. +*/ + __u64 bin_cl; + + /* Pointer to an array of CL chunks. +* +* This is now the preferred way of passing optional attributes +* when submitting a job. +*/ + __u64 cl_chunks; + }; /* Pointer to the shader records. * @@ -120,8 +196,14 @@ struct drm_vc4_submit_cl { __u64 uniforms; __u64 bo_handles; - /* Size in bytes of the
[Mesa-dev] [PATCH 3/3] broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon --- src/gallium/drivers/vc4/vc4_context.h | 13 +++ src/gallium/drivers/vc4/vc4_job.c | 9 +- src/gallium/drivers/vc4/vc4_query.c | 197 -- src/gallium/drivers/vc4/vc4_screen.c | 7 ++ src/gallium/drivers/vc4/vc4_screen.h | 1 + 5 files changed, 215 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_context.h b/src/gallium/drivers/vc4/vc4_context.h index 4a1e4093f1a0..b6d9f041efc7 100644 --- a/src/gallium/drivers/vc4/vc4_context.h +++ b/src/gallium/drivers/vc4/vc4_context.h @@ -309,6 +309,11 @@ struct vc4_job { struct vc4_job_key key; }; +struct vc4_hwperfmon { +uint32_t id; +uint64_t counters[DRM_VC4_MAX_PERF_COUNTERS]; +}; + struct vc4_context { struct pipe_context base; @@ -387,6 +392,8 @@ struct vc4_context { struct pipe_viewport_state viewport; struct vc4_constbuf_stateobj constbuf[PIPE_SHADER_TYPES]; struct vc4_vertexbuf_stateobj vertexbuf; + +struct vc4_hwperfmon *perfmon; /** @} */ }; @@ -444,6 +451,12 @@ vc4_sampler_state(struct pipe_sampler_state *psampler) return (struct vc4_sampler_state *)psampler; } +int vc4_get_driver_query_group_info(struct pipe_screen *pscreen, +unsigned index, +struct pipe_driver_query_group_info *info); +int vc4_get_driver_query_info(struct pipe_screen *pscreen, unsigned index, + struct pipe_driver_query_info *info); + struct pipe_context *vc4_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags); void vc4_draw_init(struct pipe_context *pctx); diff --git a/src/gallium/drivers/vc4/vc4_job.c b/src/gallium/drivers/vc4/vc4_job.c index fb0c5bbc78cf..f75a32565603 100644 --- a/src/gallium/drivers/vc4/vc4_job.c +++ b/src/gallium/drivers/vc4/vc4_job.c @@ -362,7 +362,7 @@ vc4_submit_setup_rcl_msaa_surface(struct vc4_job *job, rsc->writes++; } -#define MAX_CHUNKS 1 +#define MAX_CHUNKS 2 /** * Submits the job to the kernel and then reinitializes it. @@ -467,6 +467,13 @@ vc4_job_submit(struct vc4_context *vc4, struct vc4_job *job) submit.uniforms = (uintptr_t)job->uniforms.base; submit.uniforms_size = cl_offset(&job->uniforms); +if (vc4->perfmon && screen->has_extended_cl) { +chunks[nchunks].perfmon.type = VC4_PERFMON_CHUNK; +chunks[nchunks].perfmon.id = vc4->perfmon->id; +chunks[nchunks].perfmon.pad = 0; +nchunks++; +} + if (nchunks) { submit.flags |= VC4_SUBMIT_CL_EXTENDED; submit.cl_chunks = (uintptr_t)chunks; diff --git a/src/gallium/drivers/vc4/vc4_query.c b/src/gallium/drivers/vc4/vc4_query.c index ddf8f8fb0c2c..d6b081bb15d7 100644 --- a/src/gallium/drivers/vc4/vc4_query.c +++ b/src/gallium/drivers/vc4/vc4_query.c @@ -32,49 +32,224 @@ struct vc4_query { -uint8_t pad; +unsigned num_queries; +struct vc4_hwperfmon *hwperfmon; }; +static const char *v3d_counter_names[] = { +"FEP-valid-primitives-no-rendered-pixels", +"FEP-valid-primitives-rendered-pixels", +"FEP-clipped-quads", +"FEP-valid-quads", +"TLB-quads-not-passing-stencil-test", +"TLB-quads-not-passing-z-and-stencil-test", +"TLB-quads-with-zero-coverage", +"TLB-quads-with-non-zero-coverage", +"TLB-quads-written-to-color-buffer", +"PTB-primitives-discarded-outside-viewport", +"PTB-primitives-need-clipping", +"PTB-primitives-discared-reversed", +"QPU-total-idle-clk-cycles", +"QPU-total-clk-cycles-vertex-coord-shading", +"QPU-total-clk-cycles-fragment-shading", +"QPU-total-clk-cycles-executing-valid-instr", +"QPU-total-clk-cycles-waiting-TMU", +"QPU-total-clk-cycles-waiting-scoreboard", +"QPU-total-clk-cycles-waiting-varyings", +"QPU-total-instr-cache-hit", +"QPU-total-instr-cache-miss", +"QPU-total-uniform-cache-hit", +"QPU-total-uniform-cache-miss", +"TMU-total-text-quads-processed", +"TMU-total-text-cache-miss", +"VPM-total-clk-cycles-VDW-stalled", +"VPM-total-clk-cycles-VCD-stalled", +"L2C-total-cache-hit", +"L2C-total
[Mesa-dev] [PATCH 0/3] broadcom/v4: Expose VC4 HW perf counters
Hello, This series makes use of the VC4 perfmon ioctls to expose HW perf counters through the GL_AMD_performance_monitor interface. This patches depends on their kernel counterparts and should not be applied until the kernel patches have reached the drm tree. Regards, Boris Boris Brezillon (3): drm-uapi: Update vc4 header with CL chunks and perfmon related definitions broadcom/vc4: Add support for extended CL submission broadcom/vc4: Add support for HW perfmon include/drm-uapi/vc4_drm.h| 156 +-- src/gallium/drivers/vc4/vc4_context.h | 13 +++ src/gallium/drivers/vc4/vc4_job.c | 30 +- src/gallium/drivers/vc4/vc4_query.c | 197 -- src/gallium/drivers/vc4/vc4_screen.c | 9 ++ src/gallium/drivers/vc4/vc4_screen.h | 2 + 6 files changed, 384 insertions(+), 23 deletions(-) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/2] drm-uapi: Update vc4 header with perfmon related definitions
Signed-off-by: Boris Brezillon --- include/drm-uapi/vc4_drm.h | 67 ++ 1 file changed, 67 insertions(+) diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h index 3415a4b71884..3686f451d779 100644 --- a/include/drm-uapi/vc4_drm.h +++ b/include/drm-uapi/vc4_drm.h @@ -42,6 +42,9 @@ extern "C" { #define DRM_VC4_GET_TILING0x09 #define DRM_VC4_LABEL_BO 0x0a #define DRM_VC4_GEM_MADVISE 0x0b +#define DRM_VC4_PERFMON_CREATE0x0c +#define DRM_VC4_PERFMON_DESTROY 0x0d +#define DRM_VC4_PERFMON_GET_VALUES0x0e #define DRM_IOCTL_VC4_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl) #define DRM_IOCTL_VC4_WAIT_SEQNO DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno) @@ -55,6 +58,9 @@ extern "C" { #define DRM_IOCTL_VC4_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling) #define DRM_IOCTL_VC4_LABEL_BODRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo) #define DRM_IOCTL_VC4_GEM_MADVISE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GEM_MADVISE, struct drm_vc4_gem_madvise) +#define DRM_IOCTL_VC4_PERFMON_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_CREATE, struct drm_vc4_perfmon_create) +#define DRM_IOCTL_VC4_PERFMON_DESTROY DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_DESTROY, struct drm_vc4_perfmon_destroy) +#define DRM_IOCTL_VC4_PERFMON_GET_VALUES DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_GET_VALUES, struct drm_vc4_perfmon_get_values) struct drm_vc4_submit_rcl_surface { __u32 hindex; /* Handle index, or ~0 if not present. */ @@ -173,6 +179,15 @@ struct drm_vc4_submit_cl { * wait ioctl). */ __u64 seqno; + + /* ID of the perfmon to attach to this job. 0 means no perfmon. */ + __u32 perfmonid; + + /* Unused field to align this struct on 64 bits. Must be set to 0. +* If one ever needs to add an u32 field to this struct, this field +* can be used. +*/ + __u32 pad2; }; /** @@ -308,6 +323,7 @@ struct drm_vc4_get_hang_state { #define DRM_VC4_PARAM_SUPPORTS_THREADED_FS 5 #define DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER 6 #define DRM_VC4_PARAM_SUPPORTS_MADVISE 7 +#define DRM_VC4_PARAM_SUPPORTS_PERFMON 8 struct drm_vc4_get_param { __u32 param; @@ -352,6 +368,57 @@ struct drm_vc4_gem_madvise { __u32 pad; }; +enum { + VC4_PERFCNT_FEP_VALID_PRIMS_NO_RENDER, + VC4_PERFCNT_FEP_VALID_PRIMS_RENDER, + VC4_PERFCNT_FEP_CLIPPED_QUADS, + VC4_PERFCNT_FEP_VALID_QUADS, + VC4_PERFCNT_TLB_QUADS_NOT_PASSING_STENCIL, + VC4_PERFCNT_TLB_QUADS_NOT_PASSING_Z_AND_STENCIL, + VC4_PERFCNT_TLB_QUADS_PASSING_Z_AND_STENCIL, + VC4_PERFCNT_TLB_QUADS_ZERO_COVERAGE, + VC4_PERFCNT_TLB_QUADS_NON_ZERO_COVERAGE, + VC4_PERFCNT_TLB_QUADS_WRITTEN_TO_COLOR_BUF, + VC4_PERFCNT_PLB_PRIMS_OUTSIDE_VIEWPORT, + VC4_PERFCNT_PLB_PRIMS_NEED_CLIPPING, + VC4_PERFCNT_PSE_PRIMS_REVERSED, + VC4_PERFCNT_QPU_TOTAL_IDLE_CYCLES, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_VERTEX_COORD_SHADING, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_FRAGMENT_SHADING, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_EXEC_VALID_INST, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_TMUS, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_SCOREBOARD, + VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_VARYINGS, + VC4_PERFCNT_QPU_TOTAL_INST_CACHE_HIT, + VC4_PERFCNT_QPU_TOTAL_INST_CACHE_MISS, + VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_HIT, + VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_MISS, + VC4_PERFCNT_TMU_TOTAL_TEXT_QUADS_PROCESSED, + VC4_PERFCNT_TMU_TOTAL_TEXT_CACHE_MISS, + VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VDW_STALLED, + VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VCD_STALLED, + VC4_PERFCNT_L2C_TOTAL_L2_CACHE_HIT, + VC4_PERFCNT_L2C_TOTAL_L2_CACHE_MISS, + VC4_PERFCNT_NUM_EVENTS, +}; + +#define DRM_VC4_MAX_PERF_COUNTERS 16 + +struct drm_vc4_perfmon_create { + __u32 id; + __u32 ncounters; + __u8 events[DRM_VC4_MAX_PERF_COUNTERS]; +}; + +struct drm_vc4_perfmon_destroy { + __u32 id; +}; + +struct drm_vc4_perfmon_get_values { + __u32 id; + __u64 values_ptr; +}; + #if defined(__cplusplus) } #endif -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/2] broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon --- Changes in v2 (all reported by Eric): - Add missing "TLB-quads-passing-z-and-stencil-test" perf counter - Make sure we wait for the results to be available before returning true in vc4_get_query_result() - Flush pending jobs in vc4_begin_query() and vc4_end_query() so that perf counters are not polluted by unrelated jobs - Reset the counters in vc4_begin_query() - Initialize ->group_id in vc4_get_driver_query_info() --- src/gallium/drivers/vc4/vc4_context.h | 18 +++ src/gallium/drivers/vc4/vc4_job.c | 7 ++ src/gallium/drivers/vc4/vc4_query.c | 228 -- src/gallium/drivers/vc4/vc4_screen.c | 7 ++ src/gallium/drivers/vc4/vc4_screen.h | 1 + 5 files changed, 249 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_context.h b/src/gallium/drivers/vc4/vc4_context.h index 4a1e4093f1a0..41241d36a4bc 100644 --- a/src/gallium/drivers/vc4/vc4_context.h +++ b/src/gallium/drivers/vc4/vc4_context.h @@ -219,6 +219,13 @@ struct vc4_job_key { struct pipe_surface *zsbuf; }; +struct vc4_hwperfmon { +uint32_t id; +uint64_t last_seqno; +uint8_t events[DRM_VC4_MAX_PERF_COUNTERS]; +uint64_t counters[DRM_VC4_MAX_PERF_COUNTERS]; +}; + /** * A complete bin/render job. * @@ -306,6 +313,9 @@ struct vc4_job { /** Any flags to be passed in drm_vc4_submit_cl.flags. */ uint32_t flags; + /* Performance monitor attached to this job. */ + struct vc4_hwperfmon *perfmon; + struct vc4_job_key key; }; @@ -387,6 +397,8 @@ struct vc4_context { struct pipe_viewport_state viewport; struct vc4_constbuf_stateobj constbuf[PIPE_SHADER_TYPES]; struct vc4_vertexbuf_stateobj vertexbuf; + +struct vc4_hwperfmon *perfmon; /** @} */ }; @@ -444,6 +456,12 @@ vc4_sampler_state(struct pipe_sampler_state *psampler) return (struct vc4_sampler_state *)psampler; } +int vc4_get_driver_query_group_info(struct pipe_screen *pscreen, +unsigned index, +struct pipe_driver_query_group_info *info); +int vc4_get_driver_query_info(struct pipe_screen *pscreen, unsigned index, + struct pipe_driver_query_info *info); + struct pipe_context *vc4_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags); void vc4_draw_init(struct pipe_context *pctx); diff --git a/src/gallium/drivers/vc4/vc4_job.c b/src/gallium/drivers/vc4/vc4_job.c index 7fe20c16bad9..f0a59781b298 100644 --- a/src/gallium/drivers/vc4/vc4_job.c +++ b/src/gallium/drivers/vc4/vc4_job.c @@ -90,6 +90,9 @@ vc4_job_create(struct vc4_context *vc4) job->draw_max_x = 0; job->draw_max_y = 0; +if (vc4->perfmon) +job->perfmon = vc4->perfmon; + return job; } @@ -453,6 +456,8 @@ vc4_job_submit(struct vc4_context *vc4, struct vc4_job *job) submit.shader_rec_count = job->shader_rec_count; submit.uniforms = (uintptr_t)job->uniforms.base; submit.uniforms_size = cl_offset(&job->uniforms); + if (job->perfmon) + submit.perfmonid = job->perfmon->id; assert(job->draw_min_x != ~0 && job->draw_min_y != ~0); submit.min_x_tile = job->draw_min_x / job->tile_width; @@ -485,6 +490,8 @@ vc4_job_submit(struct vc4_context *vc4, struct vc4_job *job) warned = true; } else if (!ret) { vc4->last_emit_seqno = submit.seqno; +if (job->perfmon) +job->perfmon->last_seqno = submit.seqno; } } diff --git a/src/gallium/drivers/vc4/vc4_query.c b/src/gallium/drivers/vc4/vc4_query.c index ddf8f8fb0c2c..6e4681e93ccb 100644 --- a/src/gallium/drivers/vc4/vc4_query.c +++ b/src/gallium/drivers/vc4/vc4_query.c @@ -22,8 +22,9 @@ */ /** - * Stub support for occlusion queries. + * Expose V3D HW perf counters. * + * We also have code to fake support for occlusion queries. * Since we expose support for GL 2.0, we have to expose occlusion queries, * but the spec allows you to expose 0 query counter bits, so we just return 0 * as the result of all our queries. @@ -32,49 +33,252 @@ struct vc4_query { -uint8_t pad; +unsigned num_queries; +struct vc4_hwperfmon *hwperfmon; }; +static const char *v3d_counter_names[] = { +"FEP-valid-primitives-no-rendered-pixels", +"FEP-valid-primitives-rendered-pixels", +"FEP-clipped-quads", +&
[Mesa-dev] [PATCH v2 0/2] broadcom/v4: Expose VC4 HW perf counters
Hello, This series makes use of the VC4 perfmon ioctls to expose HW perf counters through the GL_AMD_performance_monitor interface. This patches depends on their kernel counterparts and should not be applied until the kernel patches have reached the drm tree. Regards, Boris Changes in v2: - Drop the extended CL stuff - Fix bugs reported by Eric (see changelog in patch 2) Boris Brezillon (2): drm-uapi: Update vc4 header with perfmon related definitions broadcom/vc4: Add support for HW perfmon include/drm-uapi/vc4_drm.h| 67 ++ src/gallium/drivers/vc4/vc4_context.h | 18 +++ src/gallium/drivers/vc4/vc4_job.c | 7 ++ src/gallium/drivers/vc4/vc4_query.c | 228 -- src/gallium/drivers/vc4/vc4_screen.c | 7 ++ src/gallium/drivers/vc4/vc4_screen.h | 1 + 6 files changed, 316 insertions(+), 12 deletions(-) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] broadcom/vc4: Add support for HW perfmon
On Thu, 11 Jan 2018 16:42:45 -0800 Eric Anholt wrote: > Boris Brezillon writes: > > > The V3D engine provides several perf counters. > > Implement ->get_driver_query_[group_]info() so that these counters are > > exposed through the GL_AMD_performance_monitor extension. > > This all looks good to me! I'm looking forward to the piglit tests, I'm working on that. > but > this patch is: > > Reviewed-by: Eric Anholt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vc4: Fix infinite retry in vc4_bo_alloc()
cleared_and_retried is always reset to false when jumping to the retry label, thus leading to an infinite retry loop. Fix that by moving the cleared_and_retried variable definitions at the beginning of the function. While we're at it, move the create variable with the other local variables and explicitly reset its content in the retry path. Signed-off-by: Boris Brezillon --- src/gallium/drivers/vc4/vc4_bufmgr.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c index 12af7f8a9ef2..0653f8823232 100644 --- a/src/gallium/drivers/vc4/vc4_bufmgr.c +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c @@ -123,6 +123,8 @@ vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name) struct vc4_bo * vc4_bo_alloc(struct vc4_screen *screen, uint32_t size, const char *name) { +bool cleared_and_retried = false; +struct drm_vc4_create_bo create; struct vc4_bo *bo; int ret; @@ -149,12 +151,8 @@ vc4_bo_alloc(struct vc4_screen *screen, uint32_t size, const char *name) bo->private = true; retry: -; - -bool cleared_and_retried = false; -struct drm_vc4_create_bo create = { -.size = size -}; +memset(&create, 0, sizeof(create)); +create.size = size; ret = vc4_ioctl(screen->fd, DRM_IOCTL_VC4_CREATE_BO, &create); bo->handle = create.handle; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. Signed-off-by: Boris Brezillon --- Hello, Note that this series depends on kernel code that has not been accepted yet and is just provided to show reviewers how the ioctl can be used and what to expect from it. Please do not consider this for inclusion in MESA until the kernel part has been accepted. I also lack a check to make sure the DRM_IOCTL_VC4_GEM_MADVISE ioctl is available before using it. Thanks, Boris --- src/gallium/drivers/vc4/vc4_bufmgr.c | 16 1 file changed, 16 insertions(+) diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c index 0653f8823232..8ee37ac7010d 100644 --- a/src/gallium/drivers/vc4/vc4_bufmgr.c +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c @@ -87,6 +87,16 @@ vc4_bo_remove_from_cache(struct vc4_bo_cache *cache, struct vc4_bo *bo) cache->bo_size -= bo->size; } +static int vc4_bo_purgeable(struct vc4_bo *bo, bool purgeable) +{ +struct drm_vc4_gem_madvise arg = { +.handle = bo->handle, +.madv = purgeable ? VC4_MADV_DONTNEED : VC4_MADV_WILLNEED, +}; + +return vc4_ioctl(bo->screen->fd, DRM_IOCTL_VC4_GEM_MADVISE, &arg); +} + static struct vc4_bo * vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name) { @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name) return NULL; } +if (vc4_bo_purgeable(bo, false)) { +mtx_unlock(&cache->lock); +return NULL; +} + pipe_reference_init(&bo->reference, 1); vc4_bo_remove_from_cache(cache, bo); @@ -296,6 +311,7 @@ vc4_bo_last_unreference_locked_timed(struct vc4_bo *bo, time_t time) cache->size_list_size = page_index + 1; } +vc4_bo_purgeable(bo, true); bo->free_time = time; list_addtail(&bo->size_list, &cache->size_list[page_index]); list_addtail(&bo->time_list, &cache->time_list); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 14:50:10 +0100 Chris Wilson wrote: > Quoting Boris Brezillon (2017-09-27 14:45:17) > > static struct vc4_bo * > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char > > *name) > > { > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, uint32_t > > size, const char *name) > > return NULL; > > } > > > > +if (vc4_bo_purgeable(bo, false)) { > > +mtx_unlock(&cache->lock); > > +return NULL; > > So this would just mean that the bo was purged in the meantime. Why not > just try to use the next one in the cache or allocate afresh? No, this means the BO was purged and the kernel failed to allocate the memory back. We don't care about the retained status here, because we don't need to restore BO's content, that's why we're not checking arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely to fail with the same ENOMEM error because both path use the CMA mem. > Not sure > how way allocation failures are handled up the stack, but anyway this is > not necessarily -ENOMEM. Right, we should try with all elements in the list to see if one of them is still around or if the kernel manages to get some memory back in the meantime. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 15:24:16 +0100 Chris Wilson wrote: > Quoting Boris Brezillon (2017-09-27 15:06:53) > > On Wed, 27 Sep 2017 14:50:10 +0100 > > Chris Wilson wrote: > > > > > Quoting Boris Brezillon (2017-09-27 14:45:17) > > > > static struct vc4_bo * > > > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char > > > > *name) > > > > { > > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, > > > > uint32_t size, const char *name) > > > > return NULL; > > > > } > > > > > > > > +if (vc4_bo_purgeable(bo, false)) { > > > > +mtx_unlock(&cache->lock); > > > > +return NULL; > > > > > > So this would just mean that the bo was purged in the meantime. Why not > > > just try to use the next one in the cache or allocate afresh? > > > > No, this means the BO was purged and the kernel failed to allocate the > > memory back. We don't care about the retained status here, because we > > don't need to restore BO's content, that's why we're not checking > > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely to > > fail with the same ENOMEM error because both path use the CMA mem. > > Hmm, you don't treat purging as permanent. But you do track the lose of > contents, so retained is false? vc4_bo_purgeable() is not reporting the retained status, it just reports whether the BO can be used or not. I can change vc4_bo_purgeable() semantic to return 1 if the BO content was retained, 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM) if you prefer, but in the end, all I'll check here is 'vc4_bo_purgeable() >= 0' because I don't don't care about the retained status in this specific use case, all I care about is whether the BO can be re-used or not (IOW, is there a valid CMA region attached to the BO). > > I took a harder line, and said that userspace should recreate the object > from scratch after it was purged. I thought that would be easier > overall. But maybe not.:) Well, maybe I'm wrong in how I implemented this DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been purged and someone marks it back as unpurgeable I'm trying to re-allocate BO's buffer in the ioctl path, and if the CMA allocation fails I return -ENOMEM. I could move the allocation in the fault handler, but this would result in pretty much the same behavior except it would require an extra page-fault to realize the memory is not available or force us to check the retained status and decide to release the BO object from the BO cache. Note that I tried to stay as close as possible to the existing CMA-based-BO logic where everything is allocated at creation time and not based on an on-demand allocation, hence the decision to allocate the CMA region in the ioctl path and not in the page-fault handler. Regards, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 10:15:23 -0700 Eric Anholt wrote: > Boris Brezillon writes: > > > On Wed, 27 Sep 2017 15:24:16 +0100 > > Chris Wilson wrote: > > > >> Quoting Boris Brezillon (2017-09-27 15:06:53) > >> > On Wed, 27 Sep 2017 14:50:10 +0100 > >> > Chris Wilson wrote: > >> > > >> > > Quoting Boris Brezillon (2017-09-27 14:45:17) > >> > > > static struct vc4_bo * > >> > > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const > >> > > > char *name) > >> > > > { > >> > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, > >> > > > uint32_t size, const char *name) > >> > > > return NULL; > >> > > > } > >> > > > > >> > > > +if (vc4_bo_purgeable(bo, false)) { > >> > > > +mtx_unlock(&cache->lock); > >> > > > +return NULL; > >> > > > >> > > So this would just mean that the bo was purged in the meantime. Why not > >> > > just try to use the next one in the cache or allocate afresh? > >> > > >> > No, this means the BO was purged and the kernel failed to allocate the > >> > memory back. We don't care about the retained status here, because we > >> > don't need to restore BO's content, that's why we're not checking > >> > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely to > >> > fail with the same ENOMEM error because both path use the CMA mem. > >> > >> Hmm, you don't treat purging as permanent. But you do track the lose of > >> contents, so retained is false? > > > > vc4_bo_purgeable() is not reporting the retained status, it just > > reports whether the BO can be used or not. I can change > > vc4_bo_purgeable() semantic to return 1 if the BO content was retained, > > 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM) > > if you prefer, but in the end, all I'll check here is > > 'vc4_bo_purgeable() >= 0' because I don't don't care about the retained > > status in this specific use case, all I care about is whether the BO can > > be re-used or not (IOW, is there a valid CMA region attached to the BO). > > > >> > >> I took a harder line, and said that userspace should recreate the object > >> from scratch after it was purged. I thought that would be easier > >> overall. But maybe not.:) > > > > Well, maybe I'm wrong in how I implemented this > > DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been > > purged and someone marks it back as unpurgeable I'm trying to > > re-allocate BO's buffer in the ioctl path, and if the CMA allocation > > fails I return -ENOMEM. I could move the allocation in the fault > > handler, but this would result in pretty much the same behavior except > > it would require an extra page-fault to realize the memory is not > > available or force us to check the retained status and decide to > > release the BO object from the BO cache. > > Hmm. The downside I see to this plan is if we eventually decide to have > the purge operation not clear all the BOs, then we would probably rather > have userspace freeing objects that had been purged until it finds one > in the cache that hadn't been purged, rather than forcing reallocation > of this BO now (and possibly then purging something from elsewhere in > the cache). Okay, that's a good reason to move dma_alloc_wc() in the page-fault path. I need to change a bit the implementation to check cma_gem->vaddr value instead of checking bo->madv != __VC4_MADV_PURGED, otherwise we might pass a non-allocated BO to the GPU/Display-Engine. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 12:41:52 -0700 Eric Anholt wrote: > Boris Brezillon writes: > > > On Wed, 27 Sep 2017 10:15:23 -0700 > > Eric Anholt wrote: > > > >> Boris Brezillon writes: > >> > >> > On Wed, 27 Sep 2017 15:24:16 +0100 > >> > Chris Wilson wrote: > >> > > >> >> Quoting Boris Brezillon (2017-09-27 15:06:53) > >> >> > On Wed, 27 Sep 2017 14:50:10 +0100 > >> >> > Chris Wilson wrote: > >> >> > > >> >> > > Quoting Boris Brezillon (2017-09-27 14:45:17) > >> >> > > > static struct vc4_bo * > >> >> > > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, > >> >> > > > const char *name) > >> >> > > > { > >> >> > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen *screen, > >> >> > > > uint32_t size, const char *name) > >> >> > > > return NULL; > >> >> > > > } > >> >> > > > > >> >> > > > +if (vc4_bo_purgeable(bo, false)) { > >> >> > > > +mtx_unlock(&cache->lock); > >> >> > > > +return NULL; > >> >> > > > >> >> > > So this would just mean that the bo was purged in the meantime. Why > >> >> > > not > >> >> > > just try to use the next one in the cache or allocate afresh? > >> >> > > >> >> > No, this means the BO was purged and the kernel failed to allocate the > >> >> > memory back. We don't care about the retained status here, because we > >> >> > don't need to restore BO's content, that's why we're not checking > >> >> > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely to > >> >> > fail with the same ENOMEM error because both path use the CMA mem. > >> >> > > >> >> > >> >> Hmm, you don't treat purging as permanent. But you do track the lose of > >> >> contents, so retained is false? > >> > > >> > vc4_bo_purgeable() is not reporting the retained status, it just > >> > reports whether the BO can be used or not. I can change > >> > vc4_bo_purgeable() semantic to return 1 if the BO content was retained, > >> > 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM) > >> > if you prefer, but in the end, all I'll check here is > >> > 'vc4_bo_purgeable() >= 0' because I don't don't care about the retained > >> > status in this specific use case, all I care about is whether the BO can > >> > be re-used or not (IOW, is there a valid CMA region attached to the BO). > >> > > >> >> > >> >> I took a harder line, and said that userspace should recreate the object > >> >> from scratch after it was purged. I thought that would be easier > >> >> overall. But maybe not.:) > >> > > >> > Well, maybe I'm wrong in how I implemented this > >> > DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been > >> > purged and someone marks it back as unpurgeable I'm trying to > >> > re-allocate BO's buffer in the ioctl path, and if the CMA allocation > >> > fails I return -ENOMEM. I could move the allocation in the fault > >> > handler, but this would result in pretty much the same behavior except > >> > it would require an extra page-fault to realize the memory is not > >> > available or force us to check the retained status and decide to > >> > release the BO object from the BO cache. > >> > >> Hmm. The downside I see to this plan is if we eventually decide to have > >> the purge operation not clear all the BOs, then we would probably rather > >> have userspace freeing objects that had been purged until it finds one > >> in the cache that hadn't been purged, rather than forcing reallocation > >> of this BO now (and possibly then purging something from elsewhere in > >> the cache). > > > > Okay, that's a good reason to move dma_alloc_wc() in the page-fault > > path. I need to change a bit the implementation to check cma_gem->
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 22:03:15 +0200 Boris Brezillon wrote: > On Wed, 27 Sep 2017 12:41:52 -0700 > Eric Anholt wrote: > > > Boris Brezillon writes: > > > > > On Wed, 27 Sep 2017 10:15:23 -0700 > > > Eric Anholt wrote: > > > > > >> Boris Brezillon writes: > > >> > > >> > On Wed, 27 Sep 2017 15:24:16 +0100 > > >> > Chris Wilson wrote: > > >> > > > >> >> Quoting Boris Brezillon (2017-09-27 15:06:53) > > >> >> > On Wed, 27 Sep 2017 14:50:10 +0100 > > >> >> > Chris Wilson wrote: > > >> >> > > > >> >> > > Quoting Boris Brezillon (2017-09-27 14:45:17) > > >> >> > > > static struct vc4_bo * > > >> >> > > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, > > >> >> > > > const char *name) > > >> >> > > > { > > >> >> > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen > > >> >> > > > *screen, uint32_t size, const char *name) > > >> >> > > > return NULL; > > >> >> > > > } > > >> >> > > > > > >> >> > > > +if (vc4_bo_purgeable(bo, false)) { > > >> >> > > > +mtx_unlock(&cache->lock); > > >> >> > > > +return NULL; > > >> >> > > > > >> >> > > So this would just mean that the bo was purged in the meantime. > > >> >> > > Why not > > >> >> > > just try to use the next one in the cache or allocate afresh? > > >> >> > > > > >> >> > > > >> >> > No, this means the BO was purged and the kernel failed to allocate > > >> >> > the > > >> >> > memory back. We don't care about the retained status here, because > > >> >> > we > > >> >> > don't need to restore BO's content, that's why we're not checking > > >> >> > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is likely > > >> >> > to > > >> >> > fail with the same ENOMEM error because both path use the CMA mem. > > >> >> > > > >> >> > > >> >> Hmm, you don't treat purging as permanent. But you do track the lose > > >> >> of > > >> >> contents, so retained is false? > > >> > > > >> > vc4_bo_purgeable() is not reporting the retained status, it just > > >> > reports whether the BO can be used or not. I can change > > >> > vc4_bo_purgeable() semantic to return 1 if the BO content was retained, > > >> > 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM) > > >> > if you prefer, but in the end, all I'll check here is > > >> > 'vc4_bo_purgeable() >= 0' because I don't don't care about the retained > > >> > status in this specific use case, all I care about is whether the BO > > >> > can > > >> > be re-used or not (IOW, is there a valid CMA region attached to the > > >> > BO). > > >> > > > >> >> > > >> >> I took a harder line, and said that userspace should recreate the > > >> >> object > > >> >> from scratch after it was purged. I thought that would be easier > > >> >> overall. But maybe not.:) > > >> > > > >> > Well, maybe I'm wrong in how I implemented this > > >> > DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been > > >> > purged and someone marks it back as unpurgeable I'm trying to > > >> > re-allocate BO's buffer in the ioctl path, and if the CMA allocation > > >> > fails I return -ENOMEM. I could move the allocation in the fault > > >> > handler, but this would result in pretty much the same behavior except > > >> > it would require an extra page-fault to realize the memory is not > > >> > available or force us to check the retained status and decide to > > >> > release the
Re: [Mesa-dev] [PATCH] vc4: Mark BOs as purgeable when they enter the BO cache
On Wed, 27 Sep 2017 16:33:23 -0700 Eric Anholt wrote: > Boris Brezillon writes: > > > On Wed, 27 Sep 2017 12:41:52 -0700 > > Eric Anholt wrote: > > > >> Boris Brezillon writes: > >> > >> > On Wed, 27 Sep 2017 10:15:23 -0700 > >> > Eric Anholt wrote: > >> > > >> >> Boris Brezillon writes: > >> >> > >> >> > On Wed, 27 Sep 2017 15:24:16 +0100 > >> >> > Chris Wilson wrote: > >> >> > > >> >> >> Quoting Boris Brezillon (2017-09-27 15:06:53) > >> >> >> > On Wed, 27 Sep 2017 14:50:10 +0100 > >> >> >> > Chris Wilson wrote: > >> >> >> > > >> >> >> > > Quoting Boris Brezillon (2017-09-27 14:45:17) > >> >> >> > > > static struct vc4_bo * > >> >> >> > > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, > >> >> >> > > > const char *name) > >> >> >> > > > { > >> >> >> > > > @@ -111,6 +121,11 @@ vc4_bo_from_cache(struct vc4_screen > >> >> >> > > > *screen, uint32_t size, const char *name) > >> >> >> > > > return NULL; > >> >> >> > > > } > >> >> >> > > > > >> >> >> > > > +if (vc4_bo_purgeable(bo, false)) { > >> >> >> > > > +mtx_unlock(&cache->lock); > >> >> >> > > > +return NULL; > >> >> >> > > > >> >> >> > > So this would just mean that the bo was purged in the meantime. > >> >> >> > > Why not > >> >> >> > > just try to use the next one in the cache or allocate afresh? > >> >> >> > > > >> >> >> > > >> >> >> > No, this means the BO was purged and the kernel failed to allocate > >> >> >> > the > >> >> >> > memory back. We don't care about the retained status here, because > >> >> >> > we > >> >> >> > don't need to restore BO's content, that's why we're not checking > >> >> >> > arg.retained in vc4_bo_purgeable(). Allocating a fresh BO is > >> >> >> > likely to > >> >> >> > fail with the same ENOMEM error because both path use the CMA mem. > >> >> >> > > >> >> >> > >> >> >> Hmm, you don't treat purging as permanent. But you do track the lose > >> >> >> of > >> >> >> contents, so retained is false? > >> >> > > >> >> > vc4_bo_purgeable() is not reporting the retained status, it just > >> >> > reports whether the BO can be used or not. I can change > >> >> > vc4_bo_purgeable() semantic to return 1 if the BO content was > >> >> > retained, > >> >> > 0 if it was purged and -1 if you the ioctl returned an error (ENOMEM) > >> >> > if you prefer, but in the end, all I'll check here is > >> >> > 'vc4_bo_purgeable() >= 0' because I don't don't care about the > >> >> > retained > >> >> > status in this specific use case, all I care about is whether the BO > >> >> > can > >> >> > be re-used or not (IOW, is there a valid CMA region attached to the > >> >> > BO). > >> >> > > >> >> >> > >> >> >> I took a harder line, and said that userspace should recreate the > >> >> >> object > >> >> >> from scratch after it was purged. I thought that would be easier > >> >> >> overall. But maybe not.:) > >> >> > > >> >> > Well, maybe I'm wrong in how I implemented this > >> >> > DRM_IOCTL_VC4_GEM_MADVISE ioctl, but right now, when the BO has been > >> >> > purged and someone marks it back as unpurgeable I'm trying to > >> >> > re-allocate BO's buffer in the ioctl path, and if the CMA allocation > >> >> &g
[Mesa-dev] [PATCH v2] vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. Signed-off-by: Boris Brezillon --- Hello, Note that this series depends on kernel code that has not been accepted yet and is just provided to show reviewers how the ioctl can be used and what to expect from it. Please do not consider this for inclusion in MESA until the kernel part has been accepted. Thanks, Boris --- Changes in v2: - Remove BOs from the cache when they've been purged by the kernel - Check whether the madvise ioctl is supported or not before using it --- src/gallium/drivers/vc4/vc4_bufmgr.c | 135 ++- src/gallium/drivers/vc4/vc4_screen.c | 2 + src/gallium/drivers/vc4/vc4_screen.h | 1 + 3 files changed, 89 insertions(+), 49 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_bufmgr.c b/src/gallium/drivers/vc4/vc4_bufmgr.c index 0653f8823232..0c13ad683b6d 100644 --- a/src/gallium/drivers/vc4/vc4_bufmgr.c +++ b/src/gallium/drivers/vc4/vc4_bufmgr.c @@ -87,34 +87,106 @@ vc4_bo_remove_from_cache(struct vc4_bo_cache *cache, struct vc4_bo *bo) cache->bo_size -= bo->size; } +static void vc4_bo_purgeable(struct vc4_bo *bo) +{ +struct drm_vc4_gem_madvise arg = { +.handle = bo->handle, +.madv = VC4_MADV_DONTNEED, +}; + + if (bo->screen->has_madvise) + vc4_ioctl(bo->screen->fd, DRM_IOCTL_VC4_GEM_MADVISE, &arg); +} + +static bool vc4_bo_unpurgeable(struct vc4_bo *bo) +{ +struct drm_vc4_gem_madvise arg = { +.handle = bo->handle, +.madv = VC4_MADV_WILLNEED, +}; + + if (!bo->screen->has_madvise) + return true; + + if (vc4_ioctl(bo->screen->fd, DRM_IOCTL_VC4_GEM_MADVISE, &arg)) + return false; + + return arg.retained; +} + +static void +vc4_bo_free(struct vc4_bo *bo) +{ +struct vc4_screen *screen = bo->screen; + +if (bo->map) { +if (using_vc4_simulator && bo->name && +strcmp(bo->name, "winsys") == 0) { +free(bo->map); +} else { +munmap(bo->map, bo->size); +VG(VALGRIND_FREELIKE_BLOCK(bo->map, 0)); +} +} + +struct drm_gem_close c; +memset(&c, 0, sizeof(c)); +c.handle = bo->handle; +int ret = vc4_ioctl(screen->fd, DRM_IOCTL_GEM_CLOSE, &c); +if (ret != 0) +fprintf(stderr, "close object %d: %s\n", bo->handle, strerror(errno)); + +screen->bo_count--; +screen->bo_size -= bo->size; + +if (dump_stats) { +fprintf(stderr, "Freed %s%s%dkb:\n", +bo->name ? bo->name : "", +bo->name ? " " : "", +bo->size / 1024); +vc4_bo_dump_stats(screen); +} + +free(bo); +} + static struct vc4_bo * vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char *name) { struct vc4_bo_cache *cache = &screen->bo_cache; uint32_t page_index = size / 4096 - 1; +struct vc4_bo *iter, *tmp, *bo = NULL; if (cache->size_list_size <= page_index) return NULL; -struct vc4_bo *bo = NULL; mtx_lock(&cache->lock); -if (!list_empty(&cache->size_list[page_index])) { -bo = LIST_ENTRY(struct vc4_bo, cache->size_list[page_index].next, -size_list); - -/* Check that the BO has gone idle. If not, then we want to - * allocate something new instead, since we assume that the - * user will proceed to CPU map it and fill it with stuff. + LIST_FOR_EACH_ENTRY_SAFE(iter, tmp, &cache->size_list[page_index], +size_list) { +/* Check that the BO has gone idle. If not, then we try the + * next one in the list, and if none of them are idle then + * we want to allocate something new instead, since we assume + * that the user will proceed to CPU map it and fill it with + * stuff. */ -if (!vc4_bo_wait(bo, 0, NULL)) { -mtx_unlock(&cache->lock); -return NULL; -} - +if (!vc4_bo_wait(iter, 0, NULL)) +continue; + +if (!vc4_bo_unpurgeable(iter)) { +/* The BO has been purged. Free it and try t
Re: [Mesa-dev] [PATCH v2] vc4: Mark BOs as purgeable when they enter the BO cache
On Thu, 05 Oct 2017 11:25:46 -0700 Eric Anholt wrote: > Boris Brezillon writes: > > > This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all > > BOs placed in the mesa BO cache as purgeable so that the system can > > reclaim this memory under memory pressure. > > > > Signed-off-by: Boris Brezillon > > --- > > Hello, > > > > Note that this series depends on kernel code that has not been accepted > > yet and is just provided to show reviewers how the ioctl can be used > > and what to expect from it. > > > > Please do not consider this for inclusion in MESA until the kernel part > > has been accepted. > > > > Thanks, > > > > Boris > > --- > > > > static struct vc4_bo * > > vc4_bo_from_cache(struct vc4_screen *screen, uint32_t size, const char > > *name) > > { > > struct vc4_bo_cache *cache = &screen->bo_cache; > > uint32_t page_index = size / 4096 - 1; > > +struct vc4_bo *iter, *tmp, *bo = NULL; > > > > if (cache->size_list_size <= page_index) > > return NULL; > > > > -struct vc4_bo *bo = NULL; > > mtx_lock(&cache->lock); > > -if (!list_empty(&cache->size_list[page_index])) { > > -bo = LIST_ENTRY(struct vc4_bo, > > cache->size_list[page_index].next, > > -size_list); > > - > > -/* Check that the BO has gone idle. If not, then we want > > to > > - * allocate something new instead, since we assume that the > > - * user will proceed to CPU map it and fill it with stuff. > > + LIST_FOR_EACH_ENTRY_SAFE(iter, tmp, &cache->size_list[page_index], > > +size_list) { > > +/* Check that the BO has gone idle. If not, then we try the > > + * next one in the list, and if none of them are idle then > > + * we want to allocate something new instead, since we > > assume > > + * that the user will proceed to CPU map it and fill it > > with > > + * stuff. > > */ > > -if (!vc4_bo_wait(bo, 0, NULL)) { > > -mtx_unlock(&cache->lock); > > -return NULL; > > -} > > - > > +if (!vc4_bo_wait(iter, 0, NULL)) > > +continue; > > Since things get pushed onto the list in the order they will become > available, we can just break when we get a busy one. Makes sense. > > Other than that, and needing a re-import of vc4_drm.h in > include/drm-uapi (see README), this patch is: It's already done [1], I just didn't post the patch to avoid polluting the ML with something that is not definitive yet (the kernel header might change after your review ;-)). > > Reviewed-by: Eric Anholt > > I don't think I'll get to review of the kernel side today -- it'll take > a bit more concentration than I have right now. > > > + > > +if (!vc4_bo_unpurgeable(iter)) { > > +/* The BO has been purged. Free it and try to find > > + * another one in the cache. > > + */ > > +vc4_bo_remove_from_cache(cache, iter); > > +vc4_bo_free(iter); > > +continue; > > + } > > + > > +bo = iter; > > pipe_reference_init(&bo->reference, 1); > > vc4_bo_remove_from_cache(cache, bo); > > > > bo->name = name; > > +break; > > } > > mtx_unlock(&cache->lock); > > return bo; [1]https://github.com/bbrezillon/mesa/commit/e48157273ab90d34fdeb4b1324077d63067e94bd ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 2/5] dri_interface: add DRI2_BufferDamage interface
From: Daniel Stone Add a new DRI2_BufferDamage interface to support the EGL_KHR_partial_update extension, informing the driver of an overriding scissor region for a particular drawable. Based on a commit originally authored by: Harish Krupo renamed extension, retargeted at DRI drawable instead of context, rewritten description Signed-off-by: Boris Brezillon --- include/GL/internal/dri_interface.h | 43 + 1 file changed, 43 insertions(+) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index af0ee9c56670..ada78c5d53d6 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -85,6 +85,7 @@ typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension; typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension; typedef struct __DRI2interopExtensionRec __DRI2interopExtension; typedef struct __DRI2blobExtensionRec __DRI2blobExtension; +typedef struct __DRI2bufferDamageExtensionRec __DRI2bufferDamageExtension; typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension; typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension; @@ -488,6 +489,48 @@ struct __DRI2interopExtensionRec { struct mesa_glinterop_export_out *out); }; + +/** + * Extension for limiting window system back buffer rendering to user-defined + * scissor region. + */ + +#define __DRI2_BUFFER_DAMAGE "DRI2_BufferDamage" +#define __DRI2_BUFFER_DAMAGE_VERSION 1 + +struct __DRI2bufferDamageExtensionRec { + __DRIextension base; + + /** +* Provides an array of rectangles representing an overriding scissor region +* for rendering operations performed to the specified drawable. These +* rectangles do not replace client API scissor regions or draw +* co-ordinates, but instead inform the driver of the overall bounds of all +* operations which will be issued before the next flush. +* +* Any rendering operations writing pixels outside this region to the +* drawable will have an undefined effect on the entire drawable. +* +* This entrypoint may only be called after the drawable has been either been +* newly created or flushed, and before any rendering operations which write +* pixels to the drawable. Calling this entrypoint at any other time will +* have an undefined effect on the entire drawable. +* +* Calling this entrypoint with @size 0 and @rects NULL will reset the +* region to the buffer's full size. This entrypoint may be called once to +* reset the region, followed by a second call with a populated region, +* before a rendering call is made. +* +* Used to implement EGL_KHR_partial_update. +* +* \param drawable affected drawable +* \param size number of rectangles provided +* \param rectsthe array of rectangles, lower-left origin +*/ + void (*set_damage_region)(__DRIdrawable *drawable, unsigned int nrects, + int *rects); +}; + /*@}*/ /** -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 5/5] panfrost: Add support for KHR_partial_update()
Implement ->set_damage_region() region to support partial updates. This is a dummy implementation in that it does not try to merge damage rects. It also does not deal with distinct regions and instead pick the largest quad as the only damage rect and generate up to 4 reload rects out of it (the left/right/top/bottom regions surrounding the biggest damage rect). We also do not try to reduce the number of draws by passing all quad vertices to the blit request (would require extending u_blitter) Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_blit.c | 14 ++--- src/gallium/drivers/panfrost/pan_context.c | 49 - src/gallium/drivers/panfrost/pan_job.c | 11 src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 58 + src/gallium/drivers/panfrost/pan_resource.h | 12 - src/gallium/drivers/panfrost/pan_screen.c | 1 + 7 files changed, 141 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_blit.c b/src/gallium/drivers/panfrost/pan_blit.c index 5859f92f9d1b..3a45277ee287 100644 --- a/src/gallium/drivers/panfrost/pan_blit.c +++ b/src/gallium/drivers/panfrost/pan_blit.c @@ -105,18 +105,18 @@ panfrost_blit(struct pipe_context *pipe, */ void -panfrost_blit_wallpaper(struct panfrost_context *ctx) +panfrost_blit_wallpaper(struct panfrost_context *ctx, struct pipe_box *rect) { struct pipe_blit_info binfo = { }; panfrost_blitter_save(ctx); - binfo.src.resource = binfo.dst.resource = ctx->pipe_framebuffer.cbufs[0]->texture; - binfo.src.level = binfo.dst.level = 0; - binfo.src.box.x = binfo.dst.box.x = 0; - binfo.src.box.y = binfo.dst.box.y = 0; - binfo.src.box.width = binfo.dst.box.width = ctx->pipe_framebuffer.width; - binfo.src.box.height = binfo.dst.box.height = ctx->pipe_framebuffer.height; +binfo.src.resource = binfo.dst.resource = ctx->pipe_framebuffer.cbufs[0]->texture; +binfo.src.level = binfo.dst.level = 0; +binfo.src.box.x = binfo.dst.box.x = rect->x; +binfo.src.box.y = binfo.dst.box.y = rect->y; +binfo.src.box.width = binfo.dst.box.width = rect->width; +binfo.src.box.height = binfo.dst.box.height = rect->height; /* This avoids an assert due to missing nir_texop_txb support */ //binfo.src.box.depth = binfo.dst.box.depth = 1; diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index de6dd38c5566..c1075c6693e8 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1528,7 +1528,54 @@ panfrost_draw_wallpaper(struct pipe_context *pipe) struct panfrost_job *batch = panfrost_get_job_for_fbo(ctx); ctx->wallpaper_batch = batch; -panfrost_blit_wallpaper(ctx); + +/* Clamp the rendering area to the damage extent. The + * KHR_partial_update() spec states that trying to render outside of + * the damage region is "undefined behavior", so we should be safe. + */ +panfrost_job_intersection_scissor(batch, rsrc->damage.extent.minx, + rsrc->damage.extent.miny, + rsrc->damage.extent.maxx, + rsrc->damage.extent.maxy); + +struct pipe_scissor_state damage; +struct pipe_box rects[4]; + +/* Clamp the damage box to the rendering area. */ +damage.minx = MAX2(batch->minx, rsrc->damage.biggest_rect.x); +damage.miny = MAX2(batch->miny, rsrc->damage.biggest_rect.y); +damage.maxx = MIN2(batch->maxx, + rsrc->damage.biggest_rect.x + + rsrc->damage.biggest_rect.width); +damage.maxy = MIN2(batch->maxy, + rsrc->damage.biggest_rect.y + + rsrc->damage.biggest_rect.height); + +/* One damage rectangle means we can end up with at most 4 reload + * regions: + * 1: left region, only exists if damage.x > 0 + * 2: right region, only exists if damage.x + damage.width < fb->width + * 3: top region, only exists if damage.y > 0. The intersection with + *the left and right regions are dropped + * 4: bottom region, only exists if damage.y + damage.height < fb->height. + *The intersection with the left and right regions are dropped + */ +u_box_2d(batch->minx, batch->miny, damage.minx - batch->minx, + batch->maxy - batch->miny, &rects[0]); +u_box_2d(damage.maxx, batch->miny, batch->maxx - damage.maxx, + batch->maxy - batch->miny, &rects[1]); +u_box_2d(damage.minx
[Mesa-dev] [PATCH v4 0/5] EGL_KHR_partial_update support
This is an attempt at resurrecting Daniel's MR [1] which was already resurrecting Harish's EGL_KHR_partial_update series [2]. This version implements Marek's suggestion to pass the set_damage_region() directly to the gallium driver and let it decide how to handle the request. Some drivers might just calculate the damage extent (as done in Daniel's initial proposal and in the panfrost implementation), others might do extra optimizations like trying to reduce the area we're supposed to reload (only valid for tile-based rendering) even further. This patch series has been tested with weston (see Daniel's MR[3]) on panfrost. Note that the panfrost implementation is rather simple (just limits the rendering area to the damage extent and picks the biggest damage rect as the only damage region) but we can improve it if we feel the need. Any feedback and suggestions on how to do it better are welcome. Regards, Boris [1]https://gitlab.freedesktop.org/mesa/mesa/merge_requests/227 [2]https://patchwork.freedesktop.org/series/45915/#rev2 [3]https://gitlab.freedesktop.org/wayland/weston/merge_requests/106 Boris Brezillon (1): panfrost: Add support for KHR_partial_update() Daniel Stone (2): dri_interface: add DRI2_BufferDamage interface st/dri2: Implement DRI2bufferDamageExtension Harish Krupo (2): egl/android: Delete set_damage_region from egl dri vtbl egl/dri: Use __DRI2_DAMAGE extension for KHR_partial_update include/GL/internal/dri_interface.h | 43 +++ src/egl/drivers/dri2/egl_dri2.c | 54 +-- src/egl/drivers/dri2/egl_dri2.h | 5 +- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 src/egl/drivers/dri2/platform_android.c | 45 src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - src/gallium/drivers/panfrost/pan_blit.c | 14 ++--- src/gallium/drivers/panfrost/pan_context.c | 49 - src/gallium/drivers/panfrost/pan_job.c | 11 src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 58 + src/gallium/drivers/panfrost/pan_resource.h | 12 - src/gallium/drivers/panfrost/pan_screen.c | 1 + src/gallium/include/pipe/p_screen.h | 7 +++ src/gallium/state_trackers/dri/dri2.c | 22 20 files changed, 263 insertions(+), 79 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 1/5] egl/android: Delete set_damage_region from egl dri vtbl
From: Harish Krupo The intension of the KHR_partial_update was not to send the damage back to the platform but to send the damage to the driver to ensure that the following rendering could be restricted to those regions. This patch removes the set_damage_region from the egl_dri vtbl and all the platfrom_*.c files. Then upcomming patches add a new dri2 interface for the drivers to implement Signed-off-by: Harish Krupo Reviewed-by: Daniel Stone Signed-off-by: Boris Brezillon --- src/egl/drivers/dri2/egl_dri2.c | 3 +- src/egl/drivers/dri2/egl_dri2.h | 4 -- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 - src/egl/drivers/dri2/platform_android.c | 45 - src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - 10 files changed, 1 insertion(+), 67 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index ee4faaab34f4..3c33b2cf27f8 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1691,8 +1691,7 @@ static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); + return false; } static EGLBoolean diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index 943ff1808619..f9237dbe2559 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -119,10 +119,6 @@ struct dri2_egl_display_vtbl { _EGLSurface *surface, const EGLint *rects, EGLint n_rects); - EGLBoolean (*set_damage_region)(_EGLDriver *drv, _EGLDisplay *disp, - _EGLSurface *surface, - const EGLint *rects, EGLint n_rects); - EGLBoolean (*swap_buffers_region)(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects); diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h b/src/egl/drivers/dri2/egl_dri2_fallbacks.h index 6c2c4bbe595e..d975b7a8b130 100644 --- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h +++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h @@ -62,7 +62,6 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); } @@ -90,14 +89,6 @@ dri2_fallback_copy_buffers(_EGLDriver *drv, _EGLDisplay *disp, return _eglError(EGL_BAD_NATIVE_PIXMAP, "no support for native pixmaps"); } -static inline EGLBoolean -dri2_fallback_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, -_EGLSurface *surf, -const EGLint *rects, EGLint n_rects) -{ - return EGL_FALSE; -} - static inline EGLint dri2_fallback_query_buffer_age(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index db6ba4a4b4d6..6ce04d250c8d 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -728,43 +728,6 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) return EGL_TRUE; } -#if ANDROID_API_LEVEL >= 23 -static EGLBoolean -droid_set_damage_region(_EGLDriver *drv, -_EGLDisplay *disp, -_EGLSurface *draw, const EGLint* rects, EGLint n_rects) -{ - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw); - android_native_rect_t* droid_rects = NULL; - int ret; - - if (n_rects == 0) - return EGL_TRUE; - - droid_rects = malloc(n_rects * sizeof(android_native_rect_t)); - if (droid_rects == NULL) - return _eglError(EGL_BAD_ALLOC, "eglSetDamageRegionKHR"); - - for (EGLint num_drects = 0; num_drects < n_rects; num_drects++) { - EGLint i = num_drects * 4; - droid_rects[num_drects].left = rects[i]; - droid_rects[num_drects].bottom = rects[i + 1]; - droid_rects[num_drects].right = rects[i] + rects[i + 2]; - droid_rects[num_drects].top = rects[i + 1] + rects[i + 3]; - } - - /* -* XXX
[Mesa-dev] [PATCH v4 3/5] egl/dri: Use __DRI2_DAMAGE extension for KHR_partial_update
From: Harish Krupo Use the DRI2 interface callback to pass the damage rects to the driver. Signed-off-by: Harish Krupo Signed-off-by: Boris Brezillon --- src/egl/drivers/dri2/egl_dri2.c | 55 ++--- src/egl/drivers/dri2/egl_dri2.h | 1 + 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 3c33b2cf27f8..fcafcfd73c63 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -452,6 +452,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_NO_ERROR, 1, offsetof(struct dri2_egl_display, no_error) }, { __DRI2_CONFIG_QUERY, 1, offsetof(struct dri2_egl_display, config) }, { __DRI2_FENCE, 1, offsetof(struct dri2_egl_display, fence) }, + { __DRI2_BUFFER_DAMAGE, 1, offsetof(struct dri2_egl_display, buffer_damage) }, { __DRI2_RENDERER_QUERY, 1, offsetof(struct dri2_egl_display, rendererQuery) }, { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, @@ -721,6 +722,9 @@ dri2_setup_screen(_EGLDisplay *disp) if (dri2_dpy->flush_control) disp->Extensions.KHR_context_flush_control = EGL_TRUE; + + if (dri2_dpy->buffer_damage && dri2_dpy->buffer_damage->set_damage_region) + disp->Extensions.KHR_partial_update = EGL_TRUE; } void @@ -1658,11 +1662,22 @@ static EGLBoolean dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + ret = dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1671,12 +1686,23 @@ dri2_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, - rects, n_rects); + ret = dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, + rects, n_rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1684,14 +1710,33 @@ dri2_swap_buffers_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + EGLBoolean ret; + + ret = dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - return false; + struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + + if (!dri2_dpy->buffer_damage || !dri2_dpy->buffer_damage->set_damage_region) + return EGL_FALSE; + + dri2_dpy->buffer_damage->set_damage_region(drawable, n_rects, rects); + return EGL_TRUE; } static EGLBoolean diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index f9237dbe2559..00bce2285d3e 100644 --- a/src/e
[Mesa-dev] [PATCH v4 4/5] st/dri2: Implement DRI2bufferDamageExtension
From: Daniel Stone Add a pipe_screen->set_damage_region() hook to propagate set-damage-region requests to the driver, it's then up to the driver to decide what to do with this piece of information. If the hook is left unassigned, the buffer-damage extension is considered unsupported. Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon --- src/gallium/include/pipe/p_screen.h | 7 +++ src/gallium/state_trackers/dri/dri2.c | 22 ++ 2 files changed, 29 insertions(+) diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index 3f9bad470950..8df12ee4f865 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -464,6 +464,13 @@ struct pipe_screen { bool (*is_parallel_shader_compilation_finished)(struct pipe_screen *screen, void *shader, unsigned shader_type); + + /** +* Set damage region. +*/ + void (*set_damage_region)(struct pipe_screen *screen, + struct pipe_resource *resource, + unsigned int nrects, int *rects); }; diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 5caaa9deac41..df22e7c41642 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1806,6 +1806,23 @@ static const __DRI2interopExtension dri2InteropExtension = { .export_object = dri2_interop_export_object }; +/** + * \brief the DRI2bufferDamageExtension set_damage_region method + */ +static void +dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int *rects) +{ + struct dri_drawable *drawable = dri_drawable(dPriv); + struct pipe_resource *resource = drawable->textures[ST_ATTACHMENT_BACK_LEFT]; + struct pipe_screen *screen = resource->screen; + + screen->set_damage_region(screen, resource, nrects, rects); +} + +static __DRI2bufferDamageExtension dri2BufferDamageExtension = { + .base = { __DRI2_BUFFER_DAMAGE, 1 }, +}; + /** * \brief the DRI2ConfigQueryExtension configQueryb method */ @@ -1907,6 +1924,7 @@ static const __DRIextension *dri_screen_extensions[] = { &dri2GalliumConfigQueryExtension.base, &dri2ThrottleExtension.base, &dri2FenceExtension.base, + &dri2BufferDamageExtension.base, &dri2InteropExtension.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1922,6 +1940,7 @@ static const __DRIextension *dri_robust_screen_extensions[] = { &dri2ThrottleExtension.base, &dri2FenceExtension.base, &dri2InteropExtension.base, + &dri2BufferDamageExtension.base, &dri2Robustness.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1982,6 +2001,9 @@ dri2_init_screen(__DRIscreen * sPriv) } } + if (pscreen->set_damage_region) + dri2BufferDamageExtension.set_damage_region = dri2_set_damage_region; + if (pscreen->get_param(pscreen, PIPE_CAP_DEVICE_RESET_STATUS_QUERY)) { sPriv->extensions = dri_robust_screen_extensions; screen->has_reset_status_query = true; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 2/5] dri_interface: add DRI2_BufferDamage interface
On Tue, 25 Jun 2019 18:37:46 +0200 Boris Brezillon wrote: > From: Daniel Stone > > Add a new DRI2_BufferDamage interface to support the > EGL_KHR_partial_update extension, informing the driver of an overriding > scissor region for a particular drawable. > > Based on a commit originally authored by: > Harish Krupo > renamed extension, retargeted at DRI drawable instead of context, > rewritten description > Oops, this patch is missing Daniel's SoB. > Signed-off-by: Boris Brezillon > --- > include/GL/internal/dri_interface.h | 43 + > 1 file changed, 43 insertions(+) > > diff --git a/include/GL/internal/dri_interface.h > b/include/GL/internal/dri_interface.h > index af0ee9c56670..ada78c5d53d6 100644 > --- a/include/GL/internal/dri_interface.h > +++ b/include/GL/internal/dri_interface.h > @@ -85,6 +85,7 @@ typedef struct __DRI2throttleExtensionRec > __DRI2throttleExtension; > typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension; > typedef struct __DRI2interopExtensionRec __DRI2interopExtension; > typedef struct __DRI2blobExtensionRec __DRI2blobExtension; > +typedef struct __DRI2bufferDamageExtensionRec __DRI2bufferDamageExtension; > > typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension; > typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension; > @@ -488,6 +489,48 @@ struct __DRI2interopExtensionRec { > struct mesa_glinterop_export_out *out); > }; > > + > +/** > + * Extension for limiting window system back buffer rendering to user-defined > + * scissor region. > + */ > + > +#define __DRI2_BUFFER_DAMAGE "DRI2_BufferDamage" > +#define __DRI2_BUFFER_DAMAGE_VERSION 1 > + > +struct __DRI2bufferDamageExtensionRec { > + __DRIextension base; > + > + /** > +* Provides an array of rectangles representing an overriding scissor > region > +* for rendering operations performed to the specified drawable. These > +* rectangles do not replace client API scissor regions or draw > +* co-ordinates, but instead inform the driver of the overall bounds of > all > +* operations which will be issued before the next flush. > +* > +* Any rendering operations writing pixels outside this region to the > +* drawable will have an undefined effect on the entire drawable. > +* > +* This entrypoint may only be called after the drawable has been either > been > +* newly created or flushed, and before any rendering operations which > write > +* pixels to the drawable. Calling this entrypoint at any other time will > +* have an undefined effect on the entire drawable. > +* > +* Calling this entrypoint with @size 0 and @rects NULL will reset the > +* region to the buffer's full size. This entrypoint may be called once to > +* reset the region, followed by a second call with a populated region, > +* before a rendering call is made. > +* > +* Used to implement EGL_KHR_partial_update. > +* > +* \param drawable affected drawable > +* \param size number of rectangles provided > +* \param rectsthe array of rectangles, lower-left origin > +*/ > + void (*set_damage_region)(__DRIdrawable *drawable, unsigned int nrects, > + int *rects); > +}; > + > /*@}*/ > > /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] panfrost: Remove unneeded check in panfrost_scissor_culls_everything()
The ss local var is guaranteed to be != NULL. Get rid of this useless check. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index c1075c6693e8..f3768240e7ad 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1675,7 +1675,7 @@ panfrost_scissor_culls_everything(struct panfrost_context *ctx) /* Check if we're scissoring at all */ -if (!(ss && ctx->rasterizer && ctx->rasterizer->base.scissor)) +if (!(ctx->rasterizer && ctx->rasterizer->base.scissor)) return false; return (ss->minx == ss->maxx) && (ss->miny == ss->maxy); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 5/5] panfrost: Add support for KHR_partial_update()
On Wed, 26 Jun 2019 10:51:23 -0700 Alyssa Rosenzweig wrote: > Overall, I'm quite happy with how this turns out; I was fearful it would > be a lot more complicated, though there's always time for that ;) Same here. I thought it would be more complicated than that, but it turned out to be pretty simple (mainly because I didn't go into too much optimization to discard as much of the wallpapering area as could theoretically be). > > Some specific comments follow: (mostly minor): > --- > > > +panfrost_blit_wallpaper(struct panfrost_context *ctx, struct pipe_box > > *rect) > > { > > struct pipe_blit_info binfo = { }; > > > > panfrost_blitter_save(ctx); > > > > - binfo.src.resource = binfo.dst.resource = > > ctx->pipe_framebuffer.cbufs[0]->texture; > > - binfo.src.level = binfo.dst.level = 0; > > - binfo.src.box.x = binfo.dst.box.x = 0; > > - binfo.src.box.y = binfo.dst.box.y = 0; > > - binfo.src.box.width = binfo.dst.box.width = ctx->pipe_framebuffer.width; > > - binfo.src.box.height = binfo.dst.box.height = > > ctx->pipe_framebuffer.height; > > +binfo.src.resource = binfo.dst.resource = > > ctx->pipe_framebuffer.cbufs[0]->texture; > > +binfo.src.level = binfo.dst.level = 0; > > +binfo.src.box.x = binfo.dst.box.x = rect->x; > > +binfo.src.box.y = binfo.dst.box.y = rect->y; > > +binfo.src.box.width = binfo.dst.box.width = rect->width; > > +binfo.src.box.height = binfo.dst.box.height = rect->height; > > > > /* This avoids an assert due to missing nir_texop_txb support */ > > //binfo.src.box.depth = binfo.dst.box.depth = 1; > > This will need to be rebased in a slightly messy way, since > panfrost_blit_wallpaper was edited pretty heavily in the mipmap series > that just landed. Sorry for the conflicts, although conceptually this > looks good. No problem, I'll take care of that (not the first time I rebase the patch series BTW). > > Have you considered if this interacts with mipmapping, by the way? I > suppose surfaces that get partial updates *by definition* are not > mipmapped, so that's an easy "who cares?" :) Daniel already replied to that one. > > > +u_box_2d(batch->minx, batch->miny, damage.minx - batch->minx, > > + batch->maxy - batch->miny, &rects[0]); > > +u_box_2d(damage.maxx, batch->miny, batch->maxx - damage.maxx, > > + batch->maxy - batch->miny, &rects[1]); > > +u_box_2d(damage.minx, batch->miny, damage.maxx - damage.minx, > > + damage.miny - batch->miny, &rects[2]); > > +u_box_2d(damage.minx, damage.maxy, damage.maxx - damage.minx, > > + batch->maxy - damage.maxy, &rects[3]); > > + > > +for (unsigned i = 0; i < 4; i++) { > > +if (!rects[i].width || !rects[i].height) > > +continue; > > This 'if' statement seems a little magic. Does u_box_2d clamp > width/height positive automatically? Is it possible to get negative > width/height? If the answer is "yes; no" respectively, which seems to be > how the code works, maybe add a quick comment explaining that. I'll add a comment to explain that. > > > +/* We set the damage extent to the full resource size but keep the > > + * damage box empty so that the FB content is reloaded by default. > > + */ > > English, please? Francais, s'il te plait? I'm not too familiar with > winsys or the extension -- what's the difference between damage extent > and damage box? Yeah, reading the comment again I realize it's not clear at all. The damage extent is the quad covering all damage rects (even if they don't intersect or only partially intersect). The damage box is actually the biggest damage rect (rect1 in the following example): ___ || | |__ | rect 2 | | | |_| | rect 1 |__ | | |rect3 | | |_|__|_| damage extent > > > +/* Looks like aligning on a tile is not enough, but > > aligning on > > + * twice the tile size works. > > + */ > > +ss.minx = rect[0] & ~((MALI_TILE_LENGTH * 2) - 1); > > +ss.miny = y & ~((MALI_TILE_LENGTH * 2) - 1); > > +ss.maxx = MIN2(ALIGN(rect[0] + rect[2], MALI_TILE_LENGTH * > > 2), > > + res->width0); > > +ss.maxy = MIN2(ALIGN(y + rect[3], MALI_TILE_LENGTH * 2), > > + res->height0); > > If aligning to 32x32 but not 16x16 works, that's probably masking over a > bug somewhere else in the code. I wish I could come with a better explanation, but I couldn't find anything explaining why this alignment is requirement or spot any o
Re: [Mesa-dev] [PATCH v4 5/5] panfrost: Add support for KHR_partial_update()
On Wed, 26 Jun 2019 12:19:28 -0700 Alyssa Rosenzweig wrote: > > > > +/* We set the damage extent to the full resource size but keep > > > > the > > > > + * damage box empty so that the FB content is reloaded by > > > > default. > > > > + */ > > > > > > English, please? Francais, s'il te plait? I'm not too familiar with > > > winsys or the extension -- what's the difference between damage extent > > > and damage box? > > > > Yeah, reading the comment again I realize it's not clear at all. The > > damage extent is the quad covering all damage rects (even if they don't > > intersect or only partially intersect). The damage box is actually the > > biggest damage rect (rect1 in the following example): > > > > ___ > > || | > > |__ | rect 2 | > > | | |_| > > | rect 1 |__ | > > | |rect3 | | > > |_|__|_| > > > > damage extent > > Hmm, ok. > > > > If aligning to 32x32 but not 16x16 works, that's probably masking over a > > > bug somewhere else in the code. > > > > I wish I could come with a better explanation, but I couldn't find > > anything explaining why this alignment is requirement or spot any > > obvious bugs in the code :-/. > > H. what was the symptom? The symptom is, black areas around the damage rect when the rendering area (the area you define in mali_payload_fragment) is not 32x32-aligned. If you want to test it, remove the "* 2" in the code and run weston+desktop-shell (the partial_update() logic has been merged earlier today, so you just have to build master) and start a terminal. > Also, maybe try with the commit I pushed to > the `tile-aligned` branch on gitlab.fd.o/alyssa/mesa Will try it. Thanks. > > > Oops, I fear intersection of non-32x32-aligned regions is not safe, it's > > just that I didn't test this case :-). Note that union would not be > > a problem here, because the intersection is applied last (just before > > drawing the wallpaper). That's only true if we assume the intersection > > func aligns things on 32x32 pixels of course (which is not the case > > right now). > > I'm not sure I follow why the intersection of, say, 16x16 aligned > regions (that are not also 32x32 aligned) shouldn't work? I'm not saying it shouldn't work, just saying it doesn't work in practice :P. > 16x16 *is* the > tile size for all intents and purposes here; 32x32 only comes up in > hierarchical tiling which is irrelevant to this code (and in fact, we > can disable hierarchical tiling entirely if I need to make my point even > clearer ;P Just poking fun) As I said, I wish I had a better understanding of the issue, but that's not the case :-(. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 5/5] panfrost: Add support for KHR_partial_update()
On Wed, 26 Jun 2019 14:03:45 -0700 Alyssa Rosenzweig wrote: > > The symptom is, black areas around the damage rect when the rendering > > area (the area you define in mali_payload_fragment) is not > > 32x32-aligned. If you want to test it, remove the "* 2" in the code and > > run weston+desktop-shell (the partial_update() logic has been merged > > earlier today, so you just have to build master) and start a terminal. > > So, if the wallpaper is drawing to a 32x32 aligned area but the > payload_fragment bounds are only 16x16 aligned, that doesn't do it? Aligning only the wallpaper drawing to 32x32 doesn't work. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 3/5] egl/dri: Use __DRI2_DAMAGE extension for KHR_partial_update
On Thu, 27 Jun 2019 13:29:38 +0530 Harish Krupo wrote: > Hi Boris, > > Thank you for resurrecting this series and taking it further. Just one > nitpick. > The commit summary should read: > "egl/dri: Use __DRI_BUFFER_DAMAGE extension for KHR_partial_update" > > While you are at it, could you please update my authorship and signoff > to: > "Harish Krupo " ? Sure, no problem, I'll do that. Thanks, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] panfrost: Add the sampled texture BO to the job
Otherwise we get random use-after-{free,unmap} errors. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index bf98d3853f16..37207398e82b 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -816,6 +816,8 @@ panfrost_upload_tex( struct panfrost_context *ctx, struct panfrost_sampler_view *view) { +struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); + if (!view) return (mali_ptr) NULL; @@ -848,6 +850,7 @@ panfrost_upload_tex( for (unsigned l = first_level; l <= last_level; ++l) { for (unsigned f = first_layer; f <= last_layer; ++f) { +panfrost_job_add_bo(job, rsrc->bo); view->hw.payload[idx++] = panfrost_get_texture_address(rsrc, l, f) + afbc_bit; -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Add the sampled texture BO to the job
On Mon, 1 Jul 2019 08:28:03 -0700 Alyssa Rosenzweig wrote: > > > > @@ -848,6 +850,7 @@ panfrost_upload_tex( > > for (unsigned l = first_level; l <= last_level; ++l) { > > for (unsigned f = first_layer; f <= last_layer; ++f) { > > > > +panfrost_job_add_bo(job, rsrc->bo); > > view->hw.payload[idx++] = > > panfrost_get_texture_address(rsrc, l, f) + > > afbc_bit; > > The bo is guaranteed to be the same for all levels and layers. So this > should be pulled out of both loops. Absolutely, guess I was to prompt at sending the patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] panfrost: Add the sampled texture BO to the job
Otherwise we get random use-after-{free,unmap} errors. Signed-off-by: Boris Brezillon --- Changes in v2: - Move the panfrost_job_add_bo() call out of the loop --- src/gallium/drivers/panfrost/pan_context.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index bf98d3853f16..c103a764edd9 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -840,6 +840,10 @@ panfrost_upload_tex( bool is_zs = rsrc->base.bind & PIPE_BIND_DEPTH_STENCIL; unsigned afbc_bit = (is_afbc && !is_zs) ? 1 : 0; + /* Add the BO to the job so it's retained until the job is done. */ +struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); +panfrost_job_add_bo(job, rsrc->bo); + /* Inject the addresses in, interleaving mip levels, cube faces, and * strides in that order */ -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] panfrost: Get rid of the "free imported BO" logic
bo->imported was never set to true which means this path was never taken. Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap() call which seems wrong because the import BO function calls mmap(). Let's just kill this function along with the ->imported field. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c | 18 -- src/gallium/drivers/panfrost/pan_resource.c | 19 +++ src/gallium/drivers/panfrost/pan_resource.h | 3 --- src/gallium/drivers/panfrost/pan_screen.h | 3 --- 4 files changed, 7 insertions(+), 36 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 3d25eda9667e..8c9a0612d7ed 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -175,24 +175,6 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, unsigned return TRUE; } -void -panfrost_drm_free_imported_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) -{ - struct drm_gem_close gem_close = { - .handle = bo->gem_handle, - }; - int ret; - - ret = drmIoctl(screen->fd, DRM_IOCTL_GEM_CLOSE, &gem_close); - if (ret) { -fprintf(stderr, "DRM_IOCTL_GEM_CLOSE failed: %d\n", ret); - assert(0); - } - - bo->gem_handle = -1; - bo->gpu = (mali_ptr)NULL; -} - int panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, struct pipe_surface *surf) { diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index fae535ed4e29..680b98a6cac3 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -435,19 +435,14 @@ panfrost_resource_create(struct pipe_screen *screen, static void panfrost_destroy_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) { -if (bo->imported) { -panfrost_drm_free_imported_bo(screen, bo); -} else { -struct panfrost_memory mem = { -.cpu = bo->cpu, -.gpu = bo->gpu, -.size = bo->size, -.gem_handle = bo->gem_handle, -}; - -panfrost_drm_free_slab(screen, &mem); -} +struct panfrost_memory mem = { +.cpu = bo->cpu, +.gpu = bo->gpu, +.size = bo->size, +.gem_handle = bo->gem_handle, +}; +panfrost_drm_free_slab(screen, &mem); ralloc_free(bo); } diff --git a/src/gallium/drivers/panfrost/pan_resource.h b/src/gallium/drivers/panfrost/pan_resource.h index 89a4396c0939..003211b8c4a7 100644 --- a/src/gallium/drivers/panfrost/pan_resource.h +++ b/src/gallium/drivers/panfrost/pan_resource.h @@ -75,9 +75,6 @@ struct panfrost_bo { /* Distance from tree to tree */ unsigned cubemap_stride; -/* Set if this bo was imported rather than allocated */ -bool imported; - /* Internal layout (tiled?) */ enum panfrost_memory_layout layout; diff --git a/src/gallium/drivers/panfrost/pan_screen.h b/src/gallium/drivers/panfrost/pan_screen.h index 22565d6b653b..ebc5fee5cfd6 100644 --- a/src/gallium/drivers/panfrost/pan_screen.h +++ b/src/gallium/drivers/panfrost/pan_screen.h @@ -88,9 +88,6 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, int panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, unsigned int stride, struct winsys_handle *whandle); -void -panfrost_drm_free_imported_bo(struct panfrost_screen *screen, - struct panfrost_bo *bo); int panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, struct pipe_surface *surf); -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] panfrost: Make SLAB pool creation rely on BO helpers
There's no point duplicating the code, and it will help us simplify the bo_handles[] filling logic in panfrost_drm_submit_job(). Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_allocate.c | 21 +++--- src/gallium/drivers/panfrost/pan_allocate.h | 22 -- src/gallium/drivers/panfrost/pan_context.c| 28 +++ src/gallium/drivers/panfrost/pan_drm.c| 74 +++ src/gallium/drivers/panfrost/pan_resource.h | 15 src/gallium/drivers/panfrost/pan_scoreboard.c | 2 +- src/gallium/drivers/panfrost/pan_sfbd.c | 4 +- 7 files changed, 56 insertions(+), 110 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_allocate.c b/src/gallium/drivers/panfrost/pan_allocate.c index 91ace74d0e43..37a6785e7dff 100644 --- a/src/gallium/drivers/panfrost/pan_allocate.c +++ b/src/gallium/drivers/panfrost/pan_allocate.c @@ -48,8 +48,8 @@ panfrost_allocate_chunk(struct panfrost_context *ctx, size_t size, unsigned heap struct panfrost_memory *backing = (struct panfrost_memory *) entry->slab; struct panfrost_transfer transfer = { -.cpu = backing->cpu + p_entry->offset, -.gpu = backing->gpu + p_entry->offset +.cpu = backing->bo->cpu + p_entry->offset, +.gpu = backing->bo->gpu + p_entry->offset }; return transfer; @@ -97,8 +97,8 @@ panfrost_allocate_transient(struct panfrost_context *ctx, size_t sz) struct panfrost_memory *backing = (struct panfrost_memory *) p_entry->base.slab; struct panfrost_transfer ret = { -.cpu = backing->cpu + p_entry->offset + pool->entry_offset, -.gpu = backing->gpu + p_entry->offset + pool->entry_offset +.cpu = backing->bo->cpu + p_entry->offset + pool->entry_offset, +.gpu = backing->bo->gpu + p_entry->offset + pool->entry_offset }; /* Advance the pointer */ @@ -192,18 +192,19 @@ mali_ptr panfrost_upload(struct panfrost_memory *mem, const void *data, size_t sz, bool no_pad) { /* Bounds check */ -if ((mem->stack_bottom + sz) >= mem->size) { -printf("Out of memory, tried to upload %zd but only %zd available\n", sz, mem->size - mem->stack_bottom); +if ((mem->stack_bottom + sz) >= mem->bo->size) { +printf("Out of memory, tried to upload %zd but only %zd available\n", + sz, mem->bo->size - mem->stack_bottom); assert(0); } -return pandev_upload(-1, &mem->stack_bottom, mem->gpu, mem->cpu, data, sz, no_pad); +return pandev_upload(-1, &mem->stack_bottom, mem->bo->gpu, mem->bo->cpu, data, sz, no_pad); } mali_ptr panfrost_upload_sequential(struct panfrost_memory *mem, const void *data, size_t sz) { -return pandev_upload(last_offset, &mem->stack_bottom, mem->gpu, mem->cpu, data, sz, true); +return pandev_upload(last_offset, &mem->stack_bottom, mem->bo->gpu, mem->bo->cpu, data, sz, true); } /* Simplified interface to allocate a chunk without any upload, to allow @@ -215,6 +216,6 @@ panfrost_allocate_transfer(struct panfrost_memory *mem, size_t sz, mali_ptr *gpu { int offset = pandev_allocate_offset(&mem->stack_bottom, sz); -*gpu = mem->gpu + offset; -return mem->cpu + offset; +*gpu = mem->bo->gpu + offset; +return mem->bo->cpu + offset; } diff --git a/src/gallium/drivers/panfrost/pan_allocate.h b/src/gallium/drivers/panfrost/pan_allocate.h index 5bbb1e4b078d..20ba204dee86 100644 --- a/src/gallium/drivers/panfrost/pan_allocate.h +++ b/src/gallium/drivers/panfrost/pan_allocate.h @@ -58,16 +58,28 @@ struct panfrost_transfer { mali_ptr gpu; }; +struct panfrost_bo { +struct pipe_reference reference; + +/* Mapping for the entire object (all levels) */ +uint8_t *cpu; + +/* GPU address for the object */ +mali_ptr gpu; + +/* Size of all entire trees */ +size_t size; + +int gem_handle; +}; + struct panfrost_memory { /* Subclassing slab object */ struct pb_slab slab; /* Backing for the slab in memory */ -uint8_t *cpu; -mali_ptr gpu; +struct panfrost_bo *bo; int stack_bottom; -size_t size; -int gem_handle; }; /* Slab entry sizes range from 2^min to 2^max. In this case, we range from 1k @@ -109,7 +121,7 @@ static inline mali_ptr panfrost_reserve(struct panfrost_memory *mem, size_t sz) { mem->stack_bottom += sz; -return mem->gpu + (mem->stack_bottom - sz); +return mem->bo->gpu + (mem->stack_bottom - sz); } struct panfrost_tr
[Mesa-dev] [PATCH 07/10] panfrost: Move the mmap BO logic out of panfrost_drm_import_bo()
So we can re-use it for the panfrost_drm_create_bo() function we are about to introduce. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c | 51 +++--- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index b88ab0e5ce2b..b21005feaebb 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -38,6 +38,33 @@ #include "pan_util.h" #include "pandecode/decode.h" +static void +panfrost_drm_mmap_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) +{ +struct drm_panfrost_mmap_bo mmap_bo = { .handle = bo->gem_handle }; +int ret; + +if (bo->cpu) +return; + +ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_MMAP_BO, &mmap_bo); +if (ret) { +fprintf(stderr, "DRM_IOCTL_PANFROST_MMAP_BO failed: %d\n", ret); +assert(0); +} + +bo->cpu = os_mmap(NULL, bo->size, PROT_READ | PROT_WRITE, MAP_SHARED, + screen->fd, mmap_bo.offset); +if (bo->cpu == MAP_FAILED) { +fprintf(stderr, "mmap failed: %p\n", bo->cpu); +assert(0); +} + +/* Record the mmap if we're tracing */ +if (pan_debug & PAN_DBG_TRACE) +pandecode_inject_mmap(bo->gpu, bo->cpu, bo->size, NULL); +} + void panfrost_drm_allocate_slab(struct panfrost_screen *screen, struct panfrost_memory *mem, @@ -118,7 +145,6 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, int fd) { struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); struct drm_panfrost_get_bo_offset get_bo_offset = {0,}; - struct drm_panfrost_mmap_bo mmap_bo = {0,}; int ret; unsigned gem_handle; @@ -131,29 +157,12 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, int fd) bo->gem_handle = gem_handle; bo->gpu = (mali_ptr) get_bo_offset.offset; -pipe_reference_init(&bo->reference, 1); - - // TODO map and unmap on demand? - mmap_bo.handle = gem_handle; - ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_MMAP_BO, &mmap_bo); - if (ret) { -fprintf(stderr, "DRM_IOCTL_PANFROST_MMAP_BO failed: %d\n", ret); - assert(0); - } - bo->size = lseek(fd, 0, SEEK_END); assert(bo->size > 0); -bo->cpu = os_mmap(NULL, bo->size, PROT_READ | PROT_WRITE, MAP_SHARED, - screen->fd, mmap_bo.offset); -if (bo->cpu == MAP_FAILED) { -fprintf(stderr, "mmap failed: %p\n", bo->cpu); - assert(0); - } - -/* Record the mmap if we're tracing */ -if (pan_debug & PAN_DBG_TRACE) -pandecode_inject_mmap(bo->gpu, bo->cpu, bo->size, NULL); +pipe_reference_init(&bo->reference, 1); +// TODO map and unmap on demand? +panfrost_drm_mmap_bo(screen, bo); return bo; } -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/10] panfrost: Try to make BO handling more consistent
Hello, This patch series is an attempt at making memory allocation more consistent by implementing SLAB pool allocation around the BO allocation logic. Note that my initial goal was to pass referenced BOs to the kernel driver, but I've decided to clean things up along the way (just let me know if you think it was a mistake). The first 4 patches are simple cleanups and could have been sent separately, but it's easier for me to post everything as a single series to avoid dependency issues. Patch 5 might be a bit more controversial, as I move some of the fields that were in panfrost_bo to panfrost_resource. The rationale here being that other drivers (vc4, freedreno, ...) seem to keep _bo objects as simple memory backing objs that do not contain any meta data describing the buffer content. This approach allows us to re-use those objects to allocate anything, not only texture or FB resources. Again, if you think that's a wrong decision, let me know. Patch 6 is in the same vein, it makes BO import/export functions more generic so they can be used for !winsys bufs. Patches 7-9 are here to re-use the BO creation/destruction logic as much as possible instead of duplicating the code or having convoluted code doing panfrost_bo <-> panfrost_memory conversions. And finally, patch 10 is making use of all the changes described above to pass all referenced BOs to the kernel driver when a job is submitted, thus making sure the job will wait for all resources to be ready before being scheduled. I know there's a lot going on in the panfrost area right now, and some of what I'm proposing here might be irrelevant or might have to be ported on top of other changes (which is fine). Let me know if that's the case. Regards, Boris Boris Brezillon (10): panfrost: Move scanout res creation out of panfrost_resource_create() panfrost: Get rid of the panfrost_driver abstraction leftovers panfrost: Get rid of the "free imported BO" logic panfrost: Stop exposing internal panfrost_drm_*() functions panfrost: Move BO meta-data out of panfrost_bo panfrost: Avoid passing winsys handles to import/export BO funcs panfrost: Move the mmap BO logic out of panfrost_drm_import_bo() panfrost: Add the panfrost_drm_{create,release}_bo() helpers panfrost: Make SLAB pool creation rely on BO helpers panfrost: Pass referenced BOs to the SUBMIT ioctls src/gallium/drivers/panfrost/pan_allocate.c | 21 +- src/gallium/drivers/panfrost/pan_allocate.h | 22 +- src/gallium/drivers/panfrost/pan_context.c| 40 +-- src/gallium/drivers/panfrost/pan_drm.c| 263 +- src/gallium/drivers/panfrost/pan_drm.h| 32 --- src/gallium/drivers/panfrost/pan_fragment.c | 2 +- src/gallium/drivers/panfrost/pan_mfbd.c | 24 +- src/gallium/drivers/panfrost/pan_resource.c | 242 src/gallium/drivers/panfrost/pan_resource.h | 42 +-- src/gallium/drivers/panfrost/pan_scoreboard.c | 2 +- src/gallium/drivers/panfrost/pan_screen.c | 2 - src/gallium/drivers/panfrost/pan_screen.h | 17 +- src/gallium/drivers/panfrost/pan_sfbd.c | 8 +- 13 files changed, 333 insertions(+), 384 deletions(-) delete mode 100644 src/gallium/drivers/panfrost/pan_drm.h -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/10] panfrost: Get rid of the panfrost_driver abstraction leftovers
Commit 5f81669d880b ("panfrost: Remove the panfrost_driver abstraction") left a few things behind, remove them now. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c| 1 - src/gallium/drivers/panfrost/pan_drm.h| 32 --- src/gallium/drivers/panfrost/pan_screen.c | 2 -- 3 files changed, 35 deletions(-) delete mode 100644 src/gallium/drivers/panfrost/pan_drm.h diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 7ffceb9156cd..3d25eda9667e 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -35,7 +35,6 @@ #include "pan_screen.h" #include "pan_resource.h" #include "pan_context.h" -#include "pan_drm.h" #include "pan_util.h" #include "pandecode/decode.h" diff --git a/src/gallium/drivers/panfrost/pan_drm.h b/src/gallium/drivers/panfrost/pan_drm.h deleted file mode 100644 index e94907aa983b.. --- a/src/gallium/drivers/panfrost/pan_drm.h +++ /dev/null @@ -1,32 +0,0 @@ -/* - * © Copyright 2019 Collabora, Ltd. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -#ifndef __PAN_DRM_H__ -#define __PAN_DRM_H__ - -#include "pan_screen.h" - -struct panfrost_driver *panfrost_create_drm_driver(int fd); - -#endif /* __PAN_DRM_H__ */ diff --git a/src/gallium/drivers/panfrost/pan_screen.c b/src/gallium/drivers/panfrost/pan_screen.c index d6b1bc89fc19..d53a906838eb 100644 --- a/src/gallium/drivers/panfrost/pan_screen.c +++ b/src/gallium/drivers/panfrost/pan_screen.c @@ -64,8 +64,6 @@ DEBUG_GET_ONCE_FLAGS_OPTION(pan_debug, "PAN_MESA_DEBUG", debug_options, 0) int pan_debug = 0; -struct panfrost_driver *panfrost_create_drm_driver(int fd); - static const char * panfrost_get_name(struct pipe_screen *screen) { -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] panfrost: Move scanout res creation out of panfrost_resource_create()
Which improves readability and help us avoid a memory leak. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_resource.c | 79 - 1 file changed, 44 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index 8db7e45af1b6..fae535ed4e29 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -180,6 +180,37 @@ panfrost_surface_destroy(struct pipe_context *pipe, ralloc_free(surf); } +static struct pipe_resource * +panfrost_create_scanout_res(struct pipe_screen *screen, +const struct pipe_resource *template) +{ +struct panfrost_screen *pscreen = pan_screen(screen); +struct pipe_resource scanout_templat = *template; +struct renderonly_scanout *scanout; +struct winsys_handle handle; +struct pipe_resource *res; + +scanout = renderonly_scanout_for_resource(&scanout_templat, + pscreen->ro, &handle); +if (!scanout) +return NULL; + +assert(handle.type == WINSYS_HANDLE_TYPE_FD); +/* TODO: handle modifiers? */ +res = screen->resource_from_handle(screen, template, &handle, + PIPE_HANDLE_USAGE_FRAMEBUFFER_WRITE); +close(handle.handle); +if (!res) +return NULL; + +struct panfrost_resource *pres = pan_resource(res); + +pres->scanout = scanout; +pscreen->display_target = pres; + +return res; +} + /* Computes sizes for checksumming, which is 8 bytes per 16x16 tile */ #define CHECKSUM_TILE_WIDTH 16 @@ -368,14 +399,6 @@ static struct pipe_resource * panfrost_resource_create(struct pipe_screen *screen, const struct pipe_resource *template) { -struct panfrost_resource *so = rzalloc(screen, struct panfrost_resource); -struct panfrost_screen *pscreen = (struct panfrost_screen *) screen; - -so->base = *template; -so->base.screen = screen; - -pipe_reference_init(&so->base.reference, 1); - /* Make sure we're familiar */ switch (template->target) { case PIPE_BUFFER: @@ -391,35 +414,21 @@ panfrost_resource_create(struct pipe_screen *screen, assert(0); } +if (template->bind & +(PIPE_BIND_DISPLAY_TARGET | PIPE_BIND_SCANOUT | PIPE_BIND_SHARED)) +return panfrost_create_scanout_res(screen, template); + +struct panfrost_resource *so = rzalloc(screen, struct panfrost_resource); +struct panfrost_screen *pscreen = (struct panfrost_screen *) screen; + +so->base = *template; +so->base.screen = screen; + +pipe_reference_init(&so->base.reference, 1); + util_range_init(&so->valid_buffer_range); -if (template->bind & PIPE_BIND_DISPLAY_TARGET || -template->bind & PIPE_BIND_SCANOUT || -template->bind & PIPE_BIND_SHARED) { -struct pipe_resource scanout_templat = *template; -struct renderonly_scanout *scanout; -struct winsys_handle handle; - -scanout = renderonly_scanout_for_resource(&scanout_templat, - pscreen->ro, &handle); -if (!scanout) -return NULL; - -assert(handle.type == WINSYS_HANDLE_TYPE_FD); -/* TODO: handle modifiers? */ -so = pan_resource(screen->resource_from_handle(screen, template, - &handle, - PIPE_HANDLE_USAGE_FRAMEBUFFER_WRITE)); -close(handle.handle); -if (!so) -return NULL; - -so->scanout = scanout; -pscreen->display_target = so; -} else { -so->bo = panfrost_create_bo(pscreen, template); -} - +so->bo = panfrost_create_bo(pscreen, template); return (struct pipe_resource *)so; } -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/10] panfrost: Move BO meta-data out of panfrost_bo
That's what most (all?) implementation seem to do, and my understanding is that a BO is just a bunch of memory that can be used for anything GPU related, not only texture/FB resources. Let's move those meta data in panfrost_resource so we can use panfrost_bo for all kind of memory allocation and make BO allocation more consistent. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 12 +- src/gallium/drivers/panfrost/pan_fragment.c | 2 +- src/gallium/drivers/panfrost/pan_mfbd.c | 24 ++-- src/gallium/drivers/panfrost/pan_resource.c | 126 ++-- src/gallium/drivers/panfrost/pan_resource.h | 24 ++-- src/gallium/drivers/panfrost/pan_sfbd.c | 4 +- 6 files changed, 98 insertions(+), 94 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index c78042d412d0..139e0a1140cc 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -797,7 +797,7 @@ panfrost_upload_tex( unsigned last_layer = pview->u.tex.last_layer; /* Lower-bit is set when sampling from colour AFBC */ -bool is_afbc = rsrc->bo->layout == PAN_AFBC; +bool is_afbc = rsrc->layout == PAN_AFBC; bool is_zs = rsrc->base.bind & PIPE_BIND_DEPTH_STENCIL; unsigned afbc_bit = (is_afbc && !is_zs) ? 1 : 0; @@ -818,7 +818,7 @@ panfrost_upload_tex( if (has_manual_stride) { view->hw.payload[idx++] = -rsrc->bo->slices[l].stride; +rsrc->slices[l].stride; } } } @@ -1469,7 +1469,7 @@ panfrost_draw_wallpaper(struct pipe_context *pipe) struct panfrost_resource *rsrc = pan_resource(surf->texture); unsigned level = surf->u.tex.level; -if (!rsrc->bo->slices[level].initialized) +if (!rsrc->slices[level].initialized) return; /* Save the batch */ @@ -2203,7 +2203,7 @@ panfrost_create_sampler_view( unsigned usage2_layout = 0x10; -switch (prsrc->bo->layout) { +switch (prsrc->layout) { case PAN_AFBC: usage2_layout |= 0x8 | 0x4; break; @@ -2226,9 +2226,9 @@ panfrost_create_sampler_view( unsigned first_level = template->u.tex.first_level; unsigned last_level = template->u.tex.last_level; -if (prsrc->bo->layout == PAN_LINEAR) { +if (prsrc->layout == PAN_LINEAR) { for (unsigned l = first_level; l <= last_level; ++l) { -unsigned actual_stride = prsrc->bo->slices[l].stride; +unsigned actual_stride = prsrc->slices[l].stride; unsigned width = u_minify(texture->width0, l); unsigned comp_stride = width * bytes_per_pixel; diff --git a/src/gallium/drivers/panfrost/pan_fragment.c b/src/gallium/drivers/panfrost/pan_fragment.c index 5dbca021141e..ed8677d1afdd 100644 --- a/src/gallium/drivers/panfrost/pan_fragment.c +++ b/src/gallium/drivers/panfrost/pan_fragment.c @@ -36,7 +36,7 @@ panfrost_initialize_surface(struct pipe_surface *surf) unsigned level = surf->u.tex.level; struct panfrost_resource *rsrc = pan_resource(surf->texture); -rsrc->bo->slices[level].initialized = true; +rsrc->slices[level].initialized = true; } /* Generate a fragment job. This should be called once per frame. (According to diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c b/src/gallium/drivers/panfrost/pan_mfbd.c index b209ecbf580a..b435d20b7582 100644 --- a/src/gallium/drivers/panfrost/pan_mfbd.c +++ b/src/gallium/drivers/panfrost/pan_mfbd.c @@ -128,7 +128,7 @@ panfrost_mfbd_set_cbuf( unsigned level = surf->u.tex.level; unsigned first_layer = surf->u.tex.first_layer; assert(surf->u.tex.last_layer == first_layer); -int stride = rsrc->bo->slices[level].stride; +int stride = rsrc->slices[level].stride; mali_ptr base = panfrost_get_texture_address(rsrc, level, first_layer); @@ -136,18 +136,18 @@ panfrost_mfbd_set_cbuf( /* Now, we set the layout specific pieces */ -if (rsrc->bo->layout == PAN_LINEAR) { +if (rsrc->layout == PAN_LINEAR) { rt->format.block = MALI_MFBD_BLOCK_LINEAR; rt->framebuffer = base; rt->framebuffer_stride = stride / 16; -} else if (rsrc->bo->layout == PAN_TILED) { +} else if (rsrc->layout == PAN_TILED) { rt->format.block = MALI_MFBD_BLOCK_TILED; rt->framebuffer = base;
[Mesa-dev] [PATCH 10/10] panfrost: Pass referenced BOs to the SUBMIT ioctls
Instead of manually adding the BOs from the various SLAB pools plus the one backing the color FB, we insert them in the BO set attached to the job and let panfrost_drm_submit_job() pass all BOs from this set to the SUBMIT ioctl. This means we are now passing all referenced BOs and let the scheduler wait on referenced BO fences if needed. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c | 46 +++--- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index ac82ec583021..8de4f483435c 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -192,12 +192,13 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo } static int -panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, struct pipe_surface *surf) +panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs) { struct pipe_context *gallium = (struct pipe_context *) ctx; struct panfrost_screen *screen = pan_screen(gallium->screen); +struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); struct drm_panfrost_submit submit = {0,}; -int bo_handles[7]; +int *bo_handles, ret; submit.in_syncs = (u64) (uintptr_t) &ctx->out_sync; submit.in_sync_count = 1; @@ -207,22 +208,19 @@ panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, st submit.jc = job_desc; submit.requirements = reqs; - if (surf) { - struct panfrost_resource *res = pan_resource(surf->texture); - assert(res->bo->gem_handle > 0); - bo_handles[submit.bo_handle_count++] = res->bo->gem_handle; + bo_handles = calloc(job->bos->entries, sizeof(*bo_handles)); + assert(bo_handles); + + set_foreach(job->bos, entry) { + struct panfrost_bo *bo = (struct panfrost_bo *)entry->key; + assert(bo->gem_handle > 0); + bo_handles[submit.bo_handle_count++] = bo->gem_handle; } - /* TODO: Add here the transient pools */ -/* TODO: Add here the BOs listed in the panfrost_job */ -bo_handles[submit.bo_handle_count++] = ctx->shaders.bo->gem_handle; -bo_handles[submit.bo_handle_count++] = ctx->scratchpad.bo->gem_handle; -bo_handles[submit.bo_handle_count++] = ctx->tiler_heap.bo->gem_handle; -bo_handles[submit.bo_handle_count++] = ctx->varying_mem.bo->gem_handle; -bo_handles[submit.bo_handle_count++] = ctx->tiler_polygon_list.bo->gem_handle; submit.bo_handles = (u64) (uintptr_t) bo_handles; - - if (drmIoctl(screen->fd, DRM_IOCTL_PANFROST_SUBMIT, &submit)) { + ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_SUBMIT, &submit); + free(bo_handles); + if (ret) { fprintf(stderr, "Error submitting: %m\n"); return errno; } @@ -245,13 +243,23 @@ panfrost_drm_submit_vs_fs_job(struct panfrost_context *ctx, bool has_draws, bool struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); +/* TODO: Add here the transient pools */ +panfrost_job_add_bo(job, ctx->shaders.bo); +panfrost_job_add_bo(job, ctx->scratchpad.bo); +panfrost_job_add_bo(job, ctx->tiler_heap.bo); +panfrost_job_add_bo(job, ctx->varying_mem.bo); +panfrost_job_add_bo(job, ctx->tiler_polygon_list.bo); + if (job->first_job.gpu) { - ret = panfrost_drm_submit_job(ctx, job->first_job.gpu, 0, NULL); - assert(!ret); - } +ret = panfrost_drm_submit_job(ctx, job->first_job.gpu, 0); +assert(!ret); +} if (job->first_tiler.gpu || job->clear) { -ret = panfrost_drm_submit_job(ctx, panfrost_fragment_job(ctx, has_draws), PANFROST_JD_REQ_FS, surf); +struct panfrost_resource *res = pan_resource(surf->texture); +assert(res->bo); +panfrost_job_add_bo(job, res->bo); +ret = panfrost_drm_submit_job(ctx, panfrost_fragment_job(ctx, has_draws), PANFROST_JD_REQ_FS); assert(!ret); } -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] panfrost: Avoid passing winsys handles to import/export BO funcs
Let's keep a clear split between ioctl wrappers and the rest of the driver. All the import BO function need is a dmabuf FD and the screen object, and the export one should only take care of generating a dmabuf FD out of a BO object. Winsys handle manipulation should stay in the resource.c file. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c | 17 +++-- src/gallium/drivers/panfrost/pan_resource.c | 16 +++- src/gallium/drivers/panfrost/pan_screen.h | 6 ++ 3 files changed, 20 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index f17f56bc6307..b88ab0e5ce2b 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -114,7 +114,7 @@ panfrost_drm_free_slab(struct panfrost_screen *screen, struct panfrost_memory *m } struct panfrost_bo * -panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *whandle) +panfrost_drm_import_bo(struct panfrost_screen *screen, int fd) { struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); struct drm_panfrost_get_bo_offset get_bo_offset = {0,}; @@ -122,7 +122,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha int ret; unsigned gem_handle; - ret = drmPrimeFDToHandle(screen->fd, whandle->handle, &gem_handle); + ret = drmPrimeFDToHandle(screen->fd, fd, &gem_handle); assert(!ret); get_bo_offset.handle = gem_handle; @@ -141,7 +141,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha assert(0); } -bo->size = lseek(whandle->handle, 0, SEEK_END); +bo->size = lseek(fd, 0, SEEK_END); assert(bo->size > 0); bo->cpu = os_mmap(NULL, bo->size, PROT_READ | PROT_WRITE, MAP_SHARED, screen->fd, mmap_bo.offset); @@ -158,21 +158,18 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, struct winsys_handle *wha } int -panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, unsigned int stride, struct winsys_handle *whandle) +panfrost_drm_export_bo(struct panfrost_screen *screen, const struct panfrost_bo *bo) { struct drm_prime_handle args = { -.handle = gem_handle, +.handle = bo->gem_handle, .flags = DRM_CLOEXEC, }; int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args); if (ret == -1) -return FALSE; +return -1; -whandle->handle = args.fd; -whandle->stride = stride; - -return TRUE; +return args.fd; } static int diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index 8901aeee09b1..f86617f80c20 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -70,7 +70,7 @@ panfrost_resource_from_handle(struct pipe_screen *pscreen, pipe_reference_init(&prsc->reference, 1); prsc->screen = pscreen; - rsc->bo = panfrost_drm_import_bo(screen, whandle); + rsc->bo = panfrost_drm_import_bo(screen, whandle->handle); rsc->slices[0].stride = whandle->stride; rsc->slices[0].initialized = true; @@ -120,10 +120,16 @@ panfrost_resource_get_handle(struct pipe_screen *pscreen, handle->handle = args.fd; return TRUE; -} else - return panfrost_drm_export_bo(screen, rsrc->bo->gem_handle, - rsrc->slices[0].stride, - handle); +} else { +int fd = panfrost_drm_export_bo(screen, rsrc->bo); + +if (fd < 0) +return FALSE; + +handle->handle = fd; +handle->stride = rsrc->slices[0].stride; +return TRUE; + } } return FALSE; diff --git a/src/gallium/drivers/panfrost/pan_screen.h b/src/gallium/drivers/panfrost/pan_screen.h index 83186ebb2f7f..1a1eb2f8bf27 100644 --- a/src/gallium/drivers/panfrost/pan_screen.h +++ b/src/gallium/drivers/panfrost/pan_screen.h @@ -83,11 +83,9 @@ void panfrost_drm_free_slab(struct panfrost_screen *screen, struct panfrost_memory *mem); struct panfrost_bo * -panfrost_drm_import_bo(struct panfrost_screen *screen, - struct winsys_handle *whandle); +panfrost_drm_import_bo(struct panfrost_screen *screen, int fd); int -panfrost_drm_export_bo(struct panfrost_screen *screen, int ge
[Mesa-dev] [PATCH 08/10] panfrost: Add the panfrost_drm_{create, release}_bo() helpers
To avoid the panfrost_memory <-> panfrost_bo dance done in panfrost_resource_create_bo() and panfrost_bo_unreference(). Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c | 62 + src/gallium/drivers/panfrost/pan_resource.c | 32 +-- src/gallium/drivers/panfrost/pan_screen.h | 5 ++ 3 files changed, 70 insertions(+), 29 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index b21005feaebb..d49c999e0773 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -65,6 +65,68 @@ panfrost_drm_mmap_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) pandecode_inject_mmap(bo->gpu, bo->cpu, bo->size, NULL); } +static void +panfrost_drm_munmap_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) +{ +if (!bo->cpu) +return; + +if (os_munmap((void *) (uintptr_t)bo->cpu, bo->size)) { +perror("munmap"); +abort(); +} + +bo->cpu = NULL; +} + +struct panfrost_bo * +panfrost_drm_create_bo(struct panfrost_screen *screen, size_t size, + uint32_t flags) +{ +struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); +struct drm_panfrost_create_bo create_bo = { +.size = size, +.flags = flags, +}; +int ret; + +ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_CREATE_BO, &create_bo); +if (ret) { +fprintf(stderr, "DRM_IOCTL_PANFROST_CREATE_BO failed: %d\n", ret); +assert(0); +} + +bo->size = create_bo.size; +bo->gpu = create_bo.offset; +bo->gem_handle = create_bo.handle; + +// TODO map and unmap on demand? +panfrost_drm_mmap_bo(screen, bo); + +pipe_reference_init(&bo->reference, 1); +return bo; +} + +void +panfrost_drm_release_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) +{ +struct drm_gem_close gem_close = { .handle = bo->gem_handle }; +int ret; + +if (!bo) +return; + +panfrost_drm_munmap_bo(screen, bo); + +ret = drmIoctl(screen->fd, DRM_IOCTL_GEM_CLOSE, &gem_close); +if (ret) { +fprintf(stderr, "DRM_IOCTL_GEM_CLOSE failed: %d\n", ret); +assert(0); +} + +ralloc_free(bo); +} + void panfrost_drm_allocate_slab(struct panfrost_screen *screen, struct panfrost_memory *mem, diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index f86617f80c20..b651fcffb111 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -391,18 +391,7 @@ panfrost_resource_create_bo(struct panfrost_screen *screen, struct panfrost_reso size_t bo_size; panfrost_setup_slices(pres, &bo_size); - -struct panfrost_memory mem; -struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); - -pipe_reference_init(&bo->reference, 1); -panfrost_drm_allocate_slab(screen, &mem, bo_size / 4096, true, 0, 0, 0); - -bo->cpu = mem.cpu; -bo->gpu = mem.gpu; -bo->gem_handle = mem.gem_handle; - bo->size = bo_size; - pres->bo = bo; +pres->bo = panfrost_drm_create_bo(screen, bo_size, 0); } static struct pipe_resource * @@ -442,20 +431,6 @@ panfrost_resource_create(struct pipe_screen *screen, return (struct pipe_resource *)so; } -static void -panfrost_destroy_bo(struct panfrost_screen *screen, struct panfrost_bo *bo) -{ -struct panfrost_memory mem = { -.cpu = bo->cpu, -.gpu = bo->gpu, -.size = bo->size, -.gem_handle = bo->gem_handle, -}; - -panfrost_drm_free_slab(screen, &mem); -ralloc_free(bo); -} - void panfrost_bo_reference(struct panfrost_bo *bo) { @@ -467,9 +442,8 @@ panfrost_bo_unreference(struct pipe_screen *screen, struct panfrost_bo *bo) { /* When the reference count goes to zero, we need to cleanup */ -if (pipe_reference(&bo->reference, NULL)) { -panfrost_destroy_bo(pan_screen(screen), bo); -} +if (pipe_reference(&bo->reference, NULL)) +panfrost_drm_release_bo(pan_screen(screen), bo); } static void diff --git a/src/gallium/drivers/panfrost/pan_screen.h b/src/gallium/drivers/panfrost/pan_screen.h index 1a1eb2f8bf27..9bcea6114285 100644 --- a/src/gallium/drivers/panfrost/pan_screen.h +++ b/src/gallium/drivers/panfrost/pan_screen.h @@ -83,6 +83,11 @@ void panfrost_drm_free_slab(struct panfrost_screen
[Mesa-dev] [PATCH 04/10] panfrost: Stop exposing internal panfrost_drm_*() functions
panfrost_drm_submit_job() and panfrost_fence_create() are not used outside of pan_drm.c. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_drm.c| 4 ++-- src/gallium/drivers/panfrost/pan_screen.h | 5 - 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 8c9a0612d7ed..f17f56bc6307 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -175,7 +175,7 @@ panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, unsigned return TRUE; } -int +static int panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, struct pipe_surface *surf) { struct pipe_context *gallium = (struct pipe_context *) ctx; @@ -242,7 +242,7 @@ panfrost_drm_submit_vs_fs_job(struct panfrost_context *ctx, bool has_draws, bool return ret; } -struct panfrost_fence * +static struct panfrost_fence * panfrost_fence_create(struct panfrost_context *ctx) { struct pipe_context *gallium = (struct pipe_context *) ctx; diff --git a/src/gallium/drivers/panfrost/pan_screen.h b/src/gallium/drivers/panfrost/pan_screen.h index ebc5fee5cfd6..83186ebb2f7f 100644 --- a/src/gallium/drivers/panfrost/pan_screen.h +++ b/src/gallium/drivers/panfrost/pan_screen.h @@ -89,13 +89,8 @@ int panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, unsigned int stride, struct winsys_handle *whandle); int -panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, -struct pipe_surface *surf); -int panfrost_drm_submit_vs_fs_job(struct panfrost_context *ctx, bool has_draws, bool is_scanout); -struct panfrost_fence * -panfrost_fence_create(struct panfrost_context *ctx); void panfrost_drm_force_flush_fragment(struct panfrost_context *ctx, struct pipe_fence_handle **fence); -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 3/5] egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update
From: Harish Krupo Use the DRI2 interface callback to pass the damage rects to the driver. Signed-off-by: Harish Krupo Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig --- Changes in v5: * Add Alyssa's a-b * Update Arish email address * s/__DRI2_DAMAGE/__DRI2_BUFFER_DAMAGE/ --- src/egl/drivers/dri2/egl_dri2.c | 55 ++--- src/egl/drivers/dri2/egl_dri2.h | 1 + 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 3c33b2cf27f8..fcafcfd73c63 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -452,6 +452,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_NO_ERROR, 1, offsetof(struct dri2_egl_display, no_error) }, { __DRI2_CONFIG_QUERY, 1, offsetof(struct dri2_egl_display, config) }, { __DRI2_FENCE, 1, offsetof(struct dri2_egl_display, fence) }, + { __DRI2_BUFFER_DAMAGE, 1, offsetof(struct dri2_egl_display, buffer_damage) }, { __DRI2_RENDERER_QUERY, 1, offsetof(struct dri2_egl_display, rendererQuery) }, { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, @@ -721,6 +722,9 @@ dri2_setup_screen(_EGLDisplay *disp) if (dri2_dpy->flush_control) disp->Extensions.KHR_context_flush_control = EGL_TRUE; + + if (dri2_dpy->buffer_damage && dri2_dpy->buffer_damage->set_damage_region) + disp->Extensions.KHR_partial_update = EGL_TRUE; } void @@ -1658,11 +1662,22 @@ static EGLBoolean dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + ret = dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1671,12 +1686,23 @@ dri2_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, - rects, n_rects); + ret = dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, + rects, n_rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1684,14 +1710,33 @@ dri2_swap_buffers_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + EGLBoolean ret; + + ret = dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - return false; + struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + + if (!dri2_dpy->buffer_damage || !dri2_dpy->buffer_damage->set_damage_region) + return EGL_FALSE; + + dri2_dpy->buffer_damage->set_damage_region(drawable, n_rects, rects); + return EGL_TRUE; } static EGLBool
[Mesa-dev] [PATCH v5 4/5] st/dri2: Implement DRI2bufferDamageExtension
From: Daniel Stone Add a pipe_screen->set_damage_region() hook to propagate set-damage-region requests to the driver, it's then up to the driver to decide what to do with this piece of information. If the hook is left unassigned, the buffer-damage extension is considered unsupported. Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon Reviewed-by: Alyssa Rosenzweig --- Changes in v5: * Add Alyssa's R-b --- src/gallium/include/pipe/p_screen.h | 7 +++ src/gallium/state_trackers/dri/dri2.c | 22 ++ 2 files changed, 29 insertions(+) diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index 3f9bad470950..8df12ee4f865 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -464,6 +464,13 @@ struct pipe_screen { bool (*is_parallel_shader_compilation_finished)(struct pipe_screen *screen, void *shader, unsigned shader_type); + + /** +* Set damage region. +*/ + void (*set_damage_region)(struct pipe_screen *screen, + struct pipe_resource *resource, + unsigned int nrects, int *rects); }; diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 5a7ec878bab0..1a86cd244d21 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1807,6 +1807,23 @@ static const __DRI2interopExtension dri2InteropExtension = { .export_object = dri2_interop_export_object }; +/** + * \brief the DRI2bufferDamageExtension set_damage_region method + */ +static void +dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int *rects) +{ + struct dri_drawable *drawable = dri_drawable(dPriv); + struct pipe_resource *resource = drawable->textures[ST_ATTACHMENT_BACK_LEFT]; + struct pipe_screen *screen = resource->screen; + + screen->set_damage_region(screen, resource, nrects, rects); +} + +static __DRI2bufferDamageExtension dri2BufferDamageExtension = { + .base = { __DRI2_BUFFER_DAMAGE, 1 }, +}; + /** * \brief the DRI2ConfigQueryExtension configQueryb method */ @@ -1908,6 +1925,7 @@ static const __DRIextension *dri_screen_extensions[] = { &dri2GalliumConfigQueryExtension.base, &dri2ThrottleExtension.base, &dri2FenceExtension.base, + &dri2BufferDamageExtension.base, &dri2InteropExtension.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1923,6 +1941,7 @@ static const __DRIextension *dri_robust_screen_extensions[] = { &dri2ThrottleExtension.base, &dri2FenceExtension.base, &dri2InteropExtension.base, + &dri2BufferDamageExtension.base, &dri2Robustness.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1983,6 +2002,9 @@ dri2_init_screen(__DRIscreen * sPriv) } } + if (pscreen->set_damage_region) + dri2BufferDamageExtension.set_damage_region = dri2_set_damage_region; + if (pscreen->get_param(pscreen, PIPE_CAP_DEVICE_RESET_STATUS_QUERY)) { sPriv->extensions = dri_robust_screen_extensions; screen->has_reset_status_query = true; -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 2/5] dri_interface: add DRI2_BufferDamage interface
From: Daniel Stone Add a new DRI2_BufferDamage interface to support the EGL_KHR_partial_update extension, informing the driver of an overriding scissor region for a particular drawable. Based on a commit originally authored by: Harish Krupo renamed extension, retargeted at DRI drawable instead of context, rewritten description Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig --- Changes in v5: * Add Alyssa's a-b * Add Daniel's SoB --- include/GL/internal/dri_interface.h | 43 + 1 file changed, 43 insertions(+) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index af0ee9c56670..ada78c5d53d6 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -85,6 +85,7 @@ typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension; typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension; typedef struct __DRI2interopExtensionRec __DRI2interopExtension; typedef struct __DRI2blobExtensionRec __DRI2blobExtension; +typedef struct __DRI2bufferDamageExtensionRec __DRI2bufferDamageExtension; typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension; typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension; @@ -488,6 +489,48 @@ struct __DRI2interopExtensionRec { struct mesa_glinterop_export_out *out); }; + +/** + * Extension for limiting window system back buffer rendering to user-defined + * scissor region. + */ + +#define __DRI2_BUFFER_DAMAGE "DRI2_BufferDamage" +#define __DRI2_BUFFER_DAMAGE_VERSION 1 + +struct __DRI2bufferDamageExtensionRec { + __DRIextension base; + + /** +* Provides an array of rectangles representing an overriding scissor region +* for rendering operations performed to the specified drawable. These +* rectangles do not replace client API scissor regions or draw +* co-ordinates, but instead inform the driver of the overall bounds of all +* operations which will be issued before the next flush. +* +* Any rendering operations writing pixels outside this region to the +* drawable will have an undefined effect on the entire drawable. +* +* This entrypoint may only be called after the drawable has been either been +* newly created or flushed, and before any rendering operations which write +* pixels to the drawable. Calling this entrypoint at any other time will +* have an undefined effect on the entire drawable. +* +* Calling this entrypoint with @size 0 and @rects NULL will reset the +* region to the buffer's full size. This entrypoint may be called once to +* reset the region, followed by a second call with a populated region, +* before a rendering call is made. +* +* Used to implement EGL_KHR_partial_update. +* +* \param drawable affected drawable +* \param size number of rectangles provided +* \param rectsthe array of rectangles, lower-left origin +*/ + void (*set_damage_region)(__DRIdrawable *drawable, unsigned int nrects, + int *rects); +}; + /*@}*/ /** -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 1/5] egl/android: Delete set_damage_region from egl dri vtbl
From: Harish Krupo The intension of the KHR_partial_update was not to send the damage back to the platform but to send the damage to the driver to ensure that the following rendering could be restricted to those regions. This patch removes the set_damage_region from the egl_dri vtbl and all the platfrom_*.c files. Then upcomming patches add a new dri2 interface for the drivers to implement Signed-off-by: Harish Krupo Reviewed-by: Daniel Stone Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig --- Changes in v5: * Add Alyssa's a-b --- src/egl/drivers/dri2/egl_dri2.c | 3 +- src/egl/drivers/dri2/egl_dri2.h | 4 -- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 - src/egl/drivers/dri2/platform_android.c | 45 - src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - 10 files changed, 1 insertion(+), 67 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index ee4faaab34f4..3c33b2cf27f8 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1691,8 +1691,7 @@ static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); + return false; } static EGLBoolean diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index fa04e3bb616d..1d9fe3db625f 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -122,10 +122,6 @@ struct dri2_egl_display_vtbl { _EGLSurface *surface, const EGLint *rects, EGLint n_rects); - EGLBoolean (*set_damage_region)(_EGLDriver *drv, _EGLDisplay *disp, - _EGLSurface *surface, - const EGLint *rects, EGLint n_rects); - EGLBoolean (*swap_buffers_region)(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects); diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h b/src/egl/drivers/dri2/egl_dri2_fallbacks.h index 6c2c4bbe595e..d975b7a8b130 100644 --- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h +++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h @@ -62,7 +62,6 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); } @@ -90,14 +89,6 @@ dri2_fallback_copy_buffers(_EGLDriver *drv, _EGLDisplay *disp, return _eglError(EGL_BAD_NATIVE_PIXMAP, "no support for native pixmaps"); } -static inline EGLBoolean -dri2_fallback_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, -_EGLSurface *surf, -const EGLint *rects, EGLint n_rects) -{ - return EGL_FALSE; -} - static inline EGLint dri2_fallback_query_buffer_age(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index db6ba4a4b4d6..6ce04d250c8d 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -728,43 +728,6 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) return EGL_TRUE; } -#if ANDROID_API_LEVEL >= 23 -static EGLBoolean -droid_set_damage_region(_EGLDriver *drv, -_EGLDisplay *disp, -_EGLSurface *draw, const EGLint* rects, EGLint n_rects) -{ - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw); - android_native_rect_t* droid_rects = NULL; - int ret; - - if (n_rects == 0) - return EGL_TRUE; - - droid_rects = malloc(n_rects * sizeof(android_native_rect_t)); - if (droid_rects == NULL) - return _eglError(EGL_BAD_ALLOC, "eglSetDamageRegionKHR"); - - for (EGLint num_drects = 0; num_drects < n_rects; num_drects++) { - EGLint i = num_drects * 4; - droid_rects[num_drects].left = rects[i]; - droid_rects[num_drects].bottom = rects[i + 1]; - droid_rects[num_drects].right = rects[i] + rects[i + 2]; - droid_r
[Mesa-dev] [PATCH v5 0/5] EGL_KHR_partial_update support
This is an attempt at resurrecting Daniel's MR [1] which was already resurrecting Harish's EGL_KHR_partial_update series [2]. This version implements Marek's suggestion to pass the set_damage_region() directly to the gallium driver and let it decide how to handle the request. Some drivers might just calculate the damage extent (as done in Daniel's initial proposal and in the panfrost implementation), others might do extra optimizations like trying to reduce the area we're supposed to reload (only valid for tile-based rendering) even further. This patch series has been tested with weston (see Daniel's MR[3]) on panfrost. Note that the panfrost implementation is rather simple (just limits the rendering area to the damage extent and picks the biggest damage rect as the only damage region) but we can improve it if we feel the need. Only minor changes in this v5 (collecting the R-b/A-b tags + addressing Alyssa's comments on patch 5). Regards, Boris [1]https://gitlab.freedesktop.org/mesa/mesa/merge_requests/227 [2]https://patchwork.freedesktop.org/series/45915/#rev2 [3]https://gitlab.freedesktop.org/wayland/weston/merge_requests/106 Boris Brezillon (1): panfrost: Add support for KHR_partial_update() Daniel Stone (2): dri_interface: add DRI2_BufferDamage interface st/dri2: Implement DRI2bufferDamageExtension Harish Krupo (2): egl/android: Delete set_damage_region from egl dri vtbl egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update include/GL/internal/dri_interface.h | 43 ++ src/egl/drivers/dri2/egl_dri2.c | 54 ++-- src/egl/drivers/dri2/egl_dri2.h | 5 +- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 -- src/egl/drivers/dri2/platform_android.c | 45 -- src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - src/gallium/drivers/panfrost/pan_blit.c | 10 +-- src/gallium/drivers/panfrost/pan_context.c | 63 +- src/gallium/drivers/panfrost/pan_job.c | 11 +++ src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 91 + src/gallium/drivers/panfrost/pan_resource.h | 12 ++- src/gallium/drivers/panfrost/pan_screen.c | 1 + src/gallium/include/pipe/p_screen.h | 7 ++ src/gallium/state_trackers/dri/dri2.c | 22 + 20 files changed, 308 insertions(+), 77 deletions(-) -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 5/5] panfrost: Add support for KHR_partial_update()
Implement ->set_damage_region() region to support partial updates. This is a dummy implementation in that it does not try to merge damage rects. It also does not deal with distinct regions and instead pick the largest quad as the only damage rect and generate up to 4 reload rects out of it (the left/right/top/bottom regions surrounding the biggest damage rect). We also do not try to reduce the number of draws by passing all quad vertices to the blit request (would require extending u_blitter) Signed-off-by: Boris Brezillon --- Changes in v5: - rename the second panfrost_blit_wallpaper() argument - add extra comment to explain how the set_damage_region() logic works - clarify why checking for negative box->{width,heigh} is not needed in panfrost_draw_wallpaper() --- src/gallium/drivers/panfrost/pan_blit.c | 10 +-- src/gallium/drivers/panfrost/pan_context.c | 63 +- src/gallium/drivers/panfrost/pan_job.c | 11 +++ src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 91 + src/gallium/drivers/panfrost/pan_resource.h | 12 ++- src/gallium/drivers/panfrost/pan_screen.c | 1 + 7 files changed, 186 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_blit.c b/src/gallium/drivers/panfrost/pan_blit.c index 67912a4b130f..226f67e674f5 100644 --- a/src/gallium/drivers/panfrost/pan_blit.c +++ b/src/gallium/drivers/panfrost/pan_blit.c @@ -103,7 +103,7 @@ panfrost_blit(struct pipe_context *pipe, */ void -panfrost_blit_wallpaper(struct panfrost_context *ctx) +panfrost_blit_wallpaper(struct panfrost_context *ctx, struct pipe_box *box) { struct pipe_blit_info binfo = { }; @@ -116,11 +116,11 @@ panfrost_blit_wallpaper(struct panfrost_context *ctx) binfo.src.resource = binfo.dst.resource = ctx->pipe_framebuffer.cbufs[0]->texture; binfo.src.level = binfo.dst.level = level; - binfo.src.box.x = binfo.dst.box.x = 0; - binfo.src.box.y = binfo.dst.box.y = 0; + binfo.src.box.x = binfo.dst.box.x = box->x; + binfo.src.box.y = binfo.dst.box.y = box->y; binfo.src.box.z = binfo.dst.box.z = layer; - binfo.src.box.width = binfo.dst.box.width = ctx->pipe_framebuffer.width; - binfo.src.box.height = binfo.dst.box.height = ctx->pipe_framebuffer.height; + binfo.src.box.width = binfo.dst.box.width = box->width; + binfo.src.box.height = binfo.dst.box.height = box->height; binfo.src.box.depth = binfo.dst.box.depth = 1; binfo.src.format = binfo.dst.format = ctx->pipe_framebuffer.cbufs[0]->format; diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 88e70c978818..7462e490e229 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1472,7 +1472,68 @@ panfrost_draw_wallpaper(struct pipe_context *pipe) struct panfrost_job *batch = panfrost_get_job_for_fbo(ctx); ctx->wallpaper_batch = batch; -panfrost_blit_wallpaper(ctx); + +/* Clamp the rendering area to the damage extent. The + * KHR_partial_update() spec states that trying to render outside of + * the damage region is "undefined behavior", so we should be safe. + */ +panfrost_job_intersection_scissor(batch, rsrc->damage.extent.minx, + rsrc->damage.extent.miny, + rsrc->damage.extent.maxx, + rsrc->damage.extent.maxy); + +struct pipe_scissor_state damage; +struct pipe_box rects[4]; + +/* Clamp the damage box to the rendering area. */ +damage.minx = MAX2(batch->minx, rsrc->damage.biggest_rect.x); +damage.miny = MAX2(batch->miny, rsrc->damage.biggest_rect.y); +damage.maxx = MIN2(batch->maxx, + rsrc->damage.biggest_rect.x + + rsrc->damage.biggest_rect.width); +damage.maxy = MIN2(batch->maxy, + rsrc->damage.biggest_rect.y + + rsrc->damage.biggest_rect.height); + +/* One damage rectangle means we can end up with at most 4 reload + * regions: + * 1: left region, only exists if damage.x > 0 + * 2: right region, only exists if damage.x + damage.width < fb->width + * 3: top region, only exists if damage.y > 0. The intersection with + *the left and right regions are dropped + * 4: bottom region, only exists if damage.y + damage.height < fb->height. + *The intersection with the left and right regions are dropped + * + * + *| | 3 | |
Re: [Mesa-dev] [PATCH v5 1/5] egl/android: Delete set_damage_region from egl dri vtbl
On Tue, 2 Jul 2019 15:49:58 +0200 Boris Brezillon wrote: > From: Harish Krupo Crap, forgot to update your email address here > > The intension of the KHR_partial_update was not to send the damage back > to the platform but to send the damage to the driver to ensure that the > following rendering could be restricted to those regions. > This patch removes the set_damage_region from the egl_dri vtbl and all > the platfrom_*.c files. > Then upcomming patches add a new dri2 interface for the drivers to > implement > > Signed-off-by: Harish Krupo and here. > Reviewed-by: Daniel Stone > Signed-off-by: Boris Brezillon > Acked-by: Alyssa Rosenzweig > --- > Changes in v5: > * Add Alyssa's a-b > --- > src/egl/drivers/dri2/egl_dri2.c | 3 +- > src/egl/drivers/dri2/egl_dri2.h | 4 -- > src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 - > src/egl/drivers/dri2/platform_android.c | 45 - > src/egl/drivers/dri2/platform_device.c | 1 - > src/egl/drivers/dri2/platform_drm.c | 1 - > src/egl/drivers/dri2/platform_surfaceless.c | 1 - > src/egl/drivers/dri2/platform_wayland.c | 1 - > src/egl/drivers/dri2/platform_x11.c | 2 - > src/egl/drivers/dri2/platform_x11_dri3.c| 1 - > 10 files changed, 1 insertion(+), 67 deletions(-) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index ee4faaab34f4..3c33b2cf27f8 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -1691,8 +1691,7 @@ static EGLBoolean > dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, > EGLint *rects, EGLint n_rects) > { > - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); > - return dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); > + return false; > } > > static EGLBoolean > diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h > index fa04e3bb616d..1d9fe3db625f 100644 > --- a/src/egl/drivers/dri2/egl_dri2.h > +++ b/src/egl/drivers/dri2/egl_dri2.h > @@ -122,10 +122,6 @@ struct dri2_egl_display_vtbl { >_EGLSurface *surface, >const EGLint *rects, EGLint > n_rects); > > - EGLBoolean (*set_damage_region)(_EGLDriver *drv, _EGLDisplay *disp, > - _EGLSurface *surface, > - const EGLint *rects, EGLint n_rects); > - > EGLBoolean (*swap_buffers_region)(_EGLDriver *drv, _EGLDisplay *disp, > _EGLSurface *surf, EGLint numRects, > const EGLint *rects); > diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h > b/src/egl/drivers/dri2/egl_dri2_fallbacks.h > index 6c2c4bbe595e..d975b7a8b130 100644 > --- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h > +++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h > @@ -62,7 +62,6 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, > _EGLDisplay *disp, >const EGLint *rects, EGLint n_rects) > { > struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); > - dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); > return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); > } > > @@ -90,14 +89,6 @@ dri2_fallback_copy_buffers(_EGLDriver *drv, _EGLDisplay > *disp, > return _eglError(EGL_BAD_NATIVE_PIXMAP, "no support for native pixmaps"); > } > > -static inline EGLBoolean > -dri2_fallback_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, > -_EGLSurface *surf, > -const EGLint *rects, EGLint n_rects) > -{ > - return EGL_FALSE; > -} > - > static inline EGLint > dri2_fallback_query_buffer_age(_EGLDriver *drv, _EGLDisplay *disp, > _EGLSurface *surf) > diff --git a/src/egl/drivers/dri2/platform_android.c > b/src/egl/drivers/dri2/platform_android.c > index db6ba4a4b4d6..6ce04d250c8d 100644 > --- a/src/egl/drivers/dri2/platform_android.c > +++ b/src/egl/drivers/dri2/platform_android.c > @@ -728,43 +728,6 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, > _EGLSurface *draw) > return EGL_TRUE; > } > > -#if ANDROID_API_LEVEL >= 23 > -static EGLBoolean > -droid_set_damage_region(_EGLDriver *drv, > -_EGLDisplay *disp, > -_EGLSurface *draw, const EGLint* rects, EGLint > n_rects) > -{ > - struct dri2_egl_display *dr
Re: [Mesa-dev] [PATCH 06/10] panfrost: Avoid passing winsys handles to import/export BO funcs
On Tue, 2 Jul 2019 06:56:50 -0700 Alyssa Rosenzweig wrote: > Question: Does this allow us to map arbitrary CPU buffers into GPU > space? Depends what you mean by arbitrary. You can map any dmabuf, that means the buffer has to be created kernel side and exported as a DMAbuf. > Stuff with no relation to the winsys, just... arbitrary user > memory? Nope, I don't think so. That might work if you allocate things through udmabuf, but then you're better off allocating a BO directly. > That might be useful for index/vertex buffers (which we > currently are forced to memcpy() into a BO we create if given a user > pointer rather than a resource), but maybe not actually because of sync > requirements. > > On Tue, Jul 02, 2019 at 03:23:49PM +0200, Boris Brezillon wrote: > > Let's keep a clear split between ioctl wrappers and the rest of the > > driver. All the import BO function need is a dmabuf FD and the screen > > object, and the export one should only take care of generating a dmabuf > > FD out of a BO object. Winsys handle manipulation should stay in the > > resource.c file. > > > > Signed-off-by: Boris Brezillon > > --- > > src/gallium/drivers/panfrost/pan_drm.c | 17 +++-- > > src/gallium/drivers/panfrost/pan_resource.c | 16 +++- > > src/gallium/drivers/panfrost/pan_screen.h | 6 ++ > > 3 files changed, 20 insertions(+), 19 deletions(-) > > > > diff --git a/src/gallium/drivers/panfrost/pan_drm.c > > b/src/gallium/drivers/panfrost/pan_drm.c > > index f17f56bc6307..b88ab0e5ce2b 100644 > > --- a/src/gallium/drivers/panfrost/pan_drm.c > > +++ b/src/gallium/drivers/panfrost/pan_drm.c > > @@ -114,7 +114,7 @@ panfrost_drm_free_slab(struct panfrost_screen *screen, > > struct panfrost_memory *m > > } > > > > struct panfrost_bo * > > -panfrost_drm_import_bo(struct panfrost_screen *screen, struct > > winsys_handle *whandle) > > +panfrost_drm_import_bo(struct panfrost_screen *screen, int fd) > > { > > struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); > > struct drm_panfrost_get_bo_offset get_bo_offset = {0,}; > > @@ -122,7 +122,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, > > struct winsys_handle *wha > > int ret; > > unsigned gem_handle; > > > > - ret = drmPrimeFDToHandle(screen->fd, whandle->handle, &gem_handle); > > + ret = drmPrimeFDToHandle(screen->fd, fd, &gem_handle); > > assert(!ret); > > > > get_bo_offset.handle = gem_handle; > > @@ -141,7 +141,7 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, > > struct winsys_handle *wha > > assert(0); > > } > > > > -bo->size = lseek(whandle->handle, 0, SEEK_END); > > +bo->size = lseek(fd, 0, SEEK_END); > > assert(bo->size > 0); > > bo->cpu = os_mmap(NULL, bo->size, PROT_READ | PROT_WRITE, > > MAP_SHARED, > > screen->fd, mmap_bo.offset); > > @@ -158,21 +158,18 @@ panfrost_drm_import_bo(struct panfrost_screen > > *screen, struct winsys_handle *wha > > } > > > > int > > -panfrost_drm_export_bo(struct panfrost_screen *screen, int gem_handle, > > unsigned int stride, struct winsys_handle *whandle) > > +panfrost_drm_export_bo(struct panfrost_screen *screen, const struct > > panfrost_bo *bo) > > { > > struct drm_prime_handle args = { > > -.handle = gem_handle, > > +.handle = bo->gem_handle, > > .flags = DRM_CLOEXEC, > > }; > > > > int ret = drmIoctl(screen->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, > > &args); > > if (ret == -1) > > -return FALSE; > > +return -1; > > > > -whandle->handle = args.fd; > > -whandle->stride = stride; > > - > > -return TRUE; > > +return args.fd; > > } > > > > static int > > diff --git a/src/gallium/drivers/panfrost/pan_resource.c > > b/src/gallium/drivers/panfrost/pan_resource.c > > index 8901aeee09b1..f86617f80c20 100644 > > --- a/src/gallium/drivers/panfrost/pan_resource.c > > +++ b/src/gallium/drivers/panfrost/pan_resource.c > > @@ -70,7 +70,7 @@ panfrost_resource_from_handle(struct pipe_screen *pscreen, > > pipe_reference_init(&prsc->reference, 1); > > prsc->screen = pscreen; > > > > - rsc->bo = panfrost_drm_
Re: [Mesa-dev] [PATCH 05/10] panfrost: Move BO meta-data out of panfrost_bo
On Tue, 2 Jul 2019 06:53:56 -0700 Alyssa Rosenzweig wrote: > Oh, not controversial at all, I'm quite happy with this! > Just a question -- I remember some panfrost_resources didn't have a bo > but had a winsys thingy instead. How does that interact? Didn't notice any specific test for the !rsrc->bo case, so it's probably gone. Note that winsys handles are converted into BOs through using panfrost_drm_import_bo(). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/10] panfrost: Make SLAB pool creation rely on BO helpers
On Tue, 2 Jul 2019 16:54:22 +0200 Tomeu Vizoso wrote: > On Tue, 2 Jul 2019 at 15:24, Boris Brezillon > wrote: > > > > There's no point duplicating the code, and it will help us simplify > > the bo_handles[] filling logic in panfrost_drm_submit_job(). > > Looks good but, could we drop panfrost_memory completely? Other > drivers seem to do fine wthout such a thing. We need a wrapper that contains a BO plus a pb_slab object for SLAB-based allocations (allocation of sub-page-size objects), that's exactly what panfrost_memory is right now. We can rename it if you like, but I don't think we can get rid of it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 4/5] st/dri2: Implement DRI2bufferDamageExtension
On Tue, 2 Jul 2019 13:21:31 -0400 Marek Olšák wrote: > On Tue., Jul. 2, 2019, 09:50 Boris Brezillon, > wrote: > > > From: Daniel Stone > > > > Add a pipe_screen->set_damage_region() hook to propagate > > set-damage-region requests to the driver, it's then up to the > > driver to decide what to do with this piece of information. > > > > If the hook is left unassigned, the buffer-damage extension is > > considered unsupported. > > > > Signed-off-by: Daniel Stone > > Signed-off-by: Boris Brezillon > > Reviewed-by: Alyssa Rosenzweig > > --- > > Changes in v5: > > * Add Alyssa's R-b > > --- > > src/gallium/include/pipe/p_screen.h | 7 +++ > > src/gallium/state_trackers/dri/dri2.c | 22 ++ > > 2 files changed, 29 insertions(+) > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > b/src/gallium/include/pipe/p_screen.h > > index 3f9bad470950..8df12ee4f865 100644 > > --- a/src/gallium/include/pipe/p_screen.h > > +++ b/src/gallium/include/pipe/p_screen.h > > @@ -464,6 +464,13 @@ struct pipe_screen { > > bool (*is_parallel_shader_compilation_finished)(struct > > pipe_screen *screen, > > void *shader, > > unsigned > > shader_type); + > > + /** > > +* Set damage region. > > > > Can you expand the comment to describe rects? The format of rects is > not obvious. Oops, will point to the KHR_partial_update() doc and explain what rects encode and how. This reminds me that we have a corner case (at least for tile-based GPUs): the dri implementation calls ->set_damage_region(screen, res, 0, NULL) to reset the damage region, but in KHR_partial_update() spec this means "damage all". If we follow the spec that would imply existing FB content is dropped which in turn means users relying on buffer_age() (without partial_update()) to only update the region that have changed will stop working properly. I see 2 options to solve this problem: 1/ add a new ->reset_damage_region() hook that would be called by the dri implementation after each swap_buf() in replacement of the current ->set_damage_region(screen, res, 0, NULL). Reset in that case means we consider the damage region as "unknown" and force a "reload FB content in the local-tile buffer" for the whole resource instead of restricting it to the !damage region. 2/ deviate from the KHR_partial_update() semantic and reserve ->set_damage_region(screen, res, 0, NULL) for the "reset damage region" op. That means we'll have to convert actual KHR_partial_update(0, NULL) calls into ->set_damage_region(screen, res, 1, full_res_rect) ones to reflect the behavior described in the spec. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 4/5] st/dri2: Implement DRI2bufferDamageExtension
On Wed, 03 Jul 2019 11:54:29 +0200 Erik Faye-Lund wrote: > On Wed, 2019-06-26 at 10:34 -0700, Alyssa Rosenzweig wrote: > > Ah-ha, now we're into parts of the stack I can claim to understand! > > >:) > > > > Reviewed-by: Alyssa Rosenzweig > > > > On Tue, Jun 25, 2019 at 06:37:48PM +0200, Boris Brezillon wrote: > > > From: Daniel Stone > > > > > > Add a pipe_screen->set_damage_region() hook to propagate > > > set-damage-region requests to the driver, it's then up to the > > > driver to > > > decide what to do with this piece of information. > > > > > > If the hook is left unassigned, the buffer-damage extension is > > > considered unsupported. > > > > > > Signed-off-by: Daniel Stone > > > Signed-off-by: Boris Brezillon > > > --- > > > src/gallium/include/pipe/p_screen.h | 7 +++ > > > src/gallium/state_trackers/dri/dri2.c | 22 ++ > > > 2 files changed, 29 insertions(+) > > > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > > b/src/gallium/include/pipe/p_screen.h > > > index 3f9bad470950..8df12ee4f865 100644 > > > --- a/src/gallium/include/pipe/p_screen.h > > > +++ b/src/gallium/include/pipe/p_screen.h > > > @@ -464,6 +464,13 @@ struct pipe_screen { > > > bool (*is_parallel_shader_compilation_finished)(struct > > > pipe_screen *screen, > > > void *shader, > > > unsigned > > > shader_type); > > > + > > > + /** > > > +* Set damage region. > > > +*/ > > > + void (*set_damage_region)(struct pipe_screen *screen, > > > + struct pipe_resource *resource, > > > + unsigned int nrects, int *rects); > > I would kinda have expected rects to be an array of pipe_box instead of > just an array of integers, as that'd be a bit easier to know the > semantics of... Sure, I can do that. Should I do the Y-flip as part of the ints -> box conversion or should I keep the "origin is bottom-left" semantic? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH] mesa: Export BOs in RW mode
Exported BOs might be imported back, then mmap()-ed to be written too. Most drivers handle that by mmap()-ing the GEM handle after it's been imported, but, according to [1], this is illegal. The panfrost driver has recently switched to this generic helper (which was renamed into drm_gem_map_offset() in the meantime) [2], and mmap()-ing of imported BOs now fails. Now I'm wondering how this should be solved. I guess the first question is, is mmap()-ing of imported BOs really forbidden for all BOs? I guess calling mmap() on a buffer that's been exported by the DRM driver itself then re-imported shouldn't hurt, so maybe we can check that before failing. Now, if we really want to forbid mmap() on imported BOs, that means we need a solution to mmap() the dmabuf object directly, and sometimes this mapping will request RW permissions. The problem is, all function exporting BOs in mesa are exporting them in RO-mode (resulting FD is O_READ), thus preventing mmap()s in RW mode. This patch modifies all drmPrimeHandleToFD()/ioctl(DRM_IOCTL_PRIME_HANDLE_TO_FD) call sites to pass the DRM_RDWR flag so that what's described above becomes possible. I'm not saying this is what we should do, it's more a way to start the discussion. Feel free to propose alternives to this solution. [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem.c?h=v5.2-rc7#n318 [2]https://cgit.freedesktop.org/drm/drm-misc/commit/drivers/gpu/drm/panfrost?id=583bbf46133c726bae277e8f4e32bfba2a528c7f Signed-off-by: Boris Brezillon Cc: Cc: Cc: Steven Price Cc: Rob Herring --- Cc-ing dri-devel is not a mistake, I really to have feedback from the DRM maintainers, since this started with a kernel-side change. --- src/etnaviv/drm/etnaviv_bo.c | 4 ++-- src/freedreno/drm/freedreno_bo.c | 4 ++-- src/freedreno/vulkan/tu_drm.c | 2 +- src/gallium/auxiliary/renderonly/renderonly.c | 2 +- src/gallium/drivers/iris/iris_bufmgr.c| 2 +- src/gallium/drivers/lima/lima_bo.c| 2 +- src/gallium/drivers/panfrost/pan_drm.c| 2 +- src/gallium/drivers/v3d/v3d_bufmgr.c | 2 +- src/gallium/drivers/vc4/vc4_bufmgr.c | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 3 ++- src/gallium/winsys/svga/drm/vmw_screen_dri.c | 5 +++-- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +- src/gallium/winsys/virgl/drm/virgl_drm_winsys.c | 3 ++- src/intel/vulkan/anv_gem.c| 2 +- src/mesa/drivers/dri/i965/brw_bufmgr.c| 2 +- 15 files changed, 21 insertions(+), 18 deletions(-) diff --git a/src/etnaviv/drm/etnaviv_bo.c b/src/etnaviv/drm/etnaviv_bo.c index 6e952fa47858..92634141b580 100644 --- a/src/etnaviv/drm/etnaviv_bo.c +++ b/src/etnaviv/drm/etnaviv_bo.c @@ -294,8 +294,8 @@ int etna_bo_dmabuf(struct etna_bo *bo) { int ret, prime_fd; - ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC, - &prime_fd); + ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, +DRM_CLOEXEC | DRM_RDWR, &prime_fd); if (ret) { ERROR_MSG("failed to get dmabuf fd: %d", ret); return ret; diff --git a/src/freedreno/drm/freedreno_bo.c b/src/freedreno/drm/freedreno_bo.c index 7449160f1371..ba19b08d7c54 100644 --- a/src/freedreno/drm/freedreno_bo.c +++ b/src/freedreno/drm/freedreno_bo.c @@ -318,8 +318,8 @@ int fd_bo_dmabuf(struct fd_bo *bo) { int ret, prime_fd; - ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, DRM_CLOEXEC, - &prime_fd); + ret = drmPrimeHandleToFD(bo->dev->fd, bo->handle, +DRM_CLOEXEC | DRM_RDWR, &prime_fd); if (ret) { ERROR_MSG("failed to get dmabuf fd: %d", ret); return ret; diff --git a/src/freedreno/vulkan/tu_drm.c b/src/freedreno/vulkan/tu_drm.c index 9b2e6f78879e..6bef3012ddb5 100644 --- a/src/freedreno/vulkan/tu_drm.c +++ b/src/freedreno/vulkan/tu_drm.c @@ -147,7 +147,7 @@ tu_gem_export_dmabuf(const struct tu_device *dev, uint32_t gem_handle) { int prime_fd; int ret = drmPrimeHandleToFD(dev->physical_device->local_fd, gem_handle, -DRM_CLOEXEC, &prime_fd); +DRM_CLOEXEC | DRM_RDWR, &prime_fd); return ret == 0 ? prime_fd : -1; } diff --git a/src/gallium/auxiliary/renderonly/renderonly.c b/src/gallium/auxiliary/renderonly/renderonly.c index d6a344009378..c1cc31115105 100644 --- a/src/gallium/auxiliary/renderonly/renderonly.c +++ b/src/gallium/auxiliary/renderonly/renderonly.c @@ -101,7 +101,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc, out_handle->type = WINSYS_HANDLE_TYPE_FD;
Re: [Mesa-dev] [RFC PATCH] mesa: Export BOs in RW mode
On Wed, 3 Jul 2019 07:45:32 -0600 Rob Herring wrote: > On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon > wrote: > > > > Exported BOs might be imported back, then mmap()-ed to be written > > too. Most drivers handle that by mmap()-ing the GEM handle after it's > > been imported, but, according to [1], this is illegal. > > It's not illegal, but is supposed to go thru the dmabuf mmap > functions. That's basically what I'm proposing here, just didn't post the patch skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD instead of the DRM-node one, but I have it working for panfrost. > However, none of the driver I've looked at (etnaviv, msm, > v3d, vgem) do that. It probably works because it's the same driver > doing the import and export or both drivers have essentially the same > implementations. Yes, but maybe that's something we should start fixing if mmap()-ing the dmabuf is the recommended solution. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] mesa: Export BOs in RW mode
On Wed, 3 Jul 2019 15:13:25 +0100 Steven Price wrote: > On 03/07/2019 14:56, Boris Brezillon wrote: > > On Wed, 3 Jul 2019 07:45:32 -0600 > > Rob Herring wrote: > > > >> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon > >> wrote: > >>> > >>> Exported BOs might be imported back, then mmap()-ed to be written > >>> too. Most drivers handle that by mmap()-ing the GEM handle after it's > >>> been imported, but, according to [1], this is illegal. > >> > >> It's not illegal, but is supposed to go thru the dmabuf mmap > >> functions. > > > > That's basically what I'm proposing here, just didn't post the patch > > skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD > > instead of the DRM-node one, but I have it working for panfrost. > > If we want to we could make the Panfrost kernel driver internally call > dma_buf_mmap() so that mapping using the DRM-node "just works". This is > indeed what the kbase driver does. Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or ignore its return code), so calling mmap() on the dmabuf FD instead of the DRM-node FD shouldn't be that hard. > > >> However, none of the driver I've looked at (etnaviv, msm, > >> v3d, vgem) do that. It probably works because it's the same driver > >> doing the import and export or both drivers have essentially the same > >> implementations. > > > > Yes, but maybe that's something we should start fixing if mmap()-ing > > the dmabuf is the recommended solution. > > I'm open to options here. User space calling mmap() on the dma_buf file > descriptor should always be safe (the exporter can do whatever is > necessary to make it work). If that happens then the patches I posted > close off the DRM node version which could be broken if the exporter > needs to do anything to prepare the buffer for CPU access (i.e. cache > maintenance). Talking about CPU <-> GPU syncs, I was wondering if the mmap(gem_handle) step was providing any guarantee that would allow us to ignore all the cache maintenance operations that are required when mmap()-ing a dmabuf directly. Note that in both cases the dmabuf is imported. > > Alternatively if user space wants/needs to use the DMA node then we can > take a look at what needs to change in the kernel. From a quick look at > the code it seems we'd need to split drm_gem_mmap() into a helper so > that it can return whether the exporter is handling everything or if the > caller needs to do some more work (e.g. drm_gem_shmem_mmap() needs to > allocate backing pages). But because drm_gem_mmap() is used as the > direct callback for some drivers we'd need to preserve the interface. > > The below (completely untested) patch demonstrates the idea. > > Steve > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index a8c4468f03d9..df661e24cadf 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -1140,7 +1140,7 @@ EXPORT_SYMBOL(drm_gem_mmap_obj); > * If the caller is not granted access to the buffer object, the mmap > will fail > * with EACCES. Please see the vma manager for more information. > */ > -int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma) > +int drm_gem_mmap_helper(struct file *filp, struct vm_area_struct *vma) > { > struct drm_file *priv = filp->private_data; > struct drm_device *dev = priv->minor->dev; > @@ -1189,6 +1189,11 @@ int drm_gem_mmap(struct file *filp, struct > vm_area_struct *vma) > vma->vm_flags &= ~VM_MAYWRITE; > } > > + if (obj->import_attach) { > + ret = dma_buf_mmap(obj->dma_buf, vma, 0); > + return ret?:1; > + } > + > ret = drm_gem_mmap_obj(obj, drm_vma_node_size(node) << PAGE_SHIFT, > vma); > > @@ -1196,6 +1201,16 @@ int drm_gem_mmap(struct file *filp, struct > vm_area_struct *vma) > > return ret; > } > + > +int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma) > +{ > + int ret; > + > + ret = drm_gem_mmap_helper(filp, vma); > + if (ret == 1) > + return 0; > + return ret; > +} > EXPORT_SYMBOL(drm_gem_mmap); > > void drm_gem_print_info(struct drm_printer *p, unsigned int indent, > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > b/drivers/gpu/drm/drm_gem_shmem_helper.c > index 472ea5d81f82..b85d84e4d4a8 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -466,8
Re: [Mesa-dev] [RFC PATCH] mesa: Export BOs in RW mode
On Wed, 3 Jul 2019 15:50:08 +0100 Steven Price wrote: > On 03/07/2019 15:33, Boris Brezillon wrote: > > On Wed, 3 Jul 2019 15:13:25 +0100 > > Steven Price wrote: > > > >> On 03/07/2019 14:56, Boris Brezillon wrote: > >>> On Wed, 3 Jul 2019 07:45:32 -0600 > >>> Rob Herring wrote: > >>> > >>>> On Wed, Jul 3, 2019 at 7:34 AM Boris Brezillon > >>>> wrote: > >>>>> > >>>>> Exported BOs might be imported back, then mmap()-ed to be written > >>>>> too. Most drivers handle that by mmap()-ing the GEM handle after it's > >>>>> been imported, but, according to [1], this is illegal. > >>>> > >>>> It's not illegal, but is supposed to go thru the dmabuf mmap > >>>> functions. > >>> > >>> That's basically what I'm proposing here, just didn't post the patch > >>> skipping the GET_OFFSET step and doing the mmap() on the dmabuf FD > >>> instead of the DRM-node one, but I have it working for panfrost. > >> > >> If we want to we could make the Panfrost kernel driver internally call > >> dma_buf_mmap() so that mapping using the DRM-node "just works". This is > >> indeed what the kbase driver does. > > > > Well, userspace should at least skip DRM_IOCTL_PANFROST_MMAP_BO (or > > ignore its return code), so calling mmap() on the dmabuf FD instead of > > the DRM-node FD shouldn't be that hard. > > What I was suggesting is that user space would still call > DRM_IOCTL_PANFROST_MMAP_BO to get an offset which uses in a call to > mmap(..., drm_node_fd, offset). The kernel could detect that the buffer > is imported and call the exporter for the actual mmap() functionality. Oops, sorry, brain fart. I thought it was DRM_IOCTL_PANFROST_MMAP_BO that was failing, but it's actually the mmap() call, so providing this wrapper kernel-side should work. > > The alternative is that user space 'simply' remembers that a buffer is > imported and keeps the file descriptor around so that it can instead > directly mmap() the dma_buf fd. Which is certainly easiest from the > kernel's perspective (and was what I assumed panfrost was doing - I > should have checked more closely!). > > >>>> However, none of the driver I've looked at (etnaviv, msm, > >>>> v3d, vgem) do that. It probably works because it's the same driver > >>>> doing the import and export or both drivers have essentially the same > >>>> implementations. > >>> > >>> Yes, but maybe that's something we should start fixing if mmap()-ing > >>> the dmabuf is the recommended solution. > >> > >> I'm open to options here. User space calling mmap() on the dma_buf file > >> descriptor should always be safe (the exporter can do whatever is > >> necessary to make it work). If that happens then the patches I posted > >> close off the DRM node version which could be broken if the exporter > >> needs to do anything to prepare the buffer for CPU access (i.e. cache > >> maintenance). > > > > Talking about CPU <-> GPU syncs, I was wondering if the > > mmap(gem_handle) step was providing any guarantee that would > > allow us to ignore all the cache maintenance operations that are > > required when mmap()-ing a dmabuf directly. Note that in both cases the > > dmabuf is imported. > > In theory the exporter should do whatever is required to ensure that the > CPU is synchronised when a user space mapping exists. There are some > issues here though: > > * In theory the kernel driver should map the dma_buf purely for the > duration that a job is using the buffer (and unmap immediately after). > This gives the exporter the knowledge of when the GPU is using the > memory and allows the exporter to page out of the memory if necessary. > In practise this map/unmap operation is expensive (updating the GPU's > page tables) so most drivers don't actually bother and keep the memory > mapped. This means the exporter cannot tell when the buffer is used or > move the pages. > > * The CPU mappings can be faulted on demand (performing the necessary > CPU cache invalidate if needed) and shot-down to allow moving the > memory. In theory when the GPU needs the memory it should map the buffer > and the exporter can then shoot down the mappings, perform the CPU cache > clean and then allow the GPU to use the memory. A subsequent CPU access > would then refault the page, ensuring a
Re: [Mesa-dev] [PATCH] panfrost: Take into account off-screen FBOs
On Thu, 4 Jul 2019 10:02:54 +0200 Tomeu Vizoso wrote: > In that case, ctx->pipe_framebuffer.cbufs[0] can be NULL. > > Signed-off-by: Tomeu Vizoso > Cc: Boris Brezillon Reviewed-by: Boris Brezillon > Fixes: 5375d009be18 ("panfrost: Pass referenced BOs to the SUBMIT ioctls") > --- > src/gallium/drivers/panfrost/pan_drm.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_drm.c > b/src/gallium/drivers/panfrost/pan_drm.c > index 8de4f483435c..b89f8e66a877 100644 > --- a/src/gallium/drivers/panfrost/pan_drm.c > +++ b/src/gallium/drivers/panfrost/pan_drm.c > @@ -238,7 +238,6 @@ panfrost_drm_submit_job(struct panfrost_context *ctx, u64 > job_desc, int reqs) > int > panfrost_drm_submit_vs_fs_job(struct panfrost_context *ctx, bool has_draws, > bool is_scanout) > { > -struct pipe_surface *surf = ctx->pipe_framebuffer.cbufs[0]; > int ret; > > struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); > @@ -256,9 +255,12 @@ panfrost_drm_submit_vs_fs_job(struct panfrost_context > *ctx, bool has_draws, bool > } > > if (job->first_tiler.gpu || job->clear) { > -struct panfrost_resource *res = pan_resource(surf->texture); > -assert(res->bo); > -panfrost_job_add_bo(job, res->bo); > +struct pipe_surface *surf = ctx->pipe_framebuffer.cbufs[0]; > +if (surf) { > +struct panfrost_resource *res = > pan_resource(surf->texture); > +assert(res->bo); > +panfrost_job_add_bo(job, res->bo); > +} > ret = panfrost_drm_submit_job(ctx, > panfrost_fragment_job(ctx, has_draws), PANFROST_JD_REQ_FS); > assert(!ret); > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] panfrost: Cache BO imports
Hi Tomeu, On Thu, 4 Jul 2019 10:04:41 +0200 Tomeu Vizoso wrote: > If two jobs use the same GEM object at the same time, the job that > finishes first will (previous to this commit) close the GEM object, even > if there's a job still referencing it. > > To prevent this, have all jobs use the same panfrost_bo for a given GEM > object, so it's only closed once the last job is done with it. > > Signed-off-by: Tomeu Vizoso > --- > src/gallium/drivers/panfrost/pan_allocate.h | 2 +- > src/gallium/drivers/panfrost/pan_drm.c | 46 +++-- > src/gallium/drivers/panfrost/pan_resource.c | 20 - > src/gallium/drivers/panfrost/pan_screen.h | 6 +++ > 4 files changed, 68 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_allocate.h > b/src/gallium/drivers/panfrost/pan_allocate.h > index 20ba204dee86..2dfa913b8c4d 100644 > --- a/src/gallium/drivers/panfrost/pan_allocate.h > +++ b/src/gallium/drivers/panfrost/pan_allocate.h > @@ -59,7 +59,7 @@ struct panfrost_transfer { > }; > > struct panfrost_bo { > -struct pipe_reference reference; > +int refcnt; > > /* Mapping for the entire object (all levels) */ > uint8_t *cpu; > diff --git a/src/gallium/drivers/panfrost/pan_drm.c > b/src/gallium/drivers/panfrost/pan_drm.c > index b89f8e66a877..9648ac1d452d 100644 > --- a/src/gallium/drivers/panfrost/pan_drm.c > +++ b/src/gallium/drivers/panfrost/pan_drm.c > @@ -103,7 +103,12 @@ panfrost_drm_create_bo(struct panfrost_screen *screen, > size_t size, > // TODO map and unmap on demand? > panfrost_drm_mmap_bo(screen, bo); > > -pipe_reference_init(&bo->reference, 1); > +p_atomic_set(&bo->refcnt, 1); > + > +pthread_mutex_lock(&screen->handle_table_lock); > +_mesa_hash_table_insert(screen->handle_table, &bo->gem_handle, bo); > +pthread_mutex_unlock(&screen->handle_table_lock); > + > return bo; > } > > @@ -116,6 +121,9 @@ panfrost_drm_release_bo(struct panfrost_screen *screen, > struct panfrost_bo *bo) > if (!bo) > return; > > +pthread_mutex_lock(&screen->handle_table_lock); > +_mesa_hash_table_remove_key(screen->handle_table, &bo->gem_handle); > + > panfrost_drm_munmap_bo(screen, bo); > > ret = drmIoctl(screen->fd, DRM_IOCTL_GEM_CLOSE, &gem_close); > @@ -125,6 +133,8 @@ panfrost_drm_release_bo(struct panfrost_screen *screen, > struct panfrost_bo *bo) > } > > ralloc_free(bo); > + > +pthread_mutex_unlock(&screen->handle_table_lock); > } > > void > @@ -150,17 +160,41 @@ panfrost_drm_free_slab(struct panfrost_screen *screen, > struct panfrost_memory *m > mem->bo = NULL; > } > > +/* lookup a buffer, call w/ table_lock held: */ > +static struct panfrost_bo *lookup_bo(struct hash_table *tbl, uint32_t key) I tend to add _locked() suffixes to functions that are supposed to be called with a lock held, just for people who don't read comments (like me :)). > +{ > + struct panfrost_bo *bo = NULL; > + struct hash_entry *entry = _mesa_hash_table_search(tbl, &key); > + if (entry) { > + /* found, incr refcnt and return: */ > + bo = entry->data; You need: /* BO is about to freed, don't return it. */ if (!p_atomic_read(&bo->refcnt)) return NULL; here (see below for a detailed explanation about the race). > + panfrost_bo_reference(bo); > + } > + return bo; > +} > + > struct panfrost_bo * > panfrost_drm_import_bo(struct panfrost_screen *screen, int fd) > { > - struct panfrost_bo *bo = rzalloc(screen, struct panfrost_bo); > + struct panfrost_bo *bo = NULL; > struct drm_panfrost_get_bo_offset get_bo_offset = {0,}; > int ret; > unsigned gem_handle; > > +pthread_mutex_lock(&screen->handle_table_lock); > + Unrelated/nit: we should really agree on an indentation model (tab vs spaces). I keep trying to adapt to the context surrounding my changes (using tabs when tabs are used nearby, spaces otherwise), but now we're starting to have a mix of tabs and spaces inside the same functions. > ret = drmPrimeFDToHandle(screen->fd, fd, &gem_handle); > assert(!ret); > > +if (ret) > +goto out_unlock; > + Can't we take the lock here instead? > +bo = lookup_bo(screen->handle_table, gem_handle); > +if (bo) > +goto out_unlock; > + > +bo = rzalloc(screen, struct panfrost_bo); > + > get_bo_offset.handle = gem_handle; > ret = drmIoctl(screen->fd, DRM_IOCTL_PANFROST_GET_BO_OFFSET, > &get_bo_offset); > assert(!ret); > @@ -169,10 +203,16 @@ panfrost_drm_import_bo(struct panfrost_screen *screen, > int fd) > bo->gpu = (mali_ptr) get_bo_offset.offset; > bo->size = lseek(fd, 0, SEEK_END); > assert(bo->size >
Re: [Mesa-dev] [PATCH 2/3] panfrost: Allocate scanout BOs in panfrost device
On Thu, 4 Jul 2019 10:04:42 +0200 Tomeu Vizoso wrote: > @@ -382,11 +362,14 @@ panfrost_resource_create_bo(struct panfrost_screen > *screen, struct panfrost_reso > > /* Tiling textures is almost always faster, unless we only use it > once */ > > +#define SCANOUT (PIPE_BIND_SCANOUT | PIPE_BIND_SHARED | > PIPE_BIND_DISPLAY_TARGET) > + > +bool is_scanout = (res->bind & SCANOUT); > bool is_texture = (res->bind & PIPE_BIND_SAMPLER_VIEW); > bool is_2d = res->depth0 == 1 && res->array_size == 1; > -bool is_streaming = (res->usage != PIPE_USAGE_STREAM); > +bool is_streaming = (res->usage == PIPE_USAGE_STREAM); > > -bool should_tile = is_streaming && is_texture && is_2d; > +bool should_tile = !is_streaming && is_texture && is_2d && > !is_scanout; > > /* Depth/stencil can't be tiled, only linear or AFBC */ > should_tile &= !(res->bind & PIPE_BIND_DEPTH_STENCIL); > @@ -425,10 +408,6 @@ panfrost_resource_create(struct pipe_screen *screen, > assert(0); > } > > -if (template->bind & > -(PIPE_BIND_DISPLAY_TARGET | PIPE_BIND_SCANOUT | > PIPE_BIND_SHARED)) > -return panfrost_create_scanout_res(screen, template); > - > struct panfrost_resource *so = rzalloc(screen, struct > panfrost_resource); > struct panfrost_screen *pscreen = (struct panfrost_screen *) screen; > > @@ -440,6 +419,20 @@ panfrost_resource_create(struct pipe_screen *screen, > util_range_init(&so->valid_buffer_range); > > panfrost_resource_create_bo(pscreen, so); > + > +/* Set up the "scanout resource" (the dmabuf export of our buffer to > + * the KMS handle) if the buffer might ever have > + * resource_get_handle(WINSYS_HANDLE_TYPE_KMS) called on it. > + */ > +if (template->bind & PIPE_BIND_SCANOUT) { That's probably intentional but I thought I'd mention it just to be sure this is what you intend to do: the scanout obj is now only created when PIPE_BIND_SCANOUT is set while it was previously created for PIPE_BIND_SHARED and PIPE_BIND_DISPLAY_TARGET too. > +so->scanout = > +renderonly_scanout_for_resource(&so->base, > pscreen->ro, NULL); > +if (!so->scanout) { > +panfrost_resource_destroy(screen, &so->base); > +return NULL; > +} > +} > + > return (struct pipe_resource *)so; > } > > diff --git a/src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c > b/src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c > index bf599a1497c9..5b316a2b1f37 100644 > --- a/src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c > +++ b/src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c > @@ -90,7 +90,7 @@ struct pipe_screen *kmsro_drm_screen_create(int fd, > ro.gpu_fd = drmOpenWithType("panfrost", NULL, DRM_NODE_RENDER); > > if (ro.gpu_fd >= 0) { > - ro.create_for_resource = > renderonly_create_kms_dumb_buffer_for_resource, > + ro.create_for_resource = renderonly_create_gpu_import_for_resource, >screen = panfrost_drm_screen_create_renderonly(&ro); >if (!screen) > close(ro.gpu_fd); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Getting write permissions on the mesa repo to push panfrost stuff
Hello, Alyssa recently proposed that I push my own panfrost-related submissions once they received proper review and are considered ready to merged (meaning that received enough A-b/R-b tags). In order to do that, I'd need to obtain write permissions on the git repo. Note that I already have an account on fd.o (my nick is bbrezillon). I guess Alyssa and/or Tomew will ack this request soon. Let me know if you need anything else from me. Regards, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Getting write permissions on the mesa repo to push panfrost stuff
On Thu, 04 Jul 2019 20:49:32 -0700 Kenneth Graunke wrote: > On Thursday, July 4, 2019 5:23:44 AM PDT Boris Brezillon wrote: > > Hello, > > > > Alyssa recently proposed that I push my own panfrost-related > > submissions once they received proper review and are considered > > ready to merged (meaning that received enough A-b/R-b tags). > > > > In order to do that, I'd need to obtain write permissions on the git > > repo. Note that I already have an account on fd.o (my nick is > > bbrezillon). I guess Alyssa and/or Tomew will ack this request soon. Let > > me know if you need anything else from me. > > > > Regards, > > > > Boris > > Hi there! > > I just added @bbrezillon as a "Developer" on gitlab.fdo/mesa, which > should grant you commit access! > > I noticed that there are two similar looking Gitlab accounts: > > @bbrezillon - Boris Brezillon > @bbrezillion - Boris Brezillon > > The former seems to be active, and the latter not so much. Is that also > yours - just a typo? Yes, the latter (the one with a typo) has been created back when I started contributing to igt and drm-misc, but I don't remember the password, and I no longer have access to my @free-electrons (or @bootlin, I dont rememenber) address, which means I can't reset the password. > Should it be deleted? Note that I still contribute to drm-misc and might have to push things to igt at some point (I don't need a password for that, I push through ssh). If you delete the account, can you make sure the new one also has push rights to those repos (those 2 are not on gitlab.fd.o IIRC)? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Getting write permissions on the mesa repo to push panfrost stuff
On Fri, 05 Jul 2019 13:43:20 -0700 Kenneth Graunke wrote: > On Thursday, July 4, 2019 10:42:49 PM PDT Boris Brezillon wrote: > > On Thu, 04 Jul 2019 20:49:32 -0700 > > Kenneth Graunke wrote: > > > > > On Thursday, July 4, 2019 5:23:44 AM PDT Boris Brezillon wrote: > > > > Hello, > > > > > > > > Alyssa recently proposed that I push my own panfrost-related > > > > submissions once they received proper review and are considered > > > > ready to merged (meaning that received enough A-b/R-b tags). > > > > > > > > In order to do that, I'd need to obtain write permissions on the git > > > > repo. Note that I already have an account on fd.o (my nick is > > > > bbrezillon). I guess Alyssa and/or Tomew will ack this request soon. Let > > > > me know if you need anything else from me. > > > > > > > > Regards, > > > > > > > > Boris > > > > > > Hi there! > > > > > > I just added @bbrezillon as a "Developer" on gitlab.fdo/mesa, which > > > should grant you commit access! > > > > > > I noticed that there are two similar looking Gitlab accounts: > > > > > > @bbrezillon - Boris Brezillon > > > @bbrezillion - Boris Brezillon > > > > > > The former seems to be active, and the latter not so much. Is that also > > > yours - just a typo? > > > > Yes, the latter (the one with a typo) has been created back when I > > started contributing to igt and drm-misc, but I don't remember the > > password, and I no longer have access to my @free-electrons (or > > @bootlin, I dont rememenber) address, which means I can't reset the > > password. > > > > > Should it be deleted? > > > > Note that I still contribute to drm-misc and might have to push things > > to igt at some point (I don't need a password for that, I push > > through ssh). If you delete the account, can you make sure the new one > > also has push rights to those repos (those 2 are not on gitlab.fd.o > > IIRC)? > > igt is on gitlab now: https://gitlab.freedesktop.org/drm/igt-gpu-tools > > But I don't have the ability to grant you access to that, you'd need one > of the igt group leaders to do it. Okay, I'll do that when I need it. Thanks, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 4/5] st/dri2: Implement DRI2bufferDamageExtension
Hello Marek, On Tue, 2 Jul 2019 20:09:23 +0200 Boris Brezillon wrote: > On Tue, 2 Jul 2019 13:21:31 -0400 > Marek Olšák wrote: > > > On Tue., Jul. 2, 2019, 09:50 Boris Brezillon, > > wrote: > > > > > From: Daniel Stone > > > > > > Add a pipe_screen->set_damage_region() hook to propagate > > > set-damage-region requests to the driver, it's then up to the > > > driver to decide what to do with this piece of information. > > > > > > If the hook is left unassigned, the buffer-damage extension is > > > considered unsupported. > > > > > > Signed-off-by: Daniel Stone > > > Signed-off-by: Boris Brezillon > > > Reviewed-by: Alyssa Rosenzweig > > > --- > > > Changes in v5: > > > * Add Alyssa's R-b > > > --- > > > src/gallium/include/pipe/p_screen.h | 7 +++ > > > src/gallium/state_trackers/dri/dri2.c | 22 ++ > > > 2 files changed, 29 insertions(+) > > > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > > b/src/gallium/include/pipe/p_screen.h > > > index 3f9bad470950..8df12ee4f865 100644 > > > --- a/src/gallium/include/pipe/p_screen.h > > > +++ b/src/gallium/include/pipe/p_screen.h > > > @@ -464,6 +464,13 @@ struct pipe_screen { > > > bool (*is_parallel_shader_compilation_finished)(struct > > > pipe_screen *screen, > > > void *shader, > > > unsigned > > > shader_type); + > > > + /** > > > +* Set damage region. > > > > > > > Can you expand the comment to describe rects? The format of rects is > > not obvious. > > Oops, will point to the KHR_partial_update() doc and explain what rects > encode and how. > This reminds me that we have a corner case (at least for tile-based > GPUs): the dri implementation calls > ->set_damage_region(screen, res, 0, NULL) to reset the damage region, > but in KHR_partial_update() spec this means "damage all". If we follow > the spec that would imply existing FB content is dropped which in turn > means users relying on buffer_age() (without partial_update()) to only > update the region that have changed will stop working properly. > > I see 2 options to solve this problem: > > 1/ add a new ->reset_damage_region() hook that would be called by the >dri implementation after each swap_buf() in replacement of the >current ->set_damage_region(screen, res, 0, NULL). Reset in that >case means we consider the damage region as "unknown" and force >a "reload FB content in the local-tile buffer" for the whole >resource instead of restricting it to the !damage region. > 2/ deviate from the KHR_partial_update() semantic and reserve >->set_damage_region(screen, res, 0, NULL) for the "reset damage >region" op. That means we'll have to convert actual >KHR_partial_update(0, NULL) calls into >->set_damage_region(screen, res, 1, full_res_rect) ones to reflect >the behavior described in the spec. Any advice on how to solve this problem? Regards, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 3/5] egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update
From: Harish Krupo Use the DRI2 interface callback to pass the damage rects to the driver. Signed-off-by: Harish Krupo Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig Reviewed-by: Qiang Yu Tested-by: Qiang Yu --- Changes in v6: * None Changes in v5: * Add Alyssa's a-b * Update Arish email address * s/__DRI2_DAMAGE/__DRI2_BUFFER_DAMAGE/ --- src/egl/drivers/dri2/egl_dri2.c | 55 ++--- src/egl/drivers/dri2/egl_dri2.h | 1 + 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 3c33b2cf27f8..fcafcfd73c63 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -452,6 +452,7 @@ static const struct dri2_extension_match optional_core_extensions[] = { { __DRI2_NO_ERROR, 1, offsetof(struct dri2_egl_display, no_error) }, { __DRI2_CONFIG_QUERY, 1, offsetof(struct dri2_egl_display, config) }, { __DRI2_FENCE, 1, offsetof(struct dri2_egl_display, fence) }, + { __DRI2_BUFFER_DAMAGE, 1, offsetof(struct dri2_egl_display, buffer_damage) }, { __DRI2_RENDERER_QUERY, 1, offsetof(struct dri2_egl_display, rendererQuery) }, { __DRI2_INTEROP, 1, offsetof(struct dri2_egl_display, interop) }, { __DRI_IMAGE, 1, offsetof(struct dri2_egl_display, image) }, @@ -721,6 +722,9 @@ dri2_setup_screen(_EGLDisplay *disp) if (dri2_dpy->flush_control) disp->Extensions.KHR_context_flush_control = EGL_TRUE; + + if (dri2_dpy->buffer_damage && dri2_dpy->buffer_damage->set_damage_region) + disp->Extensions.KHR_partial_update = EGL_TRUE; } void @@ -1658,11 +1662,22 @@ static EGLBoolean dri2_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + ret = dri2_dpy->vtbl->swap_buffers(drv, disp, surf); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1671,12 +1686,23 @@ dri2_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); _EGLContext *ctx = _eglGetCurrentContext(); + EGLBoolean ret; if (ctx && surf) dri2_surf_update_fence_fd(ctx, disp, surf); - return dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, - rects, n_rects); + ret = dri2_dpy->vtbl->swap_buffers_with_damage(drv, disp, surf, + rects, n_rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean @@ -1684,14 +1710,33 @@ dri2_swap_buffers_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + __DRIdrawable *dri_drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + EGLBoolean ret; + + ret = dri2_dpy->vtbl->swap_buffers_region(drv, disp, surf, numRects, rects); + + /* SwapBuffers marks the end of the frame; reset the damage region for +* use again next time. +*/ + if (ret && dri2_dpy->buffer_damage && + dri2_dpy->buffer_damage->set_damage_region) + dri2_dpy->buffer_damage->set_damage_region(dri_drawable, 0, NULL); + + return ret; } static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - return false; + struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); + __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf); + + if (!dri2_dpy->buffer_damage || !dri2_dpy->buffer_damage->set_damage_region) + return EGL_FALSE; + + dri2_dpy->buffer_damage->set_damage_region(dr
[Mesa-dev] [PATCH v6 0/5] EGL_KHR_partial_update support
This is an attempt at resurrecting Daniel's MR [1] which was already resurrecting Harish's EGL_KHR_partial_update series [2]. This version implements Marek's suggestion to pass the set_damage_region() directly to the gallium driver and let it decide how to handle the request. Some drivers might just calculate the damage extent (as done in Daniel's initial proposal and in the panfrost implementation), others might do extra optimizations like trying to reduce the area we're supposed to reload (only valid for tile-based rendering) even further. This patch series has been tested with weston on panfrost. Note that the panfrost implementation is rather simple (just limits the rendering area to the damage extent and picks the biggest damage rect as the only damage region) but we can improve it if we feel the need. No big changes in this v6, just addressed Erik and Marek concerns regarding the doc and the prototype of the gallium ->set_damage_region() hook. Regards, Boris [1]https://gitlab.freedesktop.org/mesa/mesa/merge_requests/227 [2]https://patchwork.freedesktop.org/series/45915/#rev2 Boris Brezillon (1): panfrost: Add support for KHR_partial_update() Daniel Stone (2): dri_interface: add DRI2_BufferDamage interface st/dri2: Implement DRI2bufferDamageExtension Harish Krupo (2): egl/android: Delete set_damage_region from egl dri vtbl egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update include/GL/internal/dri_interface.h | 43 ++ src/egl/drivers/dri2/egl_dri2.c | 54 ++-- src/egl/drivers/dri2/egl_dri2.h | 5 +- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 -- src/egl/drivers/dri2/platform_android.c | 45 -- src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - src/gallium/drivers/panfrost/pan_blit.c | 10 +-- src/gallium/drivers/panfrost/pan_context.c | 63 +- src/gallium/drivers/panfrost/pan_job.c | 11 +++ src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 91 + src/gallium/drivers/panfrost/pan_resource.h | 13 ++- src/gallium/drivers/panfrost/pan_screen.c | 1 + src/gallium/include/pipe/p_screen.h | 17 src/gallium/state_trackers/dri/dri2.c | 34 20 files changed, 331 insertions(+), 77 deletions(-) -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 5/5] panfrost: Add support for KHR_partial_update()
Implement ->set_damage_region() region to support partial updates. This is a dummy implementation in that it does not try to merge damage rects. It also does not deal with distinct regions and instead pick the largest quad as the only damage rect and generate up to 4 reload rects out of it (the left/right/top/bottom regions surrounding the biggest damage rect). We also do not try to reduce the number of draws by passing all quad vertices to the blit request (would require extending u_blitter) Signed-off-by: Boris Brezillon Reviewed-by: Alyssa Rosenzweig --- Changes in v6: * Add Alyssa's R-b * Adapt the code to the ->set_damage_region() prototype change Changes in v5: * rename the second panfrost_blit_wallpaper() argument * add extra comment to explain how the set_damage_region() logic works * clarify why checking for negative box->{width,heigh} is not needed in panfrost_draw_wallpaper() --- src/gallium/drivers/panfrost/pan_blit.c | 10 +-- src/gallium/drivers/panfrost/pan_context.c | 63 +- src/gallium/drivers/panfrost/pan_job.c | 11 +++ src/gallium/drivers/panfrost/pan_job.h | 5 ++ src/gallium/drivers/panfrost/pan_resource.c | 91 + src/gallium/drivers/panfrost/pan_resource.h | 13 ++- src/gallium/drivers/panfrost/pan_screen.c | 1 + 7 files changed, 187 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_blit.c b/src/gallium/drivers/panfrost/pan_blit.c index 35c8507eb8d8..1fa9eef572ea 100644 --- a/src/gallium/drivers/panfrost/pan_blit.c +++ b/src/gallium/drivers/panfrost/pan_blit.c @@ -103,7 +103,7 @@ panfrost_blit(struct pipe_context *pipe, */ void -panfrost_blit_wallpaper(struct panfrost_context *ctx) +panfrost_blit_wallpaper(struct panfrost_context *ctx, struct pipe_box *box) { struct pipe_blit_info binfo = { }; @@ -116,11 +116,11 @@ panfrost_blit_wallpaper(struct panfrost_context *ctx) binfo.src.resource = binfo.dst.resource = ctx->pipe_framebuffer.cbufs[0]->texture; binfo.src.level = binfo.dst.level = level; -binfo.src.box.x = binfo.dst.box.x = 0; -binfo.src.box.y = binfo.dst.box.y = 0; +binfo.src.box.x = binfo.dst.box.x = box->x; +binfo.src.box.y = binfo.dst.box.y = box->y; binfo.src.box.z = binfo.dst.box.z = layer; -binfo.src.box.width = binfo.dst.box.width = ctx->pipe_framebuffer.width; -binfo.src.box.height = binfo.dst.box.height = ctx->pipe_framebuffer.height; +binfo.src.box.width = binfo.dst.box.width = box->width; +binfo.src.box.height = binfo.dst.box.height = box->height; binfo.src.box.depth = binfo.dst.box.depth = 1; binfo.src.format = binfo.dst.format = ctx->pipe_framebuffer.cbufs[0]->format; diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 4bbf5230c6cf..4f3242163501 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1477,7 +1477,68 @@ panfrost_draw_wallpaper(struct pipe_context *pipe) struct panfrost_job *batch = panfrost_get_job_for_fbo(ctx); ctx->wallpaper_batch = batch; -panfrost_blit_wallpaper(ctx); + +/* Clamp the rendering area to the damage extent. The + * KHR_partial_update() spec states that trying to render outside of + * the damage region is "undefined behavior", so we should be safe. + */ +panfrost_job_intersection_scissor(batch, rsrc->damage.extent.minx, + rsrc->damage.extent.miny, + rsrc->damage.extent.maxx, + rsrc->damage.extent.maxy); + +struct pipe_scissor_state damage; +struct pipe_box rects[4]; + +/* Clamp the damage box to the rendering area. */ +damage.minx = MAX2(batch->minx, rsrc->damage.biggest_rect.x); +damage.miny = MAX2(batch->miny, rsrc->damage.biggest_rect.y); +damage.maxx = MIN2(batch->maxx, + rsrc->damage.biggest_rect.x + + rsrc->damage.biggest_rect.width); +damage.maxy = MIN2(batch->maxy, + rsrc->damage.biggest_rect.y + + rsrc->damage.biggest_rect.height); + +/* One damage rectangle means we can end up with at most 4 reload + * regions: + * 1: left region, only exists if damage.x > 0 + * 2: right region, only exists if damage.x + damage.width < fb->width + * 3: top region, only exists if damage.y > 0. The intersection with + *the left and right regions are dropped + * 4: bottom region, only exists if dama
[Mesa-dev] [PATCH v6 4/5] st/dri2: Implement DRI2bufferDamageExtension
From: Daniel Stone Add a pipe_screen->set_damage_region() hook to propagate set-damage-region requests to the driver, it's then up to the driver to decide what to do with this piece of information. If the hook is left unassigned, the buffer-damage extension is considered unsupported. Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon Reviewed-by: Alyssa Rosenzweig --- Hello Qiang, I intentionally dropped your R-b/T-b on this patch since the ->set_damage_region() prototype has changed. Feel free to add it back. Regards, Boris Changes in v6: * Pass pipe_box objects instead ints * Document the set_damage_region() hook Changes in v5: * Add Alyssa's R-b --- src/gallium/include/pipe/p_screen.h | 17 ++ src/gallium/state_trackers/dri/dri2.c | 34 +++ 2 files changed, 51 insertions(+) diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index 3f9bad470950..11a6aa939124 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -464,6 +464,23 @@ struct pipe_screen { bool (*is_parallel_shader_compilation_finished)(struct pipe_screen *screen, void *shader, unsigned shader_type); + + /** +* Set the damage region (called when KHR_partial_update() is invoked). +* This function is passed an array of rectangles encoding the damage area. +* rects are using the bottom-left origin convention. +* nrects = 0 means 'reset the damage region'. What 'reset' implies is HW +* specific. For tile-based renderers, the damage extent is typically set +* to cover the whole resource with no damage rect (or a 0-size damage +* rect). This way, the existing resource content is reloaded into the +* local tile buffer for every tile thus making partial tile update +* possible. For HW operating in immediate mode, this reset operation is +* likely to be a NOOP. +*/ + void (*set_damage_region)(struct pipe_screen *screen, + struct pipe_resource *resource, + unsigned int nrects, + const struct pipe_box *rects); }; diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 5a7ec878bab0..5273b95cd5fb 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1807,6 +1807,35 @@ static const __DRI2interopExtension dri2InteropExtension = { .export_object = dri2_interop_export_object }; +/** + * \brief the DRI2bufferDamageExtension set_damage_region method + */ +static void +dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int *rects) +{ + struct dri_drawable *drawable = dri_drawable(dPriv); + struct pipe_resource *resource = drawable->textures[ST_ATTACHMENT_BACK_LEFT]; + struct pipe_screen *screen = resource->screen; + struct pipe_box *boxes = NULL; + + if (nrects) { + boxes = CALLOC(nrects, sizeof(*boxes)); + assert(boxes); + + for (unsigned int i = 0; i < nrects; i++) { + int *rect = &rects[i * 4]; + + u_box_2d(rect[0], rect[1], rect[2], rect[3], &boxes[i]); + } + } + + screen->set_damage_region(screen, resource, nrects, boxes); +} + +static __DRI2bufferDamageExtension dri2BufferDamageExtension = { + .base = { __DRI2_BUFFER_DAMAGE, 1 }, +}; + /** * \brief the DRI2ConfigQueryExtension configQueryb method */ @@ -1908,6 +1937,7 @@ static const __DRIextension *dri_screen_extensions[] = { &dri2GalliumConfigQueryExtension.base, &dri2ThrottleExtension.base, &dri2FenceExtension.base, + &dri2BufferDamageExtension.base, &dri2InteropExtension.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1923,6 +1953,7 @@ static const __DRIextension *dri_robust_screen_extensions[] = { &dri2ThrottleExtension.base, &dri2FenceExtension.base, &dri2InteropExtension.base, + &dri2BufferDamageExtension.base, &dri2Robustness.base, &dri2NoErrorExtension.base, &driBlobExtension.base, @@ -1983,6 +2014,9 @@ dri2_init_screen(__DRIscreen * sPriv) } } + if (pscreen->set_damage_region) + dri2BufferDamageExtension.set_damage_region = dri2_set_damage_region; + if (pscreen->get_param(pscreen, PIPE_CAP_DEVICE_RESET_STATUS_QUERY)) { sPriv->extensions = dri_robust_screen_extensions; screen->has_reset_status_query = true; -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 2/5] dri_interface: add DRI2_BufferDamage interface
From: Daniel Stone Add a new DRI2_BufferDamage interface to support the EGL_KHR_partial_update extension, informing the driver of an overriding scissor region for a particular drawable. Based on a commit originally authored by: Harish Krupo renamed extension, retargeted at DRI drawable instead of context, rewritten description Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig Reviewed-by: Qiang Yu Tested-by: Qiang Yu --- Changes in v6: * Fix the doc Changes in v5: * Add Alyssa's a-b * Add Daniel's SoB --- include/GL/internal/dri_interface.h | 43 + 1 file changed, 43 insertions(+) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index af0ee9c56670..9f5bc7c569e6 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -85,6 +85,7 @@ typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension; typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension; typedef struct __DRI2interopExtensionRec __DRI2interopExtension; typedef struct __DRI2blobExtensionRec __DRI2blobExtension; +typedef struct __DRI2bufferDamageExtensionRec __DRI2bufferDamageExtension; typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension; typedef struct __DRIimageDriverExtensionRec __DRIimageDriverExtension; @@ -488,6 +489,48 @@ struct __DRI2interopExtensionRec { struct mesa_glinterop_export_out *out); }; + +/** + * Extension for limiting window system back buffer rendering to user-defined + * scissor region. + */ + +#define __DRI2_BUFFER_DAMAGE "DRI2_BufferDamage" +#define __DRI2_BUFFER_DAMAGE_VERSION 1 + +struct __DRI2bufferDamageExtensionRec { + __DRIextension base; + + /** +* Provides an array of rectangles representing an overriding scissor region +* for rendering operations performed to the specified drawable. These +* rectangles do not replace client API scissor regions or draw +* co-ordinates, but instead inform the driver of the overall bounds of all +* operations which will be issued before the next flush. +* +* Any rendering operations writing pixels outside this region to the +* drawable will have an undefined effect on the entire drawable. +* +* This entrypoint may only be called after the drawable has either been +* newly created or flushed, and before any rendering operations which write +* pixels to the drawable. Calling this entrypoint at any other time will +* have an undefined effect on the entire drawable. +* +* Calling this entrypoint with @nrects 0 and @rects NULL will reset the +* region to the buffer's full size. This entrypoint may be called once to +* reset the region, followed by a second call with a populated region, +* before a rendering call is made. +* +* Used to implement EGL_KHR_partial_update. +* +* \param drawable affected drawable +* \param nrects number of rectangles provided +* \param rectsthe array of rectangles, lower-left origin +*/ + void (*set_damage_region)(__DRIdrawable *drawable, unsigned int nrects, + int *rects); +}; + /*@}*/ /** -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 1/5] egl/android: Delete set_damage_region from egl dri vtbl
From: Harish Krupo The intension of the KHR_partial_update was not to send the damage back to the platform but to send the damage to the driver to ensure that the following rendering could be restricted to those regions. This patch removes the set_damage_region from the egl_dri vtbl and all the platfrom_*.c files. Then upcomming patches add a new dri2 interface for the drivers to implement Signed-off-by: Harish Krupo Reviewed-by: Daniel Stone Signed-off-by: Boris Brezillon Acked-by: Alyssa Rosenzweig Reviewed-by: Qiang Yu Tested-by: Qiang Yu --- Changes in v6: * Fix Harish's email address Changes in v5: * Add Alyssa's a-b --- src/egl/drivers/dri2/egl_dri2.c | 3 +- src/egl/drivers/dri2/egl_dri2.h | 4 -- src/egl/drivers/dri2/egl_dri2_fallbacks.h | 9 - src/egl/drivers/dri2/platform_android.c | 45 - src/egl/drivers/dri2/platform_device.c | 1 - src/egl/drivers/dri2/platform_drm.c | 1 - src/egl/drivers/dri2/platform_surfaceless.c | 1 - src/egl/drivers/dri2/platform_wayland.c | 1 - src/egl/drivers/dri2/platform_x11.c | 2 - src/egl/drivers/dri2/platform_x11_dri3.c| 1 - 10 files changed, 1 insertion(+), 67 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index ee4faaab34f4..3c33b2cf27f8 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1691,8 +1691,7 @@ static EGLBoolean dri2_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint *rects, EGLint n_rects) { - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - return dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); + return false; } static EGLBoolean diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index fa04e3bb616d..1d9fe3db625f 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -122,10 +122,6 @@ struct dri2_egl_display_vtbl { _EGLSurface *surface, const EGLint *rects, EGLint n_rects); - EGLBoolean (*set_damage_region)(_EGLDriver *drv, _EGLDisplay *disp, - _EGLSurface *surface, - const EGLint *rects, EGLint n_rects); - EGLBoolean (*swap_buffers_region)(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint numRects, const EGLint *rects); diff --git a/src/egl/drivers/dri2/egl_dri2_fallbacks.h b/src/egl/drivers/dri2/egl_dri2_fallbacks.h index 6c2c4bbe595e..d975b7a8b130 100644 --- a/src/egl/drivers/dri2/egl_dri2_fallbacks.h +++ b/src/egl/drivers/dri2/egl_dri2_fallbacks.h @@ -62,7 +62,6 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *disp, const EGLint *rects, EGLint n_rects) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - dri2_dpy->vtbl->set_damage_region(drv, disp, surf, rects, n_rects); return dri2_dpy->vtbl->swap_buffers(drv, disp, surf); } @@ -90,14 +89,6 @@ dri2_fallback_copy_buffers(_EGLDriver *drv, _EGLDisplay *disp, return _eglError(EGL_BAD_NATIVE_PIXMAP, "no support for native pixmaps"); } -static inline EGLBoolean -dri2_fallback_set_damage_region(_EGLDriver *drv, _EGLDisplay *disp, -_EGLSurface *surf, -const EGLint *rects, EGLint n_rects) -{ - return EGL_FALSE; -} - static inline EGLint dri2_fallback_query_buffer_age(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index db6ba4a4b4d6..6ce04d250c8d 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -728,43 +728,6 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) return EGL_TRUE; } -#if ANDROID_API_LEVEL >= 23 -static EGLBoolean -droid_set_damage_region(_EGLDriver *drv, -_EGLDisplay *disp, -_EGLSurface *draw, const EGLint* rects, EGLint n_rects) -{ - struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); - struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw); - android_native_rect_t* droid_rects = NULL; - int ret; - - if (n_rects == 0) - return EGL_TRUE; - - droid_rects = malloc(n_rects * sizeof(android_native_rect_t)); - if (droid_rects == NULL) - return _eglError(EGL_BAD_ALLOC, "eglSetDamageRegionKHR"); - - for (EGLint num_drects = 0; num_drects < n_rects; num_drects++) { - EGLint i = num_drects * 4; - droid_rects[num_drects].left = rects[i]; - droid_rects[num_drects].bottom = rects[
Re: [Mesa-dev] [PATCH v5 4/5] st/dri2: Implement DRI2bufferDamageExtension
On Mon, 15 Jul 2019 09:23:43 +0200 Boris Brezillon wrote: > Hello Marek, > > On Tue, 2 Jul 2019 20:09:23 +0200 > Boris Brezillon wrote: > > > On Tue, 2 Jul 2019 13:21:31 -0400 > > Marek Olšák wrote: > > > > > On Tue., Jul. 2, 2019, 09:50 Boris Brezillon, > > > wrote: > > > > > > > From: Daniel Stone > > > > > > > > Add a pipe_screen->set_damage_region() hook to propagate > > > > set-damage-region requests to the driver, it's then up to the > > > > driver to decide what to do with this piece of information. > > > > > > > > If the hook is left unassigned, the buffer-damage extension is > > > > considered unsupported. > > > > > > > > Signed-off-by: Daniel Stone > > > > Signed-off-by: Boris Brezillon > > > > Reviewed-by: Alyssa Rosenzweig > > > > --- > > > > Changes in v5: > > > > * Add Alyssa's R-b > > > > --- > > > > src/gallium/include/pipe/p_screen.h | 7 +++ > > > > src/gallium/state_trackers/dri/dri2.c | 22 ++ > > > > 2 files changed, 29 insertions(+) > > > > > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > > > b/src/gallium/include/pipe/p_screen.h > > > > index 3f9bad470950..8df12ee4f865 100644 > > > > --- a/src/gallium/include/pipe/p_screen.h > > > > +++ b/src/gallium/include/pipe/p_screen.h > > > > @@ -464,6 +464,13 @@ struct pipe_screen { > > > > bool (*is_parallel_shader_compilation_finished)(struct > > > > pipe_screen *screen, > > > > void *shader, > > > > unsigned > > > > shader_type); + > > > > + /** > > > > +* Set damage region. > > > > > > > > > > Can you expand the comment to describe rects? The format of rects is > > > not obvious. > > > > Oops, will point to the KHR_partial_update() doc and explain what rects > > encode and how. > > This reminds me that we have a corner case (at least for tile-based > > GPUs): the dri implementation calls > > ->set_damage_region(screen, res, 0, NULL) to reset the damage region, > > but in KHR_partial_update() spec this means "damage all". If we follow > > the spec that would imply existing FB content is dropped which in turn > > means users relying on buffer_age() (without partial_update()) to only > > update the region that have changed will stop working properly. > > > > I see 2 options to solve this problem: > > > > 1/ add a new ->reset_damage_region() hook that would be called by the > >dri implementation after each swap_buf() in replacement of the > >current ->set_damage_region(screen, res, 0, NULL). Reset in that > >case means we consider the damage region as "unknown" and force > >a "reload FB content in the local-tile buffer" for the whole > >resource instead of restricting it to the !damage region. > > 2/ deviate from the KHR_partial_update() semantic and reserve > >->set_damage_region(screen, res, 0, NULL) for the "reset damage > >region" op. That means we'll have to convert actual > >KHR_partial_update(0, NULL) calls into > >->set_damage_region(screen, res, 1, full_res_rect) ones to reflect > >the behavior described in the spec. > > Any advice on how to solve this problem? Decided to go for a 3rd option in my v6 which is to keep things as they were and document that ->set_damage_region(0, NULL) should act as a 'reset damage region'. This is exactly how it's documented in the DRI2 extension, and I guess we can live the potential extra penalty when the application calls KHR_partial_update(0, NULL) instead of KHR_partial_update(1, full_res_rect). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v6 4/5] st/dri2: Implement DRI2bufferDamageExtension
Hi Qiang, On Sun, 21 Jul 2019 17:02:54 +0800 Qiang Yu wrote: > On Mon, Jul 15, 2019 at 8:50 PM Boris Brezillon > wrote: > > > > From: Daniel Stone > > > > Add a pipe_screen->set_damage_region() hook to propagate > > set-damage-region requests to the driver, it's then up to the driver to > > decide what to do with this piece of information. > > > > If the hook is left unassigned, the buffer-damage extension is > > considered unsupported. > > > > Signed-off-by: Daniel Stone > > Signed-off-by: Boris Brezillon > > Reviewed-by: Alyssa Rosenzweig > > --- > > Hello Qiang, > > > > I intentionally dropped your R-b/T-b on this patch since the > > ->set_damage_region() prototype has changed. Feel free to add it back. > > > > Regards, > > > > Boris > > > > Changes in v6: > > * Pass pipe_box objects instead ints > > * Document the set_damage_region() hook > > > > Changes in v5: > > * Add Alyssa's R-b > > --- > > src/gallium/include/pipe/p_screen.h | 17 ++ > > src/gallium/state_trackers/dri/dri2.c | 34 +++ > > 2 files changed, 51 insertions(+) > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > b/src/gallium/include/pipe/p_screen.h > > index 3f9bad470950..11a6aa939124 100644 > > --- a/src/gallium/include/pipe/p_screen.h > > +++ b/src/gallium/include/pipe/p_screen.h > > @@ -464,6 +464,23 @@ struct pipe_screen { > > bool (*is_parallel_shader_compilation_finished)(struct pipe_screen > > *screen, > > void *shader, > > unsigned shader_type); > > + > > + /** > > +* Set the damage region (called when KHR_partial_update() is invoked). > > +* This function is passed an array of rectangles encoding the damage > > area. > > +* rects are using the bottom-left origin convention. > > +* nrects = 0 means 'reset the damage region'. What 'reset' implies is > > HW > > +* specific. For tile-based renderers, the damage extent is typically > > set > > +* to cover the whole resource with no damage rect (or a 0-size damage > > +* rect). This way, the existing resource content is reloaded into the > > +* local tile buffer for every tile thus making partial tile update > > +* possible. For HW operating in immediate mode, this reset operation is > > +* likely to be a NOOP. > > +*/ > > + void (*set_damage_region)(struct pipe_screen *screen, > > + struct pipe_resource *resource, > > + unsigned int nrects, > > + const struct pipe_box *rects); > > }; > > > > > > diff --git a/src/gallium/state_trackers/dri/dri2.c > > b/src/gallium/state_trackers/dri/dri2.c > > index 5a7ec878bab0..5273b95cd5fb 100644 > > --- a/src/gallium/state_trackers/dri/dri2.c > > +++ b/src/gallium/state_trackers/dri/dri2.c > > @@ -1807,6 +1807,35 @@ static const __DRI2interopExtension > > dri2InteropExtension = { > > .export_object = dri2_interop_export_object > > }; > > > > +/** > > + * \brief the DRI2bufferDamageExtension set_damage_region method > > + */ > > +static void > > +dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int > > *rects) > > +{ > > + struct dri_drawable *drawable = dri_drawable(dPriv); > > + struct pipe_resource *resource = > > drawable->textures[ST_ATTACHMENT_BACK_LEFT]; > > + struct pipe_screen *screen = resource->screen; > > + struct pipe_box *boxes = NULL; > > + > > + if (nrects) { > > + boxes = CALLOC(nrects, sizeof(*boxes)); > > + assert(boxes); > > Where does this boxes array get freed? I can't find in your patch 6 either. Indeed, the FREE() is missing. > In fact I prefer the v5 way which just uses `int *rects` to avoid unnecessary > conversion. Well, Erik suggested to pass an array of pipe_boxe objects to make things clearer, and I can of agree with him. Moreover, I'd expect the extra allocation + pipe_box init overhead to be negligible. Regards, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v6 4/5] st/dri2: Implement DRI2bufferDamageExtension
+Marek (looks like I forgot to Cc you on this v6 :-/). On Mon, 22 Jul 2019 09:49:31 +0200 Boris Brezillon wrote: > Hi Qiang, > > On Sun, 21 Jul 2019 17:02:54 +0800 > Qiang Yu wrote: > > > On Mon, Jul 15, 2019 at 8:50 PM Boris Brezillon > > wrote: > > > > > > From: Daniel Stone > > > > > > Add a pipe_screen->set_damage_region() hook to propagate > > > set-damage-region requests to the driver, it's then up to the driver to > > > decide what to do with this piece of information. > > > > > > If the hook is left unassigned, the buffer-damage extension is > > > considered unsupported. > > > > > > Signed-off-by: Daniel Stone > > > Signed-off-by: Boris Brezillon > > > Reviewed-by: Alyssa Rosenzweig > > > --- > > > Hello Qiang, > > > > > > I intentionally dropped your R-b/T-b on this patch since the > > > ->set_damage_region() prototype has changed. Feel free to add it back. > > > > > > Regards, > > > > > > Boris > > > > > > Changes in v6: > > > * Pass pipe_box objects instead ints > > > * Document the set_damage_region() hook > > > > > > Changes in v5: > > > * Add Alyssa's R-b > > > --- > > > src/gallium/include/pipe/p_screen.h | 17 ++ > > > src/gallium/state_trackers/dri/dri2.c | 34 +++ > > > 2 files changed, 51 insertions(+) > > > > > > diff --git a/src/gallium/include/pipe/p_screen.h > > > b/src/gallium/include/pipe/p_screen.h > > > index 3f9bad470950..11a6aa939124 100644 > > > --- a/src/gallium/include/pipe/p_screen.h > > > +++ b/src/gallium/include/pipe/p_screen.h > > > @@ -464,6 +464,23 @@ struct pipe_screen { > > > bool (*is_parallel_shader_compilation_finished)(struct pipe_screen > > > *screen, > > > void *shader, > > > unsigned shader_type); > > > + > > > + /** > > > +* Set the damage region (called when KHR_partial_update() is > > > invoked). > > > +* This function is passed an array of rectangles encoding the damage > > > area. > > > +* rects are using the bottom-left origin convention. > > > +* nrects = 0 means 'reset the damage region'. What 'reset' implies > > > is HW > > > +* specific. For tile-based renderers, the damage extent is typically > > > set > > > +* to cover the whole resource with no damage rect (or a 0-size damage > > > +* rect). This way, the existing resource content is reloaded into the > > > +* local tile buffer for every tile thus making partial tile update > > > +* possible. For HW operating in immediate mode, this reset operation > > > is > > > +* likely to be a NOOP. > > > +*/ > > > + void (*set_damage_region)(struct pipe_screen *screen, > > > + struct pipe_resource *resource, > > > + unsigned int nrects, > > > + const struct pipe_box *rects); > > > }; > > > > > > > > > diff --git a/src/gallium/state_trackers/dri/dri2.c > > > b/src/gallium/state_trackers/dri/dri2.c > > > index 5a7ec878bab0..5273b95cd5fb 100644 > > > --- a/src/gallium/state_trackers/dri/dri2.c > > > +++ b/src/gallium/state_trackers/dri/dri2.c > > > @@ -1807,6 +1807,35 @@ static const __DRI2interopExtension > > > dri2InteropExtension = { > > > .export_object = dri2_interop_export_object > > > }; > > > > > > +/** > > > + * \brief the DRI2bufferDamageExtension set_damage_region method > > > + */ > > > +static void > > > +dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int > > > *rects) > > > +{ > > > + struct dri_drawable *drawable = dri_drawable(dPriv); > > > + struct pipe_resource *resource = > > > drawable->textures[ST_ATTACHMENT_BACK_LEFT]; > > > + struct pipe_screen *screen = resource->screen; > > > + struct pipe_box *boxes = NULL; > > > + > > > + if (nrects) { > > > + boxes = CALLOC(nrects, sizeof(*boxes)); > > > + assert(boxes); > > > > Where does this boxes array get freed? I can't find in your patch 6 either. > > > > Indeed, the FREE() is missing. > > > In fact I prefer the v5 way which just uses `int *rects` to avoid > > unnecessary > > conversion. > > Well, Erik suggested to pass an array of pipe_boxe objects to make > things clearer, and I can of agree with him. Moreover, I'd expect the *kind of > extra allocation + pipe_box init overhead to be negligible. Erik, Qiang, Marek, Any comment on this v5. Should I send a v6 adding the missing FREE() call. Anything else you'd like me to change? Thanks, Boris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/9] panfrost: Allocate the polygon lists on-demand
Hello, This patch series is getting rid of the 64MB polygon-list allocation that was done at context init time in favor of a per-job polygon-list allocation allowing us to shrink the BO size considerably and thus reduce memory consumption. The first 8 patches are random cleanups. Most of them are needed to get patch 9 working (patch 2, 3, 4, 5, 6 and 8), others are just things I decided to get rid off along the way (patches 1 and 7). Regards, Boris Alyssa Rosenzweig (1): panfrost: Allocate polygon lists on-demand Boris Brezillon (8): panfrost: Get rid of ctx->job panfrost: Remove job from ctx->jobs at submission time panfrost: Delay FB descriptor allocation panfrost: Bail out early when new and current FB states are equal panfrost: Bail out early when doing a wallpaper blit panfrost: Don't emit a new FB desc when setting a new FB state panfrost: Get rid of the skippable param in attach_vt_framebuffer() panfrost: Handle the bo == NULL case in panfrost_bo_[un]reference() src/gallium/drivers/panfrost/pan_context.c| 61 +++ src/gallium/drivers/panfrost/pan_context.h| 7 +-- src/gallium/drivers/panfrost/pan_drm.c| 2 +- src/gallium/drivers/panfrost/pan_job.c| 40 +--- src/gallium/drivers/panfrost/pan_job.h| 6 ++ src/gallium/drivers/panfrost/pan_resource.c | 6 +- src/gallium/drivers/panfrost/pan_scoreboard.c | 5 +- 7 files changed, 82 insertions(+), 45 deletions(-) -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] panfrost: Remove job from ctx->jobs at submission time
This guarantees that new draws targetting the same framebuffer will get a new job instance. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_job.c | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c index 960c8556e2f0..d2a4c8c3c600 100644 --- a/src/gallium/drivers/panfrost/pan_job.c +++ b/src/gallium/drivers/panfrost/pan_job.c @@ -173,6 +173,14 @@ panfrost_job_submit(struct panfrost_context *ctx, struct panfrost_job *job) if (ret) fprintf(stderr, "panfrost_job_submit failed: %d\n", ret); + +/* Remove the job from the ctx->jobs set so that future + * panfrost_get_job() calls don't see it. + * We must reset the job key to avoid removing another valid entry when + * the job is freed. + */ +_mesa_hash_table_remove_key(ctx->jobs, &job->key); +memset(&job->key, 0, sizeof(job->key)); } void -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] panfrost: Allocate polygon lists on-demand
From: Alyssa Rosenzweig Rather than alloacting a huge (64MB) polygon list on context creation and sharing it across framebuffers, we instead allocate polygon lists as BOs (which consistently hit the cache) sized appropriately; for about a month, we've known how to calculate the polygon list size so this has only recently become possible. The good news is we can render to truly massive framebuffers without crashing and, more importantly, we eliminate the 64MB upfront overhead. If a list that size isn't actually needed, it's not allocated. Signed-off-by: Alyssa Rosenzweig Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c| 8 ++- src/gallium/drivers/panfrost/pan_context.h| 1 - src/gallium/drivers/panfrost/pan_drm.c| 2 +- src/gallium/drivers/panfrost/pan_job.c| 24 +++ src/gallium/drivers/panfrost/pan_job.h| 6 + src/gallium/drivers/panfrost/pan_scoreboard.c | 5 ++-- 6 files changed, 36 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index e781d809812b..26bd0082a339 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -62,6 +62,7 @@ panfrost_emit_midg_tiler( unsigned vertex_count) { struct midgard_tiler_descriptor t = {}; +struct panfrost_job *batch = panfrost_get_job_for_fbo(ctx); t.hierarchy_mask = panfrost_choose_hierarchy_mask(width, height, vertex_count); @@ -77,10 +78,7 @@ panfrost_emit_midg_tiler( /* Sanity check */ if (t.hierarchy_mask) { -assert(ctx->tiler_polygon_list.bo->size >= (header_size + body_size)); - -/* Specify allocated tiler structures */ -t.polygon_list = ctx->tiler_polygon_list.bo->gpu; +t.polygon_list = panfrost_job_get_polygon_list(batch, header_size + body_size); /* Allow the entire tiler heap */ t.heap_start = ctx->tiler_heap.bo->gpu; @@ -2527,7 +2525,6 @@ panfrost_destroy(struct pipe_context *pipe) panfrost_drm_free_slab(screen, &panfrost->scratchpad); panfrost_drm_free_slab(screen, &panfrost->shaders); panfrost_drm_free_slab(screen, &panfrost->tiler_heap); -panfrost_drm_free_slab(screen, &panfrost->tiler_polygon_list); panfrost_drm_free_slab(screen, &panfrost->tiler_dummy); ralloc_free(pipe); @@ -2673,7 +2670,6 @@ panfrost_setup_hardware(struct panfrost_context *ctx) panfrost_drm_allocate_slab(screen, &ctx->scratchpad, 64*4, false, 0, 0, 0); panfrost_drm_allocate_slab(screen, &ctx->shaders, 4096, true, PAN_ALLOCATE_EXECUTE, 0, 0); panfrost_drm_allocate_slab(screen, &ctx->tiler_heap, 4096, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_GROWABLE, 1, 128); -panfrost_drm_allocate_slab(screen, &ctx->tiler_polygon_list, 128*128, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_GROWABLE, 1, 128); panfrost_drm_allocate_slab(screen, &ctx->tiler_dummy, 1, false, PAN_ALLOCATE_INVISIBLE, 0, 0); } diff --git a/src/gallium/drivers/panfrost/pan_context.h b/src/gallium/drivers/panfrost/pan_context.h index ac4b21678e65..7556500ae72d 100644 --- a/src/gallium/drivers/panfrost/pan_context.h +++ b/src/gallium/drivers/panfrost/pan_context.h @@ -110,7 +110,6 @@ struct panfrost_context { struct panfrost_memory shaders; struct panfrost_memory scratchpad; struct panfrost_memory tiler_heap; -struct panfrost_memory tiler_polygon_list; struct panfrost_memory tiler_dummy; struct panfrost_memory depth_stencil_buffer; diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 89c7019dd9c7..42cf17503344 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -288,7 +288,7 @@ panfrost_drm_submit_vs_fs_job(struct panfrost_context *ctx, bool has_draws, bool panfrost_job_add_bo(job, ctx->shaders.bo); panfrost_job_add_bo(job, ctx->scratchpad.bo); panfrost_job_add_bo(job, ctx->tiler_heap.bo); -panfrost_job_add_bo(job, ctx->tiler_polygon_list.bo); +panfrost_job_add_bo(job, job->polygon_list); if (job->first_job.gpu) { ret = panfrost_drm_submit_job(ctx, job->first_job.gpu, 0); diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c index d2a4c8c3c600..9c39181d6e48 100644 --- a/src/gallium/drivers/panfrost/pan_job.c +++ b/src/gallium/drivers/panfrost/pan_job.c @@ -70,6 +70,9 @@ panfrost_free_job(struct panfrost_context *ctx, struct panfrost_job *job) BITSET_SET(screen->free_transient, *ind
[Mesa-dev] [PATCH 5/9] panfrost: Bail out early when doing a wallpaper blit
The wallpaper blit is a bit special in that the operation is targetting the current FB, but the u_blitter logic creates a new surface for it which makes util_framebuffer_state_equal() return false. In that case we don't want a new FB descriptor to be emitted/attached, so let's just copy the new state into ctx->pipe_framebuffer and exit the function. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index d442ae1f2433..1091caeb1148 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -2369,10 +2369,22 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx, if (util_framebuffer_state_equal(&ctx->pipe_framebuffer, fb)) return; -if (!ctx->wallpaper_batch && (!is_scanout || has_draws)) { -panfrost_flush(pctx, NULL, PIPE_FLUSH_END_OF_FRAME); +/* The wallpaper logic sets a new FB state before doing the blit and + * restore the old one when it's done. Those FB states are reported to + * be different because the surface they are pointing to are different, + * but those surfaces actually point to the same cbufs/zbufs. In that + * case we definitely don't want new FB descs to be emitted/attached + * since the job is expected to be flushed just after the blit is done, + * so let's just copy the new state and return here. + */ +if (ctx->wallpaper_batch) { +util_copy_framebuffer_state(&ctx->pipe_framebuffer, fb); +return; } +if (!is_scanout || has_draws) +panfrost_flush(pctx, NULL, PIPE_FLUSH_END_OF_FRAME); + util_copy_framebuffer_state(&ctx->pipe_framebuffer, fb); /* Given that we're rendering, we'd love to have compression */ -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] panfrost: Delay FB descriptor allocation
No need to emit SFBD/MFBD at frame invalidation. They can be emitted when the framebuffer is attached, which saves us a potential FB desc re-allocation if a new FB is bound after the swap. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 21 ++--- src/gallium/drivers/panfrost/pan_context.h | 3 --- 2 files changed, 6 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 014f8f6a9d07..b63023a16cda 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -189,13 +189,17 @@ panfrost_clear( static mali_ptr panfrost_attach_vt_mfbd(struct panfrost_context *ctx) { -return panfrost_upload_transient(ctx, &ctx->vt_framebuffer_mfbd, sizeof(ctx->vt_framebuffer_mfbd)) | MALI_MFBD; +struct bifrost_framebuffer mfbd = panfrost_emit_mfbd(ctx, ~0); + +return panfrost_upload_transient(ctx, &mfbd, sizeof(mfbd)) | MALI_MFBD; } static mali_ptr panfrost_attach_vt_sfbd(struct panfrost_context *ctx) { -return panfrost_upload_transient(ctx, &ctx->vt_framebuffer_sfbd, sizeof(ctx->vt_framebuffer_sfbd)) | MALI_SFBD; +struct mali_single_framebuffer sfbd = panfrost_emit_sfbd(ctx, ~0); + +return panfrost_upload_transient(ctx, &sfbd, sizeof(sfbd)) | MALI_SFBD; } static void @@ -223,13 +227,6 @@ panfrost_attach_vt_framebuffer(struct panfrost_context *ctx, bool skippable) static void panfrost_invalidate_frame(struct panfrost_context *ctx) { -struct panfrost_screen *screen = pan_screen(ctx->base.screen); - -if (screen->require_sfbd) -ctx->vt_framebuffer_sfbd = panfrost_emit_sfbd(ctx, ~0); -else -ctx->vt_framebuffer_mfbd = panfrost_emit_mfbd(ctx, ~0); - for (unsigned i = 0; i < PIPE_SHADER_TYPES; ++i) ctx->payloads[i].postfix.framebuffer = 0; @@ -2378,12 +2375,6 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx, struct panfrost_screen *screen = pan_screen(ctx->base.screen); panfrost_hint_afbc(screen, &ctx->pipe_framebuffer); - -if (screen->require_sfbd) -ctx->vt_framebuffer_sfbd = panfrost_emit_sfbd(ctx, ~0); -else -ctx->vt_framebuffer_mfbd = panfrost_emit_mfbd(ctx, ~0); - panfrost_attach_vt_framebuffer(ctx, false); } diff --git a/src/gallium/drivers/panfrost/pan_context.h b/src/gallium/drivers/panfrost/pan_context.h index a90dbb04e833..ac4b21678e65 100644 --- a/src/gallium/drivers/panfrost/pan_context.h +++ b/src/gallium/drivers/panfrost/pan_context.h @@ -137,9 +137,6 @@ struct panfrost_context { union mali_attr attributes[PIPE_MAX_ATTRIBS]; -struct mali_single_framebuffer vt_framebuffer_sfbd; -struct bifrost_framebuffer vt_framebuffer_mfbd; - /* TODO: Multiple uniform buffers (index =/= 0), finer updates? */ struct panfrost_constant_buffer constant_buffer[PIPE_SHADER_TYPES]; -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] panfrost: Get rid of the skippable param in attach_vt_framebuffer()
The only user of this function always passes true. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 2b7906eea155..e781d809812b 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -203,11 +203,11 @@ panfrost_attach_vt_sfbd(struct panfrost_context *ctx) } static void -panfrost_attach_vt_framebuffer(struct panfrost_context *ctx, bool skippable) +panfrost_attach_vt_framebuffer(struct panfrost_context *ctx) { /* Skip the attach if we can */ -if (skippable && ctx->payloads[PIPE_SHADER_VERTEX].postfix.framebuffer) { +if (ctx->payloads[PIPE_SHADER_VERTEX].postfix.framebuffer) { assert(ctx->payloads[PIPE_SHADER_FRAGMENT].postfix.framebuffer); return; } @@ -1013,7 +1013,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool with_vertex_data) struct panfrost_job *job = panfrost_get_job_for_fbo(ctx); struct panfrost_screen *screen = pan_screen(ctx->base.screen); -panfrost_attach_vt_framebuffer(ctx, true); +panfrost_attach_vt_framebuffer(ctx); if (with_vertex_data) { panfrost_emit_vertex_data(job); -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] panfrost: Don't emit a new FB desc when setting a new FB state
The FB desc will be emitted/attached on the first draw targetting this new FB. Signed-off-by: Boris Brezillon --- src/gallium/drivers/panfrost/pan_context.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 1091caeb1148..2b7906eea155 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -2384,6 +2384,9 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx, if (!is_scanout || has_draws) panfrost_flush(pctx, NULL, PIPE_FLUSH_END_OF_FRAME); +else +assert(!ctx->payloads[PIPE_SHADER_VERTEX].postfix.framebuffer && + !ctx->payloads[PIPE_SHADER_FRAGMENT].postfix.framebuffer); util_copy_framebuffer_state(&ctx->pipe_framebuffer, fb); @@ -2391,7 +2394,8 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx, struct panfrost_screen *screen = pan_screen(ctx->base.screen); panfrost_hint_afbc(screen, &ctx->pipe_framebuffer); -panfrost_attach_vt_framebuffer(ctx, false); +for (unsigned i = 0; i < PIPE_SHADER_TYPES; ++i) +ctx->payloads[i].postfix.framebuffer = 0; } static void * -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev