[Mesa-dev] [PATCH] egl: autotools: add missing dependency on generated header
platform_wayland.c includes linux-dmabuf-unstable-v1-client-protocol.h, which is generated during build. Add the missing dependency to the Makefile. I have seen the following build failure due to a race between generation of linux-dmabuf-unstable-v1-client-protocol.h and compilation of platform_wayland.cc: GEN drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h GEN drivers/dri2/linux-dmabuf-unstable-v1-protocol.c Using "code" is deprecated - use private-code or public-code. See the help page for details. CC drivers/dri2/platform_wayland.lo ../../../Mesa-18.1.0/src/egl/drivers/dri2/platform_wayland.c: In function 'create_wl_buffer': ../../../Mesa-18.1.0/src/egl/drivers/dri2/platform_wayland.c:810:16: error: implicit declaration of function 'zwp_linux_dmabuf_v1_create_params' [-Werror=implicit-function-declaration] Signed-off-by: Philipp Zabel --- src/egl/Makefile.am | 1 + 1 file changed, 1 insertion(+) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 086a4a1e630..116ed4ebf50 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -80,6 +80,7 @@ drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h: $(WL_DMABUF_XML) if HAVE_PLATFORM_WAYLAND drivers/dri2/linux-dmabuf-unstable-v1-protocol.lo: drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h drivers/dri2/egl_dri2.lo: drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h +drivers/dri2/platform_wayland.lo: drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h AM_CFLAGS += $(WAYLAND_CLIENT_CFLAGS) libEGL_common_la_LIBADD += $(WAYLAND_CLIENT_LIBS) -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] egl/android: Add DRM node probing and filtering
Hi Rob, Finally got to review this. Please see my comments inline. On Fri, May 11, 2018 at 10:48 PM Robert Foss wrote: [snip] > +EGLBoolean > +droid_load_driver(_EGLDisplay *disp) Since this is not EGL-facing, I'd personally use bool. > +{ > + struct dri2_egl_display *dri2_dpy = disp->DriverData; > + const char *err; > + > + dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd); > + if (dri2_dpy->driver_name == NULL) { > + err = "DRI2: failed to get driver name"; > + goto error; It shouldn't be an error if there is no driver for given render node. We should just skip it and try next one, which I believe would be achieved by just returning false here. > + } > + > + dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == DRM_NODE_RENDER; > + > + if (!dri2_dpy->is_render_node) { > + #ifdef HAVE_DRM_GRALLOC > + /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM names > +* for backwards compatibility with drm_gralloc. (Do not use on new > +* systems.) */ > + dri2_dpy->loader_extensions = droid_dri2_loader_extensions; > + if (!dri2_load_driver(disp)) { > + err = "DRI2: failed to load driver"; > + goto error; > + } > + #else > + err = "DRI2: handle is not for a render node"; > + goto error; > + #endif > + } else { > + dri2_dpy->loader_extensions = droid_image_loader_extensions; > + if (!dri2_load_driver_dri3(disp)) { > + err = "DRI3: failed to load driver"; > + goto error; > + } > +} > + > + return EGL_TRUE; > + > +error: > + free(dri2_dpy->driver_name); > + dri2_dpy->driver_name = NULL; > + return _eglError(EGL_NOT_INITIALIZED, err); Hmm, if we signal EGL error here, we should break the probing loop and just bail out. This would suggest that a boolean is not the right type for this function to return. Perhaps something like negative error, 0 for skip and 1 for success would make sense? Also, how does it play with the _eglError() called from the error path of dri2_initialize_android()? > +} > + > +static int > +droid_probe_driver(_EGLDisplay *disp, int fd) > +{ > + struct dri2_egl_display *dri2_dpy = disp->DriverData; > + dri2_dpy->fd = fd; > + > + if (!droid_load_driver(disp)) > + return false; Given my other suggestion about distinguishing failure, render node skip and success, I think it should be more like this: int ret = droid_load_driver(disp); if (ret <= 0) return ret; Or actually, maybe we don't really need to go as far as loading the driver. I'd say it should be enough to just check if we have a driver for the device by looking at what loader_get_driver_for_fd() returns. (In that case, we can ignore my comment about returning error on loader_get_driver_for_fd() failure in droid_load_driver(), since the skipping would be handling only here.) > + > + /* Since this probe can succeed, but another filter may fail, What another filter could fail? I can see the vendor name being checked before calling this function. The free() below is actually needed, just the comment is off. We need to free the name to be able to probe remaining nodes, without leaking the name. > + this string needs to be deallocated either way. > + Once an FD has been found, this string will be set a second time. */ > + free(dri2_dpy->driver_name); Don't we also need to unload the driver? > + dri2_dpy->driver_name = NULL; > + return true; To match the change above: return 1; > +} > + > +static int > +droid_probe_device(_EGLDisplay *disp, int fd, drmDevicePtr dev, char *vendor) > +{ > + drmVersionPtr ver = drmGetVersion(fd); > + if (!ver) > + goto fail; Something wrong with indentation here. > + > + size_t vendor_len = strlen(vendor); > + if (vendor_len != 0 && strncmp(vendor, ver->name, vendor_len)) > + goto fail; Maybe it's just me, but I don't see any point in using strncmp() if the length argument is obtained by calling strlen() first. Especially if the strlen() call is on a string that comes from some external code (property_get()). Perhaps we could just call strncmp() with PROPERTY_VALUE_MAX? This would actually play nice with my other comment about using NULL for vendor string, if the property is not present. Also nit: The label could be named in a more meaningful way, e.g. err_free_version. > + > + if (!droid_probe_driver(disp, fd)) > + goto fail; > + > + drmFreeVersion(ver); > + return true; > + > +fail: > + drmFreeVersion(ver); > + return false; Given my other suggestion about distinguishing failure, render node skip and success, I think it should be more like this: ret = droid_probe_driver(disp, fd); err_free_version: drmFreeVersion(ver); return ret; > +} > + > +static int > +droid_open_device(_EGLDisplay *disp) > +{ > + const int MAX_DRM_DEVICES = 32; > + int prop_set, num_devices, ret; > + int fd = -1, fallbac
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
Hi Rob, On Thu, May 24, 2018 at 8:23 PM Robert Foss wrote: > Hey, > I don't think I've received any feedback on this version yet. > If anyone has some time to spare, it would be nice to get it merged. Really sorry for taking so long to review. Posted my comments just now. Best regards, Tomasz > Just to be clear about the libdrm branch linked in the cover letter, > it is not required. Only for virgl platforms which happens to be what > I tested on. > Rob. > On 2018-05-11 15:47, Robert Foss wrote: > > This series replaces the dependency on > > GRALLOC_MODULE_PERFORM_GET_DRM_FD with DRM node > > probing and disables the support for drm_gralloc. > > > > The series has been tested on Qemu+AOSP, where a > > virtio gpu was successfully probed for and > > opened. > > > > This however required adding support in libdrm > > for virtio gpus, and virtio buses. An initial > > patch for this can be found here: > > > > https://gitlab.collabora.com/robertfoss/libdrm/tree/virtio_rfc > > > > Changes since v1: > > - Added fix for build issue > > - Do not rely on libdrm for probing > > - Distinguish between errors and when no drm devices are found > > > > Changes since RFC: > > - Rebased work on the libdrm patch [2]. > > - Included patch from Rob Herring disabling drm_gralloc/flink > > support by default. > > - Added device handler driver probing. > > > > > > Rob Herring (1): > >egl/android: #ifdef out flink name support > > > > Robert Foss (2): > >gallium/util: Fix build error due to cast to different size > >egl/android: Add DRM node probing and filtering > > > > src/egl/Android.mk| 6 +- > > src/egl/drivers/dri2/egl_dri2.h | 2 - > > src/egl/drivers/dri2/platform_android.c | 206 ++ > > .../auxiliary/util/u_debug_stack_android.cpp | 4 +- > > 4 files changed, 174 insertions(+), 44 deletions(-) > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
Hey, On 2018-05-25 02:17, Rob Herring wrote: On Thu, May 24, 2018 at 6:23 AM, Robert Foss wrote: Hey, I don't think I've received any feedback on this version yet. If anyone has some time to spare, it would be nice to get it merged. Just to be clear about the libdrm branch linked in the cover letter, it is not required. Only for virgl platforms which happens to be what I tested on. virgl will still fallback to using the first render node without those libdrm changes, right? If not, I don't think we should apply until we're not breaking a platform... No it will not fall back. I agree that holding off makes more sense. Emil Velikov had some objections to the approach in the libdrm branch, and started a new branch from scratch with the same goals. It isn't yet fully functional, but I'm working with him to have it sent out as soon as possible. Rob. Rob Rob. On 2018-05-11 15:47, Robert Foss wrote: This series replaces the dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD with DRM node probing and disables the support for drm_gralloc. The series has been tested on Qemu+AOSP, where a virtio gpu was successfully probed for and opened. This however required adding support in libdrm for virtio gpus, and virtio buses. An initial patch for this can be found here: https://gitlab.collabora.com/robertfoss/libdrm/tree/virtio_rfc Changes since v1: - Added fix for build issue - Do not rely on libdrm for probing - Distinguish between errors and when no drm devices are found Changes since RFC: - Rebased work on the libdrm patch [2]. - Included patch from Rob Herring disabling drm_gralloc/flink support by default. - Added device handler driver probing. Rob Herring (1): egl/android: #ifdef out flink name support Robert Foss (2): gallium/util: Fix build error due to cast to different size egl/android: Add DRM node probing and filtering src/egl/Android.mk| 6 +- src/egl/drivers/dri2/egl_dri2.h | 2 - src/egl/drivers/dri2/platform_android.c | 206 ++ .../auxiliary/util/u_debug_stack_android.cpp | 4 +- 4 files changed, 174 insertions(+), 44 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On Fri, May 25, 2018 at 5:33 PM Robert Foss wrote: > Hey, > On 2018-05-25 02:17, Rob Herring wrote: > > On Thu, May 24, 2018 at 6:23 AM, Robert Foss wrote: > >> Hey, > >> > >> I don't think I've received any feedback on this version yet. > >> If anyone has some time to spare, it would be nice to get it merged. > >> > >> Just to be clear about the libdrm branch linked in the cover letter, > >> it is not required. Only for virgl platforms which happens to be what > >> I tested on. > > > > virgl will still fallback to using the first render node without those > > libdrm changes, right? If not, I don't think we should apply until > > we're not breaking a platform... > No it will not fall back. I agree that holding off makes more sense. What's the reason of this problems? Is it because of drmGetDevices()? Since we don't really use it for anything other than getting the list of render nodes in the system, maybe we could just iterate over any /dev/renderD* nodes explicitly and avoid introducing new problems? Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: update release calendar for 18.1 series
On Tue, 2018-05-22 at 10:48 -0700, Dylan Baker wrote: > This looks good to me. I'm also opened to doing the 18.1.x releases if Emil > would rather not do them (and I have been updating my 18.1-proposed branch > either way). > I'm fine if you do 18.1.x releases. In fact, I think it would be a good idea if the same person takes care of a full release cycle, from feature releases to the stable releases. J.A. > Reviewed-by: Dylan Baker > > Quoting Juan A. Suarez Romero (2018-05-22 00:48:48) > > CC: Andres Gomez > > CC: Emil Velikov > > CC: Dylan Baker > > --- > > > > As per calendar 18.2.0rc1 starts after the last 18.1.x release, either > > we need to update the release calendar for 18.2 series, or extend 18.1 > > series. > > > > > > docs/release-calendar.html | 34 +- > > 1 file changed, 5 insertions(+), 29 deletions(-) > > > > diff --git a/docs/release-calendar.html b/docs/release-calendar.html > > index ba297532dc3..c67eea1a9de 100644 > > --- a/docs/release-calendar.html > > +++ b/docs/release-calendar.html > > @@ -46,50 +46,26 @@ if you'd like to nominate a patch in the next stable > > release. > > Last planned 18.0.x release > > > > > > -18.1 > > -2018-04-20 > > -18.1.0rc1 > > -Dylan Baker > > - > > - > > - > > -2018-04-27 > > -18.1.0rc2 > > -Dylan Baker > > - > > - > > - > > -2018-05-04 > > -18.1.0rc3 > > -Dylan Baker > > - > > - > > - > > -2018-05-11 > > -18.1.0rc4 > > -Dylan Baker > > -Last planned RC/Final release > > - > > - > > -TBD > > +18.1 > > +2018-06-01 > > 18.1.1 > > Emil Velikov > > > > > > > > -TBD > > +2018-06-15 > > 18.1.2 > > Emil Velikov > > > > > > > > -TBD > > +2018-06-29 > > 18.1.3 > > Emil Velikov > > > > > > > > -TBD > > +2018-07-13 > > 18.1.4 > > Emil Velikov > > Last planned RC/Final release > > -- > > 2.17.0 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On 2018-05-25 10:38, Tomasz Figa wrote: On Fri, May 25, 2018 at 5:33 PM Robert Foss wrote: Hey, On 2018-05-25 02:17, Rob Herring wrote: On Thu, May 24, 2018 at 6:23 AM, Robert Foss wrote: Hey, I don't think I've received any feedback on this version yet. If anyone has some time to spare, it would be nice to get it merged. Just to be clear about the libdrm branch linked in the cover letter, it is not required. Only for virgl platforms which happens to be what I tested on. virgl will still fallback to using the first render node without those libdrm changes, right? If not, I don't think we should apply until we're not breaking a platform... No it will not fall back. I agree that holding off makes more sense. What's the reason of this problems? Is it because of drmGetDevices()? Since we don't really use it for anything other than getting the list of render nodes in the system, maybe we could just iterate over any /dev/renderD* nodes explicitly and avoid introducing new problems? That's exactly the problem, and yes we could 100% solve by iterating over /dev/renderD* nodes. I originally assumed we wouldn't want to do that, but rather use the libdrm interfaces. But for the next spin I could avoid using libdrm, should I? Rob. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
On 05/25/2018 04:28 AM, Timothy Arceri wrote: On 25/05/18 11:24, Bas Nieuwenhuizen wrote: On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri wrote: From what I recall with my testing on radeonsi this wasn't really the ideal thing to do. Especially when varyings arrays are accessed via and indirect index, register use very quickly gets out of control. in radv we lower all indirect accesses in nir anyway, so that doesn't really happen in the backend anymore. Thats only for Polaris and higher though, and even then I thought that was an LLVM bug that should eventually be fixed? I don't know, I didn't hit this potential LLVM bug. On 23/05/18 22:31, Samuel Pitoiset wrote: Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_shader.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 7ed5d2a421..84ad215ccb 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device *device, nir_lower_vars_to_ssa(nir); + if (nir->info.stage == MESA_SHADER_VERTEX || + nir->info.stage == MESA_SHADER_GEOMETRY) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, true); + } else if (nir->info.stage == MESA_SHADER_TESS_EVAL|| + nir->info.stage == MESA_SHADER_FRAGMENT) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, false); + } + nir_split_var_copies(nir); nir_lower_var_copies(nir); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
On 25/05/18 19:57, Samuel Pitoiset wrote: On 05/25/2018 04:28 AM, Timothy Arceri wrote: On 25/05/18 11:24, Bas Nieuwenhuizen wrote: On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri wrote: From what I recall with my testing on radeonsi this wasn't really the ideal thing to do. Especially when varyings arrays are accessed via and indirect index, register use very quickly gets out of control. in radv we lower all indirect accesses in nir anyway, so that doesn't really happen in the backend anymore. Thats only for Polaris and higher though, and even then I thought that was an LLVM bug that should eventually be fixed? I don't know, I didn't hit this potential LLVM bug. I just mean isn't that the only reason we lower indirect access for some varyings in RADV/radeonsi? Because of missing support in LLVM. On 23/05/18 22:31, Samuel Pitoiset wrote: Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_shader.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 7ed5d2a421..84ad215ccb 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device *device, nir_lower_vars_to_ssa(nir); + if (nir->info.stage == MESA_SHADER_VERTEX || + nir->info.stage == MESA_SHADER_GEOMETRY) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, true); + } else if (nir->info.stage == MESA_SHADER_TESS_EVAL|| + nir->info.stage == MESA_SHADER_FRAGMENT) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, false); + } + nir_split_var_copies(nir); nir_lower_var_copies(nir); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
On 25/05/18 20:40, Timothy Arceri wrote: On 25/05/18 19:57, Samuel Pitoiset wrote: On 05/25/2018 04:28 AM, Timothy Arceri wrote: On 25/05/18 11:24, Bas Nieuwenhuizen wrote: On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri wrote: From what I recall with my testing on radeonsi this wasn't really the ideal thing to do. Especially when varyings arrays are accessed via and indirect index, register use very quickly gets out of control. in radv we lower all indirect accesses in nir anyway, so that doesn't really happen in the backend anymore. Thats only for Polaris and higher though, and even then I thought that was an LLVM bug that should eventually be fixed? I don't know, I didn't hit this potential LLVM bug. I just mean isn't that the only reason we lower indirect access for some varyings in RADV/radeonsi? Because of missing support in LLVM. Also if I'm recalling correctly I believe the tgsi radeonsi backend does something slightly better to work around that than what the NIR backend and RADV does. On 23/05/18 22:31, Samuel Pitoiset wrote: Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_shader.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 7ed5d2a421..84ad215ccb 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device *device, nir_lower_vars_to_ssa(nir); + if (nir->info.stage == MESA_SHADER_VERTEX || + nir->info.stage == MESA_SHADER_GEOMETRY) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, true); + } else if (nir->info.stage == MESA_SHADER_TESS_EVAL|| + nir->info.stage == MESA_SHADER_FRAGMENT) { + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), true, false); + } + nir_split_var_copies(nir); nir_lower_var_copies(nir); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/16] Added ci yaml file for Gitlab.
On Thursday, 2018-05-24 17:27:04 -0700, Laura Ekstrand wrote: > For now, all this does is copy our current webpage into a public folder. > Daniel Stone has the server configured to check this public folder and > host the index.html as mesa-test.freedesktop.org. When this patch series > is approved, Daniel will change it to point at mesa-3d.org. > --- > .gitlab-ci.yml | 9 + > 1 file changed, 9 insertions(+) > create mode 100644 .gitlab-ci.yml > > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > new file mode 100644 > index 00..29b30541b5 > --- /dev/null > +++ b/.gitlab-ci.yml > @@ -0,0 +1,9 @@ > +pages: > + stage: deploy > + script: > + - mkdir .public > + - cp -r docs/* .public > + - mv .public public I don't think the two-steps thing is needed here; you can drop .public and have everything in public directly. If I'm misunderstanding gitlab-ci and this is running one the same filesystem as the website, then you'll need to `rm -r public` before the move, otherwise `mv .public public` will not do what you want :) > + artifacts: > + paths: > + - public > -- > 2.14.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values
https://bugs.freedesktop.org/show_bug.cgi?id=105464 --- Comment #16 from Samuel Pitoiset --- No CTS regressions, I'm fine with it. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/16] docs: Add python script that converts html to rst.
On Thursday, 2018-05-24 17:27:05 -0700, Laura Ekstrand wrote: > Use Beautiful Soup to fix bad html, then use pandoc for converting to > rst. > --- > docs/rstConverter.py | 23 +++ > 1 file changed, 23 insertions(+) > create mode 100755 docs/rstConverter.py > > diff --git a/docs/rstConverter.py b/docs/rstConverter.py > new file mode 100755 > index 00..5321fdde8b > --- /dev/null > +++ b/docs/rstConverter.py > @@ -0,0 +1,23 @@ > +#!/usr/bin/python3 > +import glob > +import subprocess > +from bs4 import BeautifulSoup > + > +pages = glob.glob("*.html") > +pages += glob.glob("relnotes/*.html") > +for filename in pages: > +# Fix some annoyingly bad html. > +with open(filename) as f: > +soup = BeautifulSoup(f, 'html5lib') > +soup.find("div", "header").extract() # Get rid of old header > +soup.iframe.extract() # Get rid of old contents bar. > +soup.find("div", "content").unwrap() # Strip the content div. Good call on using beautifulsoup to clean the html before converting it! > + > +# Write out the better html. > +with open(filename, 'wt') as f: > +f.write(str(soup)) > + > +# Convert to rst with pandoc. > +name = filename.split(".html")[0] > +bashCmd = "pandoc " + filename + " -o " + name + ".rst" > +subprocess.run(bashCmd.split()) Idea: remove the old html at the same time as we introduce the rst (commit-wise), so that git picks it up as a rename with changes, which hopefully would be easier to check as a 1:1 of any given conversion? (In case this is as unclear as I think it is, I'm thinking about how we can review individual pages conversions; say index.html -> index.rst, to see that no release has been dropped in the process. If git shows this as a rename with changes, I expect it will be easier to check than if one commit creates all the rst files and another deletes all the html) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values
https://bugs.freedesktop.org/show_bug.cgi?id=105464 --- Comment #17 from Nicolai Hähnle --- Great, thanks for testing! -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/16] Added ci yaml file for Gitlab.
Hi Eric, On 25 May 2018 at 12:15, Eric Engestrom wrote: > If I'm misunderstanding gitlab-ci and this is running one the same > filesystem as the website, then you'll need to `rm -r public` before the > move, otherwise `mv .public public` will not do what you want :) It's always run in a fresh container, and the public/ directory is captured from that and installed later behind the scenes. So there's no need to do it here. Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS
On Fri, May 25, 2018 at 12:45 PM, Timothy Arceri wrote: > > > On 25/05/18 20:40, Timothy Arceri wrote: >> >> On 25/05/18 19:57, Samuel Pitoiset wrote: >>> >>> On 05/25/2018 04:28 AM, Timothy Arceri wrote: On 25/05/18 11:24, Bas Nieuwenhuizen wrote: > > On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri > wrote: >> >> >> From what I recall with my testing on radeonsi this wasn't really the >> ideal >> thing to do. Especially when varyings arrays are accessed via and >> indirect >> index, register use very quickly gets out of control. > > > in radv we lower all indirect accesses in nir anyway, so that doesn't > really happen in the backend anymore. Thats only for Polaris and higher though, and even then I thought that was an LLVM bug that should eventually be fixed? >>> >>> >>> I don't know, I didn't hit this potential LLVM bug. >> >> >> I just mean isn't that the only reason we lower indirect access for some >> varyings in RADV/radeonsi? Because of missing support in LLVM. > > > Also if I'm recalling correctly I believe the tgsi radeonsi backend does > something slightly better to work around that than what the NIR backend and > RADV does. So for Vega+ we lower indirect indexing for everything because it is utterly broken in LLVM. for the other GPUs we lower locals, as large vectors + spilling = nightmare. radeonsi solves it by explicitly putting the large arrays in memory. That way you load only one value on an indirect deref instead of loaidng the entire array -> doing the indirect deref -> spilling the entire array. > > > >> >>> >> >> >> >> On 23/05/18 22:31, Samuel Pitoiset wrote: >>> >>> >>> Do not lower FS inputs because this moves all load_var >>> instructions at beginning of shaders and because >>> interp_var_at_sample (and friends) seem broken. That might >>> be eventually enabled later on if we really want to preload >>> all FS inputs at beginning. >>> >>> Polaris10: >>> Totals from affected shaders: >>> SGPRS: 54072 -> 54264 (0.36 %) >>> VGPRS: 38580 -> 38124 (-1.18 %) >>> Spilled SGPRs: 652 -> 652 (0.00 %) >>> Spilled VGPRs: 0 -> 0 (0.00 %) >>> Code Size: 2128116 -> 2127380 (-0.03 %) bytes >>> Max Waves: 8048 -> 8086 (0.47 %) >>> >>> Vega10: >>> Totals from affected shaders: >>> SGPRS: 52616 -> 52656 (0.08 %) >>> VGPRS: 37536 -> 37116 (-1.12 %) >>> Spilled SGPRs: 828 -> 828 (0.00 %) >>> Code Size: 2043756 -> 2042672 (-0.05 %) bytes >>> Max Waves: 9176 -> 9254 (0.85 %) >>> >>> Signed-off-by: Samuel Pitoiset >>> --- >>>src/amd/vulkan/radv_shader.c | 10 ++ >>>1 file changed, 10 insertions(+) >>> >>> diff --git a/src/amd/vulkan/radv_shader.c >>> b/src/amd/vulkan/radv_shader.c >>> index 7ed5d2a421..84ad215ccb 100644 >>> --- a/src/amd/vulkan/radv_shader.c >>> +++ b/src/amd/vulkan/radv_shader.c >>> @@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device >>> *device, >>> nir_lower_vars_to_ssa(nir); >>>+ if (nir->info.stage == MESA_SHADER_VERTEX || >>> + nir->info.stage == MESA_SHADER_GEOMETRY) { >>> + NIR_PASS_V(nir, nir_lower_io_to_temporaries, >>> + nir_shader_get_entrypoint(nir), true, >>> true); >>> + } else if (nir->info.stage == MESA_SHADER_TESS_EVAL|| >>> + nir->info.stage == MESA_SHADER_FRAGMENT) { >>> + NIR_PASS_V(nir, nir_lower_io_to_temporaries, >>> + nir_shader_get_entrypoint(nir), true, >>> false); >>> + } >>> + >>> nir_split_var_copies(nir); >>> nir_lower_var_copies(nir); >>> >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98581] Dota 2 graphics glitch on autocast abilities.
https://bugs.freedesktop.org/show_bug.cgi?id=98581 Samuel Pitoiset changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WORKSFORME --- Comment #1 from Samuel Pitoiset --- I don't think this can still be reproduced. Dota2 and RADV have evolved a lot since the original bug report. I'm going to close it. Feel free to re-open if I'm wrong (and explain how to reproduce). -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] radv: allow radv_emit_shader_pointer_head() to emit more pointers
Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_private.h | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index e554fc7acc..708cacf770 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -1132,9 +1132,11 @@ bool radv_get_memory_fd(struct radv_device *device, static inline void radv_emit_shader_pointer_head(struct radeon_winsys_cs *cs, - unsigned sh_offset, bool use_32bit_pointers) + unsigned sh_offset, unsigned pointer_count, + bool use_32bit_pointers) { - radeon_set_sh_reg_seq(cs, sh_offset, use_32bit_pointers ? 1 : 2); + radeon_emit(cs, PKT3(PKT3_SET_SH_REG, pointer_count * (use_32bit_pointers ? 1 : 2), 0)); + radeon_emit(cs, (sh_offset - SI_SH_REG_OFFSET) >> 2); } static inline void @@ -1159,7 +1161,7 @@ radv_emit_shader_pointer(struct radv_device *device, { bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global; - radv_emit_shader_pointer_head(cs, sh_offset, use_32bit_pointers); + radv_emit_shader_pointer_head(cs, sh_offset, 1, use_32bit_pointers); radv_emit_shader_pointer_body(device, cs, va, use_32bit_pointers); } -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] radv: split radv_emit_shader_pointer()
This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_private.h | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index e2fa58d8d1..e554fc7acc 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -1131,13 +1131,17 @@ bool radv_get_memory_fd(struct radv_device *device, int *pFD); static inline void -radv_emit_shader_pointer(struct radv_device *device, -struct radeon_winsys_cs *cs, -uint32_t sh_offset, uint64_t va, bool global) +radv_emit_shader_pointer_head(struct radeon_winsys_cs *cs, + unsigned sh_offset, bool use_32bit_pointers) { - bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global; - radeon_set_sh_reg_seq(cs, sh_offset, use_32bit_pointers ? 1 : 2); +} + +static inline void +radv_emit_shader_pointer_body(struct radv_device *device, + struct radeon_winsys_cs *cs, + uint64_t va, bool use_32bit_pointers) +{ radeon_emit(cs, va); if (use_32bit_pointers) { @@ -1148,6 +1152,17 @@ radv_emit_shader_pointer(struct radv_device *device, } } +static inline void +radv_emit_shader_pointer(struct radv_device *device, +struct radeon_winsys_cs *cs, +uint32_t sh_offset, uint64_t va, bool global) +{ + bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global; + + radv_emit_shader_pointer_head(cs, sh_offset, use_32bit_pointers); + radv_emit_shader_pointer_body(device, cs, va, use_32bit_pointers); +} + static inline struct radv_descriptor_state * radv_get_descriptors_state(struct radv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point) -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] radv: emit shader descriptor pointers consecutively
This reduces the number of SET_SH_REG packets which are emitted for applications that use more than one descriptor set per stage. We should be able to emit more SET_SH_REG packets consecutively (like push constants and vertex buffers for the vertex stage), but this will be improved later. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_cmd_buffer.c | 104 +-- 1 file changed, 57 insertions(+), 47 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 5ab577b4c5..206d9b7fad 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -594,6 +594,46 @@ radv_emit_userdata_address(struct radv_cmd_buffer *cmd_buffer, base_reg + loc->sgpr_idx * 4, va, false); } +static void +radv_emit_descriptor_pointers(struct radv_cmd_buffer *cmd_buffer, + struct radv_pipeline *pipeline, + struct radv_descriptor_state *descriptors_state, + gl_shader_stage stage) +{ + struct radv_device *device = cmd_buffer->device; + struct radeon_winsys_cs *cs = cmd_buffer->cs; + uint32_t sh_base = pipeline->user_data_0[stage]; + struct radv_userdata_locations *locs = + &pipeline->shaders[stage]->info.user_sgprs_locs; + unsigned mask; + + mask = descriptors_state->dirty & descriptors_state->valid; + + for (int i = 0; i < MAX_SETS; i++) { + struct radv_userdata_info *loc = &locs->descriptor_sets[i]; + if (loc->sgpr_idx != -1 && !loc->indirect) + continue; + mask &= ~(1 << i); + } + + while (mask) { + int start, count; + + u_bit_scan_consecutive_range(&mask, &start, &count); + + struct radv_userdata_info *loc = &locs->descriptor_sets[start]; + unsigned sh_offset = sh_base + loc->sgpr_idx * 4; + + radv_emit_shader_pointer_head(cs, sh_offset, count, true); + for (int i = 0; i < count; i++) { + struct radv_descriptor_set *set = + descriptors_state->sets[start + i]; + + radv_emit_shader_pointer_body(device, cs, set->va, true); + } + } +} + static void radv_update_multisample_state(struct radv_cmd_buffer *cmd_buffer, struct radv_pipeline *pipeline) @@ -1429,47 +1469,6 @@ radv_cmd_buffer_flush_dynamic_state(struct radv_cmd_buffer *cmd_buffer) cmd_buffer->state.dirty &= ~states; } -static void -emit_stage_descriptor_set_userdata(struct radv_cmd_buffer *cmd_buffer, - struct radv_pipeline *pipeline, - int idx, - uint64_t va, - gl_shader_stage stage) -{ - struct radv_userdata_info *desc_set_loc = &pipeline->shaders[stage]->info.user_sgprs_locs.descriptor_sets[idx]; - uint32_t base_reg = pipeline->user_data_0[stage]; - - if (desc_set_loc->sgpr_idx == -1 || desc_set_loc->indirect) - return; - - assert(!desc_set_loc->indirect); - assert(desc_set_loc->num_sgprs == (HAVE_32BIT_POINTERS ? 1 : 2)); - - radv_emit_shader_pointer(cmd_buffer->device, cmd_buffer->cs, -base_reg + desc_set_loc->sgpr_idx * 4, va, false); -} - -static void -radv_emit_descriptor_set_userdata(struct radv_cmd_buffer *cmd_buffer, - VkShaderStageFlags stages, - struct radv_descriptor_set *set, - unsigned idx) -{ - if (cmd_buffer->state.pipeline) { - radv_foreach_stage(stage, stages) { - if (cmd_buffer->state.pipeline->shaders[stage]) - emit_stage_descriptor_set_userdata(cmd_buffer, cmd_buffer->state.pipeline, - idx, set->va, - stage); - } - } - - if (cmd_buffer->state.compute_pipeline && (stages & VK_SHADER_STAGE_COMPUTE_BIT)) - emit_stage_descriptor_set_userdata(cmd_buffer, cmd_buffer->state.compute_pipeline, - idx, set->va, - MESA_SHADER_COMPUTE); -} - static void radv_flush_push_descriptors(struct radv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point) @@ -1551,7 +1550,6 @@ radv_flush_descriptors(struct radv_cmd_buffer *cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS; struct radv_descriptor_state *descriptors_state = radv_get_descriptors_state(cmd_buffer, bind_point); -
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On Fri, May 25, 2018 at 4:15 AM, Robert Foss wrote: > > > On 2018-05-25 10:38, Tomasz Figa wrote: >> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss >> wrote: >> >>> Hey, >> >> >>> On 2018-05-25 02:17, Rob Herring wrote: On Thu, May 24, 2018 at 6:23 AM, Robert Foss >> >> wrote: > > Hey, > > I don't think I've received any feedback on this version yet. > If anyone has some time to spare, it would be nice to get it merged. > > Just to be clear about the libdrm branch linked in the cover letter, > it is not required. Only for virgl platforms which happens to be what > I tested on. virgl will still fallback to using the first render node without those libdrm changes, right? If not, I don't think we should apply until we're not breaking a platform... >> >> >>> No it will not fall back. I agree that holding off makes more sense. >> >> >> What's the reason of this problems? Is it because of drmGetDevices()? >> Since >> we don't really use it for anything other than getting the list of render >> nodes in the system, maybe we could just iterate over any /dev/renderD* >> nodes explicitly and avoid introducing new problems? > > > That's exactly the problem, and yes we could 100% solve by iterating over > /dev/renderD* nodes. I originally assumed we wouldn't want to do that, but > rather use the libdrm interfaces. > > But for the next spin I could avoid using libdrm, should I? I don't have an opinion on libdrm really, but I do think we should fallback to the 1st (only) render node rather than just fail. Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On Fri, May 25, 2018 at 10:59 PM Rob Herring wrote: > On Fri, May 25, 2018 at 4:15 AM, Robert Foss wrote: > > > > > > On 2018-05-25 10:38, Tomasz Figa wrote: > >> > >> On Fri, May 25, 2018 at 5:33 PM Robert Foss > >> wrote: > >> > >>> Hey, > >> > >> > >>> On 2018-05-25 02:17, Rob Herring wrote: > > On Thu, May 24, 2018 at 6:23 AM, Robert Foss < robert.f...@collabora.com> > >> > >> wrote: > > > > Hey, > > > > I don't think I've received any feedback on this version yet. > > If anyone has some time to spare, it would be nice to get it merged. > > > > Just to be clear about the libdrm branch linked in the cover letter, > > it is not required. Only for virgl platforms which happens to be what > > I tested on. > > > virgl will still fallback to using the first render node without those > libdrm changes, right? If not, I don't think we should apply until > we're not breaking a platform... > >> > >> > >>> No it will not fall back. I agree that holding off makes more sense. > >> > >> > >> What's the reason of this problems? Is it because of drmGetDevices()? > >> Since > >> we don't really use it for anything other than getting the list of render > >> nodes in the system, maybe we could just iterate over any /dev/renderD* > >> nodes explicitly and avoid introducing new problems? > > > > > > That's exactly the problem, and yes we could 100% solve by iterating over > > /dev/renderD* nodes. I originally assumed we wouldn't want to do that, but > > rather use the libdrm interfaces. > > > > But for the next spin I could avoid using libdrm, should I? > I don't have an opinion on libdrm really, but I do think we should > fallback to the 1st (only) render node rather than just fail. We do, even with libdrm. AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't include devices on virtio bus in the results, which means that there likely wouldn't be any render node returned. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/16] docs: Fix Sphinx compile errors.
On Thursday, 2018-05-24 17:27:15 -0700, Laura Ekstrand wrote: > This just involves some quick fixes to formatting of the affected pages. > --- > docs/autoconf.rst| 1 + > docs/conf.py | 2 +- > docs/dispatch.rst| 72 > ++-- > docs/egl.rst | 2 ++ > docs/releasing.rst | 14 +- > docs/relnotes.rst| 72 > +++- > docs/relnotes/17.0.5.rst | 2 +- > docs/relnotes/9.2.2.rst | 1 - > 8 files changed, 86 insertions(+), 80 deletions(-) > > diff --git a/docs/autoconf.rst b/docs/autoconf.rst > index 007252feb0..25ba71cf66 100644 > --- a/docs/autoconf.rst > +++ b/docs/autoconf.rst > @@ -102,6 +102,7 @@ There are also a few general options for altering the > Mesa build: > This option ensures that assembly will not be used. > > ``--build=`` > +.. See host > ``--host=`` > By default, the build will compile code for the architecture that > it's running on. In order to build cross-compile Mesa on a x86-64 > diff --git a/docs/conf.py b/docs/conf.py > index dcdbdd51db..33bf717a87 100644 > --- a/docs/conf.py > +++ b/docs/conf.py > @@ -99,7 +99,7 @@ html_theme = 'sphinx_rtd_theme' > # Add any paths that contain custom static files (such as style sheets) here, > # relative to this directory. They are copied after the builtin static files, > # so a file named "default.css" will overwrite the builtin "default.css". > -html_static_path = ['_static'] > +html_static_path = [] > > > # -- Options for HTMLHelp output -- > diff --git a/docs/dispatch.rst b/docs/dispatch.rst > index d6f8542c68..aba7192c31 100644 > --- a/docs/dispatch.rst > +++ b/docs/dispatch.rst > @@ -62,18 +62,17 @@ conceptually simple: > This can be implemented in just a few lines of C code. The file > ``src/mesa/glapi/glapitemp.h`` contains code very similar to this. > > - > +--+ > -| :: > | > -| > | > -| void glVertex3f(GLfloat x, GLfloat y, GLfloat z) > | > -| { > | > -| const struct _glapi_table * const dispatch = GET_DISPATCH(); > | > -| > | > -| (*dispatch->Vertex3f)(x, y, z); > | > -| } > | > - > +--+ > -| Sample dispatch function > | > - > +--+ > +Sample dispatch function > + > + > +.. code-block:: c > + > + void glVertex3f(GLfloat x, GLfloat y, GLfloat z) > + { > + const struct _glapi_table * const dispatch = GET_DISPATCH(); > + > + (*dispatch->Vertex3f)(x, y, z); > + } > > The problem with this simple implementation is the large amount of > overhead that it adds to every GL function call. > @@ -118,16 +117,14 @@ resulting implementation of ``GET_DISPATCH`` is > slightly more complex, > but it avoids the expensive ``pthread_getspecific`` call in the common > case. > > - > +--+ > -| :: > | > -| > | > -| #define GET_DISPATCH() \ > | > -| (_glapi_Dispatch != NULL) \ > | > -| ? _glapi_Dispatch : > pthread_getspecific(&_glapi_Dispatch_key | > -| ) > | > - > +--+ > -| Improved ``GET_DISPATCH`` Implementation > | > - > +--+ > +Improved ``GET_DISPATCH`` Implementation > + > +.. code-block:: c > + > +#define GET_DISPATCH() \ > +(_glapi_Dispatch != NULL) \ > +? _glapi_Dispatch : pthread_getspecific(&_glapi_Dispatch_key) > + > > 3.2. ELF TLS > > @@ -145,16 +142,14 @@ with direct rendering drivers that use either > interface. Once the > pointer is properly declared, ``GET_DISPACH`` becomes a simple variable > reference. > > - > +-
Re: [Mesa-dev] [PATCH 15/16] docs: Human edits to the website code for clarity.
On Thursday, 2018-05-24 17:27:18 -0700, Laura Ekstrand wrote: > There's a lot here. If you're interested, it's mostly whitespace fixes, > switching variable names and function names to the Sphinx orange variable > highlight style, and naming code blocks to take advantage of Pygments > syntax highlighting. > --- > docs/application-issues.rst | 8 +- > docs/autoconf.rst | 9 +- > docs/codingstyle.rst| 36 +++ > docs/conf.py| 2 +- > docs/conform.rst| 2 +- > docs/debugging.rst | 12 +-- > docs/devinfo.rst| 26 ++--- > docs/download.rst | 6 +- > docs/egl.rst| 2 +- > docs/extensions.rst | 42 > docs/faq.rst| 38 +++ > docs/helpwanted.rst | 14 +-- > docs/index.rst | 240 > +++- > docs/install.rst| 64 ++-- > docs/intro.rst | 124 +++ > docs/license.rst| 12 +-- > docs/llvmpipe.rst | 65 +--- > docs/mangling.rst | 4 +- > docs/meson.rst | 18 ++-- > docs/osmesa.rst | 12 +-- > docs/perf.rst | 85 +++- > docs/postprocess.rst| 11 +- > docs/precompiled.rst| 6 +- > docs/release-calendar.rst | 158 ++--- > docs/releasing.rst | 158 ++--- > docs/repository.rst | 59 ++- > docs/shading.rst| 99 -- > docs/sourcetree.rst | 12 +-- > docs/submittingpatches.rst | 123 --- > docs/thanks.rst | 2 +- > docs/versions.rst | 8 +- > docs/viewperf.rst | 94 - > docs/vmware-guest.rst | 146 +-- > docs/xlibdriver.rst | 60 +-- > 34 files changed, 882 insertions(+), 875 deletions(-) > [snip] > diff --git a/docs/conf.py b/docs/conf.py > index 33bf717a87..c6eac2394d 100644 > --- a/docs/conf.py > +++ b/docs/conf.py > @@ -99,7 +99,7 @@ html_theme = 'sphinx_rtd_theme' > # Add any paths that contain custom static files (such as style sheets) here, > # relative to this directory. They are copied after the builtin static files, > # so a file named "default.css" will overwrite the builtin "default.css". > -html_static_path = [] > +html_static_path = ['specs/'] Any reason not to do this right away when creating the file in patch 6? :) That way the corresponding hunk in patch 12 is not necessary either. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/16] docs: Remove unneeded mesa css file.
On Thursday, 2018-05-24 17:27:19 -0700, Laura Ekstrand wrote: > Goodbye old css file. You belong in 1999 from whence you came. > --- > docs/mesa.css | 63 > --- > 1 file changed, 63 deletions(-) > delete mode 100644 docs/mesa.css I guess this could be deleted at the same time as the html files, but it doesn't really matter. I'm quite happy with the new website with its default theme already; we can always spend ages debating the theme style later, right now I'd love for this to land as soon as we start using gitlab :) For patch 1 (the yaml file), with or without my comment (can be done later), patch 2 (the python conversion script, which btw I guess we should probably delete once the conversion is done), patch 7 (the sphinx-build yml line), and 12-16 are: Reviewed-by: Eric Engestrom The rest of the series is: Acked-by: Eric Engestrom Thank you very much for finishing the task many of us gave a shot at, but didn't carry through! > > diff --git a/docs/mesa.css b/docs/mesa.css > deleted file mode 100644 > index 7ab8152b04..00 > --- a/docs/mesa.css > +++ /dev/null > @@ -1,63 +0,0 @@ > -/* Mesa CSS */ > -body { > - background-color: #ff; > - font: 14px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif; > - color: black; > - link: #88; > -} > - > -h1 { > - font: 24px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif; > - font-weight: bold; > - color: black; > -} > - > -h2 { > - font: 18px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif, bold; > - font-weight: bold; > - color: black; > -} > - > -code { > - font-family: monospace; > - font-size: 10pt; > - color: black; > -} > - > - > -pre { > - /*font-family: monospace;*/ > - font-size: 10pt; > - /*color: black;*/ > -} > - > -iframe { > - width: 19em; > - height: 80em; > - border: none; > - float: left; > -} > - > -.content { > - position: absolute; > - left: 20em; > - right: 10px; > - overflow: hidden > -} > - > -.header { > - background: black url('gears.png') 15px no-repeat; > - margin:0; > - padding: 5px; > - clear:both; > -} > - > -.header h1 { > - background: url('gears.png') right no-repeat; > - color: white; > - font: x-large sans-serif; > - text-align: center; > - height: 50px; > - margin: 0; > - padding-top: 30px; > -} > -- > 2.14.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.0 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests
https://bugs.freedesktop.org/show_bug.cgi?id=106644 --- Comment #9 from Ben Crocker --- We note that this is a build for a PPC970, which is essentially a big-endian ~Power4 equivalent to a G5 Mac. Moreover, it appears to be a 32-bit build. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/x11: Move dri2_format_for_depth prototype.
On Friday, 2018-05-25 06:52:25 +, Vinson Lee wrote: > Fix build error without DRI3. D'uh! I forgot building dri3 was optional, sorry :/ Reviewed-by: Eric Engestrom > > CC drivers/dri2/platform_x11.lo > drivers/dri2/platform_x11.c:1010:1: error: no previous prototype for function > 'dri2_format_for_depth' [-Werror,-Wmissing-prototypes] > dri2_format_for_depth(uint32_t depth) > ^ > > Fixes: 473af0b541b2 ("egl/x11: deduplicate depth-to-format logic") > Signed-off-by: Vinson Lee > --- > src/egl/drivers/dri2/egl_dri2.h | 3 +++ > src/egl/drivers/dri2/platform_x11_dri3.h | 3 --- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h > index adabc527f85b..b91a899e476c 100644 > --- a/src/egl/drivers/dri2/egl_dri2.h > +++ b/src/egl/drivers/dri2/egl_dri2.h > @@ -523,4 +523,7 @@ dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, > EGLint type, > void > dri2_fini_surface(_EGLSurface *surf); > > +uint32_t > +dri2_format_for_depth(uint32_t depth); > + > #endif /* EGL_DRI2_INCLUDED */ > diff --git a/src/egl/drivers/dri2/platform_x11_dri3.h > b/src/egl/drivers/dri2/platform_x11_dri3.h > index e6fd01366978..96e7ee972d9f 100644 > --- a/src/egl/drivers/dri2/platform_x11_dri3.h > +++ b/src/egl/drivers/dri2/platform_x11_dri3.h > @@ -38,7 +38,4 @@ extern struct dri2_egl_display_vtbl dri3_x11_display_vtbl; > EGLBoolean > dri3_x11_connect(struct dri2_egl_display *dri2_dpy); > > -uint32_t > -dri2_format_for_depth(uint32_t depth); > - > #endif > -- > 2.17.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/7] swr/rast: Added in-place building to SCATTERPS
SCATTERPS previously assumed it was being used with an existing basic block --- .../drivers/swr/rasterizer/jitter/builder_mem.cpp | 29 +++--- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp index 6e17888..77c2095 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp @@ -617,17 +617,28 @@ namespace SwrJit Value* pIsUndef = ICMP_EQ(pIndex, C(32)); -// Split current block -BasicBlock* pPostLoop = pCurBB->splitBasicBlock(cast(pIsUndef)->getNextNode()); +// Split current block or create new one if building inline +BasicBlock* pPostLoop; +if (pCurBB->getTerminator()) +{ +pPostLoop = pCurBB->splitBasicBlock(cast(pIsUndef)->getNextNode()); -// Remove unconditional jump created by splitBasicBlock -pCurBB->getTerminator()->eraseFromParent(); +// Remove unconditional jump created by splitBasicBlock +pCurBB->getTerminator()->eraseFromParent(); -// Add terminator to end of original block -IRB()->SetInsertPoint(pCurBB); +// Add terminator to end of original block +IRB()->SetInsertPoint(pCurBB); -// Add conditional branch -COND_BR(pIsUndef, pPostLoop, pLoop); +// Add conditional branch +COND_BR(pIsUndef, pPostLoop, pLoop); +} +else +{ +pPostLoop = BasicBlock::Create(mpJitMgr->mContext, "PostScatter_Loop", pFunc); + +// Add conditional branch +COND_BR(pIsUndef, pPostLoop, pLoop); +} // Add loop basic block contents IRB()->SetInsertPoint(pLoop); @@ -642,7 +653,7 @@ namespace SwrJit Value* pOffsetElem = LOADV(pOffsetsArrayPtr, { pIndexPhi }); // GEP to this offset in dst -Value* pCurDst = GEP(pDst, pOffsetElem); +Value* pCurDst = GEP(pDst, pOffsetElem, mInt8PtrTy); pCurDst = POINTER_CAST(pCurDst, PointerType::get(pSrcTy, 0)); STORE(pSrcElem, pCurDst); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/7] InitMemory inclusion
Version 2 makes a small change to swr_loader.cpp to include the new InitMemory header, which fixes a compile error on single-architecture builds. Alok Hota (7): swr/rast: Added in-place building to SCATTERPS swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at compile time swr/rast: Use metadata to communicate between passes swr/rast: Renamed MetaData calls swr/rast: Removed superfluous JitManager argument from passes swr/rast: Moved memory init out of core swr init swr/rast: Adjusted avx512 primitive assembly for msvc codegen src/gallium/drivers/swr/Makefile.sources | 4 +- src/gallium/drivers/swr/meson.build| 2 + src/gallium/drivers/swr/rasterizer/core/api.cpp| 4 - src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 +++-- .../drivers/swr/rasterizer/core/rdtsc_core.cpp | 1 + src/gallium/drivers/swr/rasterizer/core/state.h| 3 +- .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 2 +- .../drivers/swr/rasterizer/jitter/builder.cpp | 170 ++--- .../drivers/swr/rasterizer/jitter/builder.h| 32 +++- .../drivers/swr/rasterizer/jitter/builder_mem.cpp | 29 ++-- .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 2 +- .../rasterizer/jitter/functionpasses/lower_x86.cpp | 17 +-- .../swr/rasterizer/jitter/functionpasses/passes.h | 2 +- .../swr/rasterizer/jitter/streamout_jit.cpp| 2 +- .../drivers/swr/rasterizer/memory/InitMemory.cpp | 39 + .../drivers/swr/rasterizer/memory/InitMemory.h | 33 src/gallium/drivers/swr/swr_loader.cpp | 8 +- src/gallium/drivers/swr/swr_shader.cpp | 2 +- 18 files changed, 325 insertions(+), 166 deletions(-) create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/7] swr/rast: Use metadata to communicate between passes
--- .../drivers/swr/rasterizer/jitter/builder.h| 28 ++ 1 file changed, 28 insertions(+) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.h b/src/gallium/drivers/swr/rasterizer/jitter/builder.h index 6ca128d..08a3a6e 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/builder.h +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.h @@ -124,6 +124,34 @@ namespace SwrJit bool SetTexelMaskEvaluate(Instruction* inst); bool IsTexelMaskEvaluate(Instruction* inst); Type* GetVectorType(Type* pType); +void SetMetadata(StringRef s, uint32_t val) +{ +llvm::NamedMDNode *metaData = mpJitMgr->mpCurrentModule->getOrInsertNamedMetadata(s); +Constant* cval = mpIRBuilder->getInt32(val); +llvm::MDNode *mdNode = llvm::MDNode::get(mpJitMgr->mpCurrentModule->getContext(), llvm::ConstantAsMetadata::get(cval)); +if (metaData->getNumOperands()) +{ +metaData->setOperand(0, mdNode); +} +else +{ +metaData->addOperand(mdNode); +} +} +uint32_t GetMetadata(StringRef s) +{ +NamedMDNode* metaData = mpJitMgr->mpCurrentModule->getNamedMetadata(s); +if (metaData) +{ +MDNode* mdNode = metaData->getOperand(0); +Metadata* val = mdNode->getOperand(0); +return mdconst::dyn_extract(val)->getZExtValue(); +} +else +{ +return 0; +} +} #include "gen_builder.hpp" #include "gen_builder_meta.hpp" -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/7] swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at compile time
--- src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp b/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp index f289a31..48ea397 100644 --- a/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp @@ -89,6 +89,7 @@ BUCKET_DESC gCoreBuckets[] = { { "BEStoreTiles", "", true, 0xff00 }, { "BEEndTile", "", false, 0x }, }; +static_assert(NumBuckets == (sizeof(gCoreBuckets) / sizeof(gCoreBuckets[0])), "RDTSC Bucket enum and description table size mismatched."); /// @todo bucketmanager and mapping should probably be a part of the SWR context std::vector gBucketMap; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/7] swr/rast: Removed superfluous JitManager argument from passes
--- src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp | 2 +- src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 2 +- .../swr/rasterizer/jitter/functionpasses/lower_x86.cpp | 17 - .../swr/rasterizer/jitter/functionpasses/passes.h | 2 +- .../drivers/swr/rasterizer/jitter/streamout_jit.cpp | 2 +- src/gallium/drivers/swr/swr_shader.cpp | 2 +- 6 files changed, 13 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp index 72bf900..20f2e42 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp @@ -819,7 +819,7 @@ struct BlendJit : public Builder passes.add(createSCCPPass()); passes.add(createAggressiveDCEPass()); -passes.add(createLowerX86Pass(JM(), this)); +passes.add(createLowerX86Pass(this)); passes.run(*blendFunc); diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp index 7b0b80a..0abcd1a 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp @@ -269,7 +269,7 @@ Function* FetchJit::Create(const FETCH_COMPILE_STATE& fetchState) optPasses.run(*fetch); -optPasses.add(createLowerX86Pass(JM(), this)); +optPasses.add(createLowerX86Pass(this)); optPasses.run(*fetch); JitManager::DumpToFile(fetch, "opt"); diff --git a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp index 5a69eae..f2bd888 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp @@ -136,21 +136,21 @@ namespace SwrJit struct LowerX86 : public FunctionPass { -LowerX86(JitManager* pJitMgr = nullptr, Builder* b = nullptr) -: FunctionPass(ID), mpJitMgr(pJitMgr), B(b) +LowerX86(Builder* b = nullptr) +: FunctionPass(ID), B(b) { initializeLowerX86Pass(*PassRegistry::getPassRegistry()); // Determine target arch -if (mpJitMgr->mArch.AVX512F()) +if (JM()->mArch.AVX512F()) { mTarget = AVX512; } -else if (mpJitMgr->mArch.AVX2()) +else if (JM()->mArch.AVX2()) { mTarget = AVX2; } -else if (mpJitMgr->mArch.AVX()) +else if (JM()->mArch.AVX()) { mTarget = AVX; @@ -356,9 +356,8 @@ namespace SwrJit { } -JitManager* JM() { return mpJitMgr; } +JitManager* JM() { return B->JM(); } -JitManager* mpJitMgr; Builder* B; TargetArch mTarget; @@ -368,9 +367,9 @@ namespace SwrJit char LowerX86::ID = 0; // LLVM uses address of ID as the actual ID. -FunctionPass* createLowerX86Pass(JitManager* pJitMgr, Builder* b) +FunctionPass* createLowerX86Pass(Builder* b) { -return new LowerX86(pJitMgr, b); +return new LowerX86(b); } Instruction* NO_EMU(LowerX86* pThis, TargetArch arch, TargetWidth width, CallInst* pCallInst) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h index f7373f0..95ef4bc 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h +++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h @@ -33,5 +33,5 @@ namespace SwrJit { using namespace llvm; -FunctionPass* createLowerX86Pass(JitManager* pJitMgr, Builder* b); +FunctionPass* createLowerX86Pass(Builder* b); } diff --git a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp index f804900..cb2e3ae 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp @@ -307,7 +307,7 @@ struct StreamOutJit : public Builder passes.add(createSCCPPass()); passes.add(createAggressiveDCEPass()); -passes.add(createLowerX86Pass(JM(), this)); +passes.add(createLowerX86Pass(this)); passes.run(*soFunc); diff --git a/src/gallium/drivers/swr/swr_shader.cpp b/src/gallium/drivers/swr/swr_shader.cpp index 13d8986..afa184f 100644 --- a/src/gallium/drivers/swr/swr_shader.cpp +++ b/src/gallium/drivers/swr/swr_shader.cpp @@ -1402,7 +1402,7 @@ BuilderSWR::CompileFS(struct swr_context *ctx, swr_jit_fs_key &key) // after the gallivm passes, we have to lower the core's intrinsics llvm::legacy::FunctionPassManager lowerPass(JM()->mpCurrentModul
[Mesa-dev] [PATCH v2 4/7] swr/rast: Renamed MetaData calls
--- .../drivers/swr/rasterizer/jitter/builder.cpp | 170 ++--- .../drivers/swr/rasterizer/jitter/builder.h| 4 +- 2 files changed, 87 insertions(+), 87 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp index e1c5d80..4b06aaa 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp @@ -1,32 +1,32 @@ / -* Copyright (C) 2014-2015 Intel Corporation. All Rights Reserved. -* -* Permission is hereby granted, free of charge, to any person obtaining a -* copy of this software and associated documentation files (the "Software"), -* to deal in the Software without restriction, including without limitation -* the rights to use, copy, modify, merge, publish, distribute, sublicense, -* and/or sell copies of the Software, and to permit persons to whom the -* Software is furnished to do so, subject to the following conditions: -* -* The above copyright notice and this permission notice (including the next -* paragraph) shall be included in all copies or substantial portions of the -* Software. -* -* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL -* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING -* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS -* IN THE SOFTWARE. -* -* @file builder.h -* -* @brief Includes all the builder related functionality -* -* Notes: -* -**/ + * Copyright (C) 2014-2015 Intel Corporation. All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * @file builder.h + * + * @brief Includes all the builder related functionality + * + * Notes: + * + **/ #include "jit_pch.hpp" #include "builder.h" @@ -38,11 +38,9 @@ namespace SwrJit // /// @brief Contructor for Builder. /// @param pJitMgr - JitManager which contains modules, function passes, etc. -Builder::Builder(JitManager *pJitMgr) -: mpJitMgr(pJitMgr), - mpPrivateContext(nullptr) +Builder::Builder(JitManager *pJitMgr) : mpJitMgr(pJitMgr), mpPrivateContext(nullptr) { -mVWidth = pJitMgr->mVWidth; +mVWidth = pJitMgr->mVWidth; mVWidth16 = 16; mpIRBuilder = &pJitMgr->mBuilder; @@ -70,29 +68,29 @@ namespace SwrJit // Built in types: simd16 -mSimd16Int1Ty = VectorType::get(mInt1Ty, mVWidth16); -mSimd16Int16Ty = VectorType::get(mInt16Ty, mVWidth16); -mSimd16Int32Ty = VectorType::get(mInt32Ty, mVWidth16); -mSimd16Int64Ty = VectorType::get(mInt64Ty, mVWidth16); -mSimd16FP16Ty = VectorType::get(mFP16Ty, mVWidth16); -mSimd16FP32Ty = VectorType::get(mFP32Ty, mVWidth16); -mSimd16VectorTy = ArrayType::get(mSimd16FP32Ty, 4); -mSimd16VectorTRTy = ArrayType::get(mSimd16FP32Ty, 5); +mSimd16Int1Ty = VectorType::get(mInt1Ty, mVWidth16); +mSimd16Int16Ty= VectorType::get(mInt16Ty, mVWidth16); +mSimd16Int32Ty= VectorType::get(mInt32Ty, mVWidth16); +mSimd16Int64Ty= VectorType::get(mInt64Ty, mVWidth16); +mSimd16FP16Ty = VectorType::get(mFP16Ty, mVWidth16); +mSimd16FP32Ty = VectorType::get(mFP32Ty, mVWidth16); +mSimd16Vect
[Mesa-dev] [PATCH v2 7/7] swr/rast: Adjusted avx512 primitive assembly for msvc codegen
Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by about 4x, MSVC compiler was going crazy making temporaries and split-loading inputs onto the stack unless explicit AVX-512 load ops were added --- src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 + 1 file changed, 90 insertions(+), 49 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp b/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp index 64a90c7..4f89e0c 100644 --- a/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp @@ -755,36 +755,51 @@ bool PaTriList1_simd16(PA_STATE_OPT& pa, uint32_t slot, simd16vector verts[]) bool PaTriList2_simd16(PA_STATE_OPT& pa, uint32_t slot, simd16vector verts[]) { -#if KNOB_ARCH == KNOB_ARCH_AVX -simd16scalar perm0 = _simd16_setzero_ps(); -simd16scalar perm1 = _simd16_setzero_ps(); -simd16scalar perm2 = _simd16_setzero_ps(); -#elif KNOB_ARCH >= KNOB_ARCH_AVX2 +#if KNOB_ARCH >= KNOB_ARCH_AVX2 const simd16scalari perm0 = _simd16_set_epi32(13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3, 0); const simd16scalari perm1 = _simd16_set_epi32(14, 11, 8, 5, 2, 15, 12, 9, 6, 3, 0, 13, 10, 7, 4, 1); const simd16scalari perm2 = _simd16_set_epi32(15, 12, 9, 6, 3, 0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2); +#else // KNOB_ARCH == KNOB_ARCH_AVX +simd16scalar perm0 = _simd16_setzero_ps(); +simd16scalar perm1 = _simd16_setzero_ps(); +simd16scalar perm2 = _simd16_setzero_ps(); #endif const simd16vector &a = PaGetSimdVector_simd16(pa, 0, slot); const simd16vector &b = PaGetSimdVector_simd16(pa, 1, slot); const simd16vector &c = PaGetSimdVector_simd16(pa, 2, slot); -simd16vector &v0 = verts[0]; -simd16vector &v1 = verts[1]; -simd16vector &v2 = verts[2]; +const simd16mask mask0 = 0x4924; +const simd16mask mask1 = 0x2492; +const simd16mask mask2 = 0x9249; // v0 -> a0 a3 a6 a9 aC aF b2 b5 b8 bB bE c1 c4 c7 cA cD // v1 -> a1 a4 a7 aA aD b0 b3 b6 b9 bC bF c2 c5 c8 cB cE // v2 -> a2 a5 a8 aB aE b1 b4 b7 bA bD c0 c3 c6 c9 cC cF +simd16vector &v0 = verts[0]; +simd16vector &v1 = verts[1]; +simd16vector &v2 = verts[2]; + // for simd16 x, y, z, and w for (int i = 0; i < 4; i += 1) { -simd16scalar temp0 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 0x4924), c[i], 0x2492); -simd16scalar temp1 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 0x9249), c[i], 0x4924); -simd16scalar temp2 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 0x2492), c[i], 0x9249); +simd16scalar tempa = _simd16_loadu_ps(reinterpret_cast(&a[i])); +simd16scalar tempb = _simd16_loadu_ps(reinterpret_cast(&b[i])); +simd16scalar tempc = _simd16_loadu_ps(reinterpret_cast(&c[i])); + +simd16scalar temp0 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, mask0), tempc, mask1); +simd16scalar temp1 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, mask2), tempc, mask0); +simd16scalar temp2 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, mask1), tempc, mask2); + +#if KNOB_ARCH >= KNOB_ARCH_AVX2 +v0[i] = _simd16_permute_ps(temp0, perm0); +v1[i] = _simd16_permute_ps(temp1, perm1); +v2[i] = _simd16_permute_ps(temp2, perm2); +#else // #if KNOB_ARCH == KNOB_ARCH_AVX + +// the general permutes (above) are prohibitively slow to emulate on AVX (its scalar code) -#if KNOB_ARCH == KNOB_ARCH_AVX temp0 = _simd16_permute_ps_i(temp0, 0x6C); // (0, 3, 2, 1) => 00 11 01 10 => 0x6C perm0 = _simd16_permute2f128_ps(temp0, temp0, 0xB1);// (1, 0, 3, 2) => 01 00 11 10 => 0xB1 temp0 = _simd16_blend_ps(temp0, perm0, 0x); // 0010 0010 0010 0010 @@ -802,10 +817,6 @@ bool PaTriList2_simd16(PA_STATE_OPT& pa, uint32_t slot, simd16vector verts[]) temp2 = _simd16_blend_ps(temp2, perm2, 0x); // 0100 0100 0100 0100 perm2 = _simd16_permute2f128_ps(temp2, temp2, 0x4E);// (2, 3, 0, 1) => 10 11 00 01 => 0x4E v2[i] = _simd16_blend_ps(temp2, perm2, 0x1C1C); // 0011 1000 0011 1000 -#elif KNOB_ARCH >= KNOB_ARCH_AVX2 -v0[i] = _simd16_permute_ps(temp0, perm0); -v1[i] = _simd16_permute_ps(temp1, perm1); -v2[i] = _simd16_permute_ps(temp2, perm2); #endif } @@ -1056,26 +1067,31 @@ bool PaTriStrip1_simd16(PA_STATE_OPT& pa, uint32_t slot, simd16vector verts[]) const simd16vector &a = PaGetSimdVector_simd16(pa, pa.prev, slot); const simd16vector &b = PaGetSimdVector_simd16(pa, pa.cur, slot); -simd16vector &v0 = verts[0]; -simd16vector &v1 = verts[1]; -simd16vector &v2 = verts[2]; +const simd16mask mask0 = 0xF000; // v0 -> a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aA aB aC aD aE aF // v1 -> a1 a3 a3 a5 a5 a7 a7 a9 a9 aB aB aD aD aF aF b1 // v2 -> a2 a2 a4 a4 a6 a6 a8 a8 aA aA aC aC aE aE b0 b0
[Mesa-dev] [PATCH v2 6/7] swr/rast: Moved memory init out of core swr init
Added two new files for a wrapper function for initialization v2: added missing include for single architecture builds --- src/gallium/drivers/swr/Makefile.sources | 4 ++- src/gallium/drivers/swr/meson.build| 2 ++ src/gallium/drivers/swr/rasterizer/core/api.cpp| 4 --- src/gallium/drivers/swr/rasterizer/core/state.h| 3 +- .../drivers/swr/rasterizer/memory/InitMemory.cpp | 39 ++ .../drivers/swr/rasterizer/memory/InitMemory.h | 33 ++ src/gallium/drivers/swr/swr_loader.cpp | 8 - 7 files changed, 86 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h diff --git a/src/gallium/drivers/swr/Makefile.sources b/src/gallium/drivers/swr/Makefile.sources index 6753d50..b298356 100644 --- a/src/gallium/drivers/swr/Makefile.sources +++ b/src/gallium/drivers/swr/Makefile.sources @@ -177,4 +177,6 @@ MEMORY_CXX_SOURCES := \ rasterizer/memory/StoreTile_TileY2.cpp \ rasterizer/memory/StoreTile_TileY.cpp \ rasterizer/memory/TilingFunctions.h \ - rasterizer/memory/tilingtraits.h + rasterizer/memory/tilingtraits.h \ + rasterizer/memory/InitMemory.cpp \ + rasterizer/memory/InitMemory.h diff --git a/src/gallium/drivers/swr/meson.build b/src/gallium/drivers/swr/meson.build index 9b272aa..b95c8bc 100644 --- a/src/gallium/drivers/swr/meson.build +++ b/src/gallium/drivers/swr/meson.build @@ -151,6 +151,8 @@ files_swr_arch = files( 'rasterizer/memory/StoreTile_TileY.cpp', 'rasterizer/memory/TilingFunctions.h', 'rasterizer/memory/tilingtraits.h', + 'rasterizer/memory/InitMemory.h', + 'rasterizer/memory/InitMemory.cpp', ) swr_context_files = files('swr_context.h') diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp b/src/gallium/drivers/swr/rasterizer/core/api.cpp index 47f3633..c932ec0 100644 --- a/src/gallium/drivers/swr/rasterizer/core/api.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp @@ -1728,10 +1728,6 @@ void InitBackendFuncTables(); /// @brief Initialize swr backend and memory internal tables void SwrInit() { -InitSimLoadTilesTable(); -InitSimStoreTilesTable(); -InitSimClearTilesTable(); - InitClearTilesTable(); InitBackendFuncTables(); InitRasterizerFunctions(); diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h b/src/gallium/drivers/swr/rasterizer/core/state.h index c26dabe..9db17ee 100644 --- a/src/gallium/drivers/swr/rasterizer/core/state.h +++ b/src/gallium/drivers/swr/rasterizer/core/state.h @@ -29,10 +29,11 @@ #include "common/formats.h" #include "common/intrin.h" -using gfxptr_t = unsigned long long; #include #include +using gfxptr_t = unsigned long long; + // /// PRIMITIVE_TOPOLOGY. // diff --git a/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp b/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp new file mode 100644 index 000..bff96e1 --- /dev/null +++ b/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp @@ -0,0 +1,39 @@ +/ +* Copyright (C) 2018 Intel Corporation. All Rights Reserved. +* +* Permission is hereby granted, free of charge, to any person obtaining a +* copy of this software and associated documentation files (the "Software"), +* to deal in the Software without restriction, including without limitation +* the rights to use, copy, modify, merge, publish, distribute, sublicense, +* and/or sell copies of the Software, and to permit persons to whom the +* Software is furnished to do so, subject to the following conditions: +* +* The above copyright notice and this permission notice (including the next +* paragraph) shall be included in all copies or substantial portions of the +* Software. +* +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +* IN THE SOFTWARE. +* +* @file InitMemory.cpp +* +* @brief Provide access to tiles table initialization functions +* +**/ +#include "memory/InitMemory.h" + +void InitSimLoadTilesTable(); +void InitSimStoreTilesTable(); +void InitSimClearTilesTable(); + +void InitTilesTable() +{ +InitSimLoadTilesTable(); +InitSimStoreTilesTable(); +InitSimClearTilesTable(); +} diff --git
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On Fri, May 25, 2018 at 9:25 AM, Tomasz Figa wrote: > On Fri, May 25, 2018 at 10:59 PM Rob Herring wrote: > >> On Fri, May 25, 2018 at 4:15 AM, Robert Foss > wrote: >> > >> > >> > On 2018-05-25 10:38, Tomasz Figa wrote: >> >> >> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss >> >> wrote: >> >> >> >>> Hey, >> >> >> >> >> >>> On 2018-05-25 02:17, Rob Herring wrote: >> >> On Thu, May 24, 2018 at 6:23 AM, Robert Foss < > robert.f...@collabora.com> >> >> >> >> wrote: >> > >> > Hey, >> > >> > I don't think I've received any feedback on this version yet. >> > If anyone has some time to spare, it would be nice to get it merged. >> > >> > Just to be clear about the libdrm branch linked in the cover letter, >> > it is not required. Only for virgl platforms which happens to be > what >> > I tested on. >> >> >> virgl will still fallback to using the first render node without > those >> libdrm changes, right? If not, I don't think we should apply until >> we're not breaking a platform... >> >> >> >> >> >>> No it will not fall back. I agree that holding off makes more sense. >> >> >> >> >> >> What's the reason of this problems? Is it because of drmGetDevices()? >> >> Since >> >> we don't really use it for anything other than getting the list of > render >> >> nodes in the system, maybe we could just iterate over any /dev/renderD* >> >> nodes explicitly and avoid introducing new problems? >> > >> > >> > That's exactly the problem, and yes we could 100% solve by iterating > over >> > /dev/renderD* nodes. I originally assumed we wouldn't want to do that, > but >> > rather use the libdrm interfaces. >> > >> > But for the next spin I could avoid using libdrm, should I? > >> I don't have an opinion on libdrm really, but I do think we should >> fallback to the 1st (only) render node rather than just fail. > > We do, even with libdrm. > > AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't > include devices on virtio bus in the results, which means that there likely > wouldn't be any render node returned. Okay. I still don't get why we search by bus in the first place. Who cares what bus the gpu sits on. Now I have an opinion. We should just iterate over render nodes matching by name or use the first node if we don't have a set name. Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/7] InitMemory inclusion
V2: Good catch! Reviewed-by: Bruce Cherniak > On May 25, 2018, at 10:19 AM, Alok Hota wrote: > > Version 2 makes a small change to swr_loader.cpp to include the new InitMemory > header, which fixes a compile error on single-architecture builds. > > Alok Hota (7): > swr/rast: Added in-place building to SCATTERPS > swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at >compile time > swr/rast: Use metadata to communicate between passes > swr/rast: Renamed MetaData calls > swr/rast: Removed superfluous JitManager argument from passes > swr/rast: Moved memory init out of core swr init > swr/rast: Adjusted avx512 primitive assembly for msvc codegen > > src/gallium/drivers/swr/Makefile.sources | 4 +- > src/gallium/drivers/swr/meson.build| 2 + > src/gallium/drivers/swr/rasterizer/core/api.cpp| 4 - > src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 +++-- > .../drivers/swr/rasterizer/core/rdtsc_core.cpp | 1 + > src/gallium/drivers/swr/rasterizer/core/state.h| 3 +- > .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 2 +- > .../drivers/swr/rasterizer/jitter/builder.cpp | 170 ++--- > .../drivers/swr/rasterizer/jitter/builder.h| 32 +++- > .../drivers/swr/rasterizer/jitter/builder_mem.cpp | 29 ++-- > .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 2 +- > .../rasterizer/jitter/functionpasses/lower_x86.cpp | 17 +-- > .../swr/rasterizer/jitter/functionpasses/passes.h | 2 +- > .../swr/rasterizer/jitter/streamout_jit.cpp| 2 +- > .../drivers/swr/rasterizer/memory/InitMemory.cpp | 39 + > .../drivers/swr/rasterizer/memory/InitMemory.h | 33 > src/gallium/drivers/swr/swr_loader.cpp | 8 +- > src/gallium/drivers/swr/swr_shader.cpp | 2 +- > 18 files changed, 325 insertions(+), 166 deletions(-) > create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp > create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.0 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests
https://bugs.freedesktop.org/show_bug.cgi?id=106644 --- Comment #10 from erhar...@mailbox.org --- Correct. Should I note (potential?) ppc specific issues in the bug title too, besides selecting hardware "PowerPC"? llvmpipe does run on my other G5 (64bit build), however a lot of piglet tests fail/segfault (see bug #105730). Don't know yet if it's a regression, but I could build and run the tests a few major releases back on both G5. If it turns out to be a regression I will at least try to bisect it and see how far I get. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] util/u_math: Implement a logbase2 function for unsigned long
On Thursday, 2018-05-24 11:47:52 +0200, Karol Herbst wrote: > From: Pierre Moreau > > v2 (Karol Herbst ): > * removed unneeded ll > * ll -> ull Reviewed-by: Eric Engestrom > > Signed-off-by: Karol Herbst > --- > src/gallium/auxiliary/util/u_math.h | 55 + > src/util/bitscan.h | 11 ++ > 2 files changed, 66 insertions(+) > > diff --git a/src/gallium/auxiliary/util/u_math.h > b/src/gallium/auxiliary/util/u_math.h > index 46d02978fd6..79869a119af 100644 > --- a/src/gallium/auxiliary/util/u_math.h > +++ b/src/gallium/auxiliary/util/u_math.h > @@ -421,6 +421,23 @@ util_logbase2(unsigned n) > #endif > } > > +static inline uint64_t > +util_logbase2_64(uint64_t n) > +{ > +#if defined(HAVE___BUILTIN_CLZLL) > + return ((sizeof(uint64_t) * 8 - 1) - __builtin_clzll(n | 1)); > +#else > + uint64_t pos = 0ull; > + if (n >= 1ull<<32) { n >>= 32; pos += 32; } > + if (n >= 1ull<<16) { n >>= 16; pos += 16; } > + if (n >= 1ull<< 8) { n >>= 8; pos += 8; } > + if (n >= 1ull<< 4) { n >>= 4; pos += 4; } > + if (n >= 1ull<< 2) { n >>= 2; pos += 2; } > + if (n >= 1ull<< 1) { pos += 1; } > + return pos; > +#endif > +} > + > /** > * Returns the ceiling of log n base 2, and 0 when n == 0. Equivalently, > * returns the smallest x such that n <= 2**x. > @@ -434,6 +451,15 @@ util_logbase2_ceil(unsigned n) > return 1 + util_logbase2(n - 1); > } > > +static inline uint64_t > +util_logbase2_ceil64(uint64_t n) > +{ > + if (n <= 1) > + return 0; > + > + return 1ull + util_logbase2_64(n - 1); > +} > + > /** > * Returns the smallest power of two >= x > */ > @@ -465,6 +491,35 @@ util_next_power_of_two(unsigned x) > #endif > } > > +static inline uint64_t > +util_next_power_of_two64(uint64_t x) > +{ > +#if defined(HAVE___BUILTIN_CLZLL) > + if (x <= 1) > + return 1; > + > + return (1ull << ((sizeof(uint64_t) * 8) - __builtin_clzll(x - 1))); > +#else > + uint64_t val = x; > + > + if (x <= 1) > + return 1; > + > + if (util_is_power_of_two_or_zero64(x)) > + return x; > + > + val--; > + val = (val >> 1) | val; > + val = (val >> 2) | val; > + val = (val >> 4) | val; > + val = (val >> 8) | val; > + val = (val >> 16) | val; > + val = (val >> 32) | val; > + val++; > + return val; > +#endif > +} > + > > /** > * Return number of bits set in n. > diff --git a/src/util/bitscan.h b/src/util/bitscan.h > index 5cc75f0beba..dc89ac93f28 100644 > --- a/src/util/bitscan.h > +++ b/src/util/bitscan.h > @@ -123,6 +123,17 @@ util_is_power_of_two_or_zero(unsigned v) > return (v & (v - 1)) == 0; > } > > +/* Determine if an uint64_t value is a power of two. > + * > + * \note > + * Zero is treated as a power of two. > + */ > +static inline bool > +util_is_power_of_two_or_zero64(uint64_t v) > +{ > + return (v & (v - 1)) == 0; > +} > + > /* Determine if an unsigned value is a power of two. > * > * \note > -- > 2.17.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs
On Sat, May 26, 2018 at 12:38 AM Rob Herring wrote: > On Fri, May 25, 2018 at 9:25 AM, Tomasz Figa wrote: > > On Fri, May 25, 2018 at 10:59 PM Rob Herring wrote: > > > >> On Fri, May 25, 2018 at 4:15 AM, Robert Foss > wrote: > >> > > >> > > >> > On 2018-05-25 10:38, Tomasz Figa wrote: > >> >> > >> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss < robert.f...@collabora.com> > >> >> wrote: > >> >> > >> >>> Hey, > >> >> > >> >> > >> >>> On 2018-05-25 02:17, Rob Herring wrote: > >> > >> On Thu, May 24, 2018 at 6:23 AM, Robert Foss < > > robert.f...@collabora.com> > >> >> > >> >> wrote: > >> > > >> > Hey, > >> > > >> > I don't think I've received any feedback on this version yet. > >> > If anyone has some time to spare, it would be nice to get it merged. > >> > > >> > Just to be clear about the libdrm branch linked in the cover letter, > >> > it is not required. Only for virgl platforms which happens to be > > what > >> > I tested on. > >> > >> > >> virgl will still fallback to using the first render node without > > those > >> libdrm changes, right? If not, I don't think we should apply until > >> we're not breaking a platform... > >> >> > >> >> > >> >>> No it will not fall back. I agree that holding off makes more sense. > >> >> > >> >> > >> >> What's the reason of this problems? Is it because of drmGetDevices()? > >> >> Since > >> >> we don't really use it for anything other than getting the list of > > render > >> >> nodes in the system, maybe we could just iterate over any /dev/renderD* > >> >> nodes explicitly and avoid introducing new problems? > >> > > >> > > >> > That's exactly the problem, and yes we could 100% solve by iterating > > over > >> > /dev/renderD* nodes. I originally assumed we wouldn't want to do that, > > but > >> > rather use the libdrm interfaces. > >> > > >> > But for the next spin I could avoid using libdrm, should I? > > > >> I don't have an opinion on libdrm really, but I do think we should > >> fallback to the 1st (only) render node rather than just fail. > > > > We do, even with libdrm. > > > > AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't > > include devices on virtio bus in the results, which means that there likely > > wouldn't be any render node returned. > Okay. I still don't get why we search by bus in the first place. Who > cares what bus the gpu sits on. We don't search by bus. drmGetDevices() iterates over DRI nodes, queries them and discards those of which bus type it fails to recognize. I have no idea why it does so, though. > Now I have an opinion. We should just iterate over render nodes > matching by name or use the first node if we don't have a set name. Yeah, I suggested that too in my previous reply. It doesn't look like libdrm has any sane helper that could help us. Best regards, Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/x11: Move dri2_format_for_depth prototype.
On Friday, 2018-05-25 16:06:26 +0100, Eric Engestrom wrote: > On Friday, 2018-05-25 06:52:25 +, Vinson Lee wrote: > > Fix build error without DRI3. > > D'uh! > I forgot building dri3 was optional, sorry :/ > > Reviewed-by: Eric Engestrom Actually, wait no, this doesn't look right, the function should be named something else if it's exposed to everyone, since it's quite specific to x11's case, or it should not be exposed to everyone. I feel like the best thing to do here is to just copy the prototype to platform_x11.c: ---8<--- diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index b2a3000b252ec0ddb12f..ea9b0cc6d6fd04804d2a 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -55,6 +55,9 @@ static EGLBoolean dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf, EGLint interval); +uint32_t +dri2_format_for_depth(uint32_t depth); + static void swrastCreateDrawable(struct dri2_egl_display * dri2_dpy, struct dri2_egl_surface * dri2_surf) --->8--- > > > > > CC drivers/dri2/platform_x11.lo > > drivers/dri2/platform_x11.c:1010:1: error: no previous prototype for > > function 'dri2_format_for_depth' [-Werror,-Wmissing-prototypes] > > dri2_format_for_depth(uint32_t depth) > > ^ > > > > Fixes: 473af0b541b2 ("egl/x11: deduplicate depth-to-format logic") > > Signed-off-by: Vinson Lee > > --- > > src/egl/drivers/dri2/egl_dri2.h | 3 +++ > > src/egl/drivers/dri2/platform_x11_dri3.h | 3 --- > > 2 files changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/src/egl/drivers/dri2/egl_dri2.h > > b/src/egl/drivers/dri2/egl_dri2.h > > index adabc527f85b..b91a899e476c 100644 > > --- a/src/egl/drivers/dri2/egl_dri2.h > > +++ b/src/egl/drivers/dri2/egl_dri2.h > > @@ -523,4 +523,7 @@ dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, > > EGLint type, > > void > > dri2_fini_surface(_EGLSurface *surf); > > > > +uint32_t > > +dri2_format_for_depth(uint32_t depth); > > + > > #endif /* EGL_DRI2_INCLUDED */ > > diff --git a/src/egl/drivers/dri2/platform_x11_dri3.h > > b/src/egl/drivers/dri2/platform_x11_dri3.h > > index e6fd01366978..96e7ee972d9f 100644 > > --- a/src/egl/drivers/dri2/platform_x11_dri3.h > > +++ b/src/egl/drivers/dri2/platform_x11_dri3.h > > @@ -38,7 +38,4 @@ extern struct dri2_egl_display_vtbl dri3_x11_display_vtbl; > > EGLBoolean > > dri3_x11_connect(struct dri2_egl_display *dri2_dpy); > > > > -uint32_t > > -dri2_format_for_depth(uint32_t depth); > > - > > #endif > > -- > > 2.17.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: > From: Francisco Jerez I think some explanation is required. I'm guessing this is because you have to write lo fragments out before high, but we should say that in the commit message. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: > It doesn't matter since we don't ever run replicated write shaders > through the optimizer but it's good to be complete. Aside: Is there anything that would prevent us from detecting that all fragments are uniform and using this message? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/53] intel/fs: Fix Gen4-5 FB write AA data payload munging for non-EOT writes.
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: > From: Francisco Jerez Okay, I think the problem this patch is fixing is that previously we would unconditionally execute the fire_fb_write() to send the AA data, and conditionally execute the fire_fb_write() that does not. But we actually want to send one or the other, and never both. With that explanation in the commit message, Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/53] intel/fs: Set up FB write message headers in the visitor
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: > Doing instruction header setup in the generator is aweful for a number Misspelling: awful ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/53] intel/fs: SIMD32 support for fragment shaders
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: > This patch series adds back-end compiler support for SIMD32 fragment > shaders. Support is added and everything works but it's currently hidden > behind INTEL_DEBUG=do32. We know that it improves performance in some > cases but we do not yet have a good enough heuristic to start turning it on > by default. The objective of this series is to just to get the compiler > infrastructure landed so that it stops bit-rotting in Curro's branch. > Figuring out a good heuristic is left as an exercise to the reader. :-) 1-6, 8-20 are Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs
On Friday, February 23, 2018 7:10:55 AM PDT Eleni Maria Stea wrote: > Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed > formats that can render. When GetCompressed* functions are called, the > pixels are returned in the non-compressed format that is used for the > rendering. > > With this patch we store both the compressed and non-compressed versions > of the image, so that both rendering commands and GetCompressed* > commands work. > > Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT > in intel_miptree_map_etc function have been removed because when the > miptree is mapped for reading (for example from a GetCompress* > function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set). > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 10 +- > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 +++ > src/mesa/drivers/dri/i965/intel_tex.c | 157 > +- > src/mesa/drivers/dri/i965/intel_tex.h | 8 ++ > src/mesa/drivers/dri/i965/intel_tex_image.c | 93 ++- > src/mesa/drivers/dri/i965/intel_tex_obj.h | 8 ++ > 6 files changed, 256 insertions(+), 34 deletions(-) Hello, I think this patch could probably be simplified a bit, with less duplication of core Mesa stuff...some suggestions below... > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index 22977d6659..c8c7c025b6 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -730,9 +730,10 @@ miptree_create(struct brw_context *brw, > mesa_format etc_format = MESA_FORMAT_NONE; > uint32_t alloc_flags = 0; > > - format = intel_lower_compressed_format(brw, format); > - > - etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; > + if (!(flags & MIPTREE_CREATE_ETC)) { > + format = intel_lower_compressed_format(brw, format); > + etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; > + } > > if (flags & MIPTREE_CREATE_BUSY) >alloc_flags |= BO_ALLOC_BUSY; > @@ -3314,9 +3315,6 @@ intel_miptree_map_etc(struct brw_context *brw, >assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM); > } > > - assert(map->mode & GL_MAP_WRITE_BIT); > - assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT); > - > map->stride = _mesa_format_row_stride(mt->etc_format, map->w); > map->buffer = malloc(_mesa_format_image_size(mt->etc_format, > map->w, map->h, 1)); > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > index 7fcf09f118..bf6195b97a 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > @@ -379,6 +379,20 @@ enum intel_miptree_create_flags { > * that the miptree will be created with mt->aux_usage == NONE. > */ > MIPTREE_CREATE_NO_AUX = 1 << 2, > + > + /** Create a second miptree for the compressed pixels (Gen7 only) > +* > +* On Gen7, we need to store 2 miptrees for some compressed > +* formats so we can handle rendering as well as getting the > +* compressed image data. This flag indicates that the miptree > +* is expected to hold compressed data for the latter case. > +*/ > + MIPTREE_CREATE_ETC = 1 << 3, > +}; Create flags look fine. > + > +enum intel_miptree_upload_flags { > + MIPTREE_UPLOAD_DEFAULT = 0, > + MIPTREE_UPLOAD_ETC, > }; Rather than creating an extra set of flags here, I would just extend the GL_MAP_*_BIT flags that already get passed around as 'mode'. There's some precedent for that with BRW_MAP_DIRECT_BIT in intel_mipmap_tree.h: /** * This bit extends the set of GL_MAP_*_BIT enums. * * When calling intel_miptree_map() on an ETC-transcoded-to-RGB miptree or a * depthstencil-split-to-separate-stencil miptree, we'll normally make a * temporary and recreate the kind of data requested by Mesa core, since we're * satisfying some glGetTexImage() request or something. * * However, occasionally you want to actually map the miptree's current data * without transcoding back. This flag to intel_miptree_map() gets you that. */ #define BRW_MAP_DIRECT_BIT 0x8000 So, I'd just make a BRW_MAP_ETC_BIT 0x4000, and use that instead. The advantage is that you should be able to reuse existing functions rather than creating new ones that take an extra 'flags' parameter. > > struct intel_mipmap_tree *intel_miptree_create(struct brw_context *brw, > diff --git a/src/mesa/drivers/dri/i965/intel_tex.c > b/src/mesa/drivers/dri/i965/intel_tex.c > index 65a1cb37d4..56077a7676 100644 > --- a/src/mesa/drivers/dri/i965/intel_tex.c > +++ b/src/mesa/drivers/dri/i965/intel_tex.c > @@ -66,6 +66,8 @@ intel_alloc_texture_image_buffer(struct gl_context *ctx, > struct intel_texture_image *intel_image = intel_texture
Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Use cpu tiling/detiling when mapping
On Monday, April 30, 2018 4:38:57 PM PDT Scott D Phillips wrote: > Kenneth Graunke writes: > > > On Monday, April 30, 2018 10:25:49 AM PDT Scott D Phillips wrote: > >> Rename the (un)map_gtt functions to (un)map_map (map by > >> returning a map) and add new functions (un)map_tiled_memcpy that > >> return a shadow buffer populated with the intel_tiled_memcpy > >> functions. > >> > >> Tiling/detiling with the cpu will be the only way to handle Yf/Ys > >> tiling, when support is added for those formats. > >> > >> v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson) > >> > >> v3: Add units to parameter names of tile_extents (Nanley Chery) > >> Use _mesa_align_malloc for the shadow copy (Nanley) > >> Continue using gtt maps on gen4 (Nanley) > >> > >> v4: Use streaming_load_memcpy when detiling > >> > >> Reviewed-by: Chris Wilson > >> --- > >> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 98 > >> +-- > >> 1 file changed, 94 insertions(+), 4 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> index b9a564552df..498eebd2f86 100644 > >> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> @@ -31,6 +31,7 @@ > >> #include "intel_image.h" > >> #include "intel_mipmap_tree.h" > >> #include "intel_tex.h" > >> +#include "intel_tiled_memcpy.h" > >> #include "intel_blit.h" > >> #include "intel_fbo.h" > >> > >> @@ -3066,7 +3067,7 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt) > >> } > >> > >> static void > >> -intel_miptree_unmap_gtt(struct brw_context *brw, > >> +intel_miptree_unmap_map(struct brw_context *brw, > >> struct intel_mipmap_tree *mt, > >> struct intel_miptree_map *map, > >> unsigned int level, unsigned int slice) > >> @@ -3075,7 +3076,7 @@ intel_miptree_unmap_gtt(struct brw_context *brw, > >> } > >> > >> static void > >> -intel_miptree_map_gtt(struct brw_context *brw, > >> +intel_miptree_map_map(struct brw_context *brw, > >> struct intel_mipmap_tree *mt, > >> struct intel_miptree_map *map, > >> unsigned int level, unsigned int slice) > >> @@ -3120,7 +3121,7 @@ intel_miptree_map_gtt(struct brw_context *brw, > >> mt, _mesa_get_format_name(mt->format), > >> x, y, map->ptr, map->stride); > >> > >> - map->unmap = intel_miptree_unmap_gtt; > >> + map->unmap = intel_miptree_unmap_map; > >> } > >> > >> static void > >> @@ -3145,6 +3146,90 @@ intel_miptree_unmap_blit(struct brw_context *brw, > >> intel_miptree_release(&map->linear_mt); > >> } > >> > >> +/* Compute extent parameters for use with tiled_memcpy functions. > >> + * xs are in units of bytes and ys are in units of strides. */ > >> +static inline void > >> +tile_extents(struct intel_mipmap_tree *mt, struct intel_miptree_map *map, > >> + unsigned int level, unsigned int slice, unsigned int *x1_B, > >> + unsigned int *x2_B, unsigned int *y1_el, unsigned int *y2_el) > >> +{ > >> + unsigned int block_width, block_height; > >> + unsigned int x0_el, y0_el; > >> + > >> + _mesa_get_format_block_size(mt->format, &block_width, &block_height); > >> + > >> + assert(map->x % block_width == 0); > >> + assert(map->y % block_height == 0); > >> + > >> + intel_miptree_get_image_offset(mt, level, slice, &x0_el, &y0_el); > >> + *x1_B = (map->x / block_width + x0_el) * mt->cpp; > >> + *y1_el = map->y / block_height + y0_el; > >> + *x2_B = (DIV_ROUND_UP(map->x + map->w, block_width) + x0_el) * mt->cpp; > >> + *y2_el = DIV_ROUND_UP(map->y + map->h, block_height) + y0_el; > >> +} > >> + > >> +static void > >> +intel_miptree_unmap_tiled_memcpy(struct brw_context *brw, > >> + struct intel_mipmap_tree *mt, > >> + struct intel_miptree_map *map, > >> + unsigned int level, > >> + unsigned int slice) > >> +{ > >> + if (map->mode & GL_MAP_WRITE_BIT) { > >> + unsigned int x1, x2, y1, y2; > >> + tile_extents(mt, map, level, slice, &x1, &x2, &y1, &y2); > >> + > >> + char *dst = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW); > >> + dst += mt->offset; > >> + > >> + linear_to_tiled(x1, x2, y1, y2, dst, map->ptr, mt->surf.row_pitch, > >> + map->stride, brw->has_swizzling, mt->surf.tiling, > >> memcpy); > >> + > >> + intel_miptree_unmap_raw(mt); > >> + } > >> + _mesa_align_free(map->buffer); > >> + map->buffer = map->ptr = NULL; > >> +} > >> + > >> +static void > >> +intel_miptree_map_tiled_memcpy(struct brw_context *brw, > >> + struct intel_mipmap_tree *mt, > >> + struct intel_miptree_map *map, > >> +
Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.
On Fri, May 25, 2018 at 11:27 AM, Matt Turner wrote: > On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand > wrote: > > From: Francisco Jerez > > I think some explanation is required. I'm guessing this is because you > have to write lo fragments out before high, but we should say that in > the commit message. > How about this: When using multiple RT write messages to the same RT such as for dual-source blending or all RT writes in SIMD32, we have to set the "Last Render Target Select" bit on all write messages that target the last RT but only set EOT on the last RT write in the shader. Special-casing for dual-source blend works today because that is the only case which requires multiple RT write messages per RT. When we start doing SIMD32, this will become much more common so we add a dedicated bit for it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects
On Fri, May 25, 2018 at 11:29 AM, Matt Turner wrote: > On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand > wrote: > > It doesn't matter since we don't ever run replicated write shaders > > through the optimizer but it's good to be complete. > > Aside: Is there anything that would prevent us from detecting that all > fragments are uniform and using this message? > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects
On Fri, May 25, 2018 at 11:29 AM, Matt Turner wrote: > On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand > wrote: > > It doesn't matter since we don't ever run replicated write shaders > > through the optimizer but it's good to be complete. > > Aside: Is there anything that would prevent us from detecting that all > fragments are uniform and using this message? > We've considered that in the past. Unfortunately, it also has other restrictions such as not allowing color masking so we'd have to put more stuff in the shader key. It could be done though. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets
For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: mesa-sta...@lists.freedesktop.org --- src/intel/blorp/blorp.c | 22 ++ src/intel/blorp/blorp.h | 3 +++ src/intel/blorp/blorp_blit.c | 4 +++- src/intel/blorp/blorp_clear.c | 9 + src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++ 5 files changed, 39 insertions(+), 1 deletion(-) diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c index e348caf..73f8c67 100644 --- a/src/intel/blorp/blorp.c +++ b/src/intel/blorp/blorp.c @@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context *blorp, */ if (is_render_target && blorp->isl_dev->info->gen <= 6) info->view.array_len = MIN2(info->view.array_len, 512); + + if (surf->tile_x_sa || surf->tile_y_sa) { + /* This is only allowed on simple 2D surfaces without MSAA */ + assert(info->surf.dim == ISL_SURF_DIM_2D); + assert(info->surf.samples == 1); + assert(info->surf.levels == 1); + assert(info->surf.logical_level0_px.array_len == 1); + assert(info->aux_usage == ISL_AUX_USAGE_NONE); + + info->tile_x_sa = surf->tile_x_sa; + info->tile_y_sa = surf->tile_y_sa; + + /* Instead of using the X/Y Offset fields in RENDER_SURFACE_STATE, we + * place the image at the tile boundary and offset our sampling or + * rendering. For this reason, we need to grow the image by the offset + * to ensure that the hardware doesn't think we've gone past the edge. + */ + info->surf.logical_level0_px.w += surf->tile_x_sa; + info->surf.logical_level0_px.h += surf->tile_y_sa; + info->surf.phys_level0_sa.w += surf->tile_x_sa; + info->surf.phys_level0_sa.h += surf->tile_y_sa; + } } diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h index f22110b..0a10ff9 100644 --- a/src/intel/blorp/blorp.h +++ b/src/intel/blorp/blorp.h @@ -114,6 +114,9 @@ struct blorp_surf * that it contains a swizzle of RGBA and resource min LOD of 0. */ struct blorp_address clear_color_addr; + + /* Only allowed for simple 2D non-MSAA surfaces */ + uint32_t tile_x_sa, tile_y_sa; }; void diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index 67d4266..68e6d4e 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch, dst_layer, ISL_FORMAT_UNSUPPORTED, true); struct brw_blorp_blit_prog_key wm_prog_key = { - .shader_type = BLORP_SHADER_TYPE_BLIT + .shader_type = BLORP_SHADER_TYPE_BLIT, + .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa, + .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa, }; const struct isl_format_layout *src_fmtl = diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c index 832e8ee..4d3125a 100644 --- a/src/intel/blorp/blorp_clear.c +++ b/src/intel/blorp/blorp_clear.c @@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch, params.x1 = x1; params.y1 = y1; + if (params.dst.tile_x_sa || params.dst.tile_y_sa) { + assert(params.dst.surf.samples == 1); + assert(num_layers == 1); + params.x0 += params.dst.tile_x_sa; + params.y0 += params.dst.tile_y_sa; + params.x1 += params.dst.tile_x_sa; + params.y1 += params.dst.tile_y_sa; + } + /* The MinLOD and MinimumArrayElement don't work properly for cube maps. * Convert them to a single slice on gen4. */ diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c index d7a2cb2..8c6d77e 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.c +++ b/src/mesa/drivers/dri/i965/brw_blorp.c @@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw, .mocs = brw_get_bo_mocs(devinfo, mt->bo), }, .aux_usage = aux_usage, + .tile_x_sa = mt->level[*level].level_x, + .tile_y_sa = mt->level[*level].level_y, }; if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target && -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets
On Friday, May 25, 2018 12:31:03 PM PDT Jason Ekstrand wrote: > For certain EGLImage cases, we represent a single slice or LOD of an > image with a byte offset to a tile and X/Y intratile offsets to the > given slice. Most of i965 is fine with this but it breaks blorp. This > is a terrible way to represent slices of a surface in EGL and we should > stop some day but that's a very scary and thorny path. This gets blorp > to start working with those surfaces and fixes some dEQP EGL test bugs. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 > Cc: mesa-sta...@lists.freedesktop.org > --- > src/intel/blorp/blorp.c | 22 ++ > src/intel/blorp/blorp.h | 3 +++ > src/intel/blorp/blorp_blit.c | 4 +++- > src/intel/blorp/blorp_clear.c | 9 + > src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++ > 5 files changed, 39 insertions(+), 1 deletion(-) > > diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c > index e348caf..73f8c67 100644 > --- a/src/intel/blorp/blorp.c > +++ b/src/intel/blorp/blorp.c > @@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context *blorp, > */ > if (is_render_target && blorp->isl_dev->info->gen <= 6) >info->view.array_len = MIN2(info->view.array_len, 512); > + > + if (surf->tile_x_sa || surf->tile_y_sa) { > + /* This is only allowed on simple 2D surfaces without MSAA */ > + assert(info->surf.dim == ISL_SURF_DIM_2D); > + assert(info->surf.samples == 1); > + assert(info->surf.levels == 1); > + assert(info->surf.logical_level0_px.array_len == 1); > + assert(info->aux_usage == ISL_AUX_USAGE_NONE); > + > + info->tile_x_sa = surf->tile_x_sa; > + info->tile_y_sa = surf->tile_y_sa; > + > + /* Instead of using the X/Y Offset fields in RENDER_SURFACE_STATE, we > + * place the image at the tile boundary and offset our sampling or > + * rendering. For this reason, we need to grow the image by the offset > + * to ensure that the hardware doesn't think we've gone past the edge. > + */ > + info->surf.logical_level0_px.w += surf->tile_x_sa; > + info->surf.logical_level0_px.h += surf->tile_y_sa; > + info->surf.phys_level0_sa.w += surf->tile_x_sa; > + info->surf.phys_level0_sa.h += surf->tile_y_sa; > + } > } > > > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h > index f22110b..0a10ff9 100644 > --- a/src/intel/blorp/blorp.h > +++ b/src/intel/blorp/blorp.h > @@ -114,6 +114,9 @@ struct blorp_surf > * that it contains a swizzle of RGBA and resource min LOD of 0. > */ > struct blorp_address clear_color_addr; > + > + /* Only allowed for simple 2D non-MSAA surfaces */ > + uint32_t tile_x_sa, tile_y_sa; > }; > > void > diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c > index 67d4266..68e6d4e 100644 > --- a/src/intel/blorp/blorp_blit.c > +++ b/src/intel/blorp/blorp_blit.c > @@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch, > dst_layer, ISL_FORMAT_UNSUPPORTED, true); > > struct brw_blorp_blit_prog_key wm_prog_key = { > - .shader_type = BLORP_SHADER_TYPE_BLIT > + .shader_type = BLORP_SHADER_TYPE_BLIT, > + .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa, > + .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa, > }; > > const struct isl_format_layout *src_fmtl = > diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c > index 832e8ee..4d3125a 100644 > --- a/src/intel/blorp/blorp_clear.c > +++ b/src/intel/blorp/blorp_clear.c > @@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch, >params.x1 = x1; >params.y1 = y1; > > + if (params.dst.tile_x_sa || params.dst.tile_y_sa) { > + assert(params.dst.surf.samples == 1); > + assert(num_layers == 1); > + params.x0 += params.dst.tile_x_sa; > + params.y0 += params.dst.tile_y_sa; > + params.x1 += params.dst.tile_x_sa; > + params.y1 += params.dst.tile_y_sa; > + } > + >/* The MinLOD and MinimumArrayElement don't work properly for cube > maps. > * Convert them to a single slice on gen4. > */ > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > index d7a2cb2..8c6d77e 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > @@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw, > .mocs = brw_get_bo_mocs(devinfo, mt->bo), >}, >.aux_usage = aux_usage, > + .tile_x_sa = mt->level[*level].level_x, > + .tile_y_sa = mt->level[*level].level_y, > }; > > if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target && > Hopefully we don't run afoul of surface width/height limits. Probably won't, hard to imagine offsetting into something that's alr
Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.
On Fri, May 25, 2018 at 12:14 PM, Jason Ekstrand wrote: > On Fri, May 25, 2018 at 11:27 AM, Matt Turner wrote: >> >> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand >> wrote: >> > From: Francisco Jerez >> >> I think some explanation is required. I'm guessing this is because you >> have to write lo fragments out before high, but we should say that in >> the commit message. > > > How about this: > > When using multiple RT write messages to the same RT such as for dual-source > blending or all RT writes in SIMD32, we have to set the "Last Render Target > Select" bit on all write messages that target the last RT but only set EOT > on the last RT write in the shader. Special-casing for dual-source blend > works today because that is the only case which requires multiple RT write > messages per RT. When we start doing SIMD32, this will become much more > common so we add a dedicated bit for it. Sounds good to me. Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets
On Fri, May 25, 2018 at 12:53 PM, Kenneth Graunke wrote: > On Friday, May 25, 2018 12:31:03 PM PDT Jason Ekstrand wrote: > > For certain EGLImage cases, we represent a single slice or LOD of an > > image with a byte offset to a tile and X/Y intratile offsets to the > > given slice. Most of i965 is fine with this but it breaks blorp. This > > is a terrible way to represent slices of a surface in EGL and we should > > stop some day but that's a very scary and thorny path. This gets blorp > > to start working with those surfaces and fixes some dEQP EGL test bugs. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 > > Cc: mesa-sta...@lists.freedesktop.org > > --- > > src/intel/blorp/blorp.c | 22 ++ > > src/intel/blorp/blorp.h | 3 +++ > > src/intel/blorp/blorp_blit.c | 4 +++- > > src/intel/blorp/blorp_clear.c | 9 + > > src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++ > > 5 files changed, 39 insertions(+), 1 deletion(-) > > > > diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c > > index e348caf..73f8c67 100644 > > --- a/src/intel/blorp/blorp.c > > +++ b/src/intel/blorp/blorp.c > > @@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context > *blorp, > > */ > > if (is_render_target && blorp->isl_dev->info->gen <= 6) > >info->view.array_len = MIN2(info->view.array_len, 512); > > + > > + if (surf->tile_x_sa || surf->tile_y_sa) { > > + /* This is only allowed on simple 2D surfaces without MSAA */ > > + assert(info->surf.dim == ISL_SURF_DIM_2D); > > + assert(info->surf.samples == 1); > > + assert(info->surf.levels == 1); > > + assert(info->surf.logical_level0_px.array_len == 1); > > + assert(info->aux_usage == ISL_AUX_USAGE_NONE); > > + > > + info->tile_x_sa = surf->tile_x_sa; > > + info->tile_y_sa = surf->tile_y_sa; > > + > > + /* Instead of using the X/Y Offset fields in > RENDER_SURFACE_STATE, we > > + * place the image at the tile boundary and offset our sampling or > > + * rendering. For this reason, we need to grow the image by the > offset > > + * to ensure that the hardware doesn't think we've gone past the > edge. > > + */ > > + info->surf.logical_level0_px.w += surf->tile_x_sa; > > + info->surf.logical_level0_px.h += surf->tile_y_sa; > > + info->surf.phys_level0_sa.w += surf->tile_x_sa; > > + info->surf.phys_level0_sa.h += surf->tile_y_sa; > > + } > > } > > > > > > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h > > index f22110b..0a10ff9 100644 > > --- a/src/intel/blorp/blorp.h > > +++ b/src/intel/blorp/blorp.h > > @@ -114,6 +114,9 @@ struct blorp_surf > > * that it contains a swizzle of RGBA and resource min LOD of 0. > > */ > > struct blorp_address clear_color_addr; > > + > > + /* Only allowed for simple 2D non-MSAA surfaces */ > > + uint32_t tile_x_sa, tile_y_sa; > > }; > > > > void > > diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c > > index 67d4266..68e6d4e 100644 > > --- a/src/intel/blorp/blorp_blit.c > > +++ b/src/intel/blorp/blorp_blit.c > > @@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch, > > dst_layer, ISL_FORMAT_UNSUPPORTED, true); > > > > struct brw_blorp_blit_prog_key wm_prog_key = { > > - .shader_type = BLORP_SHADER_TYPE_BLIT > > + .shader_type = BLORP_SHADER_TYPE_BLIT, > > + .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa, > > + .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa, > > }; > > > > const struct isl_format_layout *src_fmtl = > > diff --git a/src/intel/blorp/blorp_clear.c > b/src/intel/blorp/blorp_clear.c > > index 832e8ee..4d3125a 100644 > > --- a/src/intel/blorp/blorp_clear.c > > +++ b/src/intel/blorp/blorp_clear.c > > @@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch, > >params.x1 = x1; > >params.y1 = y1; > > > > + if (params.dst.tile_x_sa || params.dst.tile_y_sa) { > > + assert(params.dst.surf.samples == 1); > > + assert(num_layers == 1); > > + params.x0 += params.dst.tile_x_sa; > > + params.y0 += params.dst.tile_y_sa; > > + params.x1 += params.dst.tile_x_sa; > > + params.y1 += params.dst.tile_y_sa; > > + } > > + > >/* The MinLOD and MinimumArrayElement don't work properly for > cube maps. > > * Convert them to a single slice on gen4. > > */ > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > > index d7a2cb2..8c6d77e 100644 > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > @@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw, > > .mocs = brw_get_bo_mocs(devinfo, mt->bo), > >}, > >.aux_usage = aux_usage, > > + .tile_x_sa = mt->level[*level].l
[Mesa-dev] [PATCH 1/2] mesa: handle GL_UNSIGNED_INT64_ARB in _mesa_bytes_per_vertex_attrib
From: Marek Olšák Bindless texture handles can be passed via vertex attribs using this type. This fixes a bunch of bindless piglit tests on radeonsi. Cc: 18.0 18.1 --- src/mesa/main/glformats.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c index cba5e670db0..667020c193c 100644 --- a/src/mesa/main/glformats.c +++ b/src/mesa/main/glformats.c @@ -556,20 +556,22 @@ _mesa_bytes_per_vertex_attrib(GLint comps, GLenum type) case GL_UNSIGNED_INT_2_10_10_10_REV: if (comps == 4) return sizeof(GLuint); else return -1; case GL_UNSIGNED_INT_10F_11F_11F_REV: if (comps == 3) return sizeof(GLuint); else return -1; + case GL_UNSIGNED_INT64_ARB: + return comps * 8; default: return -1; } } /** * Test if the given format is unsized. */ GLboolean _mesa_is_enum_format_unsized(GLenum format) -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa: handle GL_UNSIGNED_INT64_ARB in st_pipe_vertex_format
From: Marek Olšák Bindless texture handles can be passed via vertex attribs using this type. This fixes a bunch of bindless piglit tests on radeonsi. Cc: 18.0 18.1 --- src/mesa/state_tracker/st_atom_array.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_array.c b/src/mesa/state_tracker/st_atom_array.c index 9a0935e21a5..76dc81975c8 100644 --- a/src/mesa/state_tracker/st_atom_array.c +++ b/src/mesa/state_tracker/st_atom_array.c @@ -292,20 +292,23 @@ st_pipe_vertex_format(const struct gl_array_attributes *attrib) assert(size == 3 && !integer && format == GL_RGBA); return PIPE_FORMAT_R11G11B10_FLOAT; case GL_UNSIGNED_BYTE: if (format == GL_BGRA) { /* this is an odd-ball case */ assert(normalized); return PIPE_FORMAT_B8G8R8A8_UNORM; } break; + + case GL_UNSIGNED_INT64_ARB: + return PIPE_FORMAT_R32G32_UINT; } index = integer*2 + normalized; assert(index <= 2); assert(type >= GL_BYTE && type <= GL_FIXED); return vertex_formats[type - GL_BYTE][index][size-1]; } static void init_velement(struct pipe_vertex_element *velement, int src_offset, int format, -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: > From: Francisco Jerez > > The hardware's control flow logic is 16-wide so we're out of luck > here. We could, in theory, support SIMD32 if we know the control-flow > is uniform but we don't have that information at this point. This is what the "fork" instruction is for on Gen6 :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/53] intel/fs: Fix logical FB write lowering for SIMD32
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: > From: Francisco Jerez > Presumably Jason already reviewed this and just missed attaching his R-b tag. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 30/53] intel/fs: Add the group to the flag subreg number on SNB and older
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: > We want consistent behavior in the meaning of the flag_subreg field > between SNB and IVB+. > > v2 (Jason Ekstrand): > - Add some extra commentary > > Reviewed-by: Jason Ekstrand Presumably you did not intend to review your own patch :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 32/53] intel/fs: Mark LINTERP opcode as writing accumulator implicitly on pre-Gen7.
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: > From: Francisco Jerez > > --- > src/intel/compiler/brw_shader.cpp | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/intel/compiler/brw_shader.cpp > b/src/intel/compiler/brw_shader.cpp > index 141b64e..61211ef 100644 > --- a/src/intel/compiler/brw_shader.cpp > +++ b/src/intel/compiler/brw_shader.cpp > @@ -984,7 +984,8 @@ backend_instruction::writes_accumulator_implicitly(const > struct gen_device_info > return writes_accumulator || >(devinfo->gen < 6 && > ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) || > -(opcode >= FS_OPCODE_DDX_COARSE && opcode <= > FS_OPCODE_LINTERP))); > +(opcode >= FS_OPCODE_DDX_COARSE && opcode <= > FS_OPCODE_LINTERP))) || > + (devinfo->gen < 7 && opcode == FS_OPCODE_LINTERP); That's heavy-handed. Won't this prevent the scheduler from reordering LINTERP instructions, even though we can only run into problems on SIMD32? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/53] intel/fs: SIMD32 support for fragment shaders
On Fri, May 25, 2018 at 11:50 AM, Matt Turner wrote: > On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand wrote: >> This patch series adds back-end compiler support for SIMD32 fragment >> shaders. Support is added and everything works but it's currently hidden >> behind INTEL_DEBUG=do32. We know that it improves performance in some >> cases but we do not yet have a good enough heuristic to start turning it on >> by default. The objective of this series is to just to get the compiler >> infrastructure landed so that it stops bit-rotting in Curro's branch. >> Figuring out a good heuristic is left as an exercise to the reader. :-) > > 1-6, 8-20 are > > Reviewed-by: Matt Turner 7, 22-31 are too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
Quoting Scott D Phillips (2018-04-30 18:25:48) > +#if defined(USE_SSE41) > +static ALWAYS_INLINE void * > +_memcpy_streaming_load(void *dest, const void *src, size_t count) > +{ > + if (count == 16) { > + __m128i val = _mm_stream_load_si128((__m128i *)src); > + _mm_store_si128((__m128i *)dest, val); > + return dest; > + } else if (count == 64) { > + __m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0); > + __m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1); > + __m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2); > + __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3); > + _mm_store_si128(((__m128i *)dest) + 0, val0); > + _mm_store_si128(((__m128i *)dest) + 1, val1); > + _mm_store_si128(((__m128i *)dest) + 2, val2); > + _mm_store_si128(((__m128i *)dest) + 3, val3); > + return dest; I didn't spot this before, but we use this to copy from an aligned (tiled) source to an unaligned user buffer. s/_mm_store_si128/_mm_storeu_si128/ ^ very important :) -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] i965: Support packing for intel_tiled_memcpy paths
intel_tiled_memcpy is not restricted to using the same pitch on both the src/dst buffers, nor requires row alignment on the user buffer. To support arbitrary using packing modes, all we need to do is use the core functions to compute the pixel locations. --- src/mesa/drivers/dri/i965/intel_pixel_read.c | 6 ++ src/mesa/drivers/dri/i965/intel_tex_image.c | 17 +++-- 2 files changed, 9 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c b/src/mesa/drivers/dri/i965/intel_pixel_read.c index 57df1178417..e697f63d973 100644 --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c @@ -93,10 +93,6 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, */ if (pixels == NULL || _mesa_is_bufferobj(pack->BufferObj) || - pack->Alignment > 4 || - pack->SkipPixels > 0 || - pack->SkipRows > 0 || - (pack->RowLength != 0 && pack->RowLength != width) || pack->SwapBytes || pack->LsbFirst || pack->Invert) @@ -160,6 +156,8 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, xoffset += slice_offset_x; yoffset += slice_offset_y; + pixels = _mesa_image_address(2, pack, pixels, width, height, +format, type, 0, 0, 0); dst_pitch = _mesa_image_row_stride(pack, width, format, type); /* For a window-system renderbuffer, the buffer is actually flipped diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 5afc8d99462..ebfd6fdd7d4 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -200,10 +200,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || pixels == NULL || _mesa_is_bufferobj(packing->BufferObj) || - packing->Alignment > 4 || - packing->SkipPixels > 0 || - packing->SkipRows > 0 || - (packing->RowLength != 0 && packing->RowLength != width) || packing->SwapBytes || packing->LsbFirst || packing->Invert) @@ -244,14 +240,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, if (devinfo->gen < 5 && brw->has_swizzling) return false; - int level = texImage->Level + texImage->TexObject->MinLevel; - /* Since we are going to write raw data to the miptree, we need to resolve * any pending fast color clears before we start. */ assert(image->mt->surf.logical_level0_px.depth == 1); assert(image->mt->surf.logical_level0_px.array_len == 1); + int level = texImage->Level + texImage->TexObject->MinLevel; intel_miptree_access_raw(brw, image->mt, level, 0, true); struct brw_bo *bo = image->mt->bo; @@ -286,6 +281,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, xoffset += level_x; yoffset += level_y; + pixels = _mesa_image_address(dims, packing, pixels, width, height, + format, type, 0, 0, 0); uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat); linear_to_tiled( @@ -704,10 +701,6 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || pixels == NULL || _mesa_is_bufferobj(packing->BufferObj) || - packing->Alignment > 4 || - packing->SkipPixels > 0 || - packing->SkipRows > 0 || - (packing->RowLength != 0 && packing->RowLength != width) || packing->SwapBytes || packing->LsbFirst || packing->Invert) @@ -780,6 +773,10 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, xoffset += level_x; yoffset += level_y; + int dims = _mesa_get_texture_dimensions(texImage->TexObject->Target); + pixels = _mesa_image_address(dims, packing, pixels, width, height, +format, type, 0, 0, 0); + uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat); tiled_to_linear( xoffset * cpp, (xoffset + width) * cpp, -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] i965: Enable fast detiling paths for !llc
Now that we have enabled cache-line at a time transfers to and from GPU memory, we can accelerate access into !llc (WC) memory just as well as WB memory with llc. --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 2 +- src/mesa/drivers/dri/i965/intel_pixel_read.c | 5 ++--- src/mesa/drivers/dri/i965/intel_tex_image.c| 12 ++-- src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 17 - src/mesa/drivers/dri/i965/intel_tiled_memcpy.h | 3 ++- 5 files changed, 27 insertions(+), 12 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index 66828f319be..fd9e8c49b13 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -925,7 +925,7 @@ can_map_cpu(struct brw_bo *bo, unsigned flags) * the GPU for blits or other operations, causing batches to happen at * inconvenient times. */ - if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC)) + if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC | MAP_RAW)) return false; return !(flags & MAP_WRITE); diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c b/src/mesa/drivers/dri/i965/intel_pixel_read.c index a545d215ad6..57df1178417 100644 --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c @@ -91,8 +91,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support * more types. */ - if (!devinfo->has_llc || - pixels == NULL || + if (pixels == NULL || _mesa_is_bufferobj(pack->BufferObj) || pack->Alignment > 4 || pack->SkipPixels > 0 || @@ -115,7 +114,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, return false; mem_copy_fn mem_copy = - intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD); + intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD, devinfo); if (mem_copy == NULL) return false; diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index de8832812c1..5afc8d99462 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -196,8 +196,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, * with _mesa_image_row_stride. However, before removing the restrictions * we need tests. */ - if (!devinfo->has_llc || - !(texImage->TexObject->Target == GL_TEXTURE_2D || + if (!(texImage->TexObject->Target == GL_TEXTURE_2D || texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || pixels == NULL || _mesa_is_bufferobj(packing->BufferObj) || @@ -218,7 +217,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, return false; mem_copy_fn mem_copy = - intel_get_memcpy(texImage->TexFormat, format, type, INTEL_UPLOAD); + intel_get_memcpy(texImage->TexFormat, format, type, + INTEL_UPLOAD, devinfo); if (mem_copy == NULL) return false; @@ -700,8 +700,7 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, * with _mesa_image_row_stride. However, before removing the restrictions * we need tests. */ - if (!devinfo->has_llc || - !(texImage->TexObject->Target == GL_TEXTURE_2D || + if (!(texImage->TexObject->Target == GL_TEXTURE_2D || texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || pixels == NULL || _mesa_is_bufferobj(packing->BufferObj) || @@ -715,7 +714,8 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, return false; mem_copy_fn mem_copy = - intel_get_memcpy(texImage->TexFormat, format, type, INTEL_DOWNLOAD); + intel_get_memcpy(texImage->TexFormat, format, type, + INTEL_DOWNLOAD, devinfo); if (mem_copy == NULL) return false; diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c index abe0f804f37..ae4144904f6 100644 --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c @@ -1004,11 +1004,18 @@ tiled_to_linear(uint32_t xt1, uint32_t xt2, */ mem_copy_fn intel_get_memcpy(mesa_format tiledFormat, GLenum format, GLenum type, - enum intel_memcpy_direction direction) + enum intel_memcpy_direction direction, + const struct gen_device_info *devinfo) { mesa_format user_format; mem_copy_fn fn = NULL; + /* movntdqa support is required for fast reads */ +#if !defined(USE_SSE41) + if (direction == INTEL_DOWNLOAD && !devinfo->has_llc) + return false; +#endif + if (type == GL_BITMAP) return NULL; @@ -1066,5 +1073,13 @@ mem_copy_fn intel_get_memcpy(mesa_format tiledFormat, break; } + /* Only the default
[Mesa-dev] i965: Enable fast detiling paths for !llc
Just a small series to put the new cache-line read back to good use for ye olde Xorg on bxt (and older/newer with very similar effect). From 4 trep @ 0.7007 msec ( 1430.0/sec): ShmPutImage 500x500 square 4000 trep @ 9.0367 msec ( 111.0/sec): ShmGetImage 500x500 square to 6 trep @ 0.5084 msec ( 1970.0/sec): ShmPutImage 500x500 square 12000 trep @ 2.4808 msec ( 403.0/sec): ShmGetImage 500x500 square ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] i965: Push the format checks to intel_tiled_memcpy
Allow the tiled_memcpy backend to determine if it is able to copy between the source and destination pixel buffer. This allows us to eliminate some duplication in the callers, and permits us to be more flexible in checking for compatible formats. (Hmm, is sRGB handling right?) --- src/mesa/drivers/dri/i965/intel_pixel_read.c | 16 +-- src/mesa/drivers/dri/i965/intel_tex_image.c | 46 +++- .../drivers/dri/i965/intel_tiled_memcpy.c | 108 +++--- .../drivers/dri/i965/intel_tiled_memcpy.h | 17 ++- 4 files changed, 102 insertions(+), 85 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c b/src/mesa/drivers/dri/i965/intel_pixel_read.c index 6ed7895bc76..a545d215ad6 100644 --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c @@ -86,15 +86,12 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, /* The miptree's buffer. */ struct brw_bo *bo; - uint32_t cpp; - mem_copy_fn mem_copy = NULL; /* This fastpath is restricted to specific renderbuffer types: * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support * more types. */ if (!devinfo->has_llc || - !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) || pixels == NULL || _mesa_is_bufferobj(pack->BufferObj) || pack->Alignment > 4 || @@ -117,15 +114,9 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, if (rb->NumSamples > 1) return false; - /* We can't handle copying from RGBX or BGRX because the tiled_memcpy -* function doesn't set the last channel to 1. Note this checks BaseFormat -* rather than TexFormat in case the RGBX format is being simulated with an -* RGBA format. -*/ - if (rb->_BaseFormat == GL_RGB) - return false; - - if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp)) + mem_copy_fn mem_copy = + intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD); + if (mem_copy == NULL) return false; if (!irb->mt || @@ -198,6 +189,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx, pack->Alignment, pack->RowLength, pack->SkipPixels, pack->SkipRows); + uint32_t cpp = _mesa_get_format_bytes(rb->Format); tiled_to_linear( xoffset * cpp, (xoffset + width) * cpp, yoffset, yoffset + height, diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index fae179214dd..de8832812c1 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -186,13 +186,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, struct brw_context *brw = brw_context(ctx); const struct gen_device_info *devinfo = &brw->screen->devinfo; struct intel_texture_image *image = intel_texture_image(texImage); - int src_pitch; - - /* The miptree's buffer. */ - struct brw_bo *bo; - - uint32_t cpp; - mem_copy_fn mem_copy = NULL; /* This fastpath is restricted to specific texture types: * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support @@ -204,7 +197,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, * we need tests. */ if (!devinfo->has_llc || - !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) || !(texImage->TexObject->Target == GL_TEXTURE_2D || texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || pixels == NULL || @@ -222,7 +214,12 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, if (ctx->_ImageTransferState) return false; - if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy, &cpp)) + if (format == GL_COLOR_INDEX) + return false; + + mem_copy_fn mem_copy = + intel_get_memcpy(texImage->TexFormat, format, type, INTEL_UPLOAD); + if (mem_copy == NULL) return false; /* If this is a nontrivial texture view, let another path handle it instead. */ @@ -257,7 +254,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, intel_miptree_access_raw(brw, image->mt, level, 0, true); - bo = image->mt->bo; + struct brw_bo *bo = image->mt->bo; if (brw_batch_references(&brw->batch, bo)) { perf_debug("Flushing before mapping a referenced bo.\n"); @@ -270,7 +267,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, return false; } - src_pitch = _mesa_image_row_stride(packing, width, format, type); + int src_pitch = _mesa_image_row_stride(packing, width, format, type); /* We postponed printing this message until having committed to executing * the function. @@ -289,6 +286,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, xoffset += level_x; yoffset += level_y; + uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat); + linear_to_tiled( xoffset * cpp, (xoffset + width) * cpp, yoffset, yoffset + hei
[Mesa-dev] [PATCH 1/4] i915: Fix streamling loads for intel_tiled_memcpy
We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. --- src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c index fac5427d2ed..6440dceac36 100644 --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c @@ -223,17 +223,17 @@ _memcpy_streaming_load(void *dest, const void *src, size_t count) { if (count == 16) { __m128i val = _mm_stream_load_si128((__m128i *)src); - _mm_store_si128((__m128i *)dest, val); + _mm_storeu_si128((__m128i *)dest, val); return dest; } else if (count == 64) { __m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0); __m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1); __m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2); __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3); - _mm_store_si128(((__m128i *)dest) + 0, val0); - _mm_store_si128(((__m128i *)dest) + 1, val1); - _mm_store_si128(((__m128i *)dest) + 2, val2); - _mm_store_si128(((__m128i *)dest) + 3, val3); + _mm_storeu_si128(((__m128i *)dest) + 0, val0); + _mm_storeu_si128(((__m128i *)dest) + 1, val1); + _mm_storeu_si128(((__m128i *)dest) + 2, val2); + _mm_storeu_si128(((__m128i *)dest) + 3, val3); return dest; } else { assert(count < 64); /* and (count < 16) for ytiled */ -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Gitlab migration
Daniel Stone writes: > We had a go at using Jenkins for some of this: Intel's been really > quite successful at doing it internally, but our community efforts > have been a miserable failure. After a few years I've concluded that > it's not going to change - even with Jenkins 2.0. > > Firstly, Jenkins configuration is an absolute dumpster fire. Working > out how to configure it and create the right kind of jobs (and debug > it!) is surprisingly difficult, and involves a lot of clicking through > the web UI, or using external tools like jenkins-job-builder which > seem to be in varying levels of disrepair. If you have dedicated 'QA > people' whose job is driving Jenkins for you, then great! Jenkins will > probably work well for you. This doesn't scale to a community model > though. Especially when people have different usecases and need to > install different plugins. > > Jenkins security is also a tyre fire. Plugins are again in varying > levels of disrepair, and seem remarkably prone to CVEs. There's no > real good model for updating plugins (and doing so is super fragile). > Worse still, Jenkins 2.0 really pushes you to be writing scripts in > Groovy, which can affect Jenkins in totally arbitrary ways, and > subvert the security model entirely. The way upstream deals with this > is to enforce a 'sandbox' model preventing most scripts from doing > anything useful unless manually audited and approved by an admin. > Again, this is fine for companies or small teams where you trust > people to not screw up, but doesn't scale to something like fd.o. > > Adding to these is the permission model, which again requires painful > configuration and a lot of admin clicking. It doesn't integrate well > with external services, and granularity is mostly at an instance > rather than a project level: again not suitable for something like > fd.o. > > From the UI and workflow perspective, something I've never liked is > that the first-order view is very specific pipelines, e.g. 'Mesa > master build', 'daily Piglit run', etc etc. If all you care about is > master, then this is fine. You _can_ make those pipelines run against > arbitrary branches and commits you pick up from MRs or similar, but > you really are trying to jam it sideways into the UI it wants to > present. Again this is so deeply baked into how Jenkins works that I > don't see it as really being fixable. > > I have a pile of other gripes, like how difficult their remote API is > to use, and the horrible race conditions it has. For instance, when > you schedule a run of a particular job, it doesn't report the run ID > back to you: you have to poll the last job number before you submit, > then poll again for a few seconds to find the next run ID. Good luck > to you if two runs of the same job (e.g. 'build specific Mesa commit') > get scheduled at the same time. I agree with some of your Jenkins critiques. I have implemented CI on *many* different frameworks over the past 15 years, and I think that every implementation has its fans and haters. It is wise to create automation which is mostly independent of the CI framework. Mesa i965 CI could immediately switch from Jenkins to BuildBot or GitLab, if there was a reason to do so. It may be that GitLab is superior to Jenkins by now, but the selection of the CI framework is a minor detail anyways. CI frameworks are often based on build/test pipelines, which I think is exactly the wrong concept for the domain. Flexible CI is best thought of as a multiplatform `make` system. Setting up a "pipeline" is similar to building your project with a shell script instead of a makefile. I disagree with your critique of the Jenkins remote API. It is more flexible than any other API that I have seen for CI. We implement our multiplatform-make system on top of it. It would be nice to have an ID returned when triggering a job, but you can work around by including a GUID as a build parameter, then polling for the GUID. The reasons I chose Jenkins over what was available at the time: - job/system configuration is saved as XML for backup/diff/restore - huge number of users -> fewer quality issues > GitLab CI fixes all of these things. Pipelines are strongly and > directly correlated with commits in repositories, though you can also > trigger them manually or on a schedule. Permissions are that of the > repository, and just like Travis, people can fork and work on CI > improvements in their own sandbox without impacting anything else. The > job configuration is in relatively clean YAML, and it strongly > suggests idiomatic form rather than a forest of thousands of > unmaintained plugins. > > Jobs get run in clean containers, rather than special unicorn workers > pre-configured just so, meaning that the builds are totally > reproducible locally and you can use whatever build dependencies you > want without having to bug the admins to install LLVM in some > particular chroot. Those containers can be stored in a registry >
Re: [Mesa-dev] [PATCH 1/4] i915: Fix streamling loads for intel_tiled_memcpy
On Friday, May 25, 2018 4:33:56 PM PDT Chris Wilson wrote: > We stream from a tiled and aligned source into an unaligned user buffer, > so we need to use _mm_storeu_si128. > --- > src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > index fac5427d2ed..6440dceac36 100644 > --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > @@ -223,17 +223,17 @@ _memcpy_streaming_load(void *dest, const void *src, > size_t count) > { > if (count == 16) { >__m128i val = _mm_stream_load_si128((__m128i *)src); > - _mm_store_si128((__m128i *)dest, val); > + _mm_storeu_si128((__m128i *)dest, val); >return dest; > } else if (count == 64) { >__m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0); >__m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1); >__m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2); >__m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3); > - _mm_store_si128(((__m128i *)dest) + 0, val0); > - _mm_store_si128(((__m128i *)dest) + 1, val1); > - _mm_store_si128(((__m128i *)dest) + 2, val2); > - _mm_store_si128(((__m128i *)dest) + 3, val3); > + _mm_storeu_si128(((__m128i *)dest) + 0, val0); > + _mm_storeu_si128(((__m128i *)dest) + 1, val1); > + _mm_storeu_si128(((__m128i *)dest) + 2, val2); > + _mm_storeu_si128(((__m128i *)dest) + 3, val3); >return dest; > } else { >assert(count < 64); /* and (count < 16) for ytiled */ > Fixes: d21c086d819d78fb3f6abcbb14aa492970f442aa (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98581] Dota 2 graphics glitch on autocast abilities.
https://bugs.freedesktop.org/show_bug.cgi?id=98581 ros...@gmail.com changed: What|Removed |Added Resolution|WORKSFORME |FIXED --- Comment #2 from ros...@gmail.com --- That is correct. Issue has long since been fixed. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/53] intel/fs: Fix logical FB write lowering for SIMD32
On May 25, 2018 15:23:21 Matt Turner wrote: On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: From: Francisco Jerez Presumably Jason already reviewed this and just missed attaching his R-b tag. Some of these patches have somewhat confusing authorship. I didn't add my R-b because I sort-of half-wrote this patch. I thought about changing the particular author to me but it was a toss up. In any case, a third pair of eyes was needed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/16] docs: Add python script that converts html to rst.
I specifically tried forcing a rename earlier, but it doesn't work. Git sees too much change. The only way I could get it to work was manually renaming the HTML files to rst first, then committing, then converting to rst. The problem with that strategy is that then the Pandoc command for converting to rst doesn't make sense. (.rst to .rst? What?) Laura On Fri, May 25, 2018, 4:26 AM Eric Engestrom wrote: > On Thursday, 2018-05-24 17:27:05 -0700, Laura Ekstrand wrote: > > Use Beautiful Soup to fix bad html, then use pandoc for converting to > > rst. > > --- > > docs/rstConverter.py | 23 +++ > > 1 file changed, 23 insertions(+) > > create mode 100755 docs/rstConverter.py > > > > diff --git a/docs/rstConverter.py b/docs/rstConverter.py > > new file mode 100755 > > index 00..5321fdde8b > > --- /dev/null > > +++ b/docs/rstConverter.py > > @@ -0,0 +1,23 @@ > > +#!/usr/bin/python3 > > +import glob > > +import subprocess > > +from bs4 import BeautifulSoup > > + > > +pages = glob.glob("*.html") > > +pages += glob.glob("relnotes/*.html") > > +for filename in pages: > > +# Fix some annoyingly bad html. > > +with open(filename) as f: > > +soup = BeautifulSoup(f, 'html5lib') > > +soup.find("div", "header").extract() # Get rid of old header > > +soup.iframe.extract() # Get rid of old contents bar. > > +soup.find("div", "content").unwrap() # Strip the content div. > > Good call on using beautifulsoup to clean the html before converting it! > > > + > > +# Write out the better html. > > +with open(filename, 'wt') as f: > > +f.write(str(soup)) > > + > > +# Convert to rst with pandoc. > > +name = filename.split(".html")[0] > > +bashCmd = "pandoc " + filename + " -o " + name + ".rst" > > +subprocess.run(bashCmd.split()) > > Idea: remove the old html at the same time as we introduce the rst > (commit-wise), so that git picks it up as a rename with changes, which > hopefully would be easier to check as a 1:1 of any given conversion? > > (In case this is as unclear as I think it is, I'm thinking about how we > can review individual pages conversions; say index.html -> index.rst, to > see that no release has been dropped in the process. If git shows this > as a rename with changes, I expect it will be easier to check than if > one commit creates all the rst files and another deletes all the html) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 30/53] intel/fs: Add the group to the flag subreg number on SNB and older
On May 25, 2018 15:24:53 Matt Turner wrote: On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: We want consistent behavior in the meaning of the flag_subreg field between SNB and IVB+. v2 (Jason Ekstrand): - Add some extra commentary Reviewed-by: Jason Ekstrand Presumably you did not intend to review your own patch :) My patch? Curro's patch? It gets kind of hard to tell in this series. :-). This particular one is a single line plucked out of a Curro patch the rest of which landed some time ago. I thought about leaving him as author. Maybe I should switch it back? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
On May 25, 2018 15:19:25 Matt Turner wrote: On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: From: Francisco Jerez The hardware's control flow logic is 16-wide so we're out of luck here. We could, in theory, support SIMD32 if we know the control-flow is uniform but we don't have that information at this point. This is what the "fork" instruction is for on Gen6 :) Yeah, Curro pointed that out too... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 32/53] intel/fs: Mark LINTERP opcode as writing accumulator implicitly on pre-Gen7.
On May 25, 2018 15:28:22 Matt Turner wrote: On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: From: Francisco Jerez --- src/intel/compiler/brw_shader.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index 141b64e..61211ef 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -984,7 +984,8 @@ backend_instruction::writes_accumulator_implicitly(const struct gen_device_info return writes_accumulator || (devinfo->gen < 6 && ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) || -(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))); +(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))) || + (devinfo->gen < 7 && opcode == FS_OPCODE_LINTERP); That's heavy-handed. Won't this prevent the scheduler from reordering LINTERP instructions, even though we can only run into problems on SIMD32? As long as none of them declare that they read it, re-ordering should be fine. If we don't do this, the compiler may move a LINTERP between a write and read of the accumulator emitted for some other reason. That said, this reminds me that we should probably back-port a patch that declares that they write the accumulator on gen11+ too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
Jason Ekstrand writes: > On May 25, 2018 15:19:25 Matt Turner wrote: > >> On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand wrote: >>> From: Francisco Jerez >>> >>> The hardware's control flow logic is 16-wide so we're out of luck >>> here. We could, in theory, support SIMD32 if we know the control-flow >>> is uniform but we don't have that information at this point. >> >> This is what the "fork" instruction is for on Gen6 :) > > Yeah, Curro pointed that out too... > > The main problem about the fork instruction is that it prevents the compiler from interleaving code from the low and high channel groups within control flow, which largely defeats the purpose of SIMD32 of amortizing instruction latency costs. The other problem is that it would involve substantial effort and it is... well... SNB-specific, earlier platforms still won't get support for non-uniform control flow in SIMD32, and newer platforms don't need it. Probably not worth the effort... > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Gitlab migration
On Fri, May 25, 2018 at 4:47 PM, Mark Janes wrote: > Daniel Stone writes: > > GitLab CI fixes all of these things. Pipelines are strongly and > > directly correlated with commits in repositories, though you can also > > trigger them manually or on a schedule. Permissions are that of the > > repository, and just like Travis, people can fork and work on CI > > improvements in their own sandbox without impacting anything else. The > > job configuration is in relatively clean YAML, and it strongly > > suggests idiomatic form rather than a forest of thousands of > > unmaintained plugins. > > > > Jobs get run in clean containers, rather than special unicorn workers > > pre-configured just so, meaning that the builds are totally > > reproducible locally and you can use whatever build dependencies you > > want without having to bug the admins to install LLVM in some > > particular chroot. Those containers can be stored in a registry > > attached to the project, with their own lifetime/ownership/etc > > tracking. Jenkins can use Docker if you have an external registry, but > > again this requires setting up external authentication and > > permissions, not to mention that there's no lifetime/ownership/expiry > > tracking, so you have to write more special admin cronjob scripts to > > clean up old images in the registry. > > GitLab may be perfectly suitable for CI, but please do not select Mesa > dev infrastructure based on CI features. > > Any Mesa CI needs to trigger from multiple projects: drm, dEQP, Piglit, > VulkanCTS, SPIRV-Tools, crucible, glslang. They are not all going to be > in GitLab. > > The cart (CI) follows the horse (upstream development process). CI > automation is cheap and flexible, and can easily adapt to changes in the > driver implementation / dev process. > I think part of the difficulty in this discussion is something you referenced in the second paragraph above. The type of CI we do in our Jenkins system is in a different domain than the type of CI supported by the likes of gitlab. The CI we do in our lab is more along the lines of integration testing where multiple components all have to come together whereas the gitlab CI framework is more intended to support single-project unit testing. The gitlab CI system also does not scale nearly well enough to handle the kind of testing that we need to do. The gitlab CI hooks would work fairly well for building the website, running some build tests, and maybe make check but it will never be a replacement for the Jenkins system we have in our lab. They're a useful feature (that's a good thing!) but certainly not a replacement for what we have today. I'm sorry if I implied that it would; I certainly did not intend to. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] st/dri: replace format conversion functions with single mapping table
On Thu, May 17, 2018 at 6:50 AM, Lucas Stach wrote: > Each time I have to touch the buffer import/export functions in the dri > state tracker I get lost in the maze of functions converting between > DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format. > > Rip it out and replace by a single table, which defines the correspondence > between the different representations. > > Also this now stores all the known representations in the __DRIimageRec, > to avoid the loss of information we currently have when importing a buffer > with a fourcc, which doesn't have a corresponding dri format. > > Signed-off-by: Lucas Stach > --- > src/gallium/state_trackers/dri/dri2.c | 476 ++-- > src/gallium/state_trackers/dri/dri_screen.h | 1 + > 2 files changed, 138 insertions(+), 339 deletions(-) > > diff --git a/src/gallium/state_trackers/dri/dri2.c > b/src/gallium/state_trackers/dri/dri2.c > index 859161fb87ac..9c74ca54fc89 100644 > --- a/src/gallium/state_trackers/dri/dri2.c > +++ b/src/gallium/state_trackers/dri/dri2.c > @@ -54,295 +54,72 @@ > #define DRM_FORMAT_MOD_INVALID ((1ULL<<56) - 1) > #endif > > -static const int fourcc_formats[] = { > - __DRI_IMAGE_FOURCC_ARGB2101010, > - __DRI_IMAGE_FOURCC_XRGB2101010, > - __DRI_IMAGE_FOURCC_ABGR2101010, > - __DRI_IMAGE_FOURCC_XBGR2101010, > - __DRI_IMAGE_FOURCC_ARGB, > - __DRI_IMAGE_FOURCC_ABGR, > - __DRI_IMAGE_FOURCC_SARGB, > - __DRI_IMAGE_FOURCC_XRGB, > - __DRI_IMAGE_FOURCC_XBGR, > - __DRI_IMAGE_FOURCC_ARGB1555, > - __DRI_IMAGE_FOURCC_RGB565, > - __DRI_IMAGE_FOURCC_R8, > - __DRI_IMAGE_FOURCC_R16, > - __DRI_IMAGE_FOURCC_GR88, > - __DRI_IMAGE_FOURCC_GR1616, > - __DRI_IMAGE_FOURCC_YUV410, > - __DRI_IMAGE_FOURCC_YUV411, > - __DRI_IMAGE_FOURCC_YUV420, > - __DRI_IMAGE_FOURCC_YUV422, > - __DRI_IMAGE_FOURCC_YUV444, > - __DRI_IMAGE_FOURCC_YVU410, > - __DRI_IMAGE_FOURCC_YVU411, > - __DRI_IMAGE_FOURCC_YVU420, > - __DRI_IMAGE_FOURCC_YVU422, > - __DRI_IMAGE_FOURCC_YVU444, > - __DRI_IMAGE_FOURCC_NV12, > - __DRI_IMAGE_FOURCC_NV16, > - __DRI_IMAGE_FOURCC_YUYV > -}; > - > -static int convert_fourcc(int format, int *dri_components_p) > -{ > +struct dri2_format_mapping { > + int dri_fourcc; > + int dri_format; > int dri_components; > - switch(format) { > - case __DRI_IMAGE_FOURCC_RGB565: > - format = __DRI_IMAGE_FORMAT_RGB565; > - dri_components = __DRI_IMAGE_COMPONENTS_RGB; > - break; > - case __DRI_IMAGE_FOURCC_ARGB: > - format = __DRI_IMAGE_FORMAT_ARGB; > - dri_components = __DRI_IMAGE_COMPONENTS_RGBA; > - break; > - case __DRI_IMAGE_FOURCC_XRGB: > - format = __DRI_IMAGE_FORMAT_XRGB; > - dri_components = __DRI_IMAGE_COMPONENTS_RGB; > - break; > - case __DRI_IMAGE_FOURCC_ABGR: > - format = __DRI_IMAGE_FORMAT_ABGR; > - dri_components = __DRI_IMAGE_COMPONENTS_RGBA; > - break; > - case __DRI_IMAGE_FOURCC_XBGR: > - format = __DRI_IMAGE_FORMAT_XBGR; > - dri_components = __DRI_IMAGE_COMPONENTS_RGB; > - break; > - case __DRI_IMAGE_FOURCC_ARGB2101010: > - format = __DRI_IMAGE_FORMAT_ARGB2101010; > - dri_components = __DRI_IMAGE_COMPONENTS_RGBA; > - break; > - case __DRI_IMAGE_FOURCC_XRGB2101010: > - format = __DRI_IMAGE_FORMAT_XRGB2101010; > - dri_components = __DRI_IMAGE_COMPONENTS_RGB; > - break; > - case __DRI_IMAGE_FOURCC_ABGR2101010: > - format = __DRI_IMAGE_FORMAT_ABGR2101010; > - dri_components = __DRI_IMAGE_COMPONENTS_RGBA; > - break; > - case __DRI_IMAGE_FOURCC_XBGR2101010: > - format = __DRI_IMAGE_FORMAT_XBGR2101010; > - dri_components = __DRI_IMAGE_COMPONENTS_RGB; > - break; > - case __DRI_IMAGE_FOURCC_R8: > - format = __DRI_IMAGE_FORMAT_R8; > - dri_components = __DRI_IMAGE_COMPONENTS_R; > - break; > - case __DRI_IMAGE_FOURCC_GR88: > - format = __DRI_IMAGE_FORMAT_GR88; > - dri_components = __DRI_IMAGE_COMPONENTS_RG; > - break; > - case __DRI_IMAGE_FOURCC_R16: > - format = __DRI_IMAGE_FORMAT_R16; > - dri_components = __DRI_IMAGE_COMPONENTS_R; > - break; > - case __DRI_IMAGE_FOURCC_GR1616: > - format = __DRI_IMAGE_FORMAT_GR1616; > - dri_components = __DRI_IMAGE_COMPONENTS_RG; > - break; > - case __DRI_IMAGE_FOURCC_YUYV: > - format = __DRI_IMAGE_FORMAT_YUYV; > - dri_components = __DRI_IMAGE_COMPONENTS_Y_XUXV; > - break; > - /* > -* For multi-planar YUV formats, we return the format of the first > -* plane only. Since there is only one caller which supports multi- > -* planar YUV it gets to figure out the remaining planes on it's > -* own. > -*/ > - case __DRI_IMAGE_FOURCC_YUV420: > - case __DRI_IMAGE_FOURCC_YVU420: > - format = __DRI_IMAGE_FORMAT_R8; > - dri_components = __DRI_IMAGE_COMPONENTS_Y_U_V; > - break; > - case __DRI_IMAGE_FOURCC_NV12: > -
[Mesa-dev] [PATCH v2 32/53] intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN
From: Francisco Jerez When we don't have PLN (gen4 and gen11+), we implement LINTERP as either LINE+MAC or a pair of MADs. In both cases, the accumulator is written by the first of the two instructions and read by the second. Even though the accumulator value isn't actually ever used from a logical instruction perspective, it is trashed so we need to make the scheduler aware. Otherwise, the scheduler could end up re-ordering instructions and putting a LINTERP between another an instruction which writes the accumulator and another which tries to use that result. Cc: mesa-sta...@lists.freedesktop.org --- src/intel/compiler/brw_shader.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index 141b64e..dfd2c5c 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -984,7 +984,8 @@ backend_instruction::writes_accumulator_implicitly(const struct gen_device_info return writes_accumulator || (devinfo->gen < 6 && ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) || -(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))); +(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))) || + (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln); } bool -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 33/53] intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN instruction is required to be an even register number. When it's odd (which can happen with SIMD32), we have to emit a LINE+MAC combination instead. Unfortunately, we can't just fall through to the gen4 case because the input registers are still set up for PLN which lays out the four src1 registers differently in SIMD16 than LINE. --- src/intel/compiler/brw_fs_generator.cpp | 75 + src/intel/compiler/brw_shader.cpp | 3 +- 2 files changed, 68 insertions(+), 10 deletions(-) diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp index 548a208..0ca9a4e 100644 --- a/src/intel/compiler/brw_fs_generator.cpp +++ b/src/intel/compiler/brw_fs_generator.cpp @@ -761,16 +761,73 @@ fs_generator::generate_linterp(fs_inst *inst, return true; } else if (devinfo->has_pln) { - /* From the Sandy Bridge PRM Vol. 4, Pt. 2, Section 8.3.53, "Plane": - * - *"[DevSNB]: must be even register aligned. - * - * This restriction is lifted on Ivy Bridge. - */ - assert(devinfo->gen >= 7 || (delta_x.nr & 1) == 0); - brw_PLN(p, dst, interp, delta_x); + if (devinfo->gen <= 6 && (delta_x.nr & 1) != 0) { + /* From the Sandy Bridge PRM Vol. 4, Pt. 2, Section 8.3.53, "Plane": + * + *"[DevSNB]: must be even register aligned. + * + * This restriction is lifted on Ivy Bridge. + * + * This means that we need to split PLN into LINE+MAC on-the-fly. + * Unfortunately, the inputs are laid out for PLN and not LIN+MAC so + * we have to split into SIMD8 pieces. + */ + if (inst->exec_size == 8) { +i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x); +i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y); - return false; +/* LINE writes the accumulator automatically on gen4-5. On Sandy + * Bridge and later, we have to explicitly enable it. + */ +if (devinfo->gen >= 6) + brw_inst_set_acc_wr_control(p->devinfo, i[0], true); + +brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod); + +/* brw_set_default_saturate() is called before emitting + * instructions, so the saturate bit is set in each instruction, + * so we need to unset it on the first instruction. + */ +brw_inst_set_saturate(p->devinfo, i[0], false); + } else { +brw_push_insn_state(p); +brw_set_default_exec_size(p, BRW_EXECUTE_8); + +brw_set_default_group(p, inst->group); +i[0] = brw_LINE(p, brw_null_reg(), interp, offset(delta_x, 0)); +i[1] = brw_MAC(p, offset(dst, 0), + suboffset(interp, 1), offset(delta_x, 1)); + +brw_set_default_group(p, inst->group + 8); +i[2] = brw_LINE(p, brw_null_reg(), interp, offset(delta_y, 0)); +i[3] = brw_MAC(p, offset(dst, 1), + suboffset(interp, 1), offset(delta_y, 1)); + +brw_pop_insn_state(p); + +/* LINE writes the accumulator automatically on gen4-5. On Sandy + * Bridge and later, we have to explicitly enable it. + */ +if (devinfo->gen >= 6) { + brw_inst_set_acc_wr_control(p->devinfo, i[0], true); + brw_inst_set_acc_wr_control(p->devinfo, i[2], true); +} + +brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod); +brw_inst_set_cond_modifier(p->devinfo, i[3], inst->conditional_mod); + +/* brw_set_default_saturate() is called before emitting + * instructions, so the saturate bit is set in each instruction, + * so we need to unset it on the first instruction of each pair. + */ +brw_inst_set_saturate(p->devinfo, i[0], false); +brw_inst_set_saturate(p->devinfo, i[2], false); + } + return true; + } else { + brw_PLN(p, dst, interp, delta_x); + return false; + } } else { i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x); i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y); diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index dfd2c5c..6d25d51 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -985,7 +985,8 @@ backend_instruction::writes_accumulator_implicitly(const struct gen_device_info (devinfo->gen < 6 && ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) || (opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))) || - (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln); + (opcode == FS_OPCODE_LINTERP && + (!devin
Re: [Mesa-dev] Gitlab migration
On Thu, May 24, 2018 at 6:46 AM, Daniel Stone wrote: > Hi all, > I'm going to attempt to interleave a bunch of replies here. > > On 23 May 2018 at 20:34, Jason Ekstrand wrote: > > The freedesktop.org admins are trying to move as many projects and > services > > as possible over to gitlab and somehow I got hoodwinked into > spear-heading > > it for mesa. There are a number of reasons for this change. Some of > those > > reasons have to do with the maintenance cost of our sprawling and aging > > infrastructure. Some of those reasons provide significant benefit to the > > project being migrated: > > Thanks for starting the discussion! I appreciate the help. > > To be clear, we _are_ migrating the hosting for all projects, as in, > the remote you push to will change. We've slowly staged this with a > few projects of various shapes and sizes, and are confident that it > more than holds up to the load. This is something we can pull the > trigger on roughly any time, and I'm happy to do it whenever. When > that happens, trying to push to ssh://git.fd.o will give you an error > message explaining how to update your SSH keys, how to change your > remotes, etc. > > cgit and anongit will not be orphaned: they remain as push mirrors so > are updated simultaneously with GItLab pushes, as will the GitHub > mirrors. Realistically, we can't deprecate anongit for a (very) long > time due to the millions of Yocto forks which have that URL embedded > in their build recipes. Running cgit alongside that is fairly > low-intervention. And hey, if we look at the logs in five years' time > and see 90% of people still using cgit to browse and not GitLab, > that's a pretty strong hint that we should put effort into keeping it. > Well, I don't know what people are talking about. A cgit commit log is a tight table with 5 columns with information. I can't find anything like that in GitLab. All I could find is this: https://gitlab.freedesktop.org/jekstrand/mesa/commits/master The elements are too large and don't have much information. Why would you have the author name on another line when you could add another column instead? There is a lot of unused screen space. And why having avatars in the commit log. It's not Facebook. Then there is the project Overview page. It mostly just shows files in the top level directory. Compare it with cgit where the Overview page looks like a, guess what, overview! OK, that was harsh, but there is a lot of truth to it. I guess GitLab is great for admins and I get that. Speaking of the web UI, at least the read-only view is impressively unimpressive. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev