[Mesa-dev] [PATCH] egl: autotools: add missing dependency on generated header

2018-05-25 Thread Philipp Zabel
platform_wayland.c includes linux-dmabuf-unstable-v1-client-protocol.h,
which is generated during build. Add the missing dependency to the
Makefile.

I have seen the following build failure due to a race between generation
of linux-dmabuf-unstable-v1-client-protocol.h and compilation of
platform_wayland.cc:

  GEN  drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
  GEN  drivers/dri2/linux-dmabuf-unstable-v1-protocol.c
Using "code" is deprecated - use private-code or public-code.
See the help page for details.
  CC   drivers/dri2/platform_wayland.lo
../../../Mesa-18.1.0/src/egl/drivers/dri2/platform_wayland.c: In function 
'create_wl_buffer':
../../../Mesa-18.1.0/src/egl/drivers/dri2/platform_wayland.c:810:16: error: 
implicit declaration of function 'zwp_linux_dmabuf_v1_create_params' 
[-Werror=implicit-function-declaration]

Signed-off-by: Philipp Zabel 
---
 src/egl/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index 086a4a1e630..116ed4ebf50 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -80,6 +80,7 @@ drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h: 
$(WL_DMABUF_XML)
 if HAVE_PLATFORM_WAYLAND
 drivers/dri2/linux-dmabuf-unstable-v1-protocol.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
 drivers/dri2/egl_dri2.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
+drivers/dri2/platform_wayland.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
 
 AM_CFLAGS += $(WAYLAND_CLIENT_CFLAGS)
 libEGL_common_la_LIBADD += $(WAYLAND_CLIENT_LIBS)
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] egl/android: Add DRM node probing and filtering

2018-05-25 Thread Tomasz Figa
Hi Rob,

Finally got to review this. Please see my comments inline.

On Fri, May 11, 2018 at 10:48 PM Robert Foss 
wrote:
[snip]
> +EGLBoolean
> +droid_load_driver(_EGLDisplay *disp)

Since this is not EGL-facing, I'd personally use bool.

> +{
> +   struct dri2_egl_display *dri2_dpy = disp->DriverData;
> +   const char *err;
> +
> +   dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
> +   if (dri2_dpy->driver_name == NULL) {
> +  err = "DRI2: failed to get driver name";
> +  goto error;

It shouldn't be an error if there is no driver for given render node. We
should just skip it and try next one, which I believe would be achieved by
just returning false here.

> +   }
> +
> +   dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) ==
DRM_NODE_RENDER;
> +
> +   if (!dri2_dpy->is_render_node) {
> +   #ifdef HAVE_DRM_GRALLOC
> +   /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM
names
> +* for backwards compatibility with drm_gralloc. (Do not use on
new
> +* systems.) */
> +   dri2_dpy->loader_extensions = droid_dri2_loader_extensions;
> +   if (!dri2_load_driver(disp)) {
> +  err = "DRI2: failed to load driver";
> +  goto error;
> +   }
> +   #else
> +   err = "DRI2: handle is not for a render node";
> +   goto error;
> +   #endif
> +   } else {
> +   dri2_dpy->loader_extensions = droid_image_loader_extensions;
> +   if (!dri2_load_driver_dri3(disp)) {
> +  err = "DRI3: failed to load driver";
> +  goto error;
> +   }
> +}
> +
> +   return EGL_TRUE;
> +
> +error:
> +   free(dri2_dpy->driver_name);
> +   dri2_dpy->driver_name = NULL;
> +   return _eglError(EGL_NOT_INITIALIZED, err);

Hmm, if we signal EGL error here, we should break the probing loop and just
bail out. This would suggest that a boolean is not the right type for this
function to return. Perhaps something like negative error, 0 for skip and 1
for success would make sense?

Also, how does it play with the _eglError() called from the error path of
dri2_initialize_android()?

> +}
> +
> +static int
> +droid_probe_driver(_EGLDisplay *disp, int fd)
> +{
> +   struct dri2_egl_display *dri2_dpy = disp->DriverData;
> +   dri2_dpy->fd = fd;
> +
> +   if (!droid_load_driver(disp))
> +  return false;

Given my other suggestion about distinguishing failure, render node skip
and success, I think it should be more like this:

int ret = droid_load_driver(disp);
if (ret <= 0)
   return ret;

Or actually, maybe we don't really need to go as far as loading the driver.
I'd say it should be enough to just check if we have a driver for the
device by looking at what loader_get_driver_for_fd() returns. (In that
case, we can ignore my comment about returning error on
loader_get_driver_for_fd() failure in droid_load_driver(), since the
skipping would be handling only here.)

> +
> +   /* Since this probe can succeed, but another filter may fail,

What another filter could fail? I can see the vendor name being checked
before calling this function.

The free() below is actually needed, just the comment is off. We need to
free the name to be able to probe remaining nodes, without leaking the name.

> +  this string needs to be deallocated either way.
> +  Once an FD has been found, this string will be set a second time.
*/
> +   free(dri2_dpy->driver_name);

Don't we also need to unload the driver?

> +   dri2_dpy->driver_name = NULL;
> +   return true;

To match the change above:

return 1;

> +}
> +
> +static int
> +droid_probe_device(_EGLDisplay *disp, int fd, drmDevicePtr dev, char
*vendor)
> +{
> +   drmVersionPtr ver = drmGetVersion(fd);
> +   if (!ver)
> +   goto fail;

Something wrong with indentation here.

> +
> +   size_t vendor_len = strlen(vendor);
> +   if (vendor_len != 0 && strncmp(vendor, ver->name, vendor_len))
> +  goto fail;

Maybe it's just me, but I don't see any point in using strncmp() if the
length argument is obtained by calling strlen() first. Especially if the
strlen() call is on a string that comes from some external code
(property_get()).

Perhaps we could just call strncmp() with PROPERTY_VALUE_MAX? This would
actually play nice with my other comment about using NULL for vendor
string, if the property is not present.

Also nit: The label could be named in a more meaningful way, e.g.
err_free_version.

> +
> +   if (!droid_probe_driver(disp, fd))
> +  goto fail;
> +
> +   drmFreeVersion(ver);
> +   return true;
> +
> +fail:
> +   drmFreeVersion(ver);
> +   return false;

Given my other suggestion about distinguishing failure, render node skip
and success, I think it should be more like this:

ret = droid_probe_driver(disp, fd);
err_free_version:
drmFreeVersion(ver);
return ret;

> +}
> +
> +static int
> +droid_open_device(_EGLDisplay *disp)
> +{
> +   const int MAX_DRM_DEVICES = 32;
> +   int prop_set, num_devices, ret;
> +   int fd = -1, fallbac

Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Tomasz Figa
Hi Rob,

On Thu, May 24, 2018 at 8:23 PM Robert Foss 
wrote:

> Hey,

> I don't think I've received any feedback on this version yet.
> If anyone has some time to spare, it would be nice to get it merged.

Really sorry for taking so long to review. Posted my comments just now.

Best regards,
Tomasz


> Just to be clear about the libdrm branch linked in the cover letter,
> it is not required. Only for virgl platforms which happens to be what
> I tested on.


> Rob.

> On 2018-05-11 15:47, Robert Foss wrote:
> > This series replaces the dependency on
> > GRALLOC_MODULE_PERFORM_GET_DRM_FD with DRM node
> > probing and disables the support for drm_gralloc.
> >
> > The series has been tested on Qemu+AOSP, where a
> > virtio gpu was successfully probed for and
> > opened.
> >
> > This however required adding support in libdrm
> > for virtio gpus, and virtio buses. An initial
> > patch for this can be found here:
> >
> > https://gitlab.collabora.com/robertfoss/libdrm/tree/virtio_rfc
> >
> > Changes since v1:
> >   - Added fix for build issue
> >   - Do not rely on libdrm for probing
> >   - Distinguish between errors and when no drm devices are found
> >
> > Changes since RFC:
> >   - Rebased work on the libdrm patch [2].
> >   - Included patch from Rob Herring disabling drm_gralloc/flink
> > support by default.
> >   - Added device handler driver probing.
> >
> >
> > Rob Herring (1):
> >egl/android: #ifdef out flink name support
> >
> > Robert Foss (2):
> >gallium/util: Fix build error due to cast to different size
> >egl/android: Add DRM node probing and filtering
> >
> >   src/egl/Android.mk|   6 +-
> >   src/egl/drivers/dri2/egl_dri2.h   |   2 -
> >   src/egl/drivers/dri2/platform_android.c   | 206 ++
> >   .../auxiliary/util/u_debug_stack_android.cpp  |   4 +-
> >   4 files changed, 174 insertions(+), 44 deletions(-)
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Robert Foss

Hey,

On 2018-05-25 02:17, Rob Herring wrote:

On Thu, May 24, 2018 at 6:23 AM, Robert Foss  wrote:

Hey,

I don't think I've received any feedback on this version yet.
If anyone has some time to spare, it would be nice to get it merged.

Just to be clear about the libdrm branch linked in the cover letter,
it is not required. Only for virgl platforms which happens to be what
I tested on.


virgl will still fallback to using the first render node without those
libdrm changes, right? If not, I don't think we should apply until
we're not breaking a platform...


No it will not fall back. I agree that holding off makes more sense.

Emil Velikov had some objections to the approach in the libdrm branch,
and started a new branch from scratch with the same goals. It isn't
yet fully functional, but I'm working with him to have it sent out
as soon as possible.


Rob.



Rob



Rob.

On 2018-05-11 15:47, Robert Foss wrote:


This series replaces the dependency on
GRALLOC_MODULE_PERFORM_GET_DRM_FD with DRM node
probing and disables the support for drm_gralloc.

The series has been tested on Qemu+AOSP, where a
virtio gpu was successfully probed for and
opened.

This however required adding support in libdrm
for virtio gpus, and virtio buses. An initial
patch for this can be found here:

https://gitlab.collabora.com/robertfoss/libdrm/tree/virtio_rfc

Changes since v1:
   - Added fix for build issue
   - Do not rely on libdrm for probing
   - Distinguish between errors and when no drm devices are found

Changes since RFC:
   - Rebased work on the libdrm patch [2].
   - Included patch from Rob Herring disabling drm_gralloc/flink
 support by default.
   - Added device handler driver probing.


Rob Herring (1):
egl/android: #ifdef out flink name support

Robert Foss (2):
gallium/util: Fix build error due to cast to different size
egl/android: Add DRM node probing and filtering

   src/egl/Android.mk|   6 +-
   src/egl/drivers/dri2/egl_dri2.h   |   2 -
   src/egl/drivers/dri2/platform_android.c   | 206 ++
   .../auxiliary/util/u_debug_stack_android.cpp  |   4 +-
   4 files changed, 174 insertions(+), 44 deletions(-)




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Tomasz Figa
On Fri, May 25, 2018 at 5:33 PM Robert Foss 
wrote:

> Hey,

> On 2018-05-25 02:17, Rob Herring wrote:
> > On Thu, May 24, 2018 at 6:23 AM, Robert Foss 
wrote:
> >> Hey,
> >>
> >> I don't think I've received any feedback on this version yet.
> >> If anyone has some time to spare, it would be nice to get it merged.
> >>
> >> Just to be clear about the libdrm branch linked in the cover letter,
> >> it is not required. Only for virgl platforms which happens to be what
> >> I tested on.
> >
> > virgl will still fallback to using the first render node without those
> > libdrm changes, right? If not, I don't think we should apply until
> > we're not breaking a platform...

> No it will not fall back. I agree that holding off makes more sense.

What's the reason of this problems? Is it because of drmGetDevices()? Since
we don't really use it for anything other than getting the list of render
nodes in the system, maybe we could just iterate over any /dev/renderD*
nodes explicitly and avoid introducing new problems?

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: update release calendar for 18.1 series

2018-05-25 Thread Juan A. Suarez Romero
On Tue, 2018-05-22 at 10:48 -0700, Dylan Baker wrote:
> This looks good to me. I'm also opened to doing the 18.1.x releases if Emil
> would rather not do them (and I have been updating my 18.1-proposed branch
> either way).
> 

I'm fine if you do 18.1.x releases. In fact, I think it would be a good idea if
the same person takes care of a full release cycle, from feature releases to the
stable releases.



J.A.

> Reviewed-by: Dylan Baker 
> 
> Quoting Juan A. Suarez Romero (2018-05-22 00:48:48)
> > CC: Andres Gomez 
> > CC: Emil Velikov 
> > CC: Dylan Baker 
> > ---
> > 
> > As per calendar 18.2.0rc1 starts after the last 18.1.x release, either
> > we need to update the release calendar for 18.2 series, or extend 18.1
> > series.
> > 
> > 
> >  docs/release-calendar.html | 34 +-
> >  1 file changed, 5 insertions(+), 29 deletions(-)
> > 
> > diff --git a/docs/release-calendar.html b/docs/release-calendar.html
> > index ba297532dc3..c67eea1a9de 100644
> > --- a/docs/release-calendar.html
> > +++ b/docs/release-calendar.html
> > @@ -46,50 +46,26 @@ if you'd like to nominate a patch in the next stable 
> > release.
> >  Last planned 18.0.x release
> >  
> >  
> > -18.1
> > -2018-04-20
> > -18.1.0rc1
> > -Dylan Baker
> > -
> > -
> > -
> > -2018-04-27
> > -18.1.0rc2
> > -Dylan Baker
> > -
> > -
> > -
> > -2018-05-04
> > -18.1.0rc3
> > -Dylan Baker
> > -
> > -
> > -
> > -2018-05-11
> > -18.1.0rc4
> > -Dylan Baker
> > -Last planned RC/Final release
> > -
> > -
> > -TBD
> > +18.1
> > +2018-06-01
> >  18.1.1
> >  Emil Velikov
> >  
> >  
> >  
> > -TBD
> > +2018-06-15
> >  18.1.2
> >  Emil Velikov
> >  
> >  
> >  
> > -TBD
> > +2018-06-29
> >  18.1.3
> >  Emil Velikov
> >  
> >  
> >  
> > -TBD
> > +2018-07-13
> >  18.1.4
> >  Emil Velikov
> >  Last planned RC/Final release
> > -- 
> > 2.17.0
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Robert Foss



On 2018-05-25 10:38, Tomasz Figa wrote:

On Fri, May 25, 2018 at 5:33 PM Robert Foss 
wrote:


Hey,



On 2018-05-25 02:17, Rob Herring wrote:

On Thu, May 24, 2018 at 6:23 AM, Robert Foss 

wrote:

Hey,

I don't think I've received any feedback on this version yet.
If anyone has some time to spare, it would be nice to get it merged.

Just to be clear about the libdrm branch linked in the cover letter,
it is not required. Only for virgl platforms which happens to be what
I tested on.


virgl will still fallback to using the first render node without those
libdrm changes, right? If not, I don't think we should apply until
we're not breaking a platform...



No it will not fall back. I agree that holding off makes more sense.


What's the reason of this problems? Is it because of drmGetDevices()? Since
we don't really use it for anything other than getting the list of render
nodes in the system, maybe we could just iterate over any /dev/renderD*
nodes explicitly and avoid introducing new problems?


That's exactly the problem, and yes we could 100% solve by iterating over
/dev/renderD* nodes. I originally assumed we wouldn't want to do that, but
rather use the libdrm interfaces.

But for the next spin I could avoid using libdrm, should I?


Rob.



Best regards,
Tomasz


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS

2018-05-25 Thread Samuel Pitoiset



On 05/25/2018 04:28 AM, Timothy Arceri wrote:

On 25/05/18 11:24, Bas Nieuwenhuizen wrote:
On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri 
 wrote:


 From what I recall with my testing on radeonsi this wasn't really 
the ideal
thing to do. Especially when varyings arrays are accessed via and 
indirect

index, register use very quickly gets out of control.


in radv we lower all indirect accesses in nir anyway, so that doesn't
really happen in the backend anymore.


Thats only for Polaris and higher though, and even then I thought that 
was an LLVM bug that should eventually be fixed?


I don't know, I didn't hit this potential LLVM bug.








On 23/05/18 22:31, Samuel Pitoiset wrote:


Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset 
---
   src/amd/vulkan/radv_shader.c | 10 ++
   1 file changed, 10 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c 
b/src/amd/vulkan/radv_shader.c

index 7ed5d2a421..84ad215ccb 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device
*device,
 nir_lower_vars_to_ssa(nir);
   + if (nir->info.stage == MESA_SHADER_VERTEX ||
+   nir->info.stage == MESA_SHADER_GEOMETRY) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, true);
+   } else if (nir->info.stage == MESA_SHADER_TESS_EVAL||
+  nir->info.stage == MESA_SHADER_FRAGMENT) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, 
false);

+   }
+
 nir_split_var_copies(nir);
 nir_lower_var_copies(nir);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS

2018-05-25 Thread Timothy Arceri

On 25/05/18 19:57, Samuel Pitoiset wrote:

On 05/25/2018 04:28 AM, Timothy Arceri wrote:

On 25/05/18 11:24, Bas Nieuwenhuizen wrote:
On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri 
 wrote:


 From what I recall with my testing on radeonsi this wasn't really 
the ideal
thing to do. Especially when varyings arrays are accessed via and 
indirect

index, register use very quickly gets out of control.


in radv we lower all indirect accesses in nir anyway, so that doesn't
really happen in the backend anymore.


Thats only for Polaris and higher though, and even then I thought that 
was an LLVM bug that should eventually be fixed?


I don't know, I didn't hit this potential LLVM bug.


I just mean isn't that the only reason we lower indirect access for some 
varyings in RADV/radeonsi? Because of missing support in LLVM.











On 23/05/18 22:31, Samuel Pitoiset wrote:


Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset 
---
   src/amd/vulkan/radv_shader.c | 10 ++
   1 file changed, 10 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c 
b/src/amd/vulkan/radv_shader.c

index 7ed5d2a421..84ad215ccb 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device
*device,
 nir_lower_vars_to_ssa(nir);
   + if (nir->info.stage == MESA_SHADER_VERTEX ||
+   nir->info.stage == MESA_SHADER_GEOMETRY) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, 
true);

+   } else if (nir->info.stage == MESA_SHADER_TESS_EVAL||
+  nir->info.stage == MESA_SHADER_FRAGMENT) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, 
false);

+   }
+
 nir_split_var_copies(nir);
 nir_lower_var_copies(nir);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS

2018-05-25 Thread Timothy Arceri



On 25/05/18 20:40, Timothy Arceri wrote:

On 25/05/18 19:57, Samuel Pitoiset wrote:

On 05/25/2018 04:28 AM, Timothy Arceri wrote:

On 25/05/18 11:24, Bas Nieuwenhuizen wrote:
On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri 
 wrote:


 From what I recall with my testing on radeonsi this wasn't really 
the ideal
thing to do. Especially when varyings arrays are accessed via and 
indirect

index, register use very quickly gets out of control.


in radv we lower all indirect accesses in nir anyway, so that doesn't
really happen in the backend anymore.


Thats only for Polaris and higher though, and even then I thought 
that was an LLVM bug that should eventually be fixed?


I don't know, I didn't hit this potential LLVM bug.


I just mean isn't that the only reason we lower indirect access for some 
varyings in RADV/radeonsi? Because of missing support in LLVM.


Also if I'm recalling correctly I believe the tgsi radeonsi backend does 
something slightly better to work around that than what the NIR backend 
and RADV does.














On 23/05/18 22:31, Samuel Pitoiset wrote:


Do not lower FS inputs because this moves all load_var
instructions at beginning of shaders and because
interp_var_at_sample (and friends) seem broken. That might
be eventually enabled later on if we really want to preload
all FS inputs at beginning.

Polaris10:
Totals from affected shaders:
SGPRS: 54072 -> 54264 (0.36 %)
VGPRS: 38580 -> 38124 (-1.18 %)
Spilled SGPRs: 652 -> 652 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2128116 -> 2127380 (-0.03 %) bytes
Max Waves: 8048 -> 8086 (0.47 %)

Vega10:
Totals from affected shaders:
SGPRS: 52616 -> 52656 (0.08 %)
VGPRS: 37536 -> 37116 (-1.12 %)
Spilled SGPRs: 828 -> 828 (0.00 %)
Code Size: 2043756 -> 2042672 (-0.05 %) bytes
Max Waves: 9176 -> 9254 (0.85 %)

Signed-off-by: Samuel Pitoiset 
---
   src/amd/vulkan/radv_shader.c | 10 ++
   1 file changed, 10 insertions(+)

diff --git a/src/amd/vulkan/radv_shader.c 
b/src/amd/vulkan/radv_shader.c

index 7ed5d2a421..84ad215ccb 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device
*device,
 nir_lower_vars_to_ssa(nir);
   + if (nir->info.stage == MESA_SHADER_VERTEX ||
+   nir->info.stage == MESA_SHADER_GEOMETRY) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, 
true);

+   } else if (nir->info.stage == MESA_SHADER_TESS_EVAL||
+  nir->info.stage == MESA_SHADER_FRAGMENT) {
+   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
+  nir_shader_get_entrypoint(nir), true, 
false);

+   }
+
 nir_split_var_copies(nir);
 nir_lower_var_copies(nir);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/16] Added ci yaml file for Gitlab.

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 17:27:04 -0700, Laura Ekstrand wrote:
> For now, all this does is copy our current webpage into a public folder.
> Daniel Stone has the server configured to check this public folder and
> host the index.html as mesa-test.freedesktop.org. When this patch series
> is approved, Daniel will change it to point at mesa-3d.org.
> ---
>  .gitlab-ci.yml | 9 +
>  1 file changed, 9 insertions(+)
>  create mode 100644 .gitlab-ci.yml
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> new file mode 100644
> index 00..29b30541b5
> --- /dev/null
> +++ b/.gitlab-ci.yml
> @@ -0,0 +1,9 @@
> +pages:
> +   stage: deploy
> +   script:
> +   - mkdir .public
> +   - cp -r docs/* .public
> +   - mv .public public

I don't think the two-steps thing is needed here; you can drop .public
and have everything in public directly.

If I'm misunderstanding gitlab-ci and this is running one the same
filesystem as the website, then you'll need to `rm -r public` before the
move, otherwise `mv .public public` will not do what you want :)

> +   artifacts:
> + paths:
> + - public
> -- 
> 2.14.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105464

--- Comment #16 from Samuel Pitoiset  ---
No CTS regressions, I'm fine with it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/16] docs: Add python script that converts html to rst.

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 17:27:05 -0700, Laura Ekstrand wrote:
> Use Beautiful Soup to fix bad html, then use pandoc for converting to
> rst.
> ---
>  docs/rstConverter.py | 23 +++
>  1 file changed, 23 insertions(+)
>  create mode 100755 docs/rstConverter.py
> 
> diff --git a/docs/rstConverter.py b/docs/rstConverter.py
> new file mode 100755
> index 00..5321fdde8b
> --- /dev/null
> +++ b/docs/rstConverter.py
> @@ -0,0 +1,23 @@
> +#!/usr/bin/python3
> +import glob
> +import subprocess
> +from bs4 import BeautifulSoup
> +
> +pages = glob.glob("*.html")
> +pages += glob.glob("relnotes/*.html")
> +for filename in pages:
> +# Fix some annoyingly bad html.
> +with open(filename) as f:
> +soup = BeautifulSoup(f, 'html5lib')
> +soup.find("div", "header").extract() # Get rid of old header
> +soup.iframe.extract() # Get rid of old contents bar.
> +soup.find("div", "content").unwrap() # Strip the content div.

Good call on using beautifulsoup to clean the html before converting it!

> +
> +# Write out the better html.
> +with open(filename, 'wt') as f:
> +f.write(str(soup))
> +
> +# Convert to rst with pandoc.
> +name = filename.split(".html")[0]
> +bashCmd = "pandoc " + filename + " -o " + name + ".rst"
> +subprocess.run(bashCmd.split())

Idea: remove the old html at the same time as we introduce the rst
(commit-wise), so that git picks it up as a rename with changes, which
hopefully would be easier to check as a 1:1 of any given conversion?

(In case this is as unclear as I think it is, I'm thinking about how we
can review individual pages conversions; say index.html -> index.rst, to
see that no release has been dropped in the process. If git shows this
as a rename with changes, I expect it will be easier to check than if
one commit creates all the rst files and another deletes all the html)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105464

--- Comment #17 from Nicolai Hähnle  ---
Great, thanks for testing!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/16] Added ci yaml file for Gitlab.

2018-05-25 Thread Daniel Stone
Hi Eric,

On 25 May 2018 at 12:15, Eric Engestrom  wrote:
> If I'm misunderstanding gitlab-ci and this is running one the same
> filesystem as the website, then you'll need to `rm -r public` before the
> move, otherwise `mv .public public` will not do what you want :)

It's always run in a fresh container, and the public/ directory is
captured from that and installed later behind the scenes. So there's
no need to do it here.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS

2018-05-25 Thread Bas Nieuwenhuizen
On Fri, May 25, 2018 at 12:45 PM, Timothy Arceri  wrote:
>
>
> On 25/05/18 20:40, Timothy Arceri wrote:
>>
>> On 25/05/18 19:57, Samuel Pitoiset wrote:
>>>
>>> On 05/25/2018 04:28 AM, Timothy Arceri wrote:

 On 25/05/18 11:24, Bas Nieuwenhuizen wrote:
>
> On Fri, May 25, 2018 at 2:25 AM, Timothy Arceri 
> wrote:
>>
>>
>>  From what I recall with my testing on radeonsi this wasn't really the
>> ideal
>> thing to do. Especially when varyings arrays are accessed via and
>> indirect
>> index, register use very quickly gets out of control.
>
>
> in radv we lower all indirect accesses in nir anyway, so that doesn't
> really happen in the backend anymore.


 Thats only for Polaris and higher though, and even then I thought that
 was an LLVM bug that should eventually be fixed?
>>>
>>>
>>> I don't know, I didn't hit this potential LLVM bug.
>>
>>
>> I just mean isn't that the only reason we lower indirect access for some
>> varyings in RADV/radeonsi? Because of missing support in LLVM.
>
>
> Also if I'm recalling correctly I believe the tgsi radeonsi backend does
> something slightly better to work around that than what the NIR backend and
> RADV does.

So for Vega+ we lower indirect indexing for everything because it is
utterly broken in LLVM.

for the other GPUs  we lower locals, as large vectors + spilling =
nightmare. radeonsi solves it by explicitly putting the large arrays
in memory. That way you load only one value on an indirect deref
instead of loaidng the entire array -> doing the indirect deref ->
spilling the entire array.

>
>
>
>>
>>>


>>
>>
>>
>> On 23/05/18 22:31, Samuel Pitoiset wrote:
>>>
>>>
>>> Do not lower FS inputs because this moves all load_var
>>> instructions at beginning of shaders and because
>>> interp_var_at_sample (and friends) seem broken. That might
>>> be eventually enabled later on if we really want to preload
>>> all FS inputs at beginning.
>>>
>>> Polaris10:
>>> Totals from affected shaders:
>>> SGPRS: 54072 -> 54264 (0.36 %)
>>> VGPRS: 38580 -> 38124 (-1.18 %)
>>> Spilled SGPRs: 652 -> 652 (0.00 %)
>>> Spilled VGPRs: 0 -> 0 (0.00 %)
>>> Code Size: 2128116 -> 2127380 (-0.03 %) bytes
>>> Max Waves: 8048 -> 8086 (0.47 %)
>>>
>>> Vega10:
>>> Totals from affected shaders:
>>> SGPRS: 52616 -> 52656 (0.08 %)
>>> VGPRS: 37536 -> 37116 (-1.12 %)
>>> Spilled SGPRs: 828 -> 828 (0.00 %)
>>> Code Size: 2043756 -> 2042672 (-0.05 %) bytes
>>> Max Waves: 9176 -> 9254 (0.85 %)
>>>
>>> Signed-off-by: Samuel Pitoiset 
>>> ---
>>>src/amd/vulkan/radv_shader.c | 10 ++
>>>1 file changed, 10 insertions(+)
>>>
>>> diff --git a/src/amd/vulkan/radv_shader.c
>>> b/src/amd/vulkan/radv_shader.c
>>> index 7ed5d2a421..84ad215ccb 100644
>>> --- a/src/amd/vulkan/radv_shader.c
>>> +++ b/src/amd/vulkan/radv_shader.c
>>> @@ -278,6 +278,16 @@ radv_shader_compile_to_nir(struct radv_device
>>> *device,
>>>  nir_lower_vars_to_ssa(nir);
>>>+ if (nir->info.stage == MESA_SHADER_VERTEX ||
>>> +   nir->info.stage == MESA_SHADER_GEOMETRY) {
>>> +   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
>>> +  nir_shader_get_entrypoint(nir), true,
>>> true);
>>> +   } else if (nir->info.stage == MESA_SHADER_TESS_EVAL||
>>> +  nir->info.stage == MESA_SHADER_FRAGMENT) {
>>> +   NIR_PASS_V(nir, nir_lower_io_to_temporaries,
>>> +  nir_shader_get_entrypoint(nir), true,
>>> false);
>>> +   }
>>> +
>>>  nir_split_var_copies(nir);
>>>  nir_lower_var_copies(nir);
>>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98581] Dota 2 graphics glitch on autocast abilities.

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98581

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #1 from Samuel Pitoiset  ---
I don't think this can still be reproduced. Dota2 and RADV have evolved a lot
since the original bug report. I'm going to close it. Feel free to re-open if
I'm wrong (and explain how to reproduce).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] radv: allow radv_emit_shader_pointer_head() to emit more pointers

2018-05-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_private.h | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index e554fc7acc..708cacf770 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1132,9 +1132,11 @@ bool radv_get_memory_fd(struct radv_device *device,
 
 static inline void
 radv_emit_shader_pointer_head(struct radeon_winsys_cs *cs,
- unsigned sh_offset, bool use_32bit_pointers)
+ unsigned sh_offset, unsigned pointer_count,
+ bool use_32bit_pointers)
 {
-   radeon_set_sh_reg_seq(cs, sh_offset, use_32bit_pointers ? 1 : 2);
+   radeon_emit(cs, PKT3(PKT3_SET_SH_REG, pointer_count * 
(use_32bit_pointers ? 1 : 2), 0));
+   radeon_emit(cs, (sh_offset - SI_SH_REG_OFFSET) >> 2);
 }
 
 static inline void
@@ -1159,7 +1161,7 @@ radv_emit_shader_pointer(struct radv_device *device,
 {
bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global;
 
-   radv_emit_shader_pointer_head(cs, sh_offset, use_32bit_pointers);
+   radv_emit_shader_pointer_head(cs, sh_offset, 1, use_32bit_pointers);
radv_emit_shader_pointer_body(device, cs, va, use_32bit_pointers);
 }
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] radv: split radv_emit_shader_pointer()

2018-05-25 Thread Samuel Pitoiset
This will allow to emit consecutive shader pointers for
reducing the number of emitted SET_SH_REG packets, which
is recommended.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_private.h | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index e2fa58d8d1..e554fc7acc 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1131,13 +1131,17 @@ bool radv_get_memory_fd(struct radv_device *device,
int *pFD);
 
 static inline void
-radv_emit_shader_pointer(struct radv_device *device,
-struct radeon_winsys_cs *cs,
-uint32_t sh_offset, uint64_t va, bool global)
+radv_emit_shader_pointer_head(struct radeon_winsys_cs *cs,
+ unsigned sh_offset, bool use_32bit_pointers)
 {
-   bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global;
-
radeon_set_sh_reg_seq(cs, sh_offset, use_32bit_pointers ? 1 : 2);
+}
+
+static inline void
+radv_emit_shader_pointer_body(struct radv_device *device,
+ struct radeon_winsys_cs *cs,
+ uint64_t va, bool use_32bit_pointers)
+{
radeon_emit(cs, va);
 
if (use_32bit_pointers) {
@@ -1148,6 +1152,17 @@ radv_emit_shader_pointer(struct radv_device *device,
}
 }
 
+static inline void
+radv_emit_shader_pointer(struct radv_device *device,
+struct radeon_winsys_cs *cs,
+uint32_t sh_offset, uint64_t va, bool global)
+{
+   bool use_32bit_pointers = HAVE_32BIT_POINTERS && !global;
+
+   radv_emit_shader_pointer_head(cs, sh_offset, use_32bit_pointers);
+   radv_emit_shader_pointer_body(device, cs, va, use_32bit_pointers);
+}
+
 static inline struct radv_descriptor_state *
 radv_get_descriptors_state(struct radv_cmd_buffer *cmd_buffer,
   VkPipelineBindPoint bind_point)
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radv: emit shader descriptor pointers consecutively

2018-05-25 Thread Samuel Pitoiset
This reduces the number of SET_SH_REG packets which are emitted
for applications that use more than one descriptor set per stage.

We should be able to emit more SET_SH_REG packets consecutively
(like push constants and vertex buffers for the vertex stage),
but this will be improved later.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 104 +--
 1 file changed, 57 insertions(+), 47 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 5ab577b4c5..206d9b7fad 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -594,6 +594,46 @@ radv_emit_userdata_address(struct radv_cmd_buffer 
*cmd_buffer,
 base_reg + loc->sgpr_idx * 4, va, false);
 }
 
+static void
+radv_emit_descriptor_pointers(struct radv_cmd_buffer *cmd_buffer,
+ struct radv_pipeline *pipeline,
+ struct radv_descriptor_state *descriptors_state,
+ gl_shader_stage stage)
+{
+   struct radv_device *device = cmd_buffer->device;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   uint32_t sh_base = pipeline->user_data_0[stage];
+   struct radv_userdata_locations *locs =
+   &pipeline->shaders[stage]->info.user_sgprs_locs;
+   unsigned mask;
+
+   mask = descriptors_state->dirty & descriptors_state->valid;
+
+   for (int i = 0; i < MAX_SETS; i++) {
+   struct radv_userdata_info *loc = &locs->descriptor_sets[i];
+   if (loc->sgpr_idx != -1 && !loc->indirect)
+   continue;
+   mask &= ~(1 << i);
+   }
+
+   while (mask) {
+   int start, count;
+
+   u_bit_scan_consecutive_range(&mask, &start, &count);
+
+   struct radv_userdata_info *loc = &locs->descriptor_sets[start];
+   unsigned sh_offset = sh_base + loc->sgpr_idx * 4;
+
+   radv_emit_shader_pointer_head(cs, sh_offset, count, true);
+   for (int i = 0; i < count; i++) {
+   struct radv_descriptor_set *set =
+   descriptors_state->sets[start + i];
+
+   radv_emit_shader_pointer_body(device, cs, set->va, 
true);
+   }
+   }
+}
+
 static void
 radv_update_multisample_state(struct radv_cmd_buffer *cmd_buffer,
  struct radv_pipeline *pipeline)
@@ -1429,47 +1469,6 @@ radv_cmd_buffer_flush_dynamic_state(struct 
radv_cmd_buffer *cmd_buffer)
cmd_buffer->state.dirty &= ~states;
 }
 
-static void
-emit_stage_descriptor_set_userdata(struct radv_cmd_buffer *cmd_buffer,
-  struct radv_pipeline *pipeline,
-  int idx,
-  uint64_t va,
-  gl_shader_stage stage)
-{
-   struct radv_userdata_info *desc_set_loc = 
&pipeline->shaders[stage]->info.user_sgprs_locs.descriptor_sets[idx];
-   uint32_t base_reg = pipeline->user_data_0[stage];
-
-   if (desc_set_loc->sgpr_idx == -1 || desc_set_loc->indirect)
-   return;
-
-   assert(!desc_set_loc->indirect);
-   assert(desc_set_loc->num_sgprs == (HAVE_32BIT_POINTERS ? 1 : 2));
-
-   radv_emit_shader_pointer(cmd_buffer->device, cmd_buffer->cs,
-base_reg + desc_set_loc->sgpr_idx * 4, va, 
false);
-}
-
-static void
-radv_emit_descriptor_set_userdata(struct radv_cmd_buffer *cmd_buffer,
- VkShaderStageFlags stages,
- struct radv_descriptor_set *set,
- unsigned idx)
-{
-   if (cmd_buffer->state.pipeline) {
-   radv_foreach_stage(stage, stages) {
-   if (cmd_buffer->state.pipeline->shaders[stage])
-   emit_stage_descriptor_set_userdata(cmd_buffer, 
cmd_buffer->state.pipeline,
-  idx, set->va,
-  stage);
-   }
-   }
-
-   if (cmd_buffer->state.compute_pipeline && (stages & 
VK_SHADER_STAGE_COMPUTE_BIT))
-   emit_stage_descriptor_set_userdata(cmd_buffer, 
cmd_buffer->state.compute_pipeline,
-  idx, set->va,
-  MESA_SHADER_COMPUTE);
-}
-
 static void
 radv_flush_push_descriptors(struct radv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point)
@@ -1551,7 +1550,6 @@ radv_flush_descriptors(struct radv_cmd_buffer *cmd_buffer,
 VK_PIPELINE_BIND_POINT_GRAPHICS;
struct radv_descriptor_state *descriptors_state =
radv_get_descriptors_state(cmd_buffer, bind_point);
-   

Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Rob Herring
On Fri, May 25, 2018 at 4:15 AM, Robert Foss  wrote:
>
>
> On 2018-05-25 10:38, Tomasz Figa wrote:
>>
>> On Fri, May 25, 2018 at 5:33 PM Robert Foss 
>> wrote:
>>
>>> Hey,
>>
>>
>>> On 2018-05-25 02:17, Rob Herring wrote:

 On Thu, May 24, 2018 at 6:23 AM, Robert Foss 
>>
>> wrote:
>
> Hey,
>
> I don't think I've received any feedback on this version yet.
> If anyone has some time to spare, it would be nice to get it merged.
>
> Just to be clear about the libdrm branch linked in the cover letter,
> it is not required. Only for virgl platforms which happens to be what
> I tested on.


 virgl will still fallback to using the first render node without those
 libdrm changes, right? If not, I don't think we should apply until
 we're not breaking a platform...
>>
>>
>>> No it will not fall back. I agree that holding off makes more sense.
>>
>>
>> What's the reason of this problems? Is it because of drmGetDevices()?
>> Since
>> we don't really use it for anything other than getting the list of render
>> nodes in the system, maybe we could just iterate over any /dev/renderD*
>> nodes explicitly and avoid introducing new problems?
>
>
> That's exactly the problem, and yes we could 100% solve by iterating over
> /dev/renderD* nodes. I originally assumed we wouldn't want to do that, but
> rather use the libdrm interfaces.
>
> But for the next spin I could avoid using libdrm, should I?

I don't have an opinion on libdrm really, but I do think we should
fallback to the 1st (only) render node rather than just fail.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Tomasz Figa
On Fri, May 25, 2018 at 10:59 PM Rob Herring  wrote:

> On Fri, May 25, 2018 at 4:15 AM, Robert Foss 
wrote:
> >
> >
> > On 2018-05-25 10:38, Tomasz Figa wrote:
> >>
> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss 
> >> wrote:
> >>
> >>> Hey,
> >>
> >>
> >>> On 2018-05-25 02:17, Rob Herring wrote:
> 
>  On Thu, May 24, 2018 at 6:23 AM, Robert Foss <
robert.f...@collabora.com>
> >>
> >> wrote:
> >
> > Hey,
> >
> > I don't think I've received any feedback on this version yet.
> > If anyone has some time to spare, it would be nice to get it merged.
> >
> > Just to be clear about the libdrm branch linked in the cover letter,
> > it is not required. Only for virgl platforms which happens to be
what
> > I tested on.
> 
> 
>  virgl will still fallback to using the first render node without
those
>  libdrm changes, right? If not, I don't think we should apply until
>  we're not breaking a platform...
> >>
> >>
> >>> No it will not fall back. I agree that holding off makes more sense.
> >>
> >>
> >> What's the reason of this problems? Is it because of drmGetDevices()?
> >> Since
> >> we don't really use it for anything other than getting the list of
render
> >> nodes in the system, maybe we could just iterate over any /dev/renderD*
> >> nodes explicitly and avoid introducing new problems?
> >
> >
> > That's exactly the problem, and yes we could 100% solve by iterating
over
> > /dev/renderD* nodes. I originally assumed we wouldn't want to do that,
but
> > rather use the libdrm interfaces.
> >
> > But for the next spin I could avoid using libdrm, should I?

> I don't have an opinion on libdrm really, but I do think we should
> fallback to the 1st (only) render node rather than just fail.

We do, even with libdrm.

AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't
include devices on virtio bus in the results, which means that there likely
wouldn't be any render node returned.

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/16] docs: Fix Sphinx compile errors.

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 17:27:15 -0700, Laura Ekstrand wrote:
> This just involves some quick fixes to formatting of the affected pages.
> ---
>  docs/autoconf.rst|  1 +
>  docs/conf.py |  2 +-
>  docs/dispatch.rst| 72 
> ++--
>  docs/egl.rst |  2 ++
>  docs/releasing.rst   | 14 +-
>  docs/relnotes.rst| 72 
> +++-
>  docs/relnotes/17.0.5.rst |  2 +-
>  docs/relnotes/9.2.2.rst  |  1 -
>  8 files changed, 86 insertions(+), 80 deletions(-)
> 
> diff --git a/docs/autoconf.rst b/docs/autoconf.rst
> index 007252feb0..25ba71cf66 100644
> --- a/docs/autoconf.rst
> +++ b/docs/autoconf.rst
> @@ -102,6 +102,7 @@ There are also a few general options for altering the 
> Mesa build:
>  This option ensures that assembly will not be used.
>  
>  ``--build=``
> +.. See host
>  ``--host=``
>  By default, the build will compile code for the architecture that
>  it's running on. In order to build cross-compile Mesa on a x86-64
> diff --git a/docs/conf.py b/docs/conf.py
> index dcdbdd51db..33bf717a87 100644
> --- a/docs/conf.py
> +++ b/docs/conf.py
> @@ -99,7 +99,7 @@ html_theme = 'sphinx_rtd_theme'
>  # Add any paths that contain custom static files (such as style sheets) here,
>  # relative to this directory. They are copied after the builtin static files,
>  # so a file named "default.css" will overwrite the builtin "default.css".
> -html_static_path = ['_static']
> +html_static_path = []
>  
>  
>  # -- Options for HTMLHelp output --
> diff --git a/docs/dispatch.rst b/docs/dispatch.rst
> index d6f8542c68..aba7192c31 100644
> --- a/docs/dispatch.rst
> +++ b/docs/dispatch.rst
> @@ -62,18 +62,17 @@ conceptually simple:
>  This can be implemented in just a few lines of C code. The file
>  ``src/mesa/glapi/glapitemp.h`` contains code very similar to this.
>  
> -
> +--+
> -| :: 
>   |
> -|
>   |
> -| void glVertex3f(GLfloat x, GLfloat y, GLfloat z)   
>   |
> -| {  
>   |
> -| const struct _glapi_table * const dispatch = GET_DISPATCH();   
>   |
> -|
>   |
> -| (*dispatch->Vertex3f)(x, y, z);
>   |
> -| }  
>   |
> -
> +--+
> -| Sample dispatch function   
>   |
> -
> +--+
> +Sample dispatch function
> +
> +
> +.. code-block:: c
> +
> + void glVertex3f(GLfloat x, GLfloat y, GLfloat z)
> + {
> + const struct _glapi_table * const dispatch = GET_DISPATCH();
> +
> + (*dispatch->Vertex3f)(x, y, z);
> + }
>  
>  The problem with this simple implementation is the large amount of
>  overhead that it adds to every GL function call.
> @@ -118,16 +117,14 @@ resulting implementation of ``GET_DISPATCH`` is 
> slightly more complex,
>  but it avoids the expensive ``pthread_getspecific`` call in the common
>  case.
>  
> -
> +--+
> -| :: 
>   |
> -|
>   |
> -| #define GET_DISPATCH() \   
>   |
> -| (_glapi_Dispatch != NULL) \
>   |
> -| ? _glapi_Dispatch : 
> pthread_getspecific(&_glapi_Dispatch_key |
> -| )  
>   |
> -
> +--+
> -| Improved ``GET_DISPATCH`` Implementation   
>   |
> -
> +--+
> +Improved ``GET_DISPATCH`` Implementation
> +
> +.. code-block:: c
> +
> +#define GET_DISPATCH() \
> +(_glapi_Dispatch != NULL) \
> +? _glapi_Dispatch : pthread_getspecific(&_glapi_Dispatch_key)
> +
>  
>  3.2. ELF TLS
>  
> @@ -145,16 +142,14 @@ with direct rendering drivers that use either 
> interface. Once the
>  pointer is properly declared, ``GET_DISPACH`` becomes a simple variable
>  reference.
>  
> -
> +-

Re: [Mesa-dev] [PATCH 15/16] docs: Human edits to the website code for clarity.

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 17:27:18 -0700, Laura Ekstrand wrote:
> There's a lot here.  If you're interested, it's mostly whitespace fixes,
> switching variable names and function names to the Sphinx orange variable
> highlight style, and naming code blocks to take advantage of Pygments
> syntax highlighting.
> ---
>  docs/application-issues.rst |   8 +-
>  docs/autoconf.rst   |   9 +-
>  docs/codingstyle.rst|  36 +++
>  docs/conf.py|   2 +-
>  docs/conform.rst|   2 +-
>  docs/debugging.rst  |  12 +--
>  docs/devinfo.rst|  26 ++---
>  docs/download.rst   |   6 +-
>  docs/egl.rst|   2 +-
>  docs/extensions.rst |  42 
>  docs/faq.rst|  38 +++
>  docs/helpwanted.rst |  14 +--
>  docs/index.rst  | 240 
> +++-
>  docs/install.rst|  64 ++--
>  docs/intro.rst  | 124 +++
>  docs/license.rst|  12 +--
>  docs/llvmpipe.rst   |  65 +---
>  docs/mangling.rst   |   4 +-
>  docs/meson.rst  |  18 ++--
>  docs/osmesa.rst |  12 +--
>  docs/perf.rst   |  85 +++-
>  docs/postprocess.rst|  11 +-
>  docs/precompiled.rst|   6 +-
>  docs/release-calendar.rst   | 158 ++---
>  docs/releasing.rst  | 158 ++---
>  docs/repository.rst |  59 ++-
>  docs/shading.rst|  99 --
>  docs/sourcetree.rst |  12 +--
>  docs/submittingpatches.rst  | 123 ---
>  docs/thanks.rst |   2 +-
>  docs/versions.rst   |   8 +-
>  docs/viewperf.rst   |  94 -
>  docs/vmware-guest.rst   | 146 +--
>  docs/xlibdriver.rst |  60 +--
>  34 files changed, 882 insertions(+), 875 deletions(-)
> 
[snip]
> diff --git a/docs/conf.py b/docs/conf.py
> index 33bf717a87..c6eac2394d 100644
> --- a/docs/conf.py
> +++ b/docs/conf.py
> @@ -99,7 +99,7 @@ html_theme = 'sphinx_rtd_theme'
>  # Add any paths that contain custom static files (such as style sheets) here,
>  # relative to this directory. They are copied after the builtin static files,
>  # so a file named "default.css" will overwrite the builtin "default.css".
> -html_static_path = []
> +html_static_path = ['specs/']

Any reason not to do this right away when creating the file in patch 6? :)
That way the corresponding hunk in patch 12 is not necessary either.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/16] docs: Remove unneeded mesa css file.

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 17:27:19 -0700, Laura Ekstrand wrote:
> Goodbye old css file.  You belong in 1999 from whence you came.
> ---
>  docs/mesa.css | 63 
> ---
>  1 file changed, 63 deletions(-)
>  delete mode 100644 docs/mesa.css

I guess this could be deleted at the same time as the html files, but it
doesn't really matter.

I'm quite happy with the new website with its default theme already; we
can always spend ages debating the theme style later, right now I'd love
for this to land as soon as we start using gitlab :)

For patch 1 (the yaml file), with or without my comment (can be done
later), patch 2 (the python conversion script, which btw I guess we
should probably delete once the conversion is done), patch 7 (the
sphinx-build yml line), and 12-16 are:
Reviewed-by: Eric Engestrom 

The rest of the series is:
Acked-by: Eric Engestrom 

Thank you very much for finishing the task many of us gave a shot at,
but didn't carry through!

> 
> diff --git a/docs/mesa.css b/docs/mesa.css
> deleted file mode 100644
> index 7ab8152b04..00
> --- a/docs/mesa.css
> +++ /dev/null
> @@ -1,63 +0,0 @@
> -/* Mesa CSS */
> -body {
> - background-color: #ff;
> - font: 14px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif;
> - color: black;
> - link: #88;
> -}
> -
> -h1 {
> - font: 24px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif;
> - font-weight: bold;
> - color: black;
> -}
> -
> -h2 {
> - font: 18px 'Lucida Grande', Geneva, Arial, Verdana, sans-serif, bold;
> - font-weight: bold;
> - color: black;
> -}
> -
> -code {
> - font-family: monospace;
> - font-size: 10pt;
> - color: black;
> -}
> -
> -
> -pre {
> - /*font-family: monospace;*/
> - font-size: 10pt;
> - /*color: black;*/
> -}
> -
> -iframe {
> -  width: 19em;
> -  height: 80em;
> -  border: none;
> -  float: left;
> -}
> -
> -.content {
> -  position: absolute;
> -  left: 20em;
> -  right: 10px;
> -  overflow: hidden
> -}
> -
> -.header {
> -  background: black url('gears.png') 15px no-repeat;
> -  margin:0;
> -  padding: 5px;
> -  clear:both;
> -}
> -
> -.header h1 {
> -  background: url('gears.png') right no-repeat;
> -  color: white;
> -  font: x-large sans-serif;
> -  text-align: center;
> -  height: 50px;
> -  margin: 0;
> -  padding-top: 30px;
> -}
> -- 
> 2.14.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.0 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106644

--- Comment #9 from Ben Crocker  ---
We note that this is a build for a PPC970, which is essentially
a big-endian ~Power4 equivalent to a G5 Mac.

Moreover, it appears to be a 32-bit build.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/x11: Move dri2_format_for_depth prototype.

2018-05-25 Thread Eric Engestrom
On Friday, 2018-05-25 06:52:25 +, Vinson Lee wrote:
> Fix build error without DRI3.

D'uh!
I forgot building dri3 was optional, sorry :/

Reviewed-by: Eric Engestrom 

> 
>   CC   drivers/dri2/platform_x11.lo
> drivers/dri2/platform_x11.c:1010:1: error: no previous prototype for function 
> 'dri2_format_for_depth' [-Werror,-Wmissing-prototypes]
> dri2_format_for_depth(uint32_t depth)
> ^
> 
> Fixes: 473af0b541b2 ("egl/x11: deduplicate depth-to-format logic")
> Signed-off-by: Vinson Lee 
> ---
>  src/egl/drivers/dri2/egl_dri2.h  | 3 +++
>  src/egl/drivers/dri2/platform_x11_dri3.h | 3 ---
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
> index adabc527f85b..b91a899e476c 100644
> --- a/src/egl/drivers/dri2/egl_dri2.h
> +++ b/src/egl/drivers/dri2/egl_dri2.h
> @@ -523,4 +523,7 @@ dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, 
> EGLint type,
>  void
>  dri2_fini_surface(_EGLSurface *surf);
>  
> +uint32_t
> +dri2_format_for_depth(uint32_t depth);
> +
>  #endif /* EGL_DRI2_INCLUDED */
> diff --git a/src/egl/drivers/dri2/platform_x11_dri3.h 
> b/src/egl/drivers/dri2/platform_x11_dri3.h
> index e6fd01366978..96e7ee972d9f 100644
> --- a/src/egl/drivers/dri2/platform_x11_dri3.h
> +++ b/src/egl/drivers/dri2/platform_x11_dri3.h
> @@ -38,7 +38,4 @@ extern struct dri2_egl_display_vtbl dri3_x11_display_vtbl;
>  EGLBoolean
>  dri3_x11_connect(struct dri2_egl_display *dri2_dpy);
>  
> -uint32_t
> -dri2_format_for_depth(uint32_t depth);
> -
>  #endif
> -- 
> 2.17.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/7] swr/rast: Added in-place building to SCATTERPS

2018-05-25 Thread Alok Hota
SCATTERPS previously assumed it was being used with an existing basic
block
---
 .../drivers/swr/rasterizer/jitter/builder_mem.cpp  | 29 +++---
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
index 6e17888..77c2095 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
@@ -617,17 +617,28 @@ namespace SwrJit
 
 Value* pIsUndef = ICMP_EQ(pIndex, C(32));
 
-// Split current block
-BasicBlock* pPostLoop = 
pCurBB->splitBasicBlock(cast(pIsUndef)->getNextNode());
+// Split current block or create new one if building inline
+BasicBlock* pPostLoop;
+if (pCurBB->getTerminator())
+{
+pPostLoop = 
pCurBB->splitBasicBlock(cast(pIsUndef)->getNextNode());
 
-// Remove unconditional jump created by splitBasicBlock
-pCurBB->getTerminator()->eraseFromParent();
+// Remove unconditional jump created by splitBasicBlock
+pCurBB->getTerminator()->eraseFromParent();
 
-// Add terminator to end of original block
-IRB()->SetInsertPoint(pCurBB);
+// Add terminator to end of original block
+IRB()->SetInsertPoint(pCurBB);
 
-// Add conditional branch
-COND_BR(pIsUndef, pPostLoop, pLoop);
+// Add conditional branch
+COND_BR(pIsUndef, pPostLoop, pLoop);
+}
+else
+{
+pPostLoop = BasicBlock::Create(mpJitMgr->mContext, 
"PostScatter_Loop", pFunc);
+
+// Add conditional branch
+COND_BR(pIsUndef, pPostLoop, pLoop);
+}
 
 // Add loop basic block contents
 IRB()->SetInsertPoint(pLoop);
@@ -642,7 +653,7 @@ namespace SwrJit
 Value* pOffsetElem = LOADV(pOffsetsArrayPtr, { pIndexPhi });
 
 // GEP to this offset in dst
-Value* pCurDst = GEP(pDst, pOffsetElem);
+Value* pCurDst = GEP(pDst, pOffsetElem, mInt8PtrTy);
 pCurDst = POINTER_CAST(pCurDst, PointerType::get(pSrcTy, 0));
 STORE(pSrcElem, pCurDst);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/7] InitMemory inclusion

2018-05-25 Thread Alok Hota
Version 2 makes a small change to swr_loader.cpp to include the new InitMemory
header, which fixes a compile error on single-architecture builds.

Alok Hota (7):
  swr/rast: Added in-place building to SCATTERPS
  swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at
compile time
  swr/rast: Use metadata to communicate between passes
  swr/rast: Renamed MetaData calls
  swr/rast: Removed superfluous JitManager argument from passes
  swr/rast: Moved memory init out of core swr init
  swr/rast: Adjusted avx512 primitive assembly for msvc codegen

 src/gallium/drivers/swr/Makefile.sources   |   4 +-
 src/gallium/drivers/swr/meson.build|   2 +
 src/gallium/drivers/swr/rasterizer/core/api.cpp|   4 -
 src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 +++--
 .../drivers/swr/rasterizer/core/rdtsc_core.cpp |   1 +
 src/gallium/drivers/swr/rasterizer/core/state.h|   3 +-
 .../drivers/swr/rasterizer/jitter/blend_jit.cpp|   2 +-
 .../drivers/swr/rasterizer/jitter/builder.cpp  | 170 ++---
 .../drivers/swr/rasterizer/jitter/builder.h|  32 +++-
 .../drivers/swr/rasterizer/jitter/builder_mem.cpp  |  29 ++--
 .../drivers/swr/rasterizer/jitter/fetch_jit.cpp|   2 +-
 .../rasterizer/jitter/functionpasses/lower_x86.cpp |  17 +--
 .../swr/rasterizer/jitter/functionpasses/passes.h  |   2 +-
 .../swr/rasterizer/jitter/streamout_jit.cpp|   2 +-
 .../drivers/swr/rasterizer/memory/InitMemory.cpp   |  39 +
 .../drivers/swr/rasterizer/memory/InitMemory.h |  33 
 src/gallium/drivers/swr/swr_loader.cpp |   8 +-
 src/gallium/drivers/swr/swr_shader.cpp |   2 +-
 18 files changed, 325 insertions(+), 166 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp
 create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/7] swr/rast: Use metadata to communicate between passes

2018-05-25 Thread Alok Hota
---
 .../drivers/swr/rasterizer/jitter/builder.h| 28 ++
 1 file changed, 28 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
index 6ca128d..08a3a6e 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
@@ -124,6 +124,34 @@ namespace SwrJit
 bool SetTexelMaskEvaluate(Instruction* inst);
 bool IsTexelMaskEvaluate(Instruction* inst);
 Type* GetVectorType(Type* pType);
+void SetMetadata(StringRef s, uint32_t val)
+{
+llvm::NamedMDNode *metaData = 
mpJitMgr->mpCurrentModule->getOrInsertNamedMetadata(s);
+Constant* cval = mpIRBuilder->getInt32(val);
+llvm::MDNode *mdNode = 
llvm::MDNode::get(mpJitMgr->mpCurrentModule->getContext(), 
llvm::ConstantAsMetadata::get(cval));
+if (metaData->getNumOperands())
+{
+metaData->setOperand(0, mdNode);
+}
+else
+{
+metaData->addOperand(mdNode);
+}
+}
+uint32_t GetMetadata(StringRef s)
+{
+NamedMDNode* metaData = 
mpJitMgr->mpCurrentModule->getNamedMetadata(s);
+if (metaData)
+{
+MDNode* mdNode = metaData->getOperand(0);
+Metadata* val = mdNode->getOperand(0);
+return mdconst::dyn_extract(val)->getZExtValue();
+}
+else
+{
+return 0;
+}
+}
 
 #include "gen_builder.hpp"
 #include "gen_builder_meta.hpp"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/7] swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at compile time

2018-05-25 Thread Alok Hota
---
 src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp 
b/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp
index f289a31..48ea397 100644
--- a/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/rdtsc_core.cpp
@@ -89,6 +89,7 @@ BUCKET_DESC gCoreBuckets[] = {
 { "BEStoreTiles", "", true, 0xff00 },
 { "BEEndTile", "", false, 0x },
 };
+static_assert(NumBuckets == (sizeof(gCoreBuckets) / sizeof(gCoreBuckets[0])), 
"RDTSC Bucket enum and description table size mismatched.");
 
 /// @todo bucketmanager and mapping should probably be a part of the SWR 
context
 std::vector gBucketMap;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/7] swr/rast: Removed superfluous JitManager argument from passes

2018-05-25 Thread Alok Hota
---
 src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp |  2 +-
 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp |  2 +-
 .../swr/rasterizer/jitter/functionpasses/lower_x86.cpp  | 17 -
 .../swr/rasterizer/jitter/functionpasses/passes.h   |  2 +-
 .../drivers/swr/rasterizer/jitter/streamout_jit.cpp |  2 +-
 src/gallium/drivers/swr/swr_shader.cpp  |  2 +-
 6 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index 72bf900..20f2e42 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -819,7 +819,7 @@ struct BlendJit : public Builder
 passes.add(createSCCPPass());
 passes.add(createAggressiveDCEPass());
 
-passes.add(createLowerX86Pass(JM(), this));
+passes.add(createLowerX86Pass(this));
 
 passes.run(*blendFunc);
 
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 7b0b80a..0abcd1a 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -269,7 +269,7 @@ Function* FetchJit::Create(const FETCH_COMPILE_STATE& 
fetchState)
 
 optPasses.run(*fetch);
 
-optPasses.add(createLowerX86Pass(JM(), this));
+optPasses.add(createLowerX86Pass(this));
 optPasses.run(*fetch);
 
 JitManager::DumpToFile(fetch, "opt");
diff --git 
a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
index 5a69eae..f2bd888 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp
@@ -136,21 +136,21 @@ namespace SwrJit
 
 struct LowerX86 : public FunctionPass
 {
-LowerX86(JitManager* pJitMgr = nullptr, Builder* b = nullptr)
-: FunctionPass(ID), mpJitMgr(pJitMgr), B(b)
+LowerX86(Builder* b = nullptr)
+: FunctionPass(ID), B(b)
 {
 initializeLowerX86Pass(*PassRegistry::getPassRegistry());
 
 // Determine target arch
-if (mpJitMgr->mArch.AVX512F())
+if (JM()->mArch.AVX512F())
 {
 mTarget = AVX512;
 }
-else if (mpJitMgr->mArch.AVX2())
+else if (JM()->mArch.AVX2())
 {
 mTarget = AVX2;
 }
-else if (mpJitMgr->mArch.AVX())
+else if (JM()->mArch.AVX())
 {
 mTarget = AVX;
 
@@ -356,9 +356,8 @@ namespace SwrJit
 {
 }
 
-JitManager* JM() { return mpJitMgr; }
+JitManager* JM() { return B->JM(); }
 
-JitManager* mpJitMgr;
 Builder* B;
 
 TargetArch mTarget;
@@ -368,9 +367,9 @@ namespace SwrJit
 
 char LowerX86::ID = 0;   // LLVM uses address of ID as the actual ID.
 
-FunctionPass* createLowerX86Pass(JitManager* pJitMgr, Builder* b)
+FunctionPass* createLowerX86Pass(Builder* b)
 {
-return new LowerX86(pJitMgr, b);
+return new LowerX86(b);
 }
 
 Instruction* NO_EMU(LowerX86* pThis, TargetArch arch, TargetWidth width, 
CallInst* pCallInst)
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h 
b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h
index f7373f0..95ef4bc 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/functionpasses/passes.h
@@ -33,5 +33,5 @@ namespace SwrJit
 {
 using namespace llvm;
 
-FunctionPass* createLowerX86Pass(JitManager* pJitMgr, Builder* b);
+FunctionPass* createLowerX86Pass(Builder* b);
 }
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
index f804900..cb2e3ae 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/streamout_jit.cpp
@@ -307,7 +307,7 @@ struct StreamOutJit : public Builder
 passes.add(createSCCPPass());
 passes.add(createAggressiveDCEPass());
 
-passes.add(createLowerX86Pass(JM(), this));
+passes.add(createLowerX86Pass(this));
 
 passes.run(*soFunc);
 
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index 13d8986..afa184f 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -1402,7 +1402,7 @@ BuilderSWR::CompileFS(struct swr_context *ctx, 
swr_jit_fs_key &key)
 
// after the gallivm passes, we have to lower the core's intrinsics
llvm::legacy::FunctionPassManager lowerPass(JM()->mpCurrentModul

[Mesa-dev] [PATCH v2 4/7] swr/rast: Renamed MetaData calls

2018-05-25 Thread Alok Hota
---
 .../drivers/swr/rasterizer/jitter/builder.cpp  | 170 ++---
 .../drivers/swr/rasterizer/jitter/builder.h|   4 +-
 2 files changed, 87 insertions(+), 87 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
index e1c5d80..4b06aaa 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
@@ -1,32 +1,32 @@
 /
-* Copyright (C) 2014-2015 Intel Corporation.   All Rights Reserved.
-*
-* Permission is hereby granted, free of charge, to any person obtaining a
-* copy of this software and associated documentation files (the "Software"),
-* to deal in the Software without restriction, including without limitation
-* the rights to use, copy, modify, merge, publish, distribute, sublicense,
-* and/or sell copies of the Software, and to permit persons to whom the
-* Software is furnished to do so, subject to the following conditions:
-*
-* The above copyright notice and this permission notice (including the next
-* paragraph) shall be included in all copies or substantial portions of the
-* Software.
-*
-* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
-* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
-* IN THE SOFTWARE.
-* 
-* @file builder.h
-* 
-* @brief Includes all the builder related functionality
-* 
-* Notes:
-* 
-**/
+ * Copyright (C) 2014-2015 Intel Corporation.   All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * @file builder.h
+ *
+ * @brief Includes all the builder related functionality
+ *
+ * Notes:
+ *
+ 
**/
 
 #include "jit_pch.hpp"
 #include "builder.h"
@@ -38,11 +38,9 @@ namespace SwrJit
 //
 /// @brief Contructor for Builder.
 /// @param pJitMgr - JitManager which contains modules, function passes, 
etc.
-Builder::Builder(JitManager *pJitMgr)
-: mpJitMgr(pJitMgr),
-  mpPrivateContext(nullptr)
+Builder::Builder(JitManager *pJitMgr) : mpJitMgr(pJitMgr), 
mpPrivateContext(nullptr)
 {
-mVWidth = pJitMgr->mVWidth;
+mVWidth   = pJitMgr->mVWidth;
 mVWidth16 = 16;
 
 mpIRBuilder = &pJitMgr->mBuilder;
@@ -70,29 +68,29 @@ namespace SwrJit
 
 // Built in types: simd16
 
-mSimd16Int1Ty   = VectorType::get(mInt1Ty,  mVWidth16);
-mSimd16Int16Ty  = VectorType::get(mInt16Ty, mVWidth16);
-mSimd16Int32Ty  = VectorType::get(mInt32Ty, mVWidth16);
-mSimd16Int64Ty  = VectorType::get(mInt64Ty, mVWidth16);
-mSimd16FP16Ty   = VectorType::get(mFP16Ty,  mVWidth16);
-mSimd16FP32Ty   = VectorType::get(mFP32Ty,  mVWidth16);
-mSimd16VectorTy = ArrayType::get(mSimd16FP32Ty, 4);
-mSimd16VectorTRTy   = ArrayType::get(mSimd16FP32Ty, 5);
+mSimd16Int1Ty = VectorType::get(mInt1Ty, mVWidth16);
+mSimd16Int16Ty= VectorType::get(mInt16Ty, mVWidth16);
+mSimd16Int32Ty= VectorType::get(mInt32Ty, mVWidth16);
+mSimd16Int64Ty= VectorType::get(mInt64Ty, mVWidth16);
+mSimd16FP16Ty = VectorType::get(mFP16Ty, mVWidth16);
+mSimd16FP32Ty = VectorType::get(mFP32Ty, mVWidth16);
+mSimd16Vect

[Mesa-dev] [PATCH v2 7/7] swr/rast: Adjusted avx512 primitive assembly for msvc codegen

2018-05-25 Thread Alok Hota
Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by
about 4x, MSVC compiler was going crazy making temporaries and
split-loading inputs onto the stack unless explicit AVX-512 load ops
were added
---
 src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 +
 1 file changed, 90 insertions(+), 49 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp 
b/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp
index 64a90c7..4f89e0c 100644
--- a/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp
@@ -755,36 +755,51 @@ bool PaTriList1_simd16(PA_STATE_OPT& pa, uint32_t slot, 
simd16vector verts[])
 
 bool PaTriList2_simd16(PA_STATE_OPT& pa, uint32_t slot, simd16vector verts[])
 {
-#if KNOB_ARCH == KNOB_ARCH_AVX
-simd16scalar perm0 = _simd16_setzero_ps();
-simd16scalar perm1 = _simd16_setzero_ps();
-simd16scalar perm2 = _simd16_setzero_ps();
-#elif KNOB_ARCH >= KNOB_ARCH_AVX2
+#if KNOB_ARCH >= KNOB_ARCH_AVX2
 const simd16scalari perm0 = _simd16_set_epi32(13, 10, 7, 4, 1, 14, 11,  8, 
5, 2, 15, 12,  9, 6, 3, 0);
 const simd16scalari perm1 = _simd16_set_epi32(14, 11, 8, 5, 2, 15, 12,  9, 
6, 3,  0, 13, 10, 7, 4, 1);
 const simd16scalari perm2 = _simd16_set_epi32(15, 12, 9, 6, 3,  0, 13, 10, 
7, 4,  1, 14, 11, 8, 5, 2);
+#else   // KNOB_ARCH == KNOB_ARCH_AVX
+simd16scalar perm0 = _simd16_setzero_ps();
+simd16scalar perm1 = _simd16_setzero_ps();
+simd16scalar perm2 = _simd16_setzero_ps();
 #endif
 
 const simd16vector &a = PaGetSimdVector_simd16(pa, 0, slot);
 const simd16vector &b = PaGetSimdVector_simd16(pa, 1, slot);
 const simd16vector &c = PaGetSimdVector_simd16(pa, 2, slot);
 
-simd16vector &v0 = verts[0];
-simd16vector &v1 = verts[1];
-simd16vector &v2 = verts[2];
+const simd16mask mask0 = 0x4924;
+const simd16mask mask1 = 0x2492;
+const simd16mask mask2 = 0x9249;
 
 //  v0 -> a0 a3 a6 a9 aC aF b2 b5 b8 bB bE c1 c4 c7 cA cD
 //  v1 -> a1 a4 a7 aA aD b0 b3 b6 b9 bC bF c2 c5 c8 cB cE
 //  v2 -> a2 a5 a8 aB aE b1 b4 b7 bA bD c0 c3 c6 c9 cC cF
 
+simd16vector &v0 = verts[0];
+simd16vector &v1 = verts[1];
+simd16vector &v2 = verts[2];
+
 // for simd16 x, y, z, and w
 for (int i = 0; i < 4; i += 1)
 {
-simd16scalar temp0 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 
0x4924), c[i], 0x2492);
-simd16scalar temp1 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 
0x9249), c[i], 0x4924);
-simd16scalar temp2 = _simd16_blend_ps(_simd16_blend_ps(a[i], b[i], 
0x2492), c[i], 0x9249);
+simd16scalar tempa = _simd16_loadu_ps(reinterpret_cast(&a[i]));
+simd16scalar tempb = _simd16_loadu_ps(reinterpret_cast(&b[i]));
+simd16scalar tempc = _simd16_loadu_ps(reinterpret_cast(&c[i]));
+
+simd16scalar temp0 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, 
mask0), tempc, mask1);
+simd16scalar temp1 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, 
mask2), tempc, mask0);
+simd16scalar temp2 = _simd16_blend_ps(_simd16_blend_ps(tempa, tempb, 
mask1), tempc, mask2);
+
+#if KNOB_ARCH >= KNOB_ARCH_AVX2
+v0[i] = _simd16_permute_ps(temp0, perm0);
+v1[i] = _simd16_permute_ps(temp1, perm1);
+v2[i] = _simd16_permute_ps(temp2, perm2);
+#else   // #if KNOB_ARCH == KNOB_ARCH_AVX
+
+// the general permutes (above) are prohibitively slow to emulate on 
AVX (its scalar code)
 
-#if KNOB_ARCH == KNOB_ARCH_AVX
 temp0 = _simd16_permute_ps_i(temp0, 0x6C);  // (0, 3, 2, 1) => 
00 11 01 10 => 0x6C
 perm0 = _simd16_permute2f128_ps(temp0, temp0, 0xB1);// (1, 0, 3, 2) => 
01 00 11 10 => 0xB1
 temp0 = _simd16_blend_ps(temp0, perm0, 0x); // 0010 0010 0010 
0010
@@ -802,10 +817,6 @@ bool PaTriList2_simd16(PA_STATE_OPT& pa, uint32_t slot, 
simd16vector verts[])
 temp2 = _simd16_blend_ps(temp2, perm2, 0x); // 0100 0100 0100 
0100
 perm2 = _simd16_permute2f128_ps(temp2, temp2, 0x4E);// (2, 3, 0, 1) => 
10 11 00 01 => 0x4E
 v2[i] = _simd16_blend_ps(temp2, perm2, 0x1C1C); // 0011 1000 0011 
1000
-#elif KNOB_ARCH >= KNOB_ARCH_AVX2
-v0[i] = _simd16_permute_ps(temp0, perm0);
-v1[i] = _simd16_permute_ps(temp1, perm1);
-v2[i] = _simd16_permute_ps(temp2, perm2);
 #endif
 }
 
@@ -1056,26 +1067,31 @@ bool PaTriStrip1_simd16(PA_STATE_OPT& pa, uint32_t 
slot, simd16vector verts[])
 const simd16vector &a = PaGetSimdVector_simd16(pa, pa.prev, slot);
 const simd16vector &b = PaGetSimdVector_simd16(pa, pa.cur, slot);
 
-simd16vector &v0 = verts[0];
-simd16vector &v1 = verts[1];
-simd16vector &v2 = verts[2];
+const simd16mask mask0 = 0xF000;
 
 //  v0 -> a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aA aB aC aD aE aF
 //  v1 -> a1 a3 a3 a5 a5 a7 a7 a9 a9 aB aB aD aD aF aF b1
 //  v2 -> a2 a2 a4 a4 a6 a6 a8 a8 aA aA aC aC aE aE b0 b0

[Mesa-dev] [PATCH v2 6/7] swr/rast: Moved memory init out of core swr init

2018-05-25 Thread Alok Hota
Added two new files for a wrapper function for initialization

v2: added missing include for single architecture builds
---
 src/gallium/drivers/swr/Makefile.sources   |  4 ++-
 src/gallium/drivers/swr/meson.build|  2 ++
 src/gallium/drivers/swr/rasterizer/core/api.cpp|  4 ---
 src/gallium/drivers/swr/rasterizer/core/state.h|  3 +-
 .../drivers/swr/rasterizer/memory/InitMemory.cpp   | 39 ++
 .../drivers/swr/rasterizer/memory/InitMemory.h | 33 ++
 src/gallium/drivers/swr/swr_loader.cpp |  8 -
 7 files changed, 86 insertions(+), 7 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp
 create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h

diff --git a/src/gallium/drivers/swr/Makefile.sources 
b/src/gallium/drivers/swr/Makefile.sources
index 6753d50..b298356 100644
--- a/src/gallium/drivers/swr/Makefile.sources
+++ b/src/gallium/drivers/swr/Makefile.sources
@@ -177,4 +177,6 @@ MEMORY_CXX_SOURCES := \
rasterizer/memory/StoreTile_TileY2.cpp \
rasterizer/memory/StoreTile_TileY.cpp \
rasterizer/memory/TilingFunctions.h \
-   rasterizer/memory/tilingtraits.h
+   rasterizer/memory/tilingtraits.h \
+   rasterizer/memory/InitMemory.cpp \
+   rasterizer/memory/InitMemory.h
diff --git a/src/gallium/drivers/swr/meson.build 
b/src/gallium/drivers/swr/meson.build
index 9b272aa..b95c8bc 100644
--- a/src/gallium/drivers/swr/meson.build
+++ b/src/gallium/drivers/swr/meson.build
@@ -151,6 +151,8 @@ files_swr_arch = files(
   'rasterizer/memory/StoreTile_TileY.cpp',
   'rasterizer/memory/TilingFunctions.h',
   'rasterizer/memory/tilingtraits.h',
+  'rasterizer/memory/InitMemory.h',
+  'rasterizer/memory/InitMemory.cpp',
 )
 
 swr_context_files = files('swr_context.h')
diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index 47f3633..c932ec0 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -1728,10 +1728,6 @@ void InitBackendFuncTables();
 /// @brief Initialize swr backend and memory internal tables
 void SwrInit()
 {
-InitSimLoadTilesTable();
-InitSimStoreTilesTable();
-InitSimClearTilesTable();
-
 InitClearTilesTable();
 InitBackendFuncTables();
 InitRasterizerFunctions();
diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
b/src/gallium/drivers/swr/rasterizer/core/state.h
index c26dabe..9db17ee 100644
--- a/src/gallium/drivers/swr/rasterizer/core/state.h
+++ b/src/gallium/drivers/swr/rasterizer/core/state.h
@@ -29,10 +29,11 @@
 
 #include "common/formats.h"
 #include "common/intrin.h"
-using gfxptr_t = unsigned long long;
 #include 
 #include 
 
+using gfxptr_t = unsigned long long;
+
 //
 /// PRIMITIVE_TOPOLOGY.
 //
diff --git a/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp 
b/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp
new file mode 100644
index 000..bff96e1
--- /dev/null
+++ b/src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp
@@ -0,0 +1,39 @@
+/
+* Copyright (C) 2018 Intel Corporation.   All Rights Reserved.
+*
+* Permission is hereby granted, free of charge, to any person obtaining a
+* copy of this software and associated documentation files (the "Software"),
+* to deal in the Software without restriction, including without limitation
+* the rights to use, copy, modify, merge, publish, distribute, sublicense,
+* and/or sell copies of the Software, and to permit persons to whom the
+* Software is furnished to do so, subject to the following conditions:
+*
+* The above copyright notice and this permission notice (including the next
+* paragraph) shall be included in all copies or substantial portions of the
+* Software.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+* IN THE SOFTWARE.
+*
+* @file InitMemory.cpp
+*
+* @brief Provide access to tiles table initialization functions
+*
+**/
+#include "memory/InitMemory.h"
+
+void InitSimLoadTilesTable();
+void InitSimStoreTilesTable();
+void InitSimClearTilesTable();
+
+void InitTilesTable()
+{
+InitSimLoadTilesTable();
+InitSimStoreTilesTable();
+InitSimClearTilesTable();
+}
diff --git 

Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Rob Herring
On Fri, May 25, 2018 at 9:25 AM, Tomasz Figa  wrote:
> On Fri, May 25, 2018 at 10:59 PM Rob Herring  wrote:
>
>> On Fri, May 25, 2018 at 4:15 AM, Robert Foss 
> wrote:
>> >
>> >
>> > On 2018-05-25 10:38, Tomasz Figa wrote:
>> >>
>> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss 
>> >> wrote:
>> >>
>> >>> Hey,
>> >>
>> >>
>> >>> On 2018-05-25 02:17, Rob Herring wrote:
>> 
>>  On Thu, May 24, 2018 at 6:23 AM, Robert Foss <
> robert.f...@collabora.com>
>> >>
>> >> wrote:
>> >
>> > Hey,
>> >
>> > I don't think I've received any feedback on this version yet.
>> > If anyone has some time to spare, it would be nice to get it merged.
>> >
>> > Just to be clear about the libdrm branch linked in the cover letter,
>> > it is not required. Only for virgl platforms which happens to be
> what
>> > I tested on.
>> 
>> 
>>  virgl will still fallback to using the first render node without
> those
>>  libdrm changes, right? If not, I don't think we should apply until
>>  we're not breaking a platform...
>> >>
>> >>
>> >>> No it will not fall back. I agree that holding off makes more sense.
>> >>
>> >>
>> >> What's the reason of this problems? Is it because of drmGetDevices()?
>> >> Since
>> >> we don't really use it for anything other than getting the list of
> render
>> >> nodes in the system, maybe we could just iterate over any /dev/renderD*
>> >> nodes explicitly and avoid introducing new problems?
>> >
>> >
>> > That's exactly the problem, and yes we could 100% solve by iterating
> over
>> > /dev/renderD* nodes. I originally assumed we wouldn't want to do that,
> but
>> > rather use the libdrm interfaces.
>> >
>> > But for the next spin I could avoid using libdrm, should I?
>
>> I don't have an opinion on libdrm really, but I do think we should
>> fallback to the 1st (only) render node rather than just fail.
>
> We do, even with libdrm.
>
> AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't
> include devices on virtio bus in the results, which means that there likely
> wouldn't be any render node returned.

Okay. I still don't get why we search by bus in the first place. Who
cares what bus the gpu sits on.

Now I have an opinion. We should just iterate over render nodes
matching by name or use the first node if we don't have a set name.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/7] InitMemory inclusion

2018-05-25 Thread Cherniak, Bruce
V2:  Good catch!
Reviewed-by: Bruce Cherniak 

> On May 25, 2018, at 10:19 AM, Alok Hota  wrote:
> 
> Version 2 makes a small change to swr_loader.cpp to include the new InitMemory
> header, which fixes a compile error on single-architecture builds.
> 
> Alok Hota (7):
>  swr/rast: Added in-place building to SCATTERPS
>  swr/rast: Checking gCoreBuckets and CORE_BUCKETS are equal length at
>compile time
>  swr/rast: Use metadata to communicate between passes
>  swr/rast: Renamed MetaData calls
>  swr/rast: Removed superfluous JitManager argument from passes
>  swr/rast: Moved memory init out of core swr init
>  swr/rast: Adjusted avx512 primitive assembly for msvc codegen
> 
> src/gallium/drivers/swr/Makefile.sources   |   4 +-
> src/gallium/drivers/swr/meson.build|   2 +
> src/gallium/drivers/swr/rasterizer/core/api.cpp|   4 -
> src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 139 +++--
> .../drivers/swr/rasterizer/core/rdtsc_core.cpp |   1 +
> src/gallium/drivers/swr/rasterizer/core/state.h|   3 +-
> .../drivers/swr/rasterizer/jitter/blend_jit.cpp|   2 +-
> .../drivers/swr/rasterizer/jitter/builder.cpp  | 170 ++---
> .../drivers/swr/rasterizer/jitter/builder.h|  32 +++-
> .../drivers/swr/rasterizer/jitter/builder_mem.cpp  |  29 ++--
> .../drivers/swr/rasterizer/jitter/fetch_jit.cpp|   2 +-
> .../rasterizer/jitter/functionpasses/lower_x86.cpp |  17 +--
> .../swr/rasterizer/jitter/functionpasses/passes.h  |   2 +-
> .../swr/rasterizer/jitter/streamout_jit.cpp|   2 +-
> .../drivers/swr/rasterizer/memory/InitMemory.cpp   |  39 +
> .../drivers/swr/rasterizer/memory/InitMemory.h |  33 
> src/gallium/drivers/swr/swr_loader.cpp |   8 +-
> src/gallium/drivers/swr/swr_shader.cpp |   2 +-
> 18 files changed, 325 insertions(+), 166 deletions(-)
> create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.cpp
> create mode 100644 src/gallium/drivers/swr/rasterizer/memory/InitMemory.h
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.0 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106644

--- Comment #10 from erhar...@mailbox.org ---
Correct. Should I note (potential?) ppc specific issues in the bug title too,
besides selecting hardware "PowerPC"?

llvmpipe does run on my other G5 (64bit build), however a lot of piglet tests
fail/segfault (see bug #105730).

Don't know yet if it's a regression, but I could build and run the tests a few
major releases back on both G5. If it turns out to be a regression I will at
least try to bisect it and see how far I get.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] util/u_math: Implement a logbase2 function for unsigned long

2018-05-25 Thread Eric Engestrom
On Thursday, 2018-05-24 11:47:52 +0200, Karol Herbst wrote:
> From: Pierre Moreau 
> 
> v2 (Karol Herbst ):
> * removed unneeded ll
> * ll -> ull

Reviewed-by: Eric Engestrom 

> 
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/auxiliary/util/u_math.h | 55 +
>  src/util/bitscan.h  | 11 ++
>  2 files changed, 66 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/util/u_math.h 
> b/src/gallium/auxiliary/util/u_math.h
> index 46d02978fd6..79869a119af 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -421,6 +421,23 @@ util_logbase2(unsigned n)
>  #endif
>  }
>  
> +static inline uint64_t
> +util_logbase2_64(uint64_t n)
> +{
> +#if defined(HAVE___BUILTIN_CLZLL)
> +   return ((sizeof(uint64_t) * 8 - 1) - __builtin_clzll(n | 1));
> +#else
> +   uint64_t pos = 0ull;
> +   if (n >= 1ull<<32) { n >>= 32; pos += 32; }
> +   if (n >= 1ull<<16) { n >>= 16; pos += 16; }
> +   if (n >= 1ull<< 8) { n >>=  8; pos +=  8; }
> +   if (n >= 1ull<< 4) { n >>=  4; pos +=  4; }
> +   if (n >= 1ull<< 2) { n >>=  2; pos +=  2; }
> +   if (n >= 1ull<< 1) {   pos +=  1; }
> +   return pos;
> +#endif
> +}
> +
>  /**
>   * Returns the ceiling of log n base 2, and 0 when n == 0. Equivalently,
>   * returns the smallest x such that n <= 2**x.
> @@ -434,6 +451,15 @@ util_logbase2_ceil(unsigned n)
> return 1 + util_logbase2(n - 1);
>  }
>  
> +static inline uint64_t
> +util_logbase2_ceil64(uint64_t n)
> +{
> +   if (n <= 1)
> +  return 0;
> +
> +   return 1ull + util_logbase2_64(n - 1);
> +}
> +
>  /**
>   * Returns the smallest power of two >= x
>   */
> @@ -465,6 +491,35 @@ util_next_power_of_two(unsigned x)
>  #endif
>  }
>  
> +static inline uint64_t
> +util_next_power_of_two64(uint64_t x)
> +{
> +#if defined(HAVE___BUILTIN_CLZLL)
> +   if (x <= 1)
> +   return 1;
> +
> +   return (1ull << ((sizeof(uint64_t) * 8) - __builtin_clzll(x - 1)));
> +#else
> +   uint64_t val = x;
> +
> +   if (x <= 1)
> +  return 1;
> +
> +   if (util_is_power_of_two_or_zero64(x))
> +  return x;
> +
> +   val--;
> +   val = (val >> 1)  | val;
> +   val = (val >> 2)  | val;
> +   val = (val >> 4)  | val;
> +   val = (val >> 8)  | val;
> +   val = (val >> 16) | val;
> +   val = (val >> 32) | val;
> +   val++;
> +   return val;
> +#endif
> +}
> +
>  
>  /**
>   * Return number of bits set in n.
> diff --git a/src/util/bitscan.h b/src/util/bitscan.h
> index 5cc75f0beba..dc89ac93f28 100644
> --- a/src/util/bitscan.h
> +++ b/src/util/bitscan.h
> @@ -123,6 +123,17 @@ util_is_power_of_two_or_zero(unsigned v)
> return (v & (v - 1)) == 0;
>  }
>  
> +/* Determine if an uint64_t value is a power of two.
> + *
> + * \note
> + * Zero is treated as a power of two.
> + */
> +static inline bool
> +util_is_power_of_two_or_zero64(uint64_t v)
> +{
> +   return (v & (v - 1)) == 0;
> +}
> +
>  /* Determine if an unsigned value is a power of two.
>   *
>   * \note
> -- 
> 2.17.0
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] egl/android: Remove dependencies on specific grallocs

2018-05-25 Thread Tomasz Figa
On Sat, May 26, 2018 at 12:38 AM Rob Herring  wrote:

> On Fri, May 25, 2018 at 9:25 AM, Tomasz Figa  wrote:
> > On Fri, May 25, 2018 at 10:59 PM Rob Herring  wrote:
> >
> >> On Fri, May 25, 2018 at 4:15 AM, Robert Foss  > wrote:
> >> >
> >> >
> >> > On 2018-05-25 10:38, Tomasz Figa wrote:
> >> >>
> >> >> On Fri, May 25, 2018 at 5:33 PM Robert Foss <
robert.f...@collabora.com>
> >> >> wrote:
> >> >>
> >> >>> Hey,
> >> >>
> >> >>
> >> >>> On 2018-05-25 02:17, Rob Herring wrote:
> >> 
> >>  On Thu, May 24, 2018 at 6:23 AM, Robert Foss <
> > robert.f...@collabora.com>
> >> >>
> >> >> wrote:
> >> >
> >> > Hey,
> >> >
> >> > I don't think I've received any feedback on this version yet.
> >> > If anyone has some time to spare, it would be nice to get it
merged.
> >> >
> >> > Just to be clear about the libdrm branch linked in the cover
letter,
> >> > it is not required. Only for virgl platforms which happens to be
> > what
> >> > I tested on.
> >> 
> >> 
> >>  virgl will still fallback to using the first render node without
> > those
> >>  libdrm changes, right? If not, I don't think we should apply until
> >>  we're not breaking a platform...
> >> >>
> >> >>
> >> >>> No it will not fall back. I agree that holding off makes more
sense.
> >> >>
> >> >>
> >> >> What's the reason of this problems? Is it because of
drmGetDevices()?
> >> >> Since
> >> >> we don't really use it for anything other than getting the list of
> > render
> >> >> nodes in the system, maybe we could just iterate over any
/dev/renderD*
> >> >> nodes explicitly and avoid introducing new problems?
> >> >
> >> >
> >> > That's exactly the problem, and yes we could 100% solve by iterating
> > over
> >> > /dev/renderD* nodes. I originally assumed we wouldn't want to do
that,
> > but
> >> > rather use the libdrm interfaces.
> >> >
> >> > But for the next spin I could avoid using libdrm, should I?
> >
> >> I don't have an opinion on libdrm really, but I do think we should
> >> fallback to the 1st (only) render node rather than just fail.
> >
> > We do, even with libdrm.
> >
> > AFAICT, the problem with virgl seems to be that drmGetDevices() doesn't
> > include devices on virtio bus in the results, which means that there
likely
> > wouldn't be any render node returned.

> Okay. I still don't get why we search by bus in the first place. Who
> cares what bus the gpu sits on.

We don't search by bus. drmGetDevices() iterates over DRI nodes, queries
them and discards those of which bus type it fails to recognize. I have no
idea why it does so, though.


> Now I have an opinion. We should just iterate over render nodes
> matching by name or use the first node if we don't have a set name.

Yeah, I suggested that too in my previous reply. It doesn't look like
libdrm has any sane helper that could help us.

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/x11: Move dri2_format_for_depth prototype.

2018-05-25 Thread Eric Engestrom
On Friday, 2018-05-25 16:06:26 +0100, Eric Engestrom wrote:
> On Friday, 2018-05-25 06:52:25 +, Vinson Lee wrote:
> > Fix build error without DRI3.
> 
> D'uh!
> I forgot building dri3 was optional, sorry :/
> 
> Reviewed-by: Eric Engestrom 

Actually, wait no, this doesn't look right, the function should be named
something else if it's exposed to everyone, since it's quite specific to
x11's case, or it should not be exposed to everyone.

I feel like the best thing to do here is to just copy the prototype to
platform_x11.c:

---8<---
diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index b2a3000b252ec0ddb12f..ea9b0cc6d6fd04804d2a 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -55,6 +55,9 @@ static EGLBoolean
 dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
EGLint interval);
 
+uint32_t
+dri2_format_for_depth(uint32_t depth);
+
 static void
 swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
  struct dri2_egl_surface * dri2_surf)
--->8---

> 
> > 
> >   CC   drivers/dri2/platform_x11.lo
> > drivers/dri2/platform_x11.c:1010:1: error: no previous prototype for 
> > function 'dri2_format_for_depth' [-Werror,-Wmissing-prototypes]
> > dri2_format_for_depth(uint32_t depth)
> > ^
> > 
> > Fixes: 473af0b541b2 ("egl/x11: deduplicate depth-to-format logic")
> > Signed-off-by: Vinson Lee 
> > ---
> >  src/egl/drivers/dri2/egl_dri2.h  | 3 +++
> >  src/egl/drivers/dri2/platform_x11_dri3.h | 3 ---
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/egl/drivers/dri2/egl_dri2.h 
> > b/src/egl/drivers/dri2/egl_dri2.h
> > index adabc527f85b..b91a899e476c 100644
> > --- a/src/egl/drivers/dri2/egl_dri2.h
> > +++ b/src/egl/drivers/dri2/egl_dri2.h
> > @@ -523,4 +523,7 @@ dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, 
> > EGLint type,
> >  void
> >  dri2_fini_surface(_EGLSurface *surf);
> >  
> > +uint32_t
> > +dri2_format_for_depth(uint32_t depth);
> > +
> >  #endif /* EGL_DRI2_INCLUDED */
> > diff --git a/src/egl/drivers/dri2/platform_x11_dri3.h 
> > b/src/egl/drivers/dri2/platform_x11_dri3.h
> > index e6fd01366978..96e7ee972d9f 100644
> > --- a/src/egl/drivers/dri2/platform_x11_dri3.h
> > +++ b/src/egl/drivers/dri2/platform_x11_dri3.h
> > @@ -38,7 +38,4 @@ extern struct dri2_egl_display_vtbl dri3_x11_display_vtbl;
> >  EGLBoolean
> >  dri3_x11_connect(struct dri2_egl_display *dri2_dpy);
> >  
> > -uint32_t
> > -dri2_format_for_depth(uint32_t depth);
> > -
> >  #endif
> > -- 
> > 2.17.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
> From: Francisco Jerez 

I think some explanation is required. I'm guessing this is because you
have to write lo fragments out before high, but we should say that in
the commit message.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
> It doesn't matter since we don't ever run replicated write shaders
> through the optimizer but it's good to be complete.

Aside: Is there anything that would prevent us from detecting that all
fragments are uniform and using this message?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/53] intel/fs: Fix Gen4-5 FB write AA data payload munging for non-EOT writes.

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
> From: Francisco Jerez 

Okay, I think the problem this patch is fixing is that previously we
would unconditionally execute the fire_fb_write() to send the AA data,
and conditionally execute the fire_fb_write() that does not.

But we actually want to send one or the other, and never both.

With that explanation in the commit message,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/53] intel/fs: Set up FB write message headers in the visitor

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
> Doing instruction header setup in the generator is aweful for a number

Misspelling: awful
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/53] intel/fs: SIMD32 support for fragment shaders

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
> This patch series adds back-end compiler support for SIMD32 fragment
> shaders.  Support is added and everything works but it's currently hidden
> behind INTEL_DEBUG=do32.  We know that it improves performance in some
> cases but we do not yet have a good enough heuristic to start turning it on
> by default.  The objective of this series is to just to get the compiler
> infrastructure landed so that it stops bit-rotting in Curro's branch.
> Figuring out a good heuristic is left as an exercise to the reader. :-)

1-6, 8-20 are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-25 Thread Kenneth Graunke
On Friday, February 23, 2018 7:10:55 AM PDT Eleni Maria Stea wrote:
> Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed
> formats that can render. When GetCompressed* functions are called, the
> pixels are returned in the non-compressed format that is used for the
> rendering.
> 
> With this patch we store both the compressed and non-compressed versions
> of the image, so that both rendering commands and GetCompressed*
> commands work.
> 
> Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
> in intel_miptree_map_etc function have been removed because when the
> miptree is mapped for reading (for example from a GetCompress*
> function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set).
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  10 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  14 +++
>  src/mesa/drivers/dri/i965/intel_tex.c | 157 
> +-
>  src/mesa/drivers/dri/i965/intel_tex.h |   8 ++
>  src/mesa/drivers/dri/i965/intel_tex_image.c   |  93 ++-
>  src/mesa/drivers/dri/i965/intel_tex_obj.h |   8 ++
>  6 files changed, 256 insertions(+), 34 deletions(-)

Hello,

I think this patch could probably be simplified a bit, with less
duplication of core Mesa stuff...some suggestions below...

> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 22977d6659..c8c7c025b6 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -730,9 +730,10 @@ miptree_create(struct brw_context *brw,
> mesa_format etc_format = MESA_FORMAT_NONE;
> uint32_t alloc_flags = 0;
>  
> -   format = intel_lower_compressed_format(brw, format);
> -
> -   etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> +   if (!(flags & MIPTREE_CREATE_ETC)) {
> +  format = intel_lower_compressed_format(brw, format);
> +  etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> +   }
>  
> if (flags & MIPTREE_CREATE_BUSY)
>alloc_flags |= BO_ALLOC_BUSY;
> @@ -3314,9 +3315,6 @@ intel_miptree_map_etc(struct brw_context *brw,
>assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
> }
>  
> -   assert(map->mode & GL_MAP_WRITE_BIT);
> -   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
> -
> map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
> map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
>  map->w, map->h, 1));
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index 7fcf09f118..bf6195b97a 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -379,6 +379,20 @@ enum intel_miptree_create_flags {
>  * that the miptree will be created with mt->aux_usage == NONE.
>  */
> MIPTREE_CREATE_NO_AUX   = 1 << 2,
> +
> +   /** Create a second miptree for the compressed pixels (Gen7 only)
> +*
> +* On Gen7, we need to store 2 miptrees for some compressed
> +* formats so we can handle rendering as well as getting the
> +* compressed image data. This flag indicates that the miptree
> +* is expected to hold compressed data for the latter case.
> +*/
> +   MIPTREE_CREATE_ETC  = 1 << 3,
> +};

Create flags look fine.

> +
> +enum intel_miptree_upload_flags {
> +   MIPTREE_UPLOAD_DEFAULT = 0,
> +   MIPTREE_UPLOAD_ETC,
>  };

Rather than creating an extra set of flags here, I would just extend the
GL_MAP_*_BIT flags that already get passed around as 'mode'.  There's
some precedent for that with BRW_MAP_DIRECT_BIT in intel_mipmap_tree.h:

/**
 * This bit extends the set of GL_MAP_*_BIT enums.
 *
 * When calling intel_miptree_map() on an ETC-transcoded-to-RGB miptree or a
 * depthstencil-split-to-separate-stencil miptree, we'll normally make a
 * temporary and recreate the kind of data requested by Mesa core, since we're
 * satisfying some glGetTexImage() request or something.
 *
 * However, occasionally you want to actually map the miptree's current data
 * without transcoding back.  This flag to intel_miptree_map() gets you that.
 */
#define BRW_MAP_DIRECT_BIT  0x8000

So, I'd just make a BRW_MAP_ETC_BIT 0x4000, and use that instead.
The advantage is that you should be able to reuse existing functions
rather than creating new ones that take an extra 'flags' parameter.

>  
>  struct intel_mipmap_tree *intel_miptree_create(struct brw_context *brw,
> diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
> b/src/mesa/drivers/dri/i965/intel_tex.c
> index 65a1cb37d4..56077a7676 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex.c
> @@ -66,6 +66,8 @@ intel_alloc_texture_image_buffer(struct gl_context *ctx,
> struct intel_texture_image *intel_image = intel_texture

Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Use cpu tiling/detiling when mapping

2018-05-25 Thread Kenneth Graunke
On Monday, April 30, 2018 4:38:57 PM PDT Scott D Phillips wrote:
> Kenneth Graunke  writes:
> 
> > On Monday, April 30, 2018 10:25:49 AM PDT Scott D Phillips wrote:
> >> Rename the (un)map_gtt functions to (un)map_map (map by
> >> returning a map) and add new functions (un)map_tiled_memcpy that
> >> return a shadow buffer populated with the intel_tiled_memcpy
> >> functions.
> >> 
> >> Tiling/detiling with the cpu will be the only way to handle Yf/Ys
> >> tiling, when support is added for those formats.
> >> 
> >> v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
> >> 
> >> v3: Add units to parameter names of tile_extents (Nanley Chery)
> >> Use _mesa_align_malloc for the shadow copy (Nanley)
> >> Continue using gtt maps on gen4 (Nanley)
> >> 
> >> v4: Use streaming_load_memcpy when detiling
> >> 
> >> Reviewed-by: Chris Wilson 
> >> ---
> >>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 98 
> >> +--
> >>  1 file changed, 94 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> >> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> index b9a564552df..498eebd2f86 100644
> >> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> >> @@ -31,6 +31,7 @@
> >>  #include "intel_image.h"
> >>  #include "intel_mipmap_tree.h"
> >>  #include "intel_tex.h"
> >> +#include "intel_tiled_memcpy.h"
> >>  #include "intel_blit.h"
> >>  #include "intel_fbo.h"
> >>  
> >> @@ -3066,7 +3067,7 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt)
> >>  }
> >>  
> >>  static void
> >> -intel_miptree_unmap_gtt(struct brw_context *brw,
> >> +intel_miptree_unmap_map(struct brw_context *brw,
> >>  struct intel_mipmap_tree *mt,
> >>  struct intel_miptree_map *map,
> >>  unsigned int level, unsigned int slice)
> >> @@ -3075,7 +3076,7 @@ intel_miptree_unmap_gtt(struct brw_context *brw,
> >>  }
> >>  
> >>  static void
> >> -intel_miptree_map_gtt(struct brw_context *brw,
> >> +intel_miptree_map_map(struct brw_context *brw,
> >>  struct intel_mipmap_tree *mt,
> >>  struct intel_miptree_map *map,
> >>  unsigned int level, unsigned int slice)
> >> @@ -3120,7 +3121,7 @@ intel_miptree_map_gtt(struct brw_context *brw,
> >> mt, _mesa_get_format_name(mt->format),
> >> x, y, map->ptr, map->stride);
> >>  
> >> -   map->unmap = intel_miptree_unmap_gtt;
> >> +   map->unmap = intel_miptree_unmap_map;
> >>  }
> >>  
> >>  static void
> >> @@ -3145,6 +3146,90 @@ intel_miptree_unmap_blit(struct brw_context *brw,
> >> intel_miptree_release(&map->linear_mt);
> >>  }
> >>  
> >> +/* Compute extent parameters for use with tiled_memcpy functions.
> >> + * xs are in units of bytes and ys are in units of strides. */
> >> +static inline void
> >> +tile_extents(struct intel_mipmap_tree *mt, struct intel_miptree_map *map,
> >> + unsigned int level, unsigned int slice, unsigned int *x1_B,
> >> + unsigned int *x2_B, unsigned int *y1_el, unsigned int *y2_el)
> >> +{
> >> +   unsigned int block_width, block_height;
> >> +   unsigned int x0_el, y0_el;
> >> +
> >> +   _mesa_get_format_block_size(mt->format, &block_width, &block_height);
> >> +
> >> +   assert(map->x % block_width == 0);
> >> +   assert(map->y % block_height == 0);
> >> +
> >> +   intel_miptree_get_image_offset(mt, level, slice, &x0_el, &y0_el);
> >> +   *x1_B = (map->x / block_width + x0_el) * mt->cpp;
> >> +   *y1_el = map->y / block_height + y0_el;
> >> +   *x2_B = (DIV_ROUND_UP(map->x + map->w, block_width) + x0_el) * mt->cpp;
> >> +   *y2_el = DIV_ROUND_UP(map->y + map->h, block_height) + y0_el;
> >> +}
> >> +
> >> +static void
> >> +intel_miptree_unmap_tiled_memcpy(struct brw_context *brw,
> >> + struct intel_mipmap_tree *mt,
> >> + struct intel_miptree_map *map,
> >> + unsigned int level,
> >> + unsigned int slice)
> >> +{
> >> +   if (map->mode & GL_MAP_WRITE_BIT) {
> >> +  unsigned int x1, x2, y1, y2;
> >> +  tile_extents(mt, map, level, slice, &x1, &x2, &y1, &y2);
> >> +
> >> +  char *dst = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
> >> +  dst += mt->offset;
> >> +
> >> +  linear_to_tiled(x1, x2, y1, y2, dst, map->ptr, mt->surf.row_pitch,
> >> +  map->stride, brw->has_swizzling, mt->surf.tiling, 
> >> memcpy);
> >> +
> >> +  intel_miptree_unmap_raw(mt);
> >> +   }
> >> +   _mesa_align_free(map->buffer);
> >> +   map->buffer = map->ptr = NULL;
> >> +}
> >> +
> >> +static void
> >> +intel_miptree_map_tiled_memcpy(struct brw_context *brw,
> >> +   struct intel_mipmap_tree *mt,
> >> +   struct intel_miptree_map *map,
> >> +

Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.

2018-05-25 Thread Jason Ekstrand
On Fri, May 25, 2018 at 11:27 AM, Matt Turner  wrote:

> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand 
> wrote:
> > From: Francisco Jerez 
>
> I think some explanation is required. I'm guessing this is because you
> have to write lo fragments out before high, but we should say that in
> the commit message.
>

How about this:

When using multiple RT write messages to the same RT such as for
dual-source blending or all RT writes in SIMD32, we have to set the "Last
Render Target Select" bit on all write messages that target the last RT but
only set EOT on the last RT write in the shader.  Special-casing for
dual-source blend works today because that is the only case which requires
multiple RT write messages per RT.  When we start doing SIMD32, this will
become much more common so we add a dedicated bit for it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects

2018-05-25 Thread Jason Ekstrand
On Fri, May 25, 2018 at 11:29 AM, Matt Turner  wrote:

> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand 
> wrote:
> > It doesn't matter since we don't ever run replicated write shaders
> > through the optimizer but it's good to be complete.
>
> Aside: Is there anything that would prevent us from detecting that all
> fragments are uniform and using this message?
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/53] intel/fs: FS_OPCODE_REP_FB_WRITE has side effects

2018-05-25 Thread Jason Ekstrand
On Fri, May 25, 2018 at 11:29 AM, Matt Turner  wrote:

> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand 
> wrote:
> > It doesn't matter since we don't ever run replicated write shaders
> > through the optimizer but it's good to be complete.
>
> Aside: Is there anything that would prevent us from detecting that all
> fragments are uniform and using this message?
>

We've considered that in the past.  Unfortunately, it also has other
restrictions such as not allowing color masking so we'd have to put more
stuff in the shader key.  It could be done though.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets

2018-05-25 Thread Jason Ekstrand
For certain EGLImage cases, we represent a single slice or LOD of an
image with a byte offset to a tile and X/Y intratile offsets to the
given slice.  Most of i965 is fine with this but it breaks blorp.  This
is a terrible way to represent slices of a surface in EGL and we should
stop some day but that's a very scary and thorny path.  This gets blorp
to start working with those surfaces and fixes some dEQP EGL test bugs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/blorp/blorp.c   | 22 ++
 src/intel/blorp/blorp.h   |  3 +++
 src/intel/blorp/blorp_blit.c  |  4 +++-
 src/intel/blorp/blorp_clear.c |  9 +
 src/mesa/drivers/dri/i965/brw_blorp.c |  2 ++
 5 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c
index e348caf..73f8c67 100644
--- a/src/intel/blorp/blorp.c
+++ b/src/intel/blorp/blorp.c
@@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context *blorp,
 */
if (is_render_target && blorp->isl_dev->info->gen <= 6)
   info->view.array_len = MIN2(info->view.array_len, 512);
+
+   if (surf->tile_x_sa || surf->tile_y_sa) {
+  /* This is only allowed on simple 2D surfaces without MSAA */
+  assert(info->surf.dim == ISL_SURF_DIM_2D);
+  assert(info->surf.samples == 1);
+  assert(info->surf.levels == 1);
+  assert(info->surf.logical_level0_px.array_len == 1);
+  assert(info->aux_usage == ISL_AUX_USAGE_NONE);
+
+  info->tile_x_sa = surf->tile_x_sa;
+  info->tile_y_sa = surf->tile_y_sa;
+
+  /* Instead of using the X/Y Offset fields in RENDER_SURFACE_STATE, we
+   * place the image at the tile boundary and offset our sampling or
+   * rendering.  For this reason, we need to grow the image by the offset
+   * to ensure that the hardware doesn't think we've gone past the edge.
+   */
+  info->surf.logical_level0_px.w += surf->tile_x_sa;
+  info->surf.logical_level0_px.h += surf->tile_y_sa;
+  info->surf.phys_level0_sa.w += surf->tile_x_sa;
+  info->surf.phys_level0_sa.h += surf->tile_y_sa;
+   }
 }
 
 
diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
index f22110b..0a10ff9 100644
--- a/src/intel/blorp/blorp.h
+++ b/src/intel/blorp/blorp.h
@@ -114,6 +114,9 @@ struct blorp_surf
 * that it contains a swizzle of RGBA and resource min LOD of 0.
 */
struct blorp_address clear_color_addr;
+
+   /* Only allowed for simple 2D non-MSAA surfaces */
+   uint32_t tile_x_sa, tile_y_sa;
 };
 
 void
diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 67d4266..68e6d4e 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch,
dst_layer, ISL_FORMAT_UNSUPPORTED, true);
 
struct brw_blorp_blit_prog_key wm_prog_key = {
-  .shader_type = BLORP_SHADER_TYPE_BLIT
+  .shader_type = BLORP_SHADER_TYPE_BLIT,
+  .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa,
+  .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa,
};
 
const struct isl_format_layout *src_fmtl =
diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
index 832e8ee..4d3125a 100644
--- a/src/intel/blorp/blorp_clear.c
+++ b/src/intel/blorp/blorp_clear.c
@@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch,
   params.x1 = x1;
   params.y1 = y1;
 
+  if (params.dst.tile_x_sa || params.dst.tile_y_sa) {
+ assert(params.dst.surf.samples == 1);
+ assert(num_layers == 1);
+ params.x0 += params.dst.tile_x_sa;
+ params.y0 += params.dst.tile_y_sa;
+ params.x1 += params.dst.tile_x_sa;
+ params.y1 += params.dst.tile_y_sa;
+  }
+
   /* The MinLOD and MinimumArrayElement don't work properly for cube maps.
* Convert them to a single slice on gen4.
*/
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index d7a2cb2..8c6d77e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
  .mocs = brw_get_bo_mocs(devinfo, mt->bo),
   },
   .aux_usage = aux_usage,
+  .tile_x_sa = mt->level[*level].level_x,
+  .tile_y_sa = mt->level[*level].level_y,
};
 
if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target &&
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets

2018-05-25 Thread Kenneth Graunke
On Friday, May 25, 2018 12:31:03 PM PDT Jason Ekstrand wrote:
> For certain EGLImage cases, we represent a single slice or LOD of an
> image with a byte offset to a tile and X/Y intratile offsets to the
> given slice.  Most of i965 is fine with this but it breaks blorp.  This
> is a terrible way to represent slices of a surface in EGL and we should
> stop some day but that's a very scary and thorny path.  This gets blorp
> to start working with those surfaces and fixes some dEQP EGL test bugs.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/intel/blorp/blorp.c   | 22 ++
>  src/intel/blorp/blorp.h   |  3 +++
>  src/intel/blorp/blorp_blit.c  |  4 +++-
>  src/intel/blorp/blorp_clear.c |  9 +
>  src/mesa/drivers/dri/i965/brw_blorp.c |  2 ++
>  5 files changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c
> index e348caf..73f8c67 100644
> --- a/src/intel/blorp/blorp.c
> +++ b/src/intel/blorp/blorp.c
> @@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context *blorp,
>  */
> if (is_render_target && blorp->isl_dev->info->gen <= 6)
>info->view.array_len = MIN2(info->view.array_len, 512);
> +
> +   if (surf->tile_x_sa || surf->tile_y_sa) {
> +  /* This is only allowed on simple 2D surfaces without MSAA */
> +  assert(info->surf.dim == ISL_SURF_DIM_2D);
> +  assert(info->surf.samples == 1);
> +  assert(info->surf.levels == 1);
> +  assert(info->surf.logical_level0_px.array_len == 1);
> +  assert(info->aux_usage == ISL_AUX_USAGE_NONE);
> +
> +  info->tile_x_sa = surf->tile_x_sa;
> +  info->tile_y_sa = surf->tile_y_sa;
> +
> +  /* Instead of using the X/Y Offset fields in RENDER_SURFACE_STATE, we
> +   * place the image at the tile boundary and offset our sampling or
> +   * rendering.  For this reason, we need to grow the image by the offset
> +   * to ensure that the hardware doesn't think we've gone past the edge.
> +   */
> +  info->surf.logical_level0_px.w += surf->tile_x_sa;
> +  info->surf.logical_level0_px.h += surf->tile_y_sa;
> +  info->surf.phys_level0_sa.w += surf->tile_x_sa;
> +  info->surf.phys_level0_sa.h += surf->tile_y_sa;
> +   }
>  }
>  
>  
> diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> index f22110b..0a10ff9 100644
> --- a/src/intel/blorp/blorp.h
> +++ b/src/intel/blorp/blorp.h
> @@ -114,6 +114,9 @@ struct blorp_surf
>  * that it contains a swizzle of RGBA and resource min LOD of 0.
>  */
> struct blorp_address clear_color_addr;
> +
> +   /* Only allowed for simple 2D non-MSAA surfaces */
> +   uint32_t tile_x_sa, tile_y_sa;
>  };
>  
>  void
> diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
> index 67d4266..68e6d4e 100644
> --- a/src/intel/blorp/blorp_blit.c
> +++ b/src/intel/blorp/blorp_blit.c
> @@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch,
> dst_layer, ISL_FORMAT_UNSUPPORTED, true);
>  
> struct brw_blorp_blit_prog_key wm_prog_key = {
> -  .shader_type = BLORP_SHADER_TYPE_BLIT
> +  .shader_type = BLORP_SHADER_TYPE_BLIT,
> +  .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa,
> +  .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa,
> };
>  
> const struct isl_format_layout *src_fmtl =
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index 832e8ee..4d3125a 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch,
>params.x1 = x1;
>params.y1 = y1;
>  
> +  if (params.dst.tile_x_sa || params.dst.tile_y_sa) {
> + assert(params.dst.surf.samples == 1);
> + assert(num_layers == 1);
> + params.x0 += params.dst.tile_x_sa;
> + params.y0 += params.dst.tile_y_sa;
> + params.x1 += params.dst.tile_x_sa;
> + params.y1 += params.dst.tile_y_sa;
> +  }
> +
>/* The MinLOD and MinimumArrayElement don't work properly for cube 
> maps.
> * Convert them to a single slice on gen4.
> */
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index d7a2cb2..8c6d77e 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
>   .mocs = brw_get_bo_mocs(devinfo, mt->bo),
>},
>.aux_usage = aux_usage,
> +  .tile_x_sa = mt->level[*level].level_x,
> +  .tile_y_sa = mt->level[*level].level_y,
> };
>  
> if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target &&
> 

Hopefully we don't run afoul of surface width/height limits.  Probably
won't, hard to imagine offsetting into something that's alr

Re: [Mesa-dev] [PATCH 07/53] intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot.

2018-05-25 Thread Matt Turner
On Fri, May 25, 2018 at 12:14 PM, Jason Ekstrand  wrote:
> On Fri, May 25, 2018 at 11:27 AM, Matt Turner  wrote:
>>
>> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand 
>> wrote:
>> > From: Francisco Jerez 
>>
>> I think some explanation is required. I'm guessing this is because you
>> have to write lo fragments out before high, but we should say that in
>> the commit message.
>
>
> How about this:
>
> When using multiple RT write messages to the same RT such as for dual-source
> blending or all RT writes in SIMD32, we have to set the "Last Render Target
> Select" bit on all write messages that target the last RT but only set EOT
> on the last RT write in the shader.  Special-casing for dual-source blend
> works today because that is the only case which requires multiple RT write
> messages per RT.  When we start doing SIMD32, this will become much more
> common so we add a dedicated bit for it.

Sounds good to me.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Support blits and clears on surfaces with offsets

2018-05-25 Thread Jason Ekstrand
On Fri, May 25, 2018 at 12:53 PM, Kenneth Graunke 
wrote:

> On Friday, May 25, 2018 12:31:03 PM PDT Jason Ekstrand wrote:
> > For certain EGLImage cases, we represent a single slice or LOD of an
> > image with a byte offset to a tile and X/Y intratile offsets to the
> > given slice.  Most of i965 is fine with this but it breaks blorp.  This
> > is a terrible way to represent slices of a surface in EGL and we should
> > stop some day but that's a very scary and thorny path.  This gets blorp
> > to start working with those surfaces and fixes some dEQP EGL test bugs.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/intel/blorp/blorp.c   | 22 ++
> >  src/intel/blorp/blorp.h   |  3 +++
> >  src/intel/blorp/blorp_blit.c  |  4 +++-
> >  src/intel/blorp/blorp_clear.c |  9 +
> >  src/mesa/drivers/dri/i965/brw_blorp.c |  2 ++
> >  5 files changed, 39 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c
> > index e348caf..73f8c67 100644
> > --- a/src/intel/blorp/blorp.c
> > +++ b/src/intel/blorp/blorp.c
> > @@ -137,6 +137,28 @@ brw_blorp_surface_info_init(struct blorp_context
> *blorp,
> >  */
> > if (is_render_target && blorp->isl_dev->info->gen <= 6)
> >info->view.array_len = MIN2(info->view.array_len, 512);
> > +
> > +   if (surf->tile_x_sa || surf->tile_y_sa) {
> > +  /* This is only allowed on simple 2D surfaces without MSAA */
> > +  assert(info->surf.dim == ISL_SURF_DIM_2D);
> > +  assert(info->surf.samples == 1);
> > +  assert(info->surf.levels == 1);
> > +  assert(info->surf.logical_level0_px.array_len == 1);
> > +  assert(info->aux_usage == ISL_AUX_USAGE_NONE);
> > +
> > +  info->tile_x_sa = surf->tile_x_sa;
> > +  info->tile_y_sa = surf->tile_y_sa;
> > +
> > +  /* Instead of using the X/Y Offset fields in
> RENDER_SURFACE_STATE, we
> > +   * place the image at the tile boundary and offset our sampling or
> > +   * rendering.  For this reason, we need to grow the image by the
> offset
> > +   * to ensure that the hardware doesn't think we've gone past the
> edge.
> > +   */
> > +  info->surf.logical_level0_px.w += surf->tile_x_sa;
> > +  info->surf.logical_level0_px.h += surf->tile_y_sa;
> > +  info->surf.phys_level0_sa.w += surf->tile_x_sa;
> > +  info->surf.phys_level0_sa.h += surf->tile_y_sa;
> > +   }
> >  }
> >
> >
> > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> > index f22110b..0a10ff9 100644
> > --- a/src/intel/blorp/blorp.h
> > +++ b/src/intel/blorp/blorp.h
> > @@ -114,6 +114,9 @@ struct blorp_surf
> >  * that it contains a swizzle of RGBA and resource min LOD of 0.
> >  */
> > struct blorp_address clear_color_addr;
> > +
> > +   /* Only allowed for simple 2D non-MSAA surfaces */
> > +   uint32_t tile_x_sa, tile_y_sa;
> >  };
> >
> >  void
> > diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
> > index 67d4266..68e6d4e 100644
> > --- a/src/intel/blorp/blorp_blit.c
> > +++ b/src/intel/blorp/blorp_blit.c
> > @@ -2510,7 +2510,9 @@ blorp_copy(struct blorp_batch *batch,
> > dst_layer, ISL_FORMAT_UNSUPPORTED, true);
> >
> > struct brw_blorp_blit_prog_key wm_prog_key = {
> > -  .shader_type = BLORP_SHADER_TYPE_BLIT
> > +  .shader_type = BLORP_SHADER_TYPE_BLIT,
> > +  .need_src_offset = src_surf->tile_x_sa || src_surf->tile_y_sa,
> > +  .need_dst_offset = dst_surf->tile_x_sa || dst_surf->tile_y_sa,
> > };
> >
> > const struct isl_format_layout *src_fmtl =
> > diff --git a/src/intel/blorp/blorp_clear.c
> b/src/intel/blorp/blorp_clear.c
> > index 832e8ee..4d3125a 100644
> > --- a/src/intel/blorp/blorp_clear.c
> > +++ b/src/intel/blorp/blorp_clear.c
> > @@ -438,6 +438,15 @@ blorp_clear(struct blorp_batch *batch,
> >params.x1 = x1;
> >params.y1 = y1;
> >
> > +  if (params.dst.tile_x_sa || params.dst.tile_y_sa) {
> > + assert(params.dst.surf.samples == 1);
> > + assert(num_layers == 1);
> > + params.x0 += params.dst.tile_x_sa;
> > + params.y0 += params.dst.tile_y_sa;
> > + params.x1 += params.dst.tile_x_sa;
> > + params.y1 += params.dst.tile_y_sa;
> > +  }
> > +
> >/* The MinLOD and MinimumArrayElement don't work properly for
> cube maps.
> > * Convert them to a single slice on gen4.
> > */
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> > index d7a2cb2..8c6d77e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > @@ -152,6 +152,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
> >   .mocs = brw_get_bo_mocs(devinfo, mt->bo),
> >},
> >.aux_usage = aux_usage,
> > +  .tile_x_sa = mt->level[*level].l

[Mesa-dev] [PATCH 1/2] mesa: handle GL_UNSIGNED_INT64_ARB in _mesa_bytes_per_vertex_attrib

2018-05-25 Thread Marek Olšák
From: Marek Olšák 

Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.

Cc: 18.0 18.1 
---
 src/mesa/main/glformats.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index cba5e670db0..667020c193c 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -556,20 +556,22 @@ _mesa_bytes_per_vertex_attrib(GLint comps, GLenum type)
case GL_UNSIGNED_INT_2_10_10_10_REV:
   if (comps == 4)
  return sizeof(GLuint);
   else
  return -1;
case GL_UNSIGNED_INT_10F_11F_11F_REV:
   if (comps == 3)
  return sizeof(GLuint);
   else
  return -1;
+   case GL_UNSIGNED_INT64_ARB:
+  return comps * 8;
default:
   return -1;
}
 }
 
 /**
  * Test if the given format is unsized.
  */
 GLboolean
 _mesa_is_enum_format_unsized(GLenum format)
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/mesa: handle GL_UNSIGNED_INT64_ARB in st_pipe_vertex_format

2018-05-25 Thread Marek Olšák
From: Marek Olšák 

Bindless texture handles can be passed via vertex attribs using this type.
This fixes a bunch of bindless piglit tests on radeonsi.

Cc: 18.0 18.1 
---
 src/mesa/state_tracker/st_atom_array.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_array.c 
b/src/mesa/state_tracker/st_atom_array.c
index 9a0935e21a5..76dc81975c8 100644
--- a/src/mesa/state_tracker/st_atom_array.c
+++ b/src/mesa/state_tracker/st_atom_array.c
@@ -292,20 +292,23 @@ st_pipe_vertex_format(const struct gl_array_attributes 
*attrib)
   assert(size == 3 && !integer && format == GL_RGBA);
   return PIPE_FORMAT_R11G11B10_FLOAT;
 
case GL_UNSIGNED_BYTE:
   if (format == GL_BGRA) {
  /* this is an odd-ball case */
  assert(normalized);
  return PIPE_FORMAT_B8G8R8A8_UNORM;
   }
   break;
+
+   case GL_UNSIGNED_INT64_ARB:
+  return PIPE_FORMAT_R32G32_UINT;
}
 
index = integer*2 + normalized;
assert(index <= 2);
assert(type >= GL_BYTE && type <= GL_FIXED);
return vertex_formats[type - GL_BYTE][index][size-1];
 }
 
 static void init_velement(struct pipe_vertex_element *velement,
   int src_offset, int format,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:
> From: Francisco Jerez 
>
> The hardware's control flow logic is 16-wide so we're out of luck
> here.  We could, in theory, support SIMD32 if we know the control-flow
> is uniform but we don't have that information at this point.

This is what the "fork" instruction is for on Gen6 :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/53] intel/fs: Fix logical FB write lowering for SIMD32

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:
> From: Francisco Jerez 
>

Presumably Jason already reviewed this and just missed attaching his R-b tag.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/53] intel/fs: Add the group to the flag subreg number on SNB and older

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:
> We want consistent behavior in the meaning of the flag_subreg field
> between SNB and IVB+.
>
> v2 (Jason Ekstrand):
>  - Add some extra commentary
>
> Reviewed-by: Jason Ekstrand 

Presumably you did not intend to review your own patch :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 32/53] intel/fs: Mark LINTERP opcode as writing accumulator implicitly on pre-Gen7.

2018-05-25 Thread Matt Turner
On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:
> From: Francisco Jerez 
>
> ---
>  src/intel/compiler/brw_shader.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/compiler/brw_shader.cpp 
> b/src/intel/compiler/brw_shader.cpp
> index 141b64e..61211ef 100644
> --- a/src/intel/compiler/brw_shader.cpp
> +++ b/src/intel/compiler/brw_shader.cpp
> @@ -984,7 +984,8 @@ backend_instruction::writes_accumulator_implicitly(const 
> struct gen_device_info
> return writes_accumulator ||
>(devinfo->gen < 6 &&
> ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) ||
> -(opcode >= FS_OPCODE_DDX_COARSE && opcode <= 
> FS_OPCODE_LINTERP)));
> +(opcode >= FS_OPCODE_DDX_COARSE && opcode <= 
> FS_OPCODE_LINTERP))) ||
> +  (devinfo->gen < 7 && opcode == FS_OPCODE_LINTERP);

That's heavy-handed. Won't this prevent the scheduler from reordering
LINTERP instructions, even though we can only run into problems on
SIMD32?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/53] intel/fs: SIMD32 support for fragment shaders

2018-05-25 Thread Matt Turner
On Fri, May 25, 2018 at 11:50 AM, Matt Turner  wrote:
> On Thu, May 24, 2018 at 2:55 PM, Jason Ekstrand  wrote:
>> This patch series adds back-end compiler support for SIMD32 fragment
>> shaders.  Support is added and everything works but it's currently hidden
>> behind INTEL_DEBUG=do32.  We know that it improves performance in some
>> cases but we do not yet have a good enough heuristic to start turning it on
>> by default.  The objective of this series is to just to get the compiler
>> infrastructure landed so that it stops bit-rotting in Curro's branch.
>> Figuring out a good heuristic is left as an exercise to the reader. :-)
>
> 1-6, 8-20 are
>
> Reviewed-by: Matt Turner 

7, 22-31 are too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear

2018-05-25 Thread Chris Wilson
Quoting Scott D Phillips (2018-04-30 18:25:48)
> +#if defined(USE_SSE41)
> +static ALWAYS_INLINE void *
> +_memcpy_streaming_load(void *dest, const void *src, size_t count)
> +{
> +   if (count == 16) {
> +  __m128i val = _mm_stream_load_si128((__m128i *)src);
> +  _mm_store_si128((__m128i *)dest, val);
> +  return dest;
> +   } else if (count == 64) {
> +  __m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
> +  __m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
> +  __m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
> +  __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
> +  _mm_store_si128(((__m128i *)dest) + 0, val0);
> +  _mm_store_si128(((__m128i *)dest) + 1, val1);
> +  _mm_store_si128(((__m128i *)dest) + 2, val2);
> +  _mm_store_si128(((__m128i *)dest) + 3, val3);
> +  return dest;

I didn't spot this before, but we use this to copy from an aligned
(tiled) source to an unaligned user buffer.

s/_mm_store_si128/_mm_storeu_si128/
   ^ very important :)
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965: Support packing for intel_tiled_memcpy paths

2018-05-25 Thread Chris Wilson
intel_tiled_memcpy is not restricted to using the same pitch on both the
src/dst buffers, nor requires row alignment on the user buffer. To
support arbitrary using packing modes, all we need to do is use the core
functions to compute the pixel locations.
---
 src/mesa/drivers/dri/i965/intel_pixel_read.c |  6 ++
 src/mesa/drivers/dri/i965/intel_tex_image.c  | 17 +++--
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index 57df1178417..e697f63d973 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -93,10 +93,6 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
 */
if (pixels == NULL ||
_mesa_is_bufferobj(pack->BufferObj) ||
-   pack->Alignment > 4 ||
-   pack->SkipPixels > 0 ||
-   pack->SkipRows > 0 ||
-   (pack->RowLength != 0 && pack->RowLength != width) ||
pack->SwapBytes ||
pack->LsbFirst ||
pack->Invert)
@@ -160,6 +156,8 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
xoffset += slice_offset_x;
yoffset += slice_offset_y;
 
+   pixels = _mesa_image_address(2, pack, pixels, width, height,
+format, type, 0, 0, 0);
dst_pitch = _mesa_image_row_stride(pack, width, format, type);
 
/* For a window-system renderbuffer, the buffer is actually flipped
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 5afc8d99462..ebfd6fdd7d4 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -200,10 +200,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
  texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
pixels == NULL ||
_mesa_is_bufferobj(packing->BufferObj) ||
-   packing->Alignment > 4 ||
-   packing->SkipPixels > 0 ||
-   packing->SkipRows > 0 ||
-   (packing->RowLength != 0 && packing->RowLength != width) ||
packing->SwapBytes ||
packing->LsbFirst ||
packing->Invert)
@@ -244,14 +240,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
if (devinfo->gen < 5 && brw->has_swizzling)
   return false;
 
-   int level = texImage->Level + texImage->TexObject->MinLevel;
-
/* Since we are going to write raw data to the miptree, we need to resolve
 * any pending fast color clears before we start.
 */
assert(image->mt->surf.logical_level0_px.depth == 1);
assert(image->mt->surf.logical_level0_px.array_len == 1);
 
+   int level = texImage->Level + texImage->TexObject->MinLevel;
intel_miptree_access_raw(brw, image->mt, level, 0, true);
 
struct brw_bo *bo = image->mt->bo;
@@ -286,6 +281,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
xoffset += level_x;
yoffset += level_y;
 
+   pixels = _mesa_image_address(dims, packing, pixels, width, height,
+   format, type, 0, 0, 0);
uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat);
 
linear_to_tiled(
@@ -704,10 +701,6 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
  texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
pixels == NULL ||
_mesa_is_bufferobj(packing->BufferObj) ||
-   packing->Alignment > 4 ||
-   packing->SkipPixels > 0 ||
-   packing->SkipRows > 0 ||
-   (packing->RowLength != 0 && packing->RowLength != width) ||
packing->SwapBytes ||
packing->LsbFirst ||
packing->Invert)
@@ -780,6 +773,10 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
xoffset += level_x;
yoffset += level_y;
 
+   int dims = _mesa_get_texture_dimensions(texImage->TexObject->Target);
+   pixels = _mesa_image_address(dims, packing, pixels, width, height,
+format, type, 0, 0, 0);
+
uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat);
tiled_to_linear(
   xoffset * cpp, (xoffset + width) * cpp,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] i965: Enable fast detiling paths for !llc

2018-05-25 Thread Chris Wilson
Now that we have enabled cache-line at a time transfers to and from GPU
memory, we can accelerate access into !llc (WC) memory just as well as
WB memory with llc.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c |  2 +-
 src/mesa/drivers/dri/i965/intel_pixel_read.c   |  5 ++---
 src/mesa/drivers/dri/i965/intel_tex_image.c| 12 ++--
 src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 17 -
 src/mesa/drivers/dri/i965/intel_tiled_memcpy.h |  3 ++-
 5 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 66828f319be..fd9e8c49b13 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -925,7 +925,7 @@ can_map_cpu(struct brw_bo *bo, unsigned flags)
 * the GPU for blits or other operations, causing batches to happen at
 * inconvenient times.
 */
-   if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC))
+   if (flags & (MAP_PERSISTENT | MAP_COHERENT | MAP_ASYNC | MAP_RAW))
   return false;
 
return !(flags & MAP_WRITE);
diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index a545d215ad6..57df1178417 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -91,8 +91,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
 * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support
 * more types.
 */
-   if (!devinfo->has_llc ||
-   pixels == NULL ||
+   if (pixels == NULL ||
_mesa_is_bufferobj(pack->BufferObj) ||
pack->Alignment > 4 ||
pack->SkipPixels > 0 ||
@@ -115,7 +114,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
   return false;
 
mem_copy_fn mem_copy =
-  intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD);
+  intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD, devinfo);
if (mem_copy == NULL)
   return false;
 
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index de8832812c1..5afc8d99462 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -196,8 +196,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
 * with _mesa_image_row_stride. However, before removing the restrictions
 * we need tests.
 */
-   if (!devinfo->has_llc ||
-   !(texImage->TexObject->Target == GL_TEXTURE_2D ||
+   if (!(texImage->TexObject->Target == GL_TEXTURE_2D ||
  texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
pixels == NULL ||
_mesa_is_bufferobj(packing->BufferObj) ||
@@ -218,7 +217,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
   return false;
 
mem_copy_fn mem_copy =
-  intel_get_memcpy(texImage->TexFormat, format, type, INTEL_UPLOAD);
+  intel_get_memcpy(texImage->TexFormat, format, type,
+   INTEL_UPLOAD, devinfo);
if (mem_copy == NULL)
   return false;
 
@@ -700,8 +700,7 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
 * with _mesa_image_row_stride. However, before removing the restrictions
 * we need tests.
 */
-   if (!devinfo->has_llc ||
-   !(texImage->TexObject->Target == GL_TEXTURE_2D ||
+   if (!(texImage->TexObject->Target == GL_TEXTURE_2D ||
  texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
pixels == NULL ||
_mesa_is_bufferobj(packing->BufferObj) ||
@@ -715,7 +714,8 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
   return false;
 
mem_copy_fn mem_copy =
-  intel_get_memcpy(texImage->TexFormat, format, type, INTEL_DOWNLOAD);
+  intel_get_memcpy(texImage->TexFormat, format, type,
+   INTEL_DOWNLOAD, devinfo);
if (mem_copy == NULL)
   return false;
 
diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c 
b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
index abe0f804f37..ae4144904f6 100644
--- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
+++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
@@ -1004,11 +1004,18 @@ tiled_to_linear(uint32_t xt1, uint32_t xt2,
  */
 mem_copy_fn intel_get_memcpy(mesa_format tiledFormat,
  GLenum format, GLenum type,
- enum intel_memcpy_direction direction)
+ enum intel_memcpy_direction direction,
+ const struct gen_device_info *devinfo)
 {
mesa_format user_format;
mem_copy_fn fn = NULL;
 
+   /* movntdqa support is required for fast reads */
+#if !defined(USE_SSE41)
+   if (direction == INTEL_DOWNLOAD && !devinfo->has_llc)
+  return false;
+#endif
+
if (type == GL_BITMAP)
   return NULL;
 
@@ -1066,5 +1073,13 @@ mem_copy_fn intel_get_memcpy(mesa_format tiledFormat,
   break;
}
 
+   /* Only the default 

[Mesa-dev] i965: Enable fast detiling paths for !llc

2018-05-25 Thread Chris Wilson
Just a small series to put the new cache-line read back to good use for
ye olde Xorg on bxt (and older/newer with very similar effect).

From

  4 trep @   0.7007 msec (  1430.0/sec): ShmPutImage 500x500 square
   4000 trep @   9.0367 msec (   111.0/sec): ShmGetImage 500x500 square

to

  6 trep @   0.5084 msec (  1970.0/sec): ShmPutImage 500x500 square
  12000 trep @   2.4808 msec (   403.0/sec): ShmGetImage 500x500 square



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965: Push the format checks to intel_tiled_memcpy

2018-05-25 Thread Chris Wilson
Allow the tiled_memcpy backend to determine if it is able to copy
between the source and destination pixel buffer. This allows us to
eliminate some duplication in the callers, and permits us to be more
flexible in checking for compatible formats.

(Hmm, is sRGB handling right?)
---
 src/mesa/drivers/dri/i965/intel_pixel_read.c  |  16 +--
 src/mesa/drivers/dri/i965/intel_tex_image.c   |  46 +++-
 .../drivers/dri/i965/intel_tiled_memcpy.c | 108 +++---
 .../drivers/dri/i965/intel_tiled_memcpy.h |  17 ++-
 4 files changed, 102 insertions(+), 85 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index 6ed7895bc76..a545d215ad6 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -86,15 +86,12 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
/* The miptree's buffer. */
struct brw_bo *bo;
 
-   uint32_t cpp;
-   mem_copy_fn mem_copy = NULL;
 
/* This fastpath is restricted to specific renderbuffer types:
 * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support
 * more types.
 */
if (!devinfo->has_llc ||
-   !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) ||
pixels == NULL ||
_mesa_is_bufferobj(pack->BufferObj) ||
pack->Alignment > 4 ||
@@ -117,15 +114,9 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
if (rb->NumSamples > 1)
   return false;
 
-   /* We can't handle copying from RGBX or BGRX because the tiled_memcpy
-* function doesn't set the last channel to 1. Note this checks BaseFormat
-* rather than TexFormat in case the RGBX format is being simulated with an
-* RGBA format.
-*/
-   if (rb->_BaseFormat == GL_RGB)
-  return false;
-
-   if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp))
+   mem_copy_fn mem_copy =
+  intel_get_memcpy(rb->Format, format, type, INTEL_DOWNLOAD);
+   if (mem_copy == NULL)
   return false;
 
if (!irb->mt ||
@@ -198,6 +189,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
pack->Alignment, pack->RowLength, pack->SkipPixels,
pack->SkipRows);
 
+   uint32_t cpp = _mesa_get_format_bytes(rb->Format);
tiled_to_linear(
   xoffset * cpp, (xoffset + width) * cpp,
   yoffset, yoffset + height,
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index fae179214dd..de8832812c1 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -186,13 +186,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
struct brw_context *brw = brw_context(ctx);
const struct gen_device_info *devinfo = &brw->screen->devinfo;
struct intel_texture_image *image = intel_texture_image(texImage);
-   int src_pitch;
-
-   /* The miptree's buffer. */
-   struct brw_bo *bo;
-
-   uint32_t cpp;
-   mem_copy_fn mem_copy = NULL;
 
/* This fastpath is restricted to specific texture types:
 * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support
@@ -204,7 +197,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
 * we need tests.
 */
if (!devinfo->has_llc ||
-   !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) ||
!(texImage->TexObject->Target == GL_TEXTURE_2D ||
  texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) ||
pixels == NULL ||
@@ -222,7 +214,12 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
if (ctx->_ImageTransferState)
   return false;
 
-   if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy, &cpp))
+   if (format == GL_COLOR_INDEX)
+  return false;
+
+   mem_copy_fn mem_copy =
+  intel_get_memcpy(texImage->TexFormat, format, type, INTEL_UPLOAD);
+   if (mem_copy == NULL)
   return false;
 
/* If this is a nontrivial texture view, let another path handle it 
instead. */
@@ -257,7 +254,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
 
intel_miptree_access_raw(brw, image->mt, level, 0, true);
 
-   bo = image->mt->bo;
+   struct brw_bo *bo = image->mt->bo;
 
if (brw_batch_references(&brw->batch, bo)) {
   perf_debug("Flushing before mapping a referenced bo.\n");
@@ -270,7 +267,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
   return false;
}
 
-   src_pitch = _mesa_image_row_stride(packing, width, format, type);
+   int src_pitch = _mesa_image_row_stride(packing, width, format, type);
 
/* We postponed printing this message until having committed to executing
 * the function.
@@ -289,6 +286,8 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
xoffset += level_x;
yoffset += level_y;
 
+   uint32_t cpp = _mesa_get_format_bytes(texImage->TexFormat);
+
linear_to_tiled(
   xoffset * cpp, (xoffset + width) * cpp,
   yoffset, yoffset + hei

[Mesa-dev] [PATCH 1/4] i915: Fix streamling loads for intel_tiled_memcpy

2018-05-25 Thread Chris Wilson
We stream from a tiled and aligned source into an unaligned user buffer,
so we need to use _mm_storeu_si128.
---
 src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c 
b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
index fac5427d2ed..6440dceac36 100644
--- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
+++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
@@ -223,17 +223,17 @@ _memcpy_streaming_load(void *dest, const void *src, 
size_t count)
 {
if (count == 16) {
   __m128i val = _mm_stream_load_si128((__m128i *)src);
-  _mm_store_si128((__m128i *)dest, val);
+  _mm_storeu_si128((__m128i *)dest, val);
   return dest;
} else if (count == 64) {
   __m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
   __m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
   __m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
   __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
-  _mm_store_si128(((__m128i *)dest) + 0, val0);
-  _mm_store_si128(((__m128i *)dest) + 1, val1);
-  _mm_store_si128(((__m128i *)dest) + 2, val2);
-  _mm_store_si128(((__m128i *)dest) + 3, val3);
+  _mm_storeu_si128(((__m128i *)dest) + 0, val0);
+  _mm_storeu_si128(((__m128i *)dest) + 1, val1);
+  _mm_storeu_si128(((__m128i *)dest) + 2, val2);
+  _mm_storeu_si128(((__m128i *)dest) + 3, val3);
   return dest;
} else {
   assert(count < 64); /* and (count < 16) for ytiled */
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Gitlab migration

2018-05-25 Thread Mark Janes
Daniel Stone  writes:
> We had a go at using Jenkins for some of this: Intel's been really
> quite successful at doing it internally, but our community efforts
> have been a miserable failure. After a few years I've concluded that
> it's not going to change - even with Jenkins 2.0.
>
> Firstly, Jenkins configuration is an absolute dumpster fire. Working
> out how to configure it and create the right kind of jobs (and debug
> it!) is surprisingly difficult, and involves a lot of clicking through
> the web UI, or using external tools like jenkins-job-builder which
> seem to be in varying levels of disrepair. If you have dedicated 'QA
> people' whose job is driving Jenkins for you, then great! Jenkins will
> probably work well for you. This doesn't scale to a community model
> though. Especially when people have different usecases and need to
> install different plugins.
>
> Jenkins security is also a tyre fire. Plugins are again in varying
> levels of disrepair, and seem remarkably prone to CVEs. There's no
> real good model for updating plugins (and doing so is super fragile).
> Worse still, Jenkins 2.0 really pushes you to be writing scripts in
> Groovy, which can affect Jenkins in totally arbitrary ways, and
> subvert the security model entirely. The way upstream deals with this
> is to enforce a 'sandbox' model preventing most scripts from doing
> anything useful unless manually audited and approved by an admin.
> Again, this is fine for companies or small teams where you trust
> people to not screw up, but doesn't scale to something like fd.o.
>
> Adding to these is the permission model, which again requires painful
> configuration and a lot of admin clicking. It doesn't integrate well
> with external services, and granularity is mostly at an instance
> rather than a project level: again not suitable for something like
> fd.o.
>
> From the UI and workflow perspective, something I've never liked is
> that the first-order view is very specific pipelines, e.g. 'Mesa
> master build', 'daily Piglit run', etc etc. If all you care about is
> master, then this is fine. You _can_ make those pipelines run against
> arbitrary branches and commits you pick up from MRs or similar, but
> you really are trying to jam it sideways into the UI it wants to
> present. Again this is so deeply baked into how Jenkins works that I
> don't see it as really being fixable.
>
> I have a pile of other gripes, like how difficult their remote API is
> to use, and the horrible race conditions it has. For instance, when
> you schedule a run of a particular job, it doesn't report the run ID
> back to you: you have to poll the last job number before you submit,
> then poll again for a few seconds to find the next run ID. Good luck
> to you if two runs of the same job (e.g. 'build specific Mesa commit')
> get scheduled at the same time.

I agree with some of your Jenkins critiques.  I have implemented CI on
*many* different frameworks over the past 15 years, and I think that
every implementation has its fans and haters.

It is wise to create automation which is mostly independent of the CI
framework.  Mesa i965 CI could immediately switch from Jenkins to
BuildBot or GitLab, if there was a reason to do so.  It may be that
GitLab is superior to Jenkins by now, but the selection of the CI
framework is a minor detail anyways.

CI frameworks are often based on build/test pipelines, which I think is
exactly the wrong concept for the domain.  Flexible CI is best thought
of as a multiplatform `make` system.  Setting up a "pipeline" is similar
to building your project with a shell script instead of a makefile.

I disagree with your critique of the Jenkins remote API.  It is more
flexible than any other API that I have seen for CI.  We implement our
multiplatform-make system on top of it.  It would be nice to have an ID
returned when triggering a job, but you can work around by including a
GUID as a build parameter, then polling for the GUID.

The reasons I chose Jenkins over what was available at the time:

  - job/system configuration is saved as XML for backup/diff/restore
  - huge number of users -> fewer quality issues

> GitLab CI fixes all of these things. Pipelines are strongly and
> directly correlated with commits in repositories, though you can also
> trigger them manually or on a schedule. Permissions are that of the
> repository, and just like Travis, people can fork and work on CI
> improvements in their own sandbox without impacting anything else. The
> job configuration is in relatively clean YAML, and it strongly
> suggests idiomatic form rather than a forest of thousands of
> unmaintained plugins.
>
> Jobs get run in clean containers, rather than special unicorn workers
> pre-configured just so, meaning that the builds are totally
> reproducible locally and you can use whatever build dependencies you
> want without having to bug the admins to install LLVM in some
> particular chroot. Those containers can be stored in a registry
> 

Re: [Mesa-dev] [PATCH 1/4] i915: Fix streamling loads for intel_tiled_memcpy

2018-05-25 Thread Kenneth Graunke
On Friday, May 25, 2018 4:33:56 PM PDT Chris Wilson wrote:
> We stream from a tiled and aligned source into an unaligned user buffer,
> so we need to use _mm_storeu_si128.
> ---
>  src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c 
> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> index fac5427d2ed..6440dceac36 100644
> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> @@ -223,17 +223,17 @@ _memcpy_streaming_load(void *dest, const void *src, 
> size_t count)
>  {
> if (count == 16) {
>__m128i val = _mm_stream_load_si128((__m128i *)src);
> -  _mm_store_si128((__m128i *)dest, val);
> +  _mm_storeu_si128((__m128i *)dest, val);
>return dest;
> } else if (count == 64) {
>__m128i val0 = _mm_stream_load_si128(((__m128i *)src) + 0);
>__m128i val1 = _mm_stream_load_si128(((__m128i *)src) + 1);
>__m128i val2 = _mm_stream_load_si128(((__m128i *)src) + 2);
>__m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3);
> -  _mm_store_si128(((__m128i *)dest) + 0, val0);
> -  _mm_store_si128(((__m128i *)dest) + 1, val1);
> -  _mm_store_si128(((__m128i *)dest) + 2, val2);
> -  _mm_store_si128(((__m128i *)dest) + 3, val3);
> +  _mm_storeu_si128(((__m128i *)dest) + 0, val0);
> +  _mm_storeu_si128(((__m128i *)dest) + 1, val1);
> +  _mm_storeu_si128(((__m128i *)dest) + 2, val2);
> +  _mm_storeu_si128(((__m128i *)dest) + 3, val3);
>return dest;
> } else {
>assert(count < 64); /* and (count < 16) for ytiled */
> 

Fixes: d21c086d819d78fb3f6abcbb14aa492970f442aa (i965/tiled_memcpy: inline 
movntdqa loads in tiled_to_linear)
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98581] Dota 2 graphics glitch on autocast abilities.

2018-05-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98581

ros...@gmail.com changed:

   What|Removed |Added

 Resolution|WORKSFORME  |FIXED

--- Comment #2 from ros...@gmail.com ---
That is correct. Issue has long since been fixed.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/53] intel/fs: Fix logical FB write lowering for SIMD32

2018-05-25 Thread Jason Ekstrand

On May 25, 2018 15:23:21 Matt Turner  wrote:


On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:

From: Francisco Jerez 


Presumably Jason already reviewed this and just missed attaching his R-b tag.


Some of these patches have somewhat confusing authorship.  I didn't add my 
R-b because I sort-of half-wrote this patch. I thought about changing the 
particular author to me but it was a toss up.  In any case, a third pair of 
eyes was needed.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/16] docs: Add python script that converts html to rst.

2018-05-25 Thread Laura Ekstrand
I specifically tried forcing a rename earlier, but it doesn't work.  Git
sees too much change.  The only way I could get it to work was manually
renaming the HTML files to rst first, then committing, then converting to
rst.

The problem with that strategy is that then the Pandoc command for
converting to rst doesn't make sense.  (.rst to .rst? What?)

Laura

On Fri, May 25, 2018, 4:26 AM Eric Engestrom 
wrote:

> On Thursday, 2018-05-24 17:27:05 -0700, Laura Ekstrand wrote:
> > Use Beautiful Soup to fix bad html, then use pandoc for converting to
> > rst.
> > ---
> >  docs/rstConverter.py | 23 +++
> >  1 file changed, 23 insertions(+)
> >  create mode 100755 docs/rstConverter.py
> >
> > diff --git a/docs/rstConverter.py b/docs/rstConverter.py
> > new file mode 100755
> > index 00..5321fdde8b
> > --- /dev/null
> > +++ b/docs/rstConverter.py
> > @@ -0,0 +1,23 @@
> > +#!/usr/bin/python3
> > +import glob
> > +import subprocess
> > +from bs4 import BeautifulSoup
> > +
> > +pages = glob.glob("*.html")
> > +pages += glob.glob("relnotes/*.html")
> > +for filename in pages:
> > +# Fix some annoyingly bad html.
> > +with open(filename) as f:
> > +soup = BeautifulSoup(f, 'html5lib')
> > +soup.find("div", "header").extract() # Get rid of old header
> > +soup.iframe.extract() # Get rid of old contents bar.
> > +soup.find("div", "content").unwrap() # Strip the content div.
>
> Good call on using beautifulsoup to clean the html before converting it!
>
> > +
> > +# Write out the better html.
> > +with open(filename, 'wt') as f:
> > +f.write(str(soup))
> > +
> > +# Convert to rst with pandoc.
> > +name = filename.split(".html")[0]
> > +bashCmd = "pandoc " + filename + " -o " + name + ".rst"
> > +subprocess.run(bashCmd.split())
>
> Idea: remove the old html at the same time as we introduce the rst
> (commit-wise), so that git picks it up as a rename with changes, which
> hopefully would be easier to check as a 1:1 of any given conversion?
>
> (In case this is as unclear as I think it is, I'm thinking about how we
> can review individual pages conversions; say index.html -> index.rst, to
> see that no release has been dropped in the process. If git shows this
> as a rename with changes, I expect it will be easier to check than if
> one commit creates all the rst files and another deletes all the html)
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 30/53] intel/fs: Add the group to the flag subreg number on SNB and older

2018-05-25 Thread Jason Ekstrand

On May 25, 2018 15:24:53 Matt Turner  wrote:


On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:

We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.

v2 (Jason Ekstrand):
- Add some extra commentary

Reviewed-by: Jason Ekstrand 


Presumably you did not intend to review your own patch :)


My patch? Curro's patch? It gets kind of hard to tell in this series. :-). 
This particular one is a single line plucked out of a Curro patch the rest 
of which landed some time ago.  I thought about leaving him as author.  
Maybe I should switch it back?



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow

2018-05-25 Thread Jason Ekstrand

On May 25, 2018 15:19:25 Matt Turner  wrote:


On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:

From: Francisco Jerez 

The hardware's control flow logic is 16-wide so we're out of luck
here.  We could, in theory, support SIMD32 if we know the control-flow
is uniform but we don't have that information at this point.


This is what the "fork" instruction is for on Gen6 :)


Yeah, Curro pointed that out too...



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 32/53] intel/fs: Mark LINTERP opcode as writing accumulator implicitly on pre-Gen7.

2018-05-25 Thread Jason Ekstrand

On May 25, 2018 15:28:22 Matt Turner  wrote:


On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:

From: Francisco Jerez 

---
src/intel/compiler/brw_shader.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp

index 141b64e..61211ef 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -984,7 +984,8 @@ 
backend_instruction::writes_accumulator_implicitly(const struct gen_device_info

return writes_accumulator ||
  (devinfo->gen < 6 &&
   ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) ||
-(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP)));
+(opcode >= FS_OPCODE_DDX_COARSE && opcode <= 
FS_OPCODE_LINTERP))) ||

+  (devinfo->gen < 7 && opcode == FS_OPCODE_LINTERP);


That's heavy-handed. Won't this prevent the scheduler from reordering
LINTERP instructions, even though we can only run into problems on
SIMD32?


As long as none of them declare that they read it, re-ordering should be 
fine.  If we don't do this, the compiler may move a LINTERP between a write 
and  read of the accumulator emitted for some other reason.  That said, 
this reminds me that we should probably back-port a patch that declares 
that they write the accumulator on gen11+ too.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/53] intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow

2018-05-25 Thread Francisco Jerez
Jason Ekstrand  writes:

> On May 25, 2018 15:19:25 Matt Turner  wrote:
>
>> On Thu, May 24, 2018 at 2:56 PM, Jason Ekstrand  wrote:
>>> From: Francisco Jerez 
>>>
>>> The hardware's control flow logic is 16-wide so we're out of luck
>>> here.  We could, in theory, support SIMD32 if we know the control-flow
>>> is uniform but we don't have that information at this point.
>>
>> This is what the "fork" instruction is for on Gen6 :)
>
> Yeah, Curro pointed that out too...
>
>

The main problem about the fork instruction is that it prevents the
compiler from interleaving code from the low and high channel groups
within control flow, which largely defeats the purpose of SIMD32 of
amortizing instruction latency costs.  The other problem is that it
would involve substantial effort and it is... well... SNB-specific,
earlier platforms still won't get support for non-uniform control flow
in SIMD32, and newer platforms don't need it.  Probably not worth the
effort...

>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Gitlab migration

2018-05-25 Thread Jason Ekstrand
On Fri, May 25, 2018 at 4:47 PM, Mark Janes  wrote:

> Daniel Stone  writes:
> > GitLab CI fixes all of these things. Pipelines are strongly and
> > directly correlated with commits in repositories, though you can also
> > trigger them manually or on a schedule. Permissions are that of the
> > repository, and just like Travis, people can fork and work on CI
> > improvements in their own sandbox without impacting anything else. The
> > job configuration is in relatively clean YAML, and it strongly
> > suggests idiomatic form rather than a forest of thousands of
> > unmaintained plugins.
> >
> > Jobs get run in clean containers, rather than special unicorn workers
> > pre-configured just so, meaning that the builds are totally
> > reproducible locally and you can use whatever build dependencies you
> > want without having to bug the admins to install LLVM in some
> > particular chroot. Those containers can be stored in a registry
> > attached to the project, with their own lifetime/ownership/etc
> > tracking. Jenkins can use Docker if you have an external registry, but
> > again this requires setting up external authentication and
> > permissions, not to mention that there's no lifetime/ownership/expiry
> > tracking, so you have to write more special admin cronjob scripts to
> > clean up old images in the registry.
>
> GitLab may be perfectly suitable for CI, but please do not select Mesa
> dev infrastructure based on CI features.
>
> Any Mesa CI needs to trigger from multiple projects: drm, dEQP, Piglit,
> VulkanCTS, SPIRV-Tools, crucible, glslang.  They are not all going to be
> in GitLab.
>
> The cart (CI) follows the horse (upstream development process).  CI
> automation is cheap and flexible, and can easily adapt to changes in the
> driver implementation / dev process.
>

I think part of the difficulty in this discussion is something you
referenced in the second paragraph above.  The type of CI we do in our
Jenkins system is in a different domain than the type of CI supported by
the likes of gitlab.  The CI we do in our lab is more along the lines of
integration testing where multiple components all have to come together
whereas the gitlab CI framework is more intended to support single-project
unit testing.  The gitlab CI system also does not scale nearly well enough
to handle the kind of testing that we need to do.  The gitlab CI hooks
would work fairly well for building the website, running some build tests,
and maybe make check but it will never be a replacement for the Jenkins
system we have in our lab.  They're a useful feature (that's a good thing!)
but certainly not a replacement for what we have today.  I'm sorry if I
implied that it would;  I certainly did not intend to.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/dri: replace format conversion functions with single mapping table

2018-05-25 Thread Marek Olšák
On Thu, May 17, 2018 at 6:50 AM, Lucas Stach  wrote:

> Each time I have to touch the buffer import/export functions in the dri
> state tracker I get lost in the maze of functions converting between
> DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format.
>
> Rip it out and replace by a single table, which defines the correspondence
> between the different representations.
>
> Also this now stores all the known representations in the __DRIimageRec,
> to avoid the loss of information we currently have when importing a buffer
> with a fourcc, which doesn't have a corresponding dri format.
>
> Signed-off-by: Lucas Stach 
> ---
>  src/gallium/state_trackers/dri/dri2.c   | 476 ++--
>  src/gallium/state_trackers/dri/dri_screen.h |   1 +
>  2 files changed, 138 insertions(+), 339 deletions(-)
>
> diff --git a/src/gallium/state_trackers/dri/dri2.c
> b/src/gallium/state_trackers/dri/dri2.c
> index 859161fb87ac..9c74ca54fc89 100644
> --- a/src/gallium/state_trackers/dri/dri2.c
> +++ b/src/gallium/state_trackers/dri/dri2.c
> @@ -54,295 +54,72 @@
>  #define DRM_FORMAT_MOD_INVALID ((1ULL<<56) - 1)
>  #endif
>
> -static const int fourcc_formats[] = {
> -   __DRI_IMAGE_FOURCC_ARGB2101010,
> -   __DRI_IMAGE_FOURCC_XRGB2101010,
> -   __DRI_IMAGE_FOURCC_ABGR2101010,
> -   __DRI_IMAGE_FOURCC_XBGR2101010,
> -   __DRI_IMAGE_FOURCC_ARGB,
> -   __DRI_IMAGE_FOURCC_ABGR,
> -   __DRI_IMAGE_FOURCC_SARGB,
> -   __DRI_IMAGE_FOURCC_XRGB,
> -   __DRI_IMAGE_FOURCC_XBGR,
> -   __DRI_IMAGE_FOURCC_ARGB1555,
> -   __DRI_IMAGE_FOURCC_RGB565,
> -   __DRI_IMAGE_FOURCC_R8,
> -   __DRI_IMAGE_FOURCC_R16,
> -   __DRI_IMAGE_FOURCC_GR88,
> -   __DRI_IMAGE_FOURCC_GR1616,
> -   __DRI_IMAGE_FOURCC_YUV410,
> -   __DRI_IMAGE_FOURCC_YUV411,
> -   __DRI_IMAGE_FOURCC_YUV420,
> -   __DRI_IMAGE_FOURCC_YUV422,
> -   __DRI_IMAGE_FOURCC_YUV444,
> -   __DRI_IMAGE_FOURCC_YVU410,
> -   __DRI_IMAGE_FOURCC_YVU411,
> -   __DRI_IMAGE_FOURCC_YVU420,
> -   __DRI_IMAGE_FOURCC_YVU422,
> -   __DRI_IMAGE_FOURCC_YVU444,
> -   __DRI_IMAGE_FOURCC_NV12,
> -   __DRI_IMAGE_FOURCC_NV16,
> -   __DRI_IMAGE_FOURCC_YUYV
> -};
> -
> -static int convert_fourcc(int format, int *dri_components_p)
> -{
> +struct dri2_format_mapping {
> +   int dri_fourcc;
> +   int dri_format;
> int dri_components;
> -   switch(format) {
> -   case __DRI_IMAGE_FOURCC_RGB565:
> -  format = __DRI_IMAGE_FORMAT_RGB565;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGB;
> -  break;
> -   case __DRI_IMAGE_FOURCC_ARGB:
> -  format = __DRI_IMAGE_FORMAT_ARGB;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGBA;
> -  break;
> -   case __DRI_IMAGE_FOURCC_XRGB:
> -  format = __DRI_IMAGE_FORMAT_XRGB;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGB;
> -  break;
> -   case __DRI_IMAGE_FOURCC_ABGR:
> -  format = __DRI_IMAGE_FORMAT_ABGR;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGBA;
> -  break;
> -   case __DRI_IMAGE_FOURCC_XBGR:
> -  format = __DRI_IMAGE_FORMAT_XBGR;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGB;
> -  break;
> -   case __DRI_IMAGE_FOURCC_ARGB2101010:
> -  format = __DRI_IMAGE_FORMAT_ARGB2101010;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGBA;
> -  break;
> -   case __DRI_IMAGE_FOURCC_XRGB2101010:
> -  format = __DRI_IMAGE_FORMAT_XRGB2101010;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGB;
> -  break;
> -   case __DRI_IMAGE_FOURCC_ABGR2101010:
> -  format = __DRI_IMAGE_FORMAT_ABGR2101010;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGBA;
> -  break;
> -   case __DRI_IMAGE_FOURCC_XBGR2101010:
> -  format = __DRI_IMAGE_FORMAT_XBGR2101010;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RGB;
> -  break;
> -   case __DRI_IMAGE_FOURCC_R8:
> -  format = __DRI_IMAGE_FORMAT_R8;
> -  dri_components = __DRI_IMAGE_COMPONENTS_R;
> -  break;
> -   case __DRI_IMAGE_FOURCC_GR88:
> -  format = __DRI_IMAGE_FORMAT_GR88;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RG;
> -  break;
> -   case __DRI_IMAGE_FOURCC_R16:
> -  format = __DRI_IMAGE_FORMAT_R16;
> -  dri_components = __DRI_IMAGE_COMPONENTS_R;
> -  break;
> -   case __DRI_IMAGE_FOURCC_GR1616:
> -  format = __DRI_IMAGE_FORMAT_GR1616;
> -  dri_components = __DRI_IMAGE_COMPONENTS_RG;
> -  break;
> -   case __DRI_IMAGE_FOURCC_YUYV:
> -  format = __DRI_IMAGE_FORMAT_YUYV;
> -  dri_components = __DRI_IMAGE_COMPONENTS_Y_XUXV;
> -  break;
> -   /*
> -* For multi-planar YUV formats, we return the format of the first
> -* plane only.  Since there is only one caller which supports multi-
> -* planar YUV it gets to figure out the remaining planes on it's
> -* own.
> -*/
> -   case __DRI_IMAGE_FOURCC_YUV420:
> -   case __DRI_IMAGE_FOURCC_YVU420:
> -  format = __DRI_IMAGE_FORMAT_R8;
> -  dri_components = __DRI_IMAGE_COMPONENTS_Y_U_V;
> -  break;
> -   case __DRI_IMAGE_FOURCC_NV12:
> -   

[Mesa-dev] [PATCH v2 32/53] intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN

2018-05-25 Thread Jason Ekstrand
From: Francisco Jerez 

When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs.  In both cases, the accumulator is written
by the first of the two instructions and read by the second.  Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware.  Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/intel/compiler/brw_shader.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index 141b64e..dfd2c5c 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -984,7 +984,8 @@ backend_instruction::writes_accumulator_implicitly(const 
struct gen_device_info
return writes_accumulator ||
   (devinfo->gen < 6 &&
((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) ||
-(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP)));
+(opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))) 
||
+  (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln);
 }
 
 bool
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 33/53] intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates

2018-05-25 Thread Jason Ekstrand
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number.  When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead.  Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.
---
 src/intel/compiler/brw_fs_generator.cpp | 75 +
 src/intel/compiler/brw_shader.cpp   |  3 +-
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 548a208..0ca9a4e 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -761,16 +761,73 @@ fs_generator::generate_linterp(fs_inst *inst,
 
   return true;
} else if (devinfo->has_pln) {
-  /* From the Sandy Bridge PRM Vol. 4, Pt. 2, Section 8.3.53, "Plane":
-   *
-   *"[DevSNB]: must be even register aligned.
-   *
-   * This restriction is lifted on Ivy Bridge.
-   */
-  assert(devinfo->gen >= 7 || (delta_x.nr & 1) == 0);
-  brw_PLN(p, dst, interp, delta_x);
+  if (devinfo->gen <= 6 && (delta_x.nr & 1) != 0) {
+ /* From the Sandy Bridge PRM Vol. 4, Pt. 2, Section 8.3.53, "Plane":
+  *
+  *"[DevSNB]: must be even register aligned.
+  *
+  * This restriction is lifted on Ivy Bridge.
+  *
+  * This means that we need to split PLN into LINE+MAC on-the-fly.
+  * Unfortunately, the inputs are laid out for PLN and not LIN+MAC so
+  * we have to split into SIMD8 pieces.
+  */
+ if (inst->exec_size == 8) {
+i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x);
+i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y);
 
-  return false;
+/* LINE writes the accumulator automatically on gen4-5.  On Sandy
+ * Bridge and later, we have to explicitly enable it.
+ */
+if (devinfo->gen >= 6)
+   brw_inst_set_acc_wr_control(p->devinfo, i[0], true);
+
+brw_inst_set_cond_modifier(p->devinfo, i[1], 
inst->conditional_mod);
+
+/* brw_set_default_saturate() is called before emitting
+ * instructions, so the saturate bit is set in each instruction,
+ * so we need to unset it on the first instruction.
+ */
+brw_inst_set_saturate(p->devinfo, i[0], false);
+ } else {
+brw_push_insn_state(p);
+brw_set_default_exec_size(p, BRW_EXECUTE_8);
+
+brw_set_default_group(p, inst->group);
+i[0] = brw_LINE(p, brw_null_reg(), interp, offset(delta_x, 0));
+i[1] = brw_MAC(p, offset(dst, 0),
+   suboffset(interp, 1), offset(delta_x, 1));
+
+brw_set_default_group(p, inst->group + 8);
+i[2] = brw_LINE(p, brw_null_reg(), interp, offset(delta_y, 0));
+i[3] = brw_MAC(p, offset(dst, 1),
+   suboffset(interp, 1), offset(delta_y, 1));
+
+brw_pop_insn_state(p);
+
+/* LINE writes the accumulator automatically on gen4-5.  On Sandy
+ * Bridge and later, we have to explicitly enable it.
+ */
+if (devinfo->gen >= 6) {
+   brw_inst_set_acc_wr_control(p->devinfo, i[0], true);
+   brw_inst_set_acc_wr_control(p->devinfo, i[2], true);
+}
+
+brw_inst_set_cond_modifier(p->devinfo, i[1], 
inst->conditional_mod);
+brw_inst_set_cond_modifier(p->devinfo, i[3], 
inst->conditional_mod);
+
+/* brw_set_default_saturate() is called before emitting
+ * instructions, so the saturate bit is set in each instruction,
+ * so we need to unset it on the first instruction of each pair.
+ */
+brw_inst_set_saturate(p->devinfo, i[0], false);
+brw_inst_set_saturate(p->devinfo, i[2], false);
+ }
+ return true;
+  } else {
+ brw_PLN(p, dst, interp, delta_x);
+ return false;
+  }
} else {
   i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x);
   i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y);
diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index dfd2c5c..6d25d51 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -985,7 +985,8 @@ backend_instruction::writes_accumulator_implicitly(const 
struct gen_device_info
   (devinfo->gen < 6 &&
((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) ||
 (opcode >= FS_OPCODE_DDX_COARSE && opcode <= FS_OPCODE_LINTERP))) 
||
-  (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln);
+  (opcode == FS_OPCODE_LINTERP &&
+   (!devin

Re: [Mesa-dev] Gitlab migration

2018-05-25 Thread Marek Olšák
On Thu, May 24, 2018 at 6:46 AM, Daniel Stone  wrote:

> Hi all,
> I'm going to attempt to interleave a bunch of replies here.
>
> On 23 May 2018 at 20:34, Jason Ekstrand  wrote:
> > The freedesktop.org admins are trying to move as many projects and
> services
> > as possible over to gitlab and somehow I got hoodwinked into
> spear-heading
> > it for mesa.  There are a number of reasons for this change.  Some of
> those
> > reasons have to do with the maintenance cost of our sprawling and aging
> > infrastructure.  Some of those reasons provide significant benefit to the
> > project being migrated:
>
> Thanks for starting the discussion! I appreciate the help.
>
> To be clear, we _are_ migrating the hosting for all projects, as in,
> the remote you push to will change. We've slowly staged this with a
> few projects of various shapes and sizes, and are confident that it
> more than holds up to the load. This is something we can pull the
> trigger on roughly any time, and I'm happy to do it whenever. When
> that happens, trying to push to ssh://git.fd.o will give you an error
> message explaining how to update your SSH keys, how to change your
> remotes, etc.
>
> cgit and anongit will not be orphaned: they remain as push mirrors so
> are updated simultaneously with GItLab pushes, as will the GitHub
> mirrors. Realistically, we can't deprecate anongit for a (very) long
> time due to the millions of Yocto forks which have that URL embedded
> in their build recipes. Running cgit alongside that is fairly
> low-intervention. And hey, if we look at the logs in five years' time
> and see 90% of people still using cgit to browse and not GitLab,
> that's a pretty strong hint that we should put effort into keeping it.
>

Well, I don't know what people are talking about. A cgit commit log is a
tight table with 5 columns with information. I can't find anything like
that in GitLab. All I could find is this:
https://gitlab.freedesktop.org/jekstrand/mesa/commits/master

The elements are too large and don't have much information. Why would you
have the author name on another line when you could add another column
instead? There is a lot of unused screen space. And why having avatars in
the commit log. It's not Facebook.

Then there is the project Overview page. It mostly just shows files in the
top level directory. Compare it with cgit where the Overview page looks
like a, guess what, overview!

OK, that was harsh, but there is a lot of truth to it. I guess GitLab is
great for admins and I get that. Speaking of the web UI, at least the
read-only view is impressively unimpressive.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev