Re: [Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Frank Binns
On 23/01/15 01:11, Rob Clark wrote:
> On Thu, Jan 22, 2015 at 4:36 PM, Emil Velikov  
> wrote:
>>> +static const char* node_path_fmt_card = "/dev/dri/card%d";
>> You can reuse the DRM_DIR_NAME + DRM_DEV_NAME macros (from xf86drm.h)
>> for this.
>>
>>> +static const char* node_path_fmt_render = "/dev/dri/renderD%d";
>> There is no macro for the renderD%d, although you can still use
>> DRM_DIR_NAME for the path.
> I suppose for consistency, it wouldn't be a horrible idea to add the
> missing macro to libdrm

I've already posted a couple of patches related to this:
http://lists.freedesktop.org/archives/dri-devel/2015-January/075449.html

Thanks
Frank

>
> (although to avoid libdrm version bump dependency from mesa side, I'm
> fine with open-coding it for now in mesa and clean up some time after
> there has been a libdrm release)
>
> BR,
> -R
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix URB size for CHV

2015-01-23 Thread ville . syrjala
From: Ville Syrjälä 

Increase the device info .urb.size for CHV to match the default URB
size (192kB).

Signed-off-by: Ville Syrjälä 
---
 src/mesa/drivers/dri/i965/brw_device_info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
index 3c3c564..ba65584 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -241,7 +241,7 @@ static const struct brw_device_info brw_device_info_chv = {
.max_gs_threads = 80,
.max_wm_threads = 128,
.urb = {
-  .size = 128,
+  .size = 192,
   .min_vs_entries = 34,
   .max_vs_entries = 640,
   .max_gs_entries = 256,
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Fix URB size for gen8

2015-01-23 Thread Ville Syrjälä
On Wed, Jan 21, 2015 at 12:51:02PM -0800, Kenneth Graunke wrote:
> On Wednesday, January 21, 2015 08:17:36 PM ville.syrj...@linux.intel.com 
> wrote:
> > From: Ville Syrjälä 
> > 
> > Increase the device info .urb.size for BDW GT3 and CHV to match the
> > default URB size for each.
> > 
> > Also add all missing platforms (BYT,BDW,CHV) to the comment describing
> > the default URB size in gen7_urb.c.
> > 
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
> >  src/mesa/drivers/dri/i965/gen7_urb.c| 5 -
> >  2 files changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
> > b/src/mesa/drivers/dri/i965/brw_device_info.c
> > index bdef42b..d0b9e05 100644
> > --- a/src/mesa/drivers/dri/i965/brw_device_info.c
> > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
> > @@ -226,7 +226,7 @@ static const struct brw_device_info 
> > brw_device_info_bdw_gt3 = {
> > GEN8_FEATURES, .gt = 3,
> > .max_wm_threads = 384,
> > .urb = {
> > -  .size = 384,
> > +  .size = 768,
> >.min_vs_entries = 64,
> >.max_vs_entries = 2560,
> >.max_gs_entries = 960,
> > @@ -243,7 +243,7 @@ static const struct brw_device_info brw_device_info_chv 
> > = {
> > .max_gs_threads = 80,
> > .max_wm_threads = 128,
> > .urb = {
> > -  .size = 128,
> > +  .size = 192,
> >.min_vs_entries = 34,
> >.max_vs_entries = 640,
> >.max_gs_entries = 256,
> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> > b/src/mesa/drivers/dri/i965/gen7_urb.c
> > index 201f42e..f90d6e3 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> > @@ -50,9 +50,12 @@
> >   * Currently we split the constant buffer space evenly among whatever 
> > stages
> >   * are active.  This is probably not ideal, but simple.
> >   *
> > - * Ivybridge GT1 and Haswell GT1 have 128kB of URB space.
> > + * Ivybridge GT1, Baytrail and Haswell GT1 have 128kB of URB space.
> >   * Ivybridge GT2 and Haswell GT2 have 256kB of URB space.
> >   * Haswell GT3 has 512kB of URB space.
> > + * Broadwell GT1 and Cherryview have 192kB of URB space.
> > + * Broadwell GT2 has 384kB of URB space.
> > + * Broadwell GT3 has 768kB of URB space.
> >   *
> >   * See "Volume 2a: 3D Pipeline," section 1.8, "Volume 1b: Configurations",
> >   * and the documentation for 3DSTATE_PUSH_CONSTANT_ALLOC_xS.
> > 
> 
> Have you tested this?  I tried 768k on Broadwell GT3 a while back and got
> no end of GPU hangs.  Which is odd, because it should be the correct value.

OK, since the BDW stuff has potential issues, I've split the CHV stuff
into separate patches. I already pushed the max_wm_threads and
min_vs_entries patches with your r-b, and re-posted the URB size patch
with just CHV changed this time.

I'll leave the BDW bits to someone else who has enough BDW hardware
around to test and figure out what works and what doesn't.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Fix URB size for gen8

2015-01-23 Thread Ville Syrjälä
On Wed, Jan 21, 2015 at 12:51:02PM -0800, Kenneth Graunke wrote:
> On Wednesday, January 21, 2015 08:17:36 PM ville.syrj...@linux.intel.com 
> wrote:
> > From: Ville Syrjälä 
> > 
> > Increase the device info .urb.size for BDW GT3 and CHV to match the
> > default URB size for each.
> > 
> > Also add all missing platforms (BYT,BDW,CHV) to the comment describing
> > the default URB size in gen7_urb.c.
> > 
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
> >  src/mesa/drivers/dri/i965/gen7_urb.c| 5 -
> >  2 files changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
> > b/src/mesa/drivers/dri/i965/brw_device_info.c
> > index bdef42b..d0b9e05 100644
> > --- a/src/mesa/drivers/dri/i965/brw_device_info.c
> > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
> > @@ -226,7 +226,7 @@ static const struct brw_device_info 
> > brw_device_info_bdw_gt3 = {
> > GEN8_FEATURES, .gt = 3,
> > .max_wm_threads = 384,
> > .urb = {
> > -  .size = 384,
> > +  .size = 768,
> >.min_vs_entries = 64,
> >.max_vs_entries = 2560,
> >.max_gs_entries = 960,
> > @@ -243,7 +243,7 @@ static const struct brw_device_info brw_device_info_chv 
> > = {
> > .max_gs_threads = 80,
> > .max_wm_threads = 128,
> > .urb = {
> > -  .size = 128,
> > +  .size = 192,
> >.min_vs_entries = 34,
> >.max_vs_entries = 640,
> >.max_gs_entries = 256,
> > diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> > b/src/mesa/drivers/dri/i965/gen7_urb.c
> > index 201f42e..f90d6e3 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> > @@ -50,9 +50,12 @@
> >   * Currently we split the constant buffer space evenly among whatever 
> > stages
> >   * are active.  This is probably not ideal, but simple.
> >   *
> > - * Ivybridge GT1 and Haswell GT1 have 128kB of URB space.
> > + * Ivybridge GT1, Baytrail and Haswell GT1 have 128kB of URB space.
> >   * Ivybridge GT2 and Haswell GT2 have 256kB of URB space.
> >   * Haswell GT3 has 512kB of URB space.
> > + * Broadwell GT1 and Cherryview have 192kB of URB space.
> > + * Broadwell GT2 has 384kB of URB space.
> > + * Broadwell GT3 has 768kB of URB space.
> >   *
> >   * See "Volume 2a: 3D Pipeline," section 1.8, "Volume 1b: Configurations",
> >   * and the documentation for 3DSTATE_PUSH_CONSTANT_ALLOC_xS.
> > 
> 
> Have you tested this?  I tried 768k on Broadwell GT3 a while back and got
> no end of GPU hangs.  Which is odd, because it should be the correct value.

I was just reading the spec a bit more and I saw this note in 3DSTATE_URB_VS:
"The offset and size should be programmed as if there is only one slice
 enabled. Hardware will grow the size based on the slice configuration.
 The hardware supports up to 1024KB of URB space so any slice and max urb
 size configuration that goes over that limit is not allowed and will
 cause corruption. Refer to the L3 allocation and programming guide for
 valid URB configurations."

So I guess that explains it. And then 384 is the correct value for GT2
and GT3. I'm not quite sure how that interacts with the push constant
offset/size though...

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88275] [865G] Intel OpenGL rendering isn't starting

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88275

--- Comment #25 from Eugene  ---
(In reply to Timothy Arceri from comment #24)
> What does it say when you run:
> 
> LIBGL_DEBUG=verbose glxinfo | grep direct

$ LIBGL_DEBUG=verbose glxinfo | grep direct
libGL: screen 0 does not appear to be DRI3 capable
libGL: pci id for fd 4: 8086:2572, driver i915
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/tls/i915_dri.so
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/i915_dri.so
libGL error: failed to create dri screen
libGL error: failed to load driver: i915
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/tls/swrast_dri.so
libGL: OpenDriver: trying /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
libGL: Can't open configuration file /home/teacher/.drirc: No such file or
directory.
libGL: Can't open configuration file /home/teacher/.drirc: No such file or
directory.
direct rendering: Yes

For any additional info/tests, lease ask.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: Pass the correct X visual depth to xcb_put_image().

2015-01-23 Thread Jose Fonseca

It looks like nobody really cares, so I'll take it as consent.

This only happens if X requires 24bit visuals. Maybe a minority of 
drivers do that.  At least Intel X driver does.


BTW, this should go to stable branches too.

Jose

On 19/01/15 23:09, Jose Fonseca wrote:

From: José Fonseca 

The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

   https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

PS: I rarely use the Mesa DRI software rasterizers (I usually use the
non-DRI Xlib SW renderers), but every time I try them they seem broken
at some fundamental level.  I wonder if it's just me or if nobody truly
uses them on a daily basis.
---
  src/egl/drivers/dri2/platform_x11.c | 24 +---
  1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index dd88e90..cbcf6a7 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf,

  static void
  swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
- struct dri2_egl_surface * dri2_surf,
- int depth)
+ struct dri2_egl_surface * dri2_surf)
  {
 uint32_t   mask;
 const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
 valgc[0] = function;
 valgc[1] = False;
 xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, 
mask, valgc);
-   dri2_surf->depth = depth;
-   switch (depth) {
+   switch (dri2_surf->depth) {
case 32:
case 24:
   dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
   dri2_surf->bytes_per_pixel = 0;
   break;
default:
- _eglLog(_EGL_WARNING, "unsupported depth %d", depth);
+ _eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
 }
  }

@@ -257,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
 }
-
-   if (dri2_dpy->dri2) {
-  xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
-   } else {
-  swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, 
EGL_BUFFER_SIZE));
-   }

 if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -275,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,

dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
+  dri2_surf->depth = reply->depth;
free(reply);
 }

+   if (dri2_dpy->dri2) {
+  xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
+   } else {
+  if (type == EGL_PBUFFER_BIT) {
+ dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
+  }
+  swrastCreateDrawable(dri2_dpy, dri2_surf);
+   }
+
 /* we always copy the back buffer to front */
 dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] New stable-branch 10.4 candidate pushed

2015-01-23 Thread Emil Velikov
Hello list,

As mentioned earlier here is the current status of the 10.4.3 release
candidate:
 - 48 queued
 - 4 nominated (outstanding)
 - and 0 rejected patches


In a nut shell this gives us over 40 fixes in the nine state-tracker,
a couple of i965 fixes, and a build fix for OpenBSD.

Take a look at section "Mesa stable queue" for more information.


Testing
---
The following results are against piglit a68d27e7254.


Changes - classic i965(snb)
---
Intermittent test results
 - GLX_OML_sync_control
 - ARB_buffer_storage/bufferstorage-persistent read coherent

Fixes
 - shaders/glsl-deriv-varyings - fail > pass


Changes - swrast classic, gallium
-
None.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.


Trivial merge conflicts
---
Here are the commits where I manually merged conflicts, (so these might
merit additional review):

commit 021d71b8480393c3f0bbe35fd2f8bb7052c7ff31
Author: Kenneth Graunke 

i965: Respect the no_8 flag on Gen6, not just Gen7+.

(cherry picked from commit f95733ddb7fff0af923fce3a07ebef78fa3139a4)


commit 22c75f9f5a698365ca669428ed1a7899670b1e64
Author: Axel Davy 

st/nine: Implement TEXCOORD special behaviours

(cherry picked from commit 5399119fb1ea646880c5e8a54e4c7f789be4c574)


commit 8e08ba6f9676fd5433f38c8969a1100d69b1
Author: Axel Davy 

st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC

(cherry picked from commit 7865210670a735f268044b0c388a4895bed1ea4c)


The plan is to have 10.4.3 tomorrow morning (Saturday). From 10.4.4
the schedule will be back to normal - RC notice on Tuesdays, release
on Fridays. If you have any questions or comments that you would like
to share before the release, please go ahead.


Cheers,
Emil


Mesa stable queue
-

Nominated (4)
==
Jose Fonseca (1):
  egl: Pass the correct X visual depth to xcb_put_image().

Mario Kleiner (2):
  glx/dri3: Request non-vsynced Present for swapinterval zero.
  glx: Handle out-of-sequence swap completion events correctly.

Marius Predut (1):
  Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No
functional changes, only bug fixed.


Queued (48)
===
Axel Davy (39):
  st/nine: Add new texture format strings
  st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
  st/nine: NineBaseTexture9: fix setting of last_layer
  st/nine: CubeTexture: fix GetLevelDesc
  st/nine: Fix crash when deleting non-implicit swapchain
  st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad 
format
  st/nine: NineBaseTexture9: update sampler view creation
  st/nine: Check if srgb format is supported before trying to use it.
  st/nine: Add ATI1 and ATI2 support
  st/nine: Rework of boolean constants
  st/nine: Convert integer constants to floats before storing them when 
cards don't support integers
  st/nine: Remove some shader unused code
  st/nine: Saturate oFog and oPts vs outputs
  st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
  st/nine: Fix typo for M4x4
  st/nine: Fix POW implementation
  st/nine: Handle RSQ special cases
  st/nine: Handle NRM with input of null norm
  st/nine: Correct LOG on negative values
  st/nine: Rewrite LOOP implementation, and a0 aL handling
  st/nine: Fix CND implementation
  st/nine: Clamp ps 1.X constants
  st/nine: Fix some fixed function pipeline operation
  st/nine: Implement TEXCOORD special behaviours
  st/nine: Fill missing dst and src number for some instructions.
  st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
  st/nine: implement TEXM3x2DEPTH
  st/nine: Implement TEXM3x2TEX
  st/nine: Implement TEXM3x3SPEC
  st/nine: Implement TEXDEPTH
  st/nine: Implement TEXDP3
  st/nine: Implement TEXDP3TEX
  st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
  st/nine: Correct rules for relative adressing and constants.
  st/nine: Remove unused code for ps
  st/nine: Fix sm3 relative addressing for non-debug build
  st/nine: Add variables containing the size of the constant buffers
  st/nine: Allocate the correct size for the user constant buffer
  st/nine: Allocate vs constbuf buffer for indirect addressing once.

Jason Ekstrand (1):
  mesa: Fix clamping to -1.0 in snorm_to_float

Jonathan Gray (1):
  glsl: Link glsl_test with pthreads library.

Jose Fonseca (1):
  nine: Drop use of TGSI_OPCODE_CND.

Kenneth Graunke (2):
  i965: Respect the no_8 flag on Gen6, not just Gen7+.
  i965: Work around mysterious Gen4 GPU hangs with minimal state changes.

Stanislaw Halik (1):
  st/nine: Hack to generate resource if it doesn't exist when getting view

Xavier Bouchoux (3):
  st/nine: Additional defines to d3dtypes.h
  st/

Re: [Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Emil Velikov
On 23/01/15 02:00, Haixia Shi wrote:
> Hi Emil,
> 
> On Thu, Jan 22, 2015 at 4:38 PM, Emil Velikov  
> wrote:
>> On 22/01/15 22:23, Haixia Shi wrote:
>>> Hi Emil
>>>
>>> On Thu, Jan 22, 2015 at 1:36 PM, Emil Velikov  
>>> wrote:
 Hi Haixia Shi,

 On 22/01/15 17:35, Haixia Shi wrote:
> Try the render node first and use it if available. Otherwise fall back to
> normal nodes.
>
 What is the use-case for such a platform - I assume it's worth
 mentioning in the commit message ?

 No other platform picks the device at random as seen below. Why did you
 choose such an approach ? It seems like one can easily shoot themselves
 by using it.
>>>
>>> CC Stephane. The goal here is just to pick the first available node
>>> for off-screen rendering only.
>>>
>> Hmm I'm guessing that using the drm/gbm platform is out of the question
>> ? Iirc there has been a bit of love on the gbm topic, and afaiu this
>> solution is to be used with minigbm ?
> 
> Yes this solutions is to be used with minigbm.
> 
>>
>> What I'm thinking here is:
>> If you're testing a device with provides two or more nodes (be that the
>> classic card or the render ones), one cannot guarantee that the kernel
>> module for hw#1 will be loaded first. Thus even if one presumes that
>> they are working on (testing) hw#1 that may or may not be the case.
>>
>> Not 100% sure on the module order part, so I could be wrong.
> 
> I don't have a good answer for that... any suggestion on how best to
> pick the right one?
> 
Might be worth having a look at how platform_drm does it. But we warned
there be dragons :)

> + char *card_path;
> + if (asprintf(&card_path, node_path_fmt, base + i) < 0)
> +continue;
> +
> + dri2_dpy->fd = open(card_path, O_RDWR);
 If you open a normal node (card%d) I believe that you'll need an
 authenticate hook in dri2_egl_display_vtbl. Does things work without it
 on your system/platform ?
>>>
>>> You're correct; normal node would require the legacy auth hook, and it
>>> would only work without auth if the process is run as root, which is
>>> why we're trying render nodes first.
>>>
>> So you're saying that people without render nodes should run egl(mesa)
>> as root ? That does not sound like a wise suggestion imho.
>>
>> Basically what I'm trying to say is - if you have a fall-back to normal
>> nodes, some form of auth ought to be in place.
> 
> I see your point. Would it be cleaner if we simply require render node
> to be present? The normal node (card%d) and the auth hook is more
> trouble than its worth.
> 
It's up-to you if you want to keep it.
I'm just pointing out that having a fall-back that (a) mostly fails,
without giving a clear indication as to why, or (b) forces you to run
the app as root is counter-intuitive (not the best security practise)
for most people.


-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/5] nir: use Python to autogenerate opcode information

2015-01-23 Thread Jason Ekstrand
On Thu, Jan 22, 2015 at 8:32 PM, Connor Abbott  wrote:

> Before, we used a system where a file, nir_opcodes.h, defined some macros
> that
> were included to generate the enum values and the nir_op_infos structure.
> This
> worked pretty well, but for development the error messages were never very
> useful, Python tools couldn't understand the opcode list, and it was
> difficult
> to use nir_opcodes.h to do other things like autogenerate a builder API.
> Now, we
> store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to
> generate the old nir_opcodes.c and nir_opcodes_h.py to generate
> nir_opcodes.h,
> which contains all the enum names and gets included into nir.h like
> before.  In
> addition to solving the above problems, using Python and Mako to generate
> everything means that it's much easier to add keep information centralized
> as we
> add new things like constant propagation that require per-opcode
> information.
>
> v2:
> - make Opcode derive from object (Dylan)
> - don't use assert like it's a function (Dylan)
> - style fixes for fnoise, use xrange (Dylan)
> - use iterkeys() in nir_opcodes_h.py (Dylan)
> - use pydoc-style comments (Jason)
> - don't make fmin/fmax commutative and associative yet (Jason)
>
> Signed-off-by: Connor Abbott 
> ---
>  src/glsl/Makefile.am  |  15 +-
>  src/glsl/Makefile.sources |   6 +-
>  src/glsl/nir/.gitignore   |   2 +
>  src/glsl/nir/nir.h|   9 -
>  src/glsl/nir/nir_opcodes.c|  46 -
>  src/glsl/nir/nir_opcodes.h| 366
> 
>  src/glsl/nir/nir_opcodes.py   | 383
> ++
>  src/glsl/nir/nir_opcodes_c.py |  56 ++
>  src/glsl/nir/nir_opcodes_h.py |  39 +
>  9 files changed, 497 insertions(+), 425 deletions(-)
>  delete mode 100644 src/glsl/nir/nir_opcodes.c
>  delete mode 100644 src/glsl/nir/nir_opcodes.h
>  create mode 100644 src/glsl/nir/nir_opcodes.py
>  create mode 100644 src/glsl/nir/nir_opcodes_c.py
>  create mode 100644 src/glsl/nir/nir_opcodes_h.py
>
> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> index 9d9f99a..8474b70 100644
> --- a/src/glsl/Makefile.am
> +++ b/src/glsl/Makefile.am
> @@ -27,6 +27,7 @@ AM_CPPFLAGS = \
> -I$(top_srcdir)/src/glsl/glcpp \
> -I$(top_srcdir)/src/glsl/nir \
> -I$(top_srcdir)/src/gtest/include \
> +   -I$(top_builddir)/src/glsl/nir \
> $(DEFINES)
>  AM_CFLAGS = $(VISIBILITY_CFLAGS)
>  AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
> @@ -216,7 +217,9 @@ BUILT_SOURCES =
>  \
> glsl_lexer.cpp  \
> glcpp/glcpp-parse.c \
> glcpp/glcpp-lex.c   \
> -   nir/nir_opt_algebraic.c
> +   nir/nir_opt_algebraic.c \
> +   nir/nir_opcodes.h   \
> +   nir/nir_opcodes.c
>

Alphabetize! (said in my best Matt Turner impression)


>  CLEANFILES =   \
> glcpp/glcpp-parse.h \
> glsl_parser.h   \
> @@ -232,3 +235,13 @@ dist-hook:
>  nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py
> $(MKDIR_P) nir; \
> $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opt_algebraic.py > $@
> +
> +nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
> +   $(MKDIR_P) nir; \
> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
> +
> +nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py
> +   $(MKDIR_P) nir; \
> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_c.py > $@
> +
> +nir/nir.h: nir/nir_opcodes.h
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index 6237627..56299eb 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -14,7 +14,9 @@ LIBGLCPP_GENERATED_FILES = \
> $(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
>
>  NIR_GENERATED_FILES = \
> -   $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
> +   $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c \
> +   $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
> +   $(GLSL_BUILDDIR)/nir/nir_opcodes.c
>

here too


>
>  NIR_FILES = \
> $(GLSL_SRCDIR)/nir/nir.c \
> @@ -35,8 +37,6 @@ NIR_FILES = \
> $(GLSL_SRCDIR)/nir/nir_lower_var_copies.c \
> $(GLSL_SRCDIR)/nir/nir_lower_vec_to_movs.c \
> $(GLSL_SRCDIR)/nir/nir_metadata.c \
> -   $(GLSL_SRCDIR)/nir/nir_opcodes.c \
> -   $(GLSL_SRCDIR)/nir/nir_opcodes.h \
> $(GLSL_SRCDIR)/nir/nir_opt_constant_folding.c \
> $(GLSL_SRCDIR)/nir/nir_opt_copy_propagate.c \
> $(GLSL_SRCDIR)/nir/nir_opt_cse.c \
> diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
> index 6d954fe..4c28193 100644
> --

[Mesa-dev] [PATCH] clover: CL 1.2 add GetKernelArgInfo

2015-01-23 Thread EdB
---
 depends on clLinkProgram serie

 src/gallium/state_trackers/clover/api/dispatch.cpp |  2 +-
 src/gallium/state_trackers/clover/api/dispatch.hpp |  8 +-
 src/gallium/state_trackers/clover/api/kernel.cpp   | 51 
 src/gallium/state_trackers/clover/core/kernel.cpp  |  6 ++
 src/gallium/state_trackers/clover/core/kernel.hpp  |  1 +
 src/gallium/state_trackers/clover/core/module.hpp  | 17 +++-
 .../state_trackers/clover/llvm/invocation.cpp  | 95 +-
 7 files changed, 175 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp 
b/src/gallium/state_trackers/clover/api/dispatch.cpp
index 44bff4f..c0388ec 100644
--- a/src/gallium/state_trackers/clover/api/dispatch.cpp
+++ b/src/gallium/state_trackers/clover/api/dispatch.cpp
@@ -125,7 +125,7 @@ namespace clover {
   clCompileProgram,
   clLinkProgram,
   clUnloadPlatformCompiler,
-  NULL, // clGetKernelArgInfo
+  clGetKernelArgInfo,
   NULL, // clEnqueueFillBuffer
   NULL, // clEnqueueFillImage
   NULL, // clEnqueueMigrateMemObjects
diff --git a/src/gallium/state_trackers/clover/api/dispatch.hpp 
b/src/gallium/state_trackers/clover/api/dispatch.hpp
index ffae1ae..ffe8556 100644
--- a/src/gallium/state_trackers/clover/api/dispatch.hpp
+++ b/src/gallium/state_trackers/clover/api/dispatch.hpp
@@ -693,7 +693,13 @@ struct _cl_icd_dispatch {
CL_API_ENTRY cl_int (CL_API_CALL *clUnloadPlatformCompiler)(
   cl_platform_id platform);
 
-   void *clGetKernelArgInfo;
+   CL_API_ENTRY cl_int (CL_API_CALL *clGetKernelArgInfo)(
+  cl_kernel kernel,
+  cl_uint arg_indx,
+  cl_kernel_arg_info  param_name,
+  size_t param_value_size,
+  void * param_value,
+  size_t * param_value_size_ret);
 
CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueFillBuffer)(
   cl_command_queue command_queue,
diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp 
b/src/gallium/state_trackers/clover/api/kernel.cpp
index 4fe1756..7f4ae9d 100644
--- a/src/gallium/state_trackers/clover/api/kernel.cpp
+++ b/src/gallium/state_trackers/clover/api/kernel.cpp
@@ -148,6 +148,57 @@ clGetKernelInfo(cl_kernel d_kern, cl_kernel_info param,
 }
 
 CLOVER_API cl_int
+clGetKernelArgInfo(cl_kernel d_kern,
+   cl_uint idx, cl_kernel_arg_info param,
+   size_t size, void *r_buf, size_t *r_size) try {
+   property_buffer buf { r_buf, size, r_size };
+   const auto &kern = obj(d_kern);
+   const auto args_info = kern.args_info();
+
+   if (args_info.size() == 0)
+  throw error(CL_KERNEL_ARG_INFO_NOT_AVAILABLE);
+
+   if (idx >= args_info.size())
+  throw error(CL_INVALID_ARG_INDEX);
+
+   const auto &info = args_info[idx];
+
+   switch (param) {
+   case CL_KERNEL_ARG_ADDRESS_QUALIFIER:
+  buf.as_scalar() =
+  info.address_qualifier;
+  break;
+
+   case CL_KERNEL_ARG_ACCESS_QUALIFIER:
+  buf.as_scalar() =
+  info.access_qualifier;
+  break;
+
+   case CL_KERNEL_ARG_TYPE_NAME:
+  buf.as_string() =
+  std::string(info.type_name.begin(), info.type_name.size());
+  break;
+
+   case CL_KERNEL_ARG_TYPE_QUALIFIER:
+  buf.as_scalar() = info.type_qualifier;
+  break;
+
+   case CL_KERNEL_ARG_NAME:
+  buf.as_string() =
+  std::string(info.arg_name.begin(), info.arg_name.size());
+  break;
+
+   default:
+  throw error(CL_INVALID_VALUE);
+   }
+
+   return CL_SUCCESS;
+
+} catch (error &e) {
+   return e.get();
+}
+
+CLOVER_API cl_int
 clGetKernelWorkGroupInfo(cl_kernel d_kern, cl_device_id d_dev,
  cl_kernel_work_group_info param,
  size_t size, void *r_buf, size_t *r_size) try {
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
b/src/gallium/state_trackers/clover/core/kernel.cpp
index ed4b9b0..a212cc1 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -134,6 +134,12 @@ kernel::args() const {
return map(derefs(), _args);
 }
 
+compat::vector
+kernel::args_info() const {
+   auto &syms = program().symbols();
+   return find(name_equals(_name), syms).args_info;
+}
+
 const module &
 kernel::module(const command_queue &q) const {
return program().binary(q.device());
diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp 
b/src/gallium/state_trackers/clover/core/kernel.hpp
index bf5998d..5ae4690 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -134,6 +134,7 @@ namespace clover {
 
   argument_range args();
   const_argument_range args() const;
+  compat::vector args_info() const;
 
   const intrusive_ref program;
 
diff --git a/src/gallium/state_trackers/clover/core/module.hpp 
b/src/gallium/state_trackers/clover/core/module.hpp
index 200b9d

Re: [Mesa-dev] [PATCH v2 2/5] nir: use Python to autogenerate opcode information

2015-01-23 Thread Connor Abbott
On Fri, Jan 23, 2015 at 1:07 PM, Jason Ekstrand  wrote:
>
>
> On Thu, Jan 22, 2015 at 8:32 PM, Connor Abbott  wrote:
>>
>> Before, we used a system where a file, nir_opcodes.h, defined some macros
>> that
>> were included to generate the enum values and the nir_op_infos structure.
>> This
>> worked pretty well, but for development the error messages were never very
>> useful, Python tools couldn't understand the opcode list, and it was
>> difficult
>> to use nir_opcodes.h to do other things like autogenerate a builder API.
>> Now, we
>> store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py
>> to
>> generate the old nir_opcodes.c and nir_opcodes_h.py to generate
>> nir_opcodes.h,
>> which contains all the enum names and gets included into nir.h like
>> before.  In
>> addition to solving the above problems, using Python and Mako to generate
>> everything means that it's much easier to add keep information centralized
>> as we
>> add new things like constant propagation that require per-opcode
>> information.
>>
>> v2:
>> - make Opcode derive from object (Dylan)
>> - don't use assert like it's a function (Dylan)
>> - style fixes for fnoise, use xrange (Dylan)
>> - use iterkeys() in nir_opcodes_h.py (Dylan)
>> - use pydoc-style comments (Jason)
>> - don't make fmin/fmax commutative and associative yet (Jason)
>>
>> Signed-off-by: Connor Abbott 
>> ---
>>  src/glsl/Makefile.am  |  15 +-
>>  src/glsl/Makefile.sources |   6 +-
>>  src/glsl/nir/.gitignore   |   2 +
>>  src/glsl/nir/nir.h|   9 -
>>  src/glsl/nir/nir_opcodes.c|  46 -
>>  src/glsl/nir/nir_opcodes.h| 366
>> 
>>  src/glsl/nir/nir_opcodes.py   | 383
>> ++
>>  src/glsl/nir/nir_opcodes_c.py |  56 ++
>>  src/glsl/nir/nir_opcodes_h.py |  39 +
>>  9 files changed, 497 insertions(+), 425 deletions(-)
>>  delete mode 100644 src/glsl/nir/nir_opcodes.c
>>  delete mode 100644 src/glsl/nir/nir_opcodes.h
>>  create mode 100644 src/glsl/nir/nir_opcodes.py
>>  create mode 100644 src/glsl/nir/nir_opcodes_c.py
>>  create mode 100644 src/glsl/nir/nir_opcodes_h.py
>>
>> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
>> index 9d9f99a..8474b70 100644
>> --- a/src/glsl/Makefile.am
>> +++ b/src/glsl/Makefile.am
>> @@ -27,6 +27,7 @@ AM_CPPFLAGS = \
>> -I$(top_srcdir)/src/glsl/glcpp \
>> -I$(top_srcdir)/src/glsl/nir \
>> -I$(top_srcdir)/src/gtest/include \
>> +   -I$(top_builddir)/src/glsl/nir \
>> $(DEFINES)
>>  AM_CFLAGS = $(VISIBILITY_CFLAGS)
>>  AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
>> @@ -216,7 +217,9 @@ BUILT_SOURCES =
>> \
>> glsl_lexer.cpp  \
>> glcpp/glcpp-parse.c \
>> glcpp/glcpp-lex.c   \
>> -   nir/nir_opt_algebraic.c
>> +   nir/nir_opt_algebraic.c \
>> +   nir/nir_opcodes.h   \
>> +   nir/nir_opcodes.c
>
>
> Alphabetize! (said in my best Matt Turner impression)
>
>>
>>  CLEANFILES =   \
>> glcpp/glcpp-parse.h \
>> glsl_parser.h   \
>> @@ -232,3 +235,13 @@ dist-hook:
>>  nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py
>> $(MKDIR_P) nir; \
>> $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opt_algebraic.py > $@
>> +
>> +nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
>> +   $(MKDIR_P) nir; \
>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
>> +
>> +nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py
>> +   $(MKDIR_P) nir; \
>> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_c.py > $@
>> +
>> +nir/nir.h: nir/nir_opcodes.h
>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>> index 6237627..56299eb 100644
>> --- a/src/glsl/Makefile.sources
>> +++ b/src/glsl/Makefile.sources
>> @@ -14,7 +14,9 @@ LIBGLCPP_GENERATED_FILES = \
>> $(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
>>
>>  NIR_GENERATED_FILES = \
>> -   $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
>> +   $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c \
>> +   $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
>> +   $(GLSL_BUILDDIR)/nir/nir_opcodes.c
>
>
> here too
>
>>
>>
>>  NIR_FILES = \
>> $(GLSL_SRCDIR)/nir/nir.c \
>> @@ -35,8 +37,6 @@ NIR_FILES = \
>> $(GLSL_SRCDIR)/nir/nir_lower_var_copies.c \
>> $(GLSL_SRCDIR)/nir/nir_lower_vec_to_movs.c \
>> $(GLSL_SRCDIR)/nir/nir_metadata.c \
>> -   $(GLSL_SRCDIR)/nir/nir_opcodes.c \
>> -   $(GLSL_SRCDIR)/nir/nir_opcodes.h \
>> $(GLSL_SRCDIR)/nir/nir_opt_constant_folding.c \
>> $(GLSL_SR

[Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Haixia Shi
The NULL platform is for off-screen rendering only. Render node support is
required.

Signed-off-by: Haixia Shi 
---
 src/egl/drivers/dri2/Makefile.am |   5 ++
 src/egl/drivers/dri2/egl_dri2.c  |  13 ++-
 src/egl/drivers/dri2/egl_dri2.h  |   3 +
 src/egl/drivers/dri2/platform_null.c | 169 +++
 4 files changed, 187 insertions(+), 3 deletions(-)
 create mode 100644 src/egl/drivers/dri2/platform_null.c

diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am
index 79a40e8..14b2d60 100644
--- a/src/egl/drivers/dri2/Makefile.am
+++ b/src/egl/drivers/dri2/Makefile.am
@@ -64,3 +64,8 @@ if HAVE_EGL_PLATFORM_DRM
 libegl_dri2_la_SOURCES += platform_drm.c
 AM_CFLAGS += -DHAVE_DRM_PLATFORM
 endif
+
+if HAVE_EGL_PLATFORM_NULL
+libegl_dri2_la_SOURCES += platform_null.c
+AM_CFLAGS += -DHAVE_NULL_PLATFORM
+endif
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 86e5f24..6ed137e 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -534,7 +534,7 @@ dri2_setup_screen(_EGLDisplay *disp)
  disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE;
  disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE;
   }
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
   if (dri2_dpy->image->base.version >= 8 &&
   dri2_dpy->image->createImageFromDmaBufs) {
  disp->Extensions.EXT_image_dma_buf_import = EGL_TRUE;
@@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
   return EGL_FALSE;
 
switch (disp->Platform) {
+#ifdef HAVE_NULL_PLATFORM
+   case _EGL_PLATFORM_NULL:
+  if (disp->Options.TestOnly)
+ return EGL_TRUE;
+  return dri2_initialize_null(drv, disp);
+#endif
+
 #ifdef HAVE_X11_PLATFORM
case _EGL_PLATFORM_X11:
   if (disp->Options.TestOnly)
@@ -1571,7 +1578,7 @@ dri2_create_wayland_buffer_from_image(_EGLDriver *drv, 
_EGLDisplay *dpy,
return dri2_dpy->vtbl->create_wayland_buffer_from_image(drv, dpy, img);
 }
 
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
 static EGLBoolean
 dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs)
 {
@@ -1829,7 +1836,7 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
case EGL_WAYLAND_BUFFER_WL:
   return dri2_create_image_wayland_wl_buffer(disp, ctx, buffer, attr_list);
 #endif
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
case EGL_LINUX_DMA_BUF_EXT:
   return dri2_create_image_dma_buf(disp, ctx, buffer, attr_list);
 #endif
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 9efe1f7..e206424 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -332,6 +332,9 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp);
 EGLBoolean
 dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp);
 
+EGLBoolean
+dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp);
+
 void
 dri2_flush_drawable_for_swapbuffers(_EGLDisplay *disp, _EGLSurface *draw);
 
diff --git a/src/egl/drivers/dri2/platform_null.c 
b/src/egl/drivers/dri2/platform_null.c
new file mode 100644
index 000..f537e37
--- /dev/null
+++ b/src/egl/drivers/dri2/platform_null.c
@@ -0,0 +1,169 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (c) 2014 The Chromium OS Authors.
+ * Copyright © 2011 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "egl_dri2.h"
+#include "egl_dri2_fallbacks.h"
+#include "loader.h"
+
+static struct dri2_egl_display_vtbl dri2_null_display_vtbl = {
+   .create_pixmap_surface = dri2_fallback_create_pixmap_surface,
+   .create_image = dri2_create_image_khr,
+   .swap_interval = dri2_fallback_swap_interval,
+   .

[Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Haixia Shi
The NULL platform is for off-screen rendering only. Render node support is
required.

Signed-off-by: Haixia Shi 
---
 src/egl/drivers/dri2/Makefile.am |   5 ++
 src/egl/drivers/dri2/egl_dri2.c  |  13 ++-
 src/egl/drivers/dri2/egl_dri2.h  |   3 +
 src/egl/drivers/dri2/platform_null.c | 169 +++
 4 files changed, 187 insertions(+), 3 deletions(-)
 create mode 100644 src/egl/drivers/dri2/platform_null.c

diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am
index 79a40e8..14b2d60 100644
--- a/src/egl/drivers/dri2/Makefile.am
+++ b/src/egl/drivers/dri2/Makefile.am
@@ -64,3 +64,8 @@ if HAVE_EGL_PLATFORM_DRM
 libegl_dri2_la_SOURCES += platform_drm.c
 AM_CFLAGS += -DHAVE_DRM_PLATFORM
 endif
+
+if HAVE_EGL_PLATFORM_NULL
+libegl_dri2_la_SOURCES += platform_null.c
+AM_CFLAGS += -DHAVE_NULL_PLATFORM
+endif
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 86e5f24..6ed137e 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -534,7 +534,7 @@ dri2_setup_screen(_EGLDisplay *disp)
  disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE;
  disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE;
   }
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
   if (dri2_dpy->image->base.version >= 8 &&
   dri2_dpy->image->createImageFromDmaBufs) {
  disp->Extensions.EXT_image_dma_buf_import = EGL_TRUE;
@@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
   return EGL_FALSE;
 
switch (disp->Platform) {
+#ifdef HAVE_NULL_PLATFORM
+   case _EGL_PLATFORM_NULL:
+  if (disp->Options.TestOnly)
+ return EGL_TRUE;
+  return dri2_initialize_null(drv, disp);
+#endif
+
 #ifdef HAVE_X11_PLATFORM
case _EGL_PLATFORM_X11:
   if (disp->Options.TestOnly)
@@ -1571,7 +1578,7 @@ dri2_create_wayland_buffer_from_image(_EGLDriver *drv, 
_EGLDisplay *dpy,
return dri2_dpy->vtbl->create_wayland_buffer_from_image(drv, dpy, img);
 }
 
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
 static EGLBoolean
 dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs)
 {
@@ -1829,7 +1836,7 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
case EGL_WAYLAND_BUFFER_WL:
   return dri2_create_image_wayland_wl_buffer(disp, ctx, buffer, attr_list);
 #endif
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
case EGL_LINUX_DMA_BUF_EXT:
   return dri2_create_image_dma_buf(disp, ctx, buffer, attr_list);
 #endif
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 9efe1f7..e206424 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -332,6 +332,9 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp);
 EGLBoolean
 dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp);
 
+EGLBoolean
+dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp);
+
 void
 dri2_flush_drawable_for_swapbuffers(_EGLDisplay *disp, _EGLSurface *draw);
 
diff --git a/src/egl/drivers/dri2/platform_null.c 
b/src/egl/drivers/dri2/platform_null.c
new file mode 100644
index 000..55ceab6
--- /dev/null
+++ b/src/egl/drivers/dri2/platform_null.c
@@ -0,0 +1,169 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (c) 2014 The Chromium OS Authors.
+ * Copyright © 2011 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "egl_dri2.h"
+#include "egl_dri2_fallbacks.h"
+#include "loader.h"
+
+static struct dri2_egl_display_vtbl dri2_null_display_vtbl = {
+   .create_pixmap_surface = dri2_fallback_create_pixmap_surface,
+   .create_image = dri2_create_image_khr,
+   .swap_interval = dri2_fallback_swap_interval,
+   .

Re: [Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Haixia Shi
Hi Emil,

On Fri, Jan 23, 2015 at 8:42 AM, Emil Velikov  wrote:
> Might be worth having a look at how platform_drm does it. But we warned
> there be dragons :)

It seems platform_drm would cast disp->PlatformDisplay to a gbm_device
and use it if available; otherwise it always uses the first normal
node (/dev/dri/card0).

Can it be assumed that if render nodes are available then it would
always be the first one (/dev/dri/renderD128)? Otherwise I still think
it is correct to run a for loop to try all the available render nodes
(renderD128..renderD191)

Thanks,
Haixia

On Fri, Jan 23, 2015 at 8:42 AM, Emil Velikov  wrote:
> On 23/01/15 02:00, Haixia Shi wrote:
>> Hi Emil,
>>
>> On Thu, Jan 22, 2015 at 4:38 PM, Emil Velikov  
>> wrote:
>>> On 22/01/15 22:23, Haixia Shi wrote:
 Hi Emil

 On Thu, Jan 22, 2015 at 1:36 PM, Emil Velikov  
 wrote:
> Hi Haixia Shi,
>
> On 22/01/15 17:35, Haixia Shi wrote:
>> Try the render node first and use it if available. Otherwise fall back to
>> normal nodes.
>>
> What is the use-case for such a platform - I assume it's worth
> mentioning in the commit message ?
>
> No other platform picks the device at random as seen below. Why did you
> choose such an approach ? It seems like one can easily shoot themselves
> by using it.

 CC Stephane. The goal here is just to pick the first available node
 for off-screen rendering only.

>>> Hmm I'm guessing that using the drm/gbm platform is out of the question
>>> ? Iirc there has been a bit of love on the gbm topic, and afaiu this
>>> solution is to be used with minigbm ?
>>
>> Yes this solutions is to be used with minigbm.
>>
>>>
>>> What I'm thinking here is:
>>> If you're testing a device with provides two or more nodes (be that the
>>> classic card or the render ones), one cannot guarantee that the kernel
>>> module for hw#1 will be loaded first. Thus even if one presumes that
>>> they are working on (testing) hw#1 that may or may not be the case.
>>>
>>> Not 100% sure on the module order part, so I could be wrong.
>>
>> I don't have a good answer for that... any suggestion on how best to
>> pick the right one?
>>
> Might be worth having a look at how platform_drm does it. But we warned
> there be dragons :)
>
>> + char *card_path;
>> + if (asprintf(&card_path, node_path_fmt, base + i) < 0)
>> +continue;
>> +
>> + dri2_dpy->fd = open(card_path, O_RDWR);
> If you open a normal node (card%d) I believe that you'll need an
> authenticate hook in dri2_egl_display_vtbl. Does things work without it
> on your system/platform ?

 You're correct; normal node would require the legacy auth hook, and it
 would only work without auth if the process is run as root, which is
 why we're trying render nodes first.

>>> So you're saying that people without render nodes should run egl(mesa)
>>> as root ? That does not sound like a wise suggestion imho.
>>>
>>> Basically what I'm trying to say is - if you have a fall-back to normal
>>> nodes, some form of auth ought to be in place.
>>
>> I see your point. Would it be cleaner if we simply require render node
>> to be present? The normal node (card%d) and the auth hook is more
>> trouble than its worth.
>>
> It's up-to you if you want to keep it.
> I'm just pointing out that having a fall-back that (a) mostly fails,
> without giving a clear indication as to why, or (b) forces you to run
> the app as root is counter-intuitive (not the best security practise)
> for most people.
>
>
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/16] i965: Add is_3src() to backend_instruction.

2015-01-23 Thread Ian Romanick
On 01/20/2015 12:16 AM, Kenneth Graunke wrote:
> On Monday, January 19, 2015 03:31:06 PM Matt Turner wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_shader.cpp| 10 ++
>>  src/mesa/drivers/dri/i965/brw_shader.h  |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp |  6 +-
>>  3 files changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index cbdf976..c6fead7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -678,6 +678,16 @@ backend_reg::is_accumulator() const
>>  }
>>  
>>  bool
>> +backend_instruction::is_3src() const
>> +{
>> +   return opcode == BRW_OPCODE_LRP ||
>> +  opcode == BRW_OPCODE_MAD ||
>> +  opcode == BRW_OPCODE_BFE ||
>> +  opcode == BRW_OPCODE_BFI2 ||
>> +  opcode == BRW_OPCODE_CSEL;

Can this also replace is_3src() in brw_eu_compact.c?  FWIW, that
function was already doing basically what Ken suggests below... and you
wrote it!  (Commit 31eed95b)

> Pah, manual listings of things :)  Let's do even better:
> 
>return opcode < 128 && opcode_descs[op].nsrc == 3;

Shouldn't that be

   return opcode < ARRAY_SIZE(opcode_descs) && opcode_descs[opode].nsrc
== 3;

> That would get
> Reviewed-by: Kenneth Graunke 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Jason Ekstrand
On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov 
wrote:

> On 05/01/15 17:45, Laura Ekstrand wrote:
> > This comment is vague.  Do you have a specific recommendation for the
> > code here?
> >
> Seems like I'm way too subtle - yes I have a few.
>
>
> 1. Add ARB_direct_state_access to struct gl_extension
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3731,6 +3731,7 @@ struct gl_extensions
> GLboolean ARB_depth_clamp;
> GLboolean ARB_depth_texture;
> GLboolean ARB_derivative_control;
> +   GLboolean ARB_direct_state_access
> GLboolean ARB_draw_buffers_blend;
> GLboolean ARB_draw_elements_base_vertex;
>
>
> 2. Use it in the extensions table.
> --- a/src/mesa/main/extensions.c
> +++ b/src/mesa/main/extensions.c
> @@ -103,6 +103,7 @@ static const struct extension extension_table[] = {
> { "GL_ARB_depth_clamp", o(ARB_depth_clamp),
> GL, 2003 },
> { "GL_ARB_depth_texture",
> o(ARB_depth_texture),   GLL,2001 },
> { "GL_ARB_derivative_control",
> o(ARB_derivative_control),  GL, 2014 },
> +   { "GL_ARB_direct_state_access",
> o(ARB_direct_state_access), GL, 2014 },
>
>
> 3. Make use of if when the spec amends existing behaviour - most of the
> spec text as of section "New Tokens" onwards. Clearly with this series
> you're adding the new entry points(functions) so it does not apply here :)
>
>
> if (foo->Extensions.ARB_direct_state_access) {
>  
> }
>
>
> Pretty much every extension that was added to mesa follows this approach
> so keeping up with traditions is always nice.
>

Yes, and no...  We have the table of booleans in gl_extensions so that we
can expose different extensions/behavior on different drivers.  However,
ARB_direct_state_access doesn't actually add new functionality, just new
ways of getting at old functionality.  We *should* be able to implement it
in a driver-agnostic way entirely within core mesa.  Therefore, there's no
reason to be able to shut it off on a per-driver basis and no reason for
the flag in gl_extensions.  If we find that, for some reason, we only want
to support it in core contexts or that it adds something some drivers can't
handle it, then we'll need the flag.
--Jason


>
> Cheers,
> Emil
>
> P.S. Pardon if my nitpicking came out a bit wierd.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] r600g, radeonsi: Fix calculation of IR target cap string buffer size

2015-01-23 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Jan 22, 2015 at 4:41 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> Fixes writing beyond the allocated buffer:
>
> ==31855== Invalid write of size 1
> ==31855==at 0x50AB2A9: vsprintf (iovsprintf.c:43)
> ==31855==by 0x508F6F6: sprintf (sprintf.c:32)
> ==31855==by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526)
> ==31855==by 0x5B2B7DE: get_compute_param (device.cpp:37)
> ==31855==by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201)
> ==31855==by 0x5B398E0: 
> clover::program::build(clover::ref_vector const&, char 
> const*, clover::compat::vector clover::compat::string> > const&) (program.cpp:63)
> ==31855==by 0x5B20152: clBuildProgram (program.cpp:182)
> ==31855==by 0x400F41: main (hello_world.c:109)
> ==31855==  Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd
> ==31855==at 0x4C29180: operator new(unsigned long) 
> (vg_replace_malloc.c:324)
> ==31855==by 0x5B2B7C2: allocate (new_allocator.h:104)
> ==31855==by 0x5B2B7C2: allocate (alloc_traits.h:357)
> ==31855==by 0x5B2B7C2: _M_allocate (stl_vector.h:170)
> ==31855==by 0x5B2B7C2: _M_create_storage (stl_vector.h:185)
> ==31855==by 0x5B2B7C2: _Vector_base (stl_vector.h:136)
> ==31855==by 0x5B2B7C2: vector (stl_vector.h:278)
> ==31855==by 0x5B2B7C2: get_compute_param (device.cpp:35)
> ==31855==by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201)
> ==31855==by 0x5B398E0: 
> clover::program::build(clover::ref_vector const&, char 
> const*, clover::compat::vector clover::compat::string> > const&) (program.cpp:63)
> ==31855==by 0x5B20152: clBuildProgram (program.cpp:182)
> ==31855==by 0x400F41: main (hello_world.c:109)
>
> Signed-off-by: Michel Dänzer 
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index f91772e..ddb4142 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -524,9 +524,9 @@ static int r600_get_compute_param(struct pipe_screen 
> *screen,
> }
> if (ret) {
> sprintf(ret, "%s-%s", gpu, triple);
> -
> }
> -   return (strlen(triple) + strlen(gpu)) * sizeof(char);
> +   /* +2 for dash and terminating NIL byte */
> +   return (strlen(triple) + strlen(gpu) + 2) * sizeof(char);
> }
> case PIPE_COMPUTE_CAP_GRID_DIMENSION:
> if (ret) {
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] r600g, radeonsi: Fix calculation of IR target cap string buffer size

2015-01-23 Thread Tom Stellard
Reviewed-by: Tom Stellard 

On Fri, Jan 23, 2015 at 10:01:56PM +0100, Marek Olšák wrote:
> Reviewed-by: Marek Olšák 
> 
> Marek
> 
> On Thu, Jan 22, 2015 at 4:41 AM, Michel Dänzer  wrote:
> > From: Michel Dänzer 
> >
> > Fixes writing beyond the allocated buffer:
> >
> > ==31855== Invalid write of size 1
> > ==31855==at 0x50AB2A9: vsprintf (iovsprintf.c:43)
> > ==31855==by 0x508F6F6: sprintf (sprintf.c:32)
> > ==31855==by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526)
> > ==31855==by 0x5B2B7DE: get_compute_param (device.cpp:37)
> > ==31855==by 0x5B2B7DE: clover::device::ir_target() const 
> > (device.cpp:201)
> > ==31855==by 0x5B398E0: 
> > clover::program::build(clover::ref_vector const&, char 
> > const*, clover::compat::vector > clover::compat::string> > const&) (program.cpp:63)
> > ==31855==by 0x5B20152: clBuildProgram (program.cpp:182)
> > ==31855==by 0x400F41: main (hello_world.c:109)
> > ==31855==  Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd
> > ==31855==at 0x4C29180: operator new(unsigned long) 
> > (vg_replace_malloc.c:324)
> > ==31855==by 0x5B2B7C2: allocate (new_allocator.h:104)
> > ==31855==by 0x5B2B7C2: allocate (alloc_traits.h:357)
> > ==31855==by 0x5B2B7C2: _M_allocate (stl_vector.h:170)
> > ==31855==by 0x5B2B7C2: _M_create_storage (stl_vector.h:185)
> > ==31855==by 0x5B2B7C2: _Vector_base (stl_vector.h:136)
> > ==31855==by 0x5B2B7C2: vector (stl_vector.h:278)
> > ==31855==by 0x5B2B7C2: get_compute_param (device.cpp:35)
> > ==31855==by 0x5B2B7C2: clover::device::ir_target() const 
> > (device.cpp:201)
> > ==31855==by 0x5B398E0: 
> > clover::program::build(clover::ref_vector const&, char 
> > const*, clover::compat::vector > clover::compat::string> > const&) (program.cpp:63)
> > ==31855==by 0x5B20152: clBuildProgram (program.cpp:182)
> > ==31855==by 0x400F41: main (hello_world.c:109)
> >
> > Signed-off-by: Michel Dänzer 
> > ---
> >  src/gallium/drivers/radeon/r600_pipe_common.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> > b/src/gallium/drivers/radeon/r600_pipe_common.c
> > index f91772e..ddb4142 100644
> > --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> > +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> > @@ -524,9 +524,9 @@ static int r600_get_compute_param(struct pipe_screen 
> > *screen,
> > }
> > if (ret) {
> > sprintf(ret, "%s-%s", gpu, triple);
> > -
> > }
> > -   return (strlen(triple) + strlen(gpu)) * sizeof(char);
> > +   /* +2 for dash and terminating NIL byte */
> > +   return (strlen(triple) + strlen(gpu) + 2) * sizeof(char);
> > }
> > case PIPE_COMPUTE_CAP_GRID_DIMENSION:
> > if (ret) {
> > --
> > 2.1.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/16] i965: Add is_3src() to backend_instruction.

2015-01-23 Thread Matt Turner
On Fri, Jan 23, 2015 at 11:42 AM, Ian Romanick  wrote:
> On 01/20/2015 12:16 AM, Kenneth Graunke wrote:
>> On Monday, January 19, 2015 03:31:06 PM Matt Turner wrote:
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_shader.cpp| 10 ++
>>>  src/mesa/drivers/dri/i965/brw_shader.h  |  1 +
>>>  src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp |  6 +-
>>>  3 files changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> index cbdf976..c6fead7 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> @@ -678,6 +678,16 @@ backend_reg::is_accumulator() const
>>>  }
>>>
>>>  bool
>>> +backend_instruction::is_3src() const
>>> +{
>>> +   return opcode == BRW_OPCODE_LRP ||
>>> +  opcode == BRW_OPCODE_MAD ||
>>> +  opcode == BRW_OPCODE_BFE ||
>>> +  opcode == BRW_OPCODE_BFI2 ||
>>> +  opcode == BRW_OPCODE_CSEL;
>
> Can this also replace is_3src() in brw_eu_compact.c?  FWIW, that
> function was already doing basically what Ken suggests below... and you
> wrote it!  (Commit 31eed95b)

Not easily (or cleanly) if we want it as a method of
backend_instruction. Not much code savings either.

>> Pah, manual listings of things :)  Let's do even better:
>>
>>return opcode < 128 && opcode_descs[op].nsrc == 3;
>
> Shouldn't that be
>
>return opcode < ARRAY_SIZE(opcode_descs) && opcode_descs[opode].nsrc
> == 3;

Yes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/fs: Don't use backend_visitor::instructions after creating the CFG.

2015-01-23 Thread Matt Turner
On Sat, Jan 17, 2015 at 12:07 AM, Kenneth Graunke  wrote:
> With an updated commit message and Piglit passing (I'll test and let you 
> know),
> Reviewed-by: Kenneth Graunke 

Reminder to piglit these the next time you power on your Gen4. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 0/4] A NIR constant folding infastructure

2015-01-23 Thread Jason Ekstrand
This is a 3rd version of the constant folding architecture that Connor came
up with based on python autogeneration.  This 3rd version has a few trivial
fixes to some of the patches and a complete rewrite of 2.  While I think
Connor's patch 2 probably worked, it involved a lot of very opaque wrapping
strings in strings and I couldn't follow it.  I've rewritten it to use mako
to generate something that is *hopefully* more clear.  I'm ok with using
Connor's version if people would prefer, but I thought I've been kicking
this around in my head and thought I'd knock it out anyway.

Connor Abbott (3):
  nir: use Python to autogenerate opcode information
  nir/constant_folding: use the new constant folding infrastructure
  nir/opt_algebraic: be more careful about constant types

Jason Ekstrand (1):
  nir: add new constant folding infrastructure

 src/glsl/Makefile.am |  18 +
 src/glsl/Makefile.sources|   5 +-
 src/glsl/nir/.gitignore  |   3 +
 src/glsl/nir/nir.h   |  16 +-
 src/glsl/nir/nir_algebraic.py|  39 ++-
 src/glsl/nir/nir_constant_expressions.h  |  31 ++
 src/glsl/nir/nir_constant_expressions.py | 319 +
 src/glsl/nir/nir_opcodes.c   |  46 ---
 src/glsl/nir/nir_opcodes.h   | 366 
 src/glsl/nir/nir_opcodes.py  | 575 +++
 src/glsl/nir/nir_opcodes_c.py|  55 +++
 src/glsl/nir/nir_opcodes_h.py|  47 +++
 src/glsl/nir/nir_opt_algebraic.py|   6 +-
 src/glsl/nir/nir_opt_constant_folding.c  | 179 ++
 src/mesa/drivers/dri/i965/Makefile.am|   1 +
 15 files changed, 1108 insertions(+), 598 deletions(-)
 create mode 100644 src/glsl/nir/nir_constant_expressions.h
 create mode 100644 src/glsl/nir/nir_constant_expressions.py
 delete mode 100644 src/glsl/nir/nir_opcodes.c
 delete mode 100644 src/glsl/nir/nir_opcodes.h
 create mode 100644 src/glsl/nir/nir_opcodes.py
 create mode 100644 src/glsl/nir/nir_opcodes_c.py
 create mode 100644 src/glsl/nir/nir_opcodes_h.py

-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] nir: add new constant folding infrastructure

2015-01-23 Thread Jason Ekstrand
Add a required field to the Opcode class, const_expr, that contains an
expression or statement that computes the result of the opcode given known
constant inputs. Then take those const_expr's and expand them into a function
that takes an opcode and an array of constant inputs and spits out the constant
result. This means that when adding opcodes, there's one less place to update,
and almost all the opcodes are self-documenting since the information on how to
compute the result is right next to the definition.

The helper functions in nir_constant_expressions.c were taken from
ir_constant_expressions.cpp.

v3 Jason Ekstrand 
 - Use mako to generate one function per opcode instead of doing piles of
   string splicing

Signed-off-by: Jason Ekstrand 
---
 src/glsl/Makefile.am |   5 +
 src/glsl/Makefile.sources|   1 +
 src/glsl/nir/.gitignore  |   1 +
 src/glsl/nir/nir_constant_expressions.h  |  31 ++
 src/glsl/nir/nir_constant_expressions.py | 319 ++
 src/glsl/nir/nir_opcodes.py  | 562 +--
 6 files changed, 735 insertions(+), 184 deletions(-)
 create mode 100644 src/glsl/nir/nir_constant_expressions.h
 create mode 100644 src/glsl/nir/nir_constant_expressions.py

diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index 59dda5f..e145cb2 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -217,6 +217,7 @@ BUILT_SOURCES = 
\
glsl_lexer.cpp  \
glcpp/glcpp-parse.c \
glcpp/glcpp-lex.c   \
+   nir/nir_constant_expressions.c  \
nir/nir_opcodes.c   \
nir/nir_opcodes.h   \
nir/nir_opt_algebraic.c
@@ -232,6 +233,10 @@ dist-hook:
$(RM) glcpp/tests/*.out
$(RM) glcpp/tests/subtest*/*.out
 
+nir/nir_constant_expressions.c: nir/nir_opcodes.py 
nir/nir_constant_expressions.py nir/nir_constant_expressions.h
+   $(MKDIR_P) nir; \
+   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py > 
$@
+
 nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
$(MKDIR_P) nir; \
$(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index dc1c55d..dd76c44 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -14,6 +14,7 @@ LIBGLCPP_GENERATED_FILES = \
$(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
 
 NIR_GENERATED_FILES = \
+   $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c \
$(GLSL_BUILDDIR)/nir/nir_opcodes.c \
$(GLSL_BUILDDIR)/nir/nir_opcodes.h \
$(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
index 4c28193..261f64f 100644
--- a/src/glsl/nir/.gitignore
+++ b/src/glsl/nir/.gitignore
@@ -1,3 +1,4 @@
 nir_opt_algebraic.c
 nir_opcodes.c
 nir_opcodes.h
+nir_constant_expressions.c
diff --git a/src/glsl/nir/nir_constant_expressions.h 
b/src/glsl/nir/nir_constant_expressions.h
new file mode 100644
index 000..97997f2
--- /dev/null
+++ b/src/glsl/nir/nir_constant_expressions.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2014 Connor Abbott
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Connor Abbott (cwabbo...@gmail.com)
+ *
+ */
+
+#include "nir.h"
+
+nir_const_value nir_eval_const_opcode(nir_op op, unsigned num_components,
+  nir_const_value *src);
diff --git a/src/glsl/nir/nir_constant_expressions.py 
b/src/glsl/nir/nir_constant_expressions.py
new file mode 100644
index 000..09c766a
--- /dev/null
+++ b/src/glsl/nir/nir_consta

[Mesa-dev] [PATCH 1/4] nir: use Python to autogenerate opcode information

2015-01-23 Thread Jason Ekstrand
From: Connor Abbott 

Before, we used a system where a file, nir_opcodes.h, defined some macros that
were included to generate the enum values and the nir_op_infos structure. This
worked pretty well, but for development the error messages were never very
useful, Python tools couldn't understand the opcode list, and it was difficult
to use nir_opcodes.h to do other things like autogenerate a builder API. Now, we
store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to
generate the old nir_opcodes.c and nir_opcodes_h.py to generate nir_opcodes.h,
which contains all the enum names and gets included into nir.h like before.  In
addition to solving the above problems, using Python and Mako to generate
everything means that it's much easier to add keep information centralized as we
add new things like constant propagation that require per-opcode information.

v2:
 - make Opcode derive from object (Dylan)
 - don't use assert like it's a function (Dylan)
 - style fixes for fnoise, use xrange (Dylan)
 - use iterkeys() in nir_opcodes_h.py (Dylan)
 - use pydoc-style comments (Jason)
 - don't make fmin/fmax commutative and associative yet (Jason)

Signed-off-by: Connor Abbott 

v3 Jason Ekstrand 
 - Alphabetize source file lists
 - Generate nir_opcodes.h in the builddir instead of the source dir
 - Include $(builddir)/src/glsl/nir in the i965 build
 - Rework nir_opcodes.h generation so it generates a complete header file
   instead of one that has to be embedded inside an enum declaration
---
 src/glsl/Makefile.am  |  13 ++
 src/glsl/Makefile.sources |   4 +-
 src/glsl/nir/.gitignore   |   2 +
 src/glsl/nir/nir.h|  16 +-
 src/glsl/nir/nir_opcodes.c|  46 
 src/glsl/nir/nir_opcodes.h| 366 
 src/glsl/nir/nir_opcodes.py   | 381 ++
 src/glsl/nir/nir_opcodes_c.py |  55 +
 src/glsl/nir/nir_opcodes_h.py |  47 +
 src/mesa/drivers/dri/i965/Makefile.am |   1 +
 10 files changed, 503 insertions(+), 428 deletions(-)
 delete mode 100644 src/glsl/nir/nir_opcodes.c
 delete mode 100644 src/glsl/nir/nir_opcodes.h
 create mode 100644 src/glsl/nir/nir_opcodes.py
 create mode 100644 src/glsl/nir/nir_opcodes_c.py
 create mode 100644 src/glsl/nir/nir_opcodes_h.py

diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index 9d9f99a..59dda5f 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -27,6 +27,7 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/glsl/glcpp \
-I$(top_srcdir)/src/glsl/nir \
-I$(top_srcdir)/src/gtest/include \
+   -I$(top_builddir)/src/glsl/nir \
$(DEFINES)
 AM_CFLAGS = $(VISIBILITY_CFLAGS)
 AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
@@ -216,6 +217,8 @@ BUILT_SOURCES = 
\
glsl_lexer.cpp  \
glcpp/glcpp-parse.c \
glcpp/glcpp-lex.c   \
+   nir/nir_opcodes.c   \
+   nir/nir_opcodes.h   \
nir/nir_opt_algebraic.c
 CLEANFILES =   \
glcpp/glcpp-parse.h \
@@ -229,6 +232,16 @@ dist-hook:
$(RM) glcpp/tests/*.out
$(RM) glcpp/tests/subtest*/*.out
 
+nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
+   $(MKDIR_P) nir; \
+   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
+
+nir/nir.h: $(top_builddir)/src/glsl/nir/nir_opcodes.h
+
+nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py
+   $(MKDIR_P) nir; \
+   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_c.py > $@
+
 nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py
$(MKDIR_P) nir; \
$(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opt_algebraic.py > $@
diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 6237627..dc1c55d 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -14,6 +14,8 @@ LIBGLCPP_GENERATED_FILES = \
$(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
 
 NIR_GENERATED_FILES = \
+   $(GLSL_BUILDDIR)/nir/nir_opcodes.c \
+   $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
$(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
 
 NIR_FILES = \
@@ -35,8 +37,6 @@ NIR_FILES = \
$(GLSL_SRCDIR)/nir/nir_lower_var_copies.c \
$(GLSL_SRCDIR)/nir/nir_lower_vec_to_movs.c \
$(GLSL_SRCDIR)/nir/nir_metadata.c \
-   $(GLSL_SRCDIR)/nir/nir_opcodes.c \
-   $(GLSL_SRCDIR)/nir/nir_opcodes.h \
$(GLSL_SRCDIR)/nir/nir_opt_constant_folding.c \
$(GLSL_SRCDIR)/nir/nir_opt_copy_propagate.c \
$(GLSL_SRCDIR)/nir/nir_opt_cse.c \
diff --git a/src/glsl/

[Mesa-dev] [PATCH 4/4] nir/opt_algebraic: be more careful about constant types

2015-01-23 Thread Jason Ekstrand
From: Connor Abbott 

We can do this now that we have opcode info available in Python.

Signed-off-by: Connor Abbott 

v2 Jason Ekstrand :
 - Make it build again
---
 src/glsl/nir/nir_algebraic.py | 39 +++
 src/glsl/nir/nir_opt_algebraic.py |  6 ++
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
index f9b246d..cc8624b 100644
--- a/src/glsl/nir/nir_algebraic.py
+++ b/src/glsl/nir/nir_algebraic.py
@@ -28,6 +28,7 @@ import itertools
 import struct
 import sys
 import mako.template
+from nir_opcodes import opcodes
 
 # Represents a set of variables, each with a unique id
 class VarSet(object):
@@ -43,7 +44,7 @@ class VarSet(object):
 
 class Value(object):
@staticmethod
-   def create(val, name_base, varset):
+   def create(val, name_base, type_, varset):
   if isinstance(val, tuple):
  return Expression(val, name_base, varset)
   elif isinstance(val, Expression):
@@ -51,7 +52,7 @@ class Value(object):
   elif isinstance(val, (str, unicode)):
  return Variable(val, name_base, varset)
   elif isinstance(val, (bool, int, long, float)):
- return Constant(val, name_base)
+ return Constant(val, name_base, type_)
 
__template = mako.template.Template("""
 static const ${val.c_type} ${val.name} = {
@@ -89,19 +90,36 @@ static const ${val.c_type} ${val.name} = {
 Expression=Expression)
 
 class Constant(Value):
-   def __init__(self, val, name):
+   def __init__(self, val, name, type_):
   Value.__init__(self, name, "constant")
   self.value = val
+  if type_ == "unknown":
+ if isinstance(self.value, (bool)):
+self.type_ = "bool"
+ if isinstance(self.value, (int, long)):
+self.type_ = "int"
+ elif isinstance(self.value, float):
+self.type_ = "float"
+  else:
+ if type_ == "bool":
+assert isinstance(self.value, (bool))
+ elif type_ == "int" or type_ == "unsigned":
+assert isinstance(self.value, (int, long))
+ elif type_ == "float":
+assert isinstance(self.value, (float))
+ else:
+assert False
+ self.type_ = type_
 
def __hex__(self):
   # Even if it's an integer, we still need to unpack as an unsigned
   # int.  This is because, without C99, we can only assign to the first
   # element of a union in an initializer.
-  if isinstance(self.value, (bool)):
+  if self.type_ == "bool":
  return 'NIR_TRUE' if self.value else 'NIR_FALSE'
-  if isinstance(self.value, (int, long)):
+  if self.type_ == "int" or type == "unsigned":
  return hex(struct.unpack('I', struct.pack('i', self.value))[0])
-  elif isinstance(self.value, float):
+  elif self.type_ == "float":
  return hex(struct.unpack('I', struct.pack('f', self.value))[0])
   else:
  assert False
@@ -119,7 +137,11 @@ class Expression(Value):
   assert isinstance(expr, tuple)
 
   self.opcode = expr[0]
-  self.sources = [ Value.create(src, "{0}_{1}".format(name_base, i), 
varset)
+  assert self.opcode in opcodes
+
+  opcode_info = opcodes[self.opcode]
+  self.sources = [ Value.create(src, "{0}_{1}".format(name_base, i),
+opcode_info.input_types[i], varset)
for (i, src) in enumerate(expr[1:]) ]
 
def render(self):
@@ -141,7 +163,8 @@ class SearchAndReplace(object):
   if isinstance(replace, Value):
  self.replace = replace
   else:
- self.replace = Value.create(replace, "replace{0}".format(self.id), 
varset)
+ self.replace = Value.create(replace, "replace{0}".format(self.id),
+ "unknown", varset)
 
 _algebraic_pass_template = mako.template.Template("""
 #include "nir.h"
diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 169bb41..a03ad01 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -37,10 +37,8 @@ d = 'd'
 # defined as a tuple of the form (, , , , )
 # where each source is either an expression or a value.  A value can be
 # either a numeric constant or a string representing a variable name.  For
-# constants, you have to be careful to make sure that it is the right type
-# because python is unaware of the source and destination types of the
-# opcodes.
-
+# constants, you must use the correct type for the opcode or there will be an
+# assertion failure when generating the pass.
 optimizations = [
(('fneg', ('fneg', a)), a),
(('ineg', ('ineg', a)), a),
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] nir/constant_folding: use the new constant folding infrastructure

2015-01-23 Thread Jason Ekstrand
From: Connor Abbott 

Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir_opt_constant_folding.c | 179 
 1 file changed, 21 insertions(+), 158 deletions(-)

diff --git a/src/glsl/nir/nir_opt_constant_folding.c 
b/src/glsl/nir/nir_opt_constant_folding.c
index 878436b..f727453 100644
--- a/src/glsl/nir/nir_opt_constant_folding.c
+++ b/src/glsl/nir/nir_opt_constant_folding.c
@@ -25,7 +25,7 @@
  *
  */
 
-#include "nir.h"
+#include "nir_constant_expressions.h"
 #include 
 
 /*
@@ -38,20 +38,10 @@ struct constant_fold_state {
bool progress;
 };
 
-#define SRC_COMP(T, IDX, CMP) src[IDX]->value.T[instr->src[IDX].swizzle[CMP]]
-#define SRC(T, IDX) SRC_COMP(T, IDX, i)
-#define DEST_COMP(T, CMP) dest->value.T[CMP]
-#define DEST(T) DEST_COMP(T, i)
-
-#define FOLD_PER_COMP(EXPR) \
-   for (unsigned i = 0; i < instr->dest.dest.ssa.num_components; i++) { \
-  EXPR; \
-   } \
-
 static bool
 constant_fold_alu_instr(nir_alu_instr *instr, void *mem_ctx)
 {
-   nir_load_const_instr *src[4], *dest;
+   nir_const_value src[4];
 
if (!instr->dest.dest.is_ssa)
   return false;
@@ -60,163 +50,36 @@ constant_fold_alu_instr(nir_alu_instr *instr, void 
*mem_ctx)
   if (!instr->src[i].src.is_ssa)
  return false;
 
-  if (instr->src[i].src.ssa->parent_instr->type != 
nir_instr_type_load_const)
+  nir_instr *src_instr = instr->src[i].src.ssa->parent_instr;
+
+  if (src_instr->type != nir_instr_type_load_const)
  return false;
+  nir_load_const_instr* load_const = nir_instr_as_load_const(src_instr);
+
+  for (unsigned j = 0; j < instr->dest.dest.ssa.num_components; j++) {
+ src[i].u[j] = load_const->value.u[instr->src[i].swizzle[j]];
+  }
 
   /* We shouldn't have any source modifiers in the optimization loop. */
   assert(!instr->src[i].abs && !instr->src[i].negate);
-
-  src[i] = nir_instr_as_load_const(instr->src[i].src.ssa->parent_instr);
}
 
/* We shouldn't have any saturate modifiers in the optimization loop. */
assert(!instr->dest.saturate);
 
-   dest = nir_load_const_instr_create(mem_ctx,
-  instr->dest.dest.ssa.num_components);
-
-   switch (instr->op) {
-   case nir_op_ineg:
-  FOLD_PER_COMP(DEST(i) = -SRC(i, 0));
-  break;
-   case nir_op_fneg:
-  FOLD_PER_COMP(DEST(f) = -SRC(f, 0));
-  break;
-   case nir_op_inot:
-  FOLD_PER_COMP(DEST(i) = ~SRC(i, 0));
-  break;
-   case nir_op_fnot:
-  FOLD_PER_COMP(DEST(f) = (SRC(f, 0) == 0.0f) ? 1.0f : 0.0f);
-  break;
-   case nir_op_frcp:
-  FOLD_PER_COMP(DEST(f) = 1.0f / SRC(f, 0));
-  break;
-   case nir_op_frsq:
-  FOLD_PER_COMP(DEST(f) = 1.0f / sqrt(SRC(f, 0)));
-  break;
-   case nir_op_fsqrt:
-  FOLD_PER_COMP(DEST(f) = sqrtf(SRC(f, 0)));
-  break;
-   case nir_op_fexp:
-  FOLD_PER_COMP(DEST(f) = expf(SRC(f, 0)));
-  break;
-   case nir_op_flog:
-  FOLD_PER_COMP(DEST(f) = logf(SRC(f, 0)));
-  break;
-   case nir_op_fexp2:
-  FOLD_PER_COMP(DEST(f) = exp2f(SRC(f, 0)));
-  break;
-   case nir_op_flog2:
-  FOLD_PER_COMP(DEST(f) = log2f(SRC(f, 0)));
-  break;
-   case nir_op_f2i:
-  FOLD_PER_COMP(DEST(i) = SRC(f, 0));
-  break;
-   case nir_op_f2u:
-  FOLD_PER_COMP(DEST(u) = SRC(f, 0));
-  break;
-   case nir_op_i2f:
-  FOLD_PER_COMP(DEST(f) = SRC(i, 0));
-  break;
-   case nir_op_f2b:
-  FOLD_PER_COMP(DEST(u) = (SRC(i, 0) == 0.0f) ? NIR_FALSE : NIR_TRUE);
-  break;
-   case nir_op_b2f:
-  FOLD_PER_COMP(DEST(f) = SRC(u, 0) ? 1.0f : 0.0f);
-  break;
-   case nir_op_i2b:
-  FOLD_PER_COMP(DEST(u) = SRC(i, 0) ? NIR_TRUE : NIR_FALSE);
-  break;
-   case nir_op_u2f:
-  FOLD_PER_COMP(DEST(f) = SRC(u, 0));
-  break;
-   case nir_op_bany2:
-  DEST_COMP(u, 0) = (SRC_COMP(u, 0, 0) || SRC_COMP(u, 0, 1)) ?
-NIR_TRUE : NIR_FALSE;
-  break;
-   case nir_op_fadd:
-  FOLD_PER_COMP(DEST(f) = SRC(f, 0) + SRC(f, 1));
-  break;
-   case nir_op_iadd:
-  FOLD_PER_COMP(DEST(i) = SRC(i, 0) + SRC(i, 1));
-  break;
-   case nir_op_fsub:
-  FOLD_PER_COMP(DEST(f) = SRC(f, 0) - SRC(f, 1));
-  break;
-   case nir_op_isub:
-  FOLD_PER_COMP(DEST(i) = SRC(i, 0) - SRC(i, 1));
-  break;
-   case nir_op_fmul:
-  FOLD_PER_COMP(DEST(f) = SRC(f, 0) * SRC(f, 1));
-  break;
-   case nir_op_imul:
-  FOLD_PER_COMP(DEST(i) = SRC(i, 0) * SRC(i, 1));
-  break;
-   case nir_op_fdiv:
-  FOLD_PER_COMP(DEST(f) = SRC(f, 0) / SRC(f, 1));
-  break;
-   case nir_op_idiv:
-  FOLD_PER_COMP(DEST(i) = SRC(i, 0) / SRC(i, 1));
-  break;
-   case nir_op_udiv:
-  FOLD_PER_COMP(DEST(u) = SRC(u, 0) / SRC(u, 1));
-  break;
-   case nir_op_flt:
-  FOLD_PER_COMP(DEST(u) = (SRC(f, 0) < SRC(f, 1)) ? NIR_TRUE : NIR_FALSE);
-  break;
-   case nir_op_fge:
-  FOLD_PER_COMP(DEST(u) = (SRC(f, 0) >= SRC(f, 1)) ? NIR_TRUE : NIR_FALSE);
-  break

Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Emil Velikov
On 23/01/15 20:51, Jason Ekstrand wrote:
> 
> 
> On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov  > wrote:
> 
> On 05/01/15 17:45, Laura Ekstrand wrote:
> > This comment is vague.  Do you have a specific recommendation for the
> > code here?
> >
> Seems like I'm way too subtle - yes I have a few.
> 
> 
> 1. Add ARB_direct_state_access to struct gl_extension
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3731,6 +3731,7 @@ struct gl_extensions
> GLboolean ARB_depth_clamp;
> GLboolean ARB_depth_texture;
> GLboolean ARB_derivative_control;
> +   GLboolean ARB_direct_state_access
> GLboolean ARB_draw_buffers_blend;
> GLboolean ARB_draw_elements_base_vertex;
> 
> 
> 2. Use it in the extensions table.
> --- a/src/mesa/main/extensions.c
> +++ b/src/mesa/main/extensions.c
> @@ -103,6 +103,7 @@ static const struct extension extension_table[] = {
> { "GL_ARB_depth_clamp", o(ARB_depth_clamp),
> GL, 2003 },
> { "GL_ARB_depth_texture",
> o(ARB_depth_texture),   GLL,2001 },
> { "GL_ARB_derivative_control",
> o(ARB_derivative_control),  GL, 2014 },
> +   { "GL_ARB_direct_state_access",
> o(ARB_direct_state_access), GL, 2014 },
> 
> 
> 3. Make use of if when the spec amends existing behaviour - most of the
> spec text as of section "New Tokens" onwards. Clearly with this series
> you're adding the new entry points(functions) so it does not apply
> here :)
> 
> 
> if (foo->Extensions.ARB_direct_state_access) {
>  
> }
> 
> 
> Pretty much every extension that was added to mesa follows this approach
> so keeping up with traditions is always nice.
> 
> 
> Yes, and no...  We have the table of booleans in gl_extensions so that
> we can expose different extensions/behavior on different drivers. 
> However, ARB_direct_state_access doesn't actually add new functionality,
> just new ways of getting at old functionality.  We *should* be able to
> implement it in a driver-agnostic way entirely within core mesa. 
> Therefore, there's no reason to be able to shut it off on a per-driver
> basis and no reason for the flag in gl_extensions.  If we find that, for
> some reason, we only want to support it in core contexts or that it adds
> something some drivers can't handle it, then we'll need the flag.
True, yet the usual approach so far had been:
1. add the flag
2. enable when/where possible
3. evaluate if things can be enabled for everyone
4. drop it (replace with dummy_true).
Why bother ? See below.

There will be a point where the extension will still be dummy_false, yet
the amendments to the spec will be applied.
At that point there will be a "few" reports from your QA team and other
people, that piglit (other) has regressed. Going the usual route will
save you that, at the cost of having one extra commit worth
(presumingly) ~50loc.

Hope with ^^ things make (a bit more) sense :)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Add algebraic optimizations for simplifying comparisons.

2015-01-23 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/10] nir: Add a bunch of algebraic optimizations on logic/bit operations.

2015-01-23 Thread Matt Turner
On Thu, Jan 22, 2015 at 7:27 AM, Jason Ekstrand  wrote:
> On Jan 22, 2015 3:41 AM, "Kenneth Graunke"  wrote:
>> Matt and I noticed a bunch of "val <- ior a a" operations in a shader,
>> so we decided to add an algebraic optimization for that.  While there,
>> I decided to add a bunch more of them.
>>
>> total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%)
>> NIR instructions in affected programs: 149634 -> 146937 (-1.80%)
>> helped:1032
>>
>> i965 already cleans these up, so the final results aren't impressive:
>>
>> total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%)
>> i965 instructions in affected programs: 764 -> 769 (0.65%)
>> HURT:   3
>>
>> However, improving the result of the NIR compile is worth doing.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/glsl/nir/nir_opt_algebraic.py | 16 
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_opt_algebraic.py
>> b/src/glsl/nir/nir_opt_algebraic.py
>> index 169bb41..cf16b19 100644
>> --- a/src/glsl/nir/nir_opt_algebraic.py
>> +++ b/src/glsl/nir/nir_opt_algebraic.py
>> @@ -68,6 +68,22 @@ optimizations = [
>> (('fadd', ('fmul', a, b), c), ('ffma', a, b, c)),
>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>> (('fmin', ('fmax', a, 1.0), 0.0), ('fsat', a)),
>> +   # Logical and bit operations
>> +   (('fand', a, a), a),
>
> This isn't correct.  The fand operation will normalize to 0.0/1.0.
>
>> +   (('fand', a, 0.0), 0.0),
>
> This is ok
>
>> +   (('iand', a, a), a),
>> +   (('iand', a, 0), 0),
>> +   (('for', a, a), a),
>> +   (('for', a, 0.0), a),
>
> Can't do these two either
>
>> +   (('ior', a, a), a),
>> +   (('ior', a, 0), a),
>> +   (('fxor', a, a), 0.0),
>
> This one should be ok
>
> With the junk optimizations removed,
> Reviewed-by: Jason Ekstrand 

Same,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/10] nir: Pull nir_instr_can_cse()'s SSA checks out of the switch.

2015-01-23 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Jason Ekstrand
On Fri, Jan 23, 2015 at 1:46 PM, Emil Velikov 
wrote:

> On 23/01/15 20:51, Jason Ekstrand wrote:
> >
> >
> > On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov  > > wrote:
> >
> > On 05/01/15 17:45, Laura Ekstrand wrote:
> > > This comment is vague.  Do you have a specific recommendation for
> the
> > > code here?
> > >
> > Seems like I'm way too subtle - yes I have a few.
> >
> >
> > 1. Add ARB_direct_state_access to struct gl_extension
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -3731,6 +3731,7 @@ struct gl_extensions
> > GLboolean ARB_depth_clamp;
> > GLboolean ARB_depth_texture;
> > GLboolean ARB_derivative_control;
> > +   GLboolean ARB_direct_state_access
> > GLboolean ARB_draw_buffers_blend;
> > GLboolean ARB_draw_elements_base_vertex;
> >
> >
> > 2. Use it in the extensions table.
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -103,6 +103,7 @@ static const struct extension extension_table[]
> = {
> > { "GL_ARB_depth_clamp",
>  o(ARB_depth_clamp),
> > GL, 2003 },
> > { "GL_ARB_depth_texture",
> > o(ARB_depth_texture),   GLL,2001 },
> > { "GL_ARB_derivative_control",
> > o(ARB_derivative_control),  GL, 2014 },
> > +   { "GL_ARB_direct_state_access",
> > o(ARB_direct_state_access), GL, 2014 },
> >
> >
> > 3. Make use of if when the spec amends existing behaviour - most of
> the
> > spec text as of section "New Tokens" onwards. Clearly with this
> series
> > you're adding the new entry points(functions) so it does not apply
> > here :)
> >
> >
> > if (foo->Extensions.ARB_direct_state_access) {
> >  
> > }
> >
> >
> > Pretty much every extension that was added to mesa follows this
> approach
> > so keeping up with traditions is always nice.
> >
> >
> > Yes, and no...  We have the table of booleans in gl_extensions so that
> > we can expose different extensions/behavior on different drivers.
> > However, ARB_direct_state_access doesn't actually add new functionality,
> > just new ways of getting at old functionality.  We *should* be able to
> > implement it in a driver-agnostic way entirely within core mesa.
> > Therefore, there's no reason to be able to shut it off on a per-driver
> > basis and no reason for the flag in gl_extensions.  If we find that, for
> > some reason, we only want to support it in core contexts or that it adds
> > something some drivers can't handle it, then we'll need the flag.
> True, yet the usual approach so far had been:
> 1. add the flag
> 2. enable when/where possible
> 3. evaluate if things can be enabled for everyone
> 4. drop it (replace with dummy_true).
> Why bother ? See below.
>

The "usual approach" is for extensions that add functionality and require
per-driver implementation.  This extension is kind of unique in that
*nothing* it adds is per-driver (as far as I know).


> There will be a point where the extension will still be dummy_false, yet
> the amendments to the spec will be applied.
>

What "ammendments to the spec"?  Once it gets implemented, we'll turn it on.


> At that point there will be a "few" reports from your QA team and other
> people, that piglit (other) has regressed. Going the usual route will
> save you that, at the cost of having one extra commit worth
> (presumingly) ~50loc.
>
> Hope with ^^ things make (a bit more) sense :)
>

Not really.  Right now it's not even 100% implemented, so it needs to be
off for everyone.  As far as anyone can tell, it will go directly from
dummy_false to dummy_true.  If we do find something in the way of
implementing it that can't be done on some drivers, we can add the flag and
then turn it on per-driver instead of turning it on for everyone.  I'm
really not seeing how a per-driver flag will do any good.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug.

2015-01-23 Thread Matt Turner
Reviewed-by: Matt Turner 

Although I'm not totally sure I want to be getting this information
all the time. I guess we can add an environment variable to control it
if needed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug.

2015-01-23 Thread Jason Ekstrand
On Thu, Jan 22, 2015 at 7:39 AM, Jason Ekstrand 
wrote:

>
> On Jan 22, 2015 3:41 AM, "Kenneth Graunke"  wrote:
> >
> > This allows us to count NIR instructions via shader-db.
> >
> > Use "run" as normal.  The results file will contain both NIR and
> > assembly.
> >
> > Then, to generate a NIR report:
> > ./report.py <(grepNIR results/foo) <(grepNIR results/bar)
> >
> > Or, to generate an i965 report:
> > ./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar)
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32
> 
> >  1 file changed, 32 insertions(+)
> >
> > I'm guessing the counting should really go in nir proper.
> > This is what I used to generate the statistics, in case people were
> > wondering.
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index 2d30321..0eb137f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -49,6 +49,28 @@ nir_optimize(nir_shader *nir)
> > } while (progress);
> >  }
> >
> > +static bool
> > +count_nir_instrs_in_block(nir_block *block, void *state)
> > +{
> > +   int *count = (int *) state;
> > +   nir_foreach_instr(block, instr) {
> > +  *count = *count + 1;
>
> *count++?  Also, we could get rid of the braces on this if.  Otherwise, I
> like this.
>
Reviewed-by: Jason Ekstrand 


> > +   }
> > +   return true;
> > +}
> > +
> > +static int
> > +count_nir_instrs(nir_shader *nir)
> > +{
> > +   int count = 0;
> > +   nir_foreach_overload(nir, overload) {
> > +  if (!overload->impl)
> > + continue;
> > +  nir_foreach_block(overload->impl, count_nir_instrs_in_block,
> &count);
> > +   }
> > +   return count;
> > +}
> > +
> >  void
> >  fs_visitor::emit_nir_code()
> >  {
> > @@ -99,6 +121,16 @@ fs_visitor::emit_nir_code()
> >nir_print_shader(nir, stderr);
> > }
> >
> > +   if (dispatch_width == 8) {
> > +  static GLuint msg_id = 0;
> > +  _mesa_gl_debug(&brw->ctx, &msg_id,
> > + MESA_DEBUG_SOURCE_SHADER_COMPILER,
> > + MESA_DEBUG_TYPE_OTHER,
> > + MESA_DEBUG_SEVERITY_NOTIFICATION,
> > + "FS NIR shader: %d inst\n",
> > + count_nir_instrs(nir));
> > +   }
> > +
> > nir_convert_from_ssa(nir);
> > nir_validate_shader(nir);
> > nir_lower_vec_to_movs(nir);
> > --
> > 2.2.2
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] i965/nir: Print NIR on INTEL_DEBUG=fs.

2015-01-23 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/10] nir: Pull nir_instr_can_cse()'s SSA checks out of the switch.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 23, 2015 at 1:50 PM, Matt Turner  wrote:

> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/10] nir: Implement CSE on intrinsics that can be eliminated and reordered.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, Jan 22, 2015 at 3:41 AM, Kenneth Graunke 
wrote:

> Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had
> load_input and load_uniform intrinsics repeated several times, with the
> same parameters, but each one generating a distinct SSA value.  This
> made ALU operations on those values appear distinct as well.
>
> Generating distinct SSA values is silly - these are read only variables.
> CSE'ing them makes everything use a single SSA value, which then allows
> other operations to be CSE'd away as well.
>
> Generalizing a bit, it seems like we should be able to safely CSE any
> intrinsics that can be eliminated and reordered.  I didn't implement
> support for variables for the time being.
>
> v2: Assert that info->num_variables == 0 (requested by Jason).
>
> total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%)
> NIR instructions in affected programs: 2413496 -> 2001071 (-17.09%)
> helped:16872
>
> total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%)
> i965 instructions in affected programs: 640654 -> 620094 (-3.21%)
> helped: 2071
> HURT:   585
> GAINED: 14
> LOST:   25
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/nir_opt_cse.c | 40 ++--
>  1 file changed, 38 insertions(+), 2 deletions(-)
>
> Here's v2 of CSE for intrinsics.  Sounds like it's good to go.
>
> diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
> index fef1678..b3e9c0d 100644
> --- a/src/glsl/nir/nir_opt_cse.c
> +++ b/src/glsl/nir/nir_opt_cse.c
> @@ -112,7 +112,34 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2)
>
>return true;
> }
> -   case nir_instr_type_intrinsic:
> +   case nir_instr_type_intrinsic: {
> +  nir_intrinsic_instr *intrinsic1 = nir_instr_as_intrinsic(instr1);
> +  nir_intrinsic_instr *intrinsic2 = nir_instr_as_intrinsic(instr2);
> +  const nir_intrinsic_info *info =
> + &nir_intrinsic_infos[intrinsic1->intrinsic];
> +
> +  if (intrinsic1->intrinsic != intrinsic2->intrinsic ||
> +  intrinsic1->num_components != intrinsic2->num_components)
> + return false;
> +
> +  if (info->has_dest && intrinsic1->dest.ssa.num_components !=
> +intrinsic2->dest.ssa.num_components)
> + return false;
> +
> +  for (unsigned i = 0; i < info->num_srcs; i++) {
> + if (!nir_srcs_equal(intrinsic1->src[i], intrinsic2->src[i]))
> +return false;
> +  }
> +
> +  assert(info->num_variables == 0);
> +
> +  for (unsigned i = 0; i < info->num_indices; i++) {
> + if (intrinsic1->const_index[i] != intrinsic2->const_index[i])
> +return false;
> +  }
> +
> +  return true;
> +   }
> case nir_instr_type_call:
> case nir_instr_type_jump:
> case nir_instr_type_ssa_undef:
> @@ -151,7 +178,13 @@ nir_instr_can_cse(nir_instr *instr)
>return true;
> case nir_instr_type_tex:
>return false; /* TODO */
> -   case nir_instr_type_intrinsic:
> +   case nir_instr_type_intrinsic: {
> +  const nir_intrinsic_info *info =
> + &nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic];
> +  return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
> + (info->flags & NIR_INTRINSIC_CAN_REORDER) &&
> + info->num_variables == 0; /* not implemented yet */
> +   }
> case nir_instr_type_call:
> case nir_instr_type_jump:
> case nir_instr_type_ssa_undef:
> @@ -176,6 +209,9 @@ nir_instr_get_dest_ssa_def(nir_instr *instr)
> case nir_instr_type_phi:
>assert(nir_instr_as_phi(instr)->dest.is_ssa);
>return &nir_instr_as_phi(instr)->dest.ssa;
> +   case nir_instr_type_intrinsic:
> +  assert(nir_instr_as_intrinsic(instr)->dest.is_ssa);
> +  return &nir_instr_as_intrinsic(instr)->dest.ssa;
> default:
>unreachable("We never ask for any of these");
> }
> --
> 2.2.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] nir: Add algebraic optimizations for pointless shifts.

2015-01-23 Thread Jason Ekstrand
On Thu, Jan 22, 2015 at 9:09 AM, Matt Turner  wrote:

> On Thu, Jan 22, 2015 at 3:41 AM, Kenneth Graunke 
> wrote:
> > The GLSL IR optimization pass contained these; we may as well include
> > them too.
> >
> > No change in the number of NIR instructions on a shader-db run.
> >
> > total i965 instructions in shared programs: 6035397 -> 6035393 (-0.00%)
> > i965 instructions in affected programs: 772 -> 768 (-0.52%)
> > helped: 3 (all in glamor)
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/glsl/nir/nir_opt_algebraic.py | 7 +++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/src/glsl/nir/nir_opt_algebraic.py
> b/src/glsl/nir/nir_opt_algebraic.py
> > index cf16b19..58e71e0 100644
> > --- a/src/glsl/nir/nir_opt_algebraic.py
> > +++ b/src/glsl/nir/nir_opt_algebraic.py
> > @@ -83,6 +83,13 @@ optimizations = [
> > # DeMorgan's Laws
> > (('iand', ('inot', a), ('inot', b)), ('inot', ('ior',  a, b))),
> > (('ior',  ('inot', a), ('inot', b)), ('inot', ('iand', a, b))),
> > +   # Shift optimizations
> > +   (('ishl', 0, a), 0),
>
> Shift zero by an unknown -> zero. Yes.
>
> > +   (('ishl', a, 0), 0),
>
> Shift an unknown by zero -> zero?!
>

Yeah, that needs to be fixed


>
> With those fixed and shader-db results confirmed,
>

Same

Reviewed-by: Jason Ekstrand 


>
> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Add algebraic optimizations for simplifying comparisons.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 23, 2015 at 1:46 PM, Matt Turner  wrote:

> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/10] nir: Add algebraic optimizations for exponential/logarithmic functions.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, Jan 22, 2015 at 3:41 AM, Kenneth Graunke 
wrote:

> Most of these exist in the GLSL IR algebraic pass already.  However,
> SSA allows us to find more instances of the patterns.
>
> total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%)
> NIR instructions in affected programs: 124189 -> 120026 (-3.35%)
> helped:604
>
> total i965 instructions in shared programs: 6025508 -> 6018718 (-0.11%)
> i965 instructions in affected programs: 261070 -> 254280 (-2.60%)
> helped: 1295
> HURT:   2 (by 1 instruction each)
> GAINED: 6
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/nir_opt_algebraic.py | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/glsl/nir/nir_opt_algebraic.py
> b/src/glsl/nir/nir_opt_algebraic.py
> index dec250b..a5b5715 100644
> --- a/src/glsl/nir/nir_opt_algebraic.py
> +++ b/src/glsl/nir/nir_opt_algebraic.py
> @@ -99,6 +99,16 @@ optimizations = [
> (('ishr', a, 0), 0),
> (('ushr', 0, a), 0),
> (('ushr', a, 0), 0),
> +   # Exponential/logarithmic identities
> +   (('fexp2', ('flog2', a)), a), # 2^lg2(a) = a
> +   (('fexp',  ('flog',  a)), a), # e^ln(a)  = a
> +   (('flog2', ('fexp2', a)), a), # lg2(2^a) = a
> +   (('flog',  ('fexp',  a)), a), # ln(e^a)  = a
> +   (('fexp2', ('fmul', ('flog2', a), b)), ('fpow', a, b)), # 2^(lg2(a)*b)
> = a^b
> +   (('fexp',  ('fmul', ('flog', a), b)),  ('fpow', a, b)), # e^(ln(a)*b)
> = a^b
> +   (('fpow', a, 1.0), a),
> +   (('fpow', a, 2.0), ('fmul', a, a)),
> +   (('fpow', 2.0, a), ('fexp2', a)),
>
>  # This one may not be exact
> (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
> --
> 2.2.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] nir: Add algebraic optimizations for division and reciprocal.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, Jan 22, 2015 at 9:18 AM, Matt Turner  wrote:

> 8-10 are
>
> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Emil Velikov
On 23/01/15 19:24, Haixia Shi wrote:
> Hi Emil,
> 
> On Fri, Jan 23, 2015 at 8:42 AM, Emil Velikov  
> wrote:
>> Might be worth having a look at how platform_drm does it. But we warned
>> there be dragons :)
> 
> It seems platform_drm would cast disp->PlatformDisplay to a gbm_device
> and use it if available; otherwise it always uses the first normal
> node (/dev/dri/card0).
> 
> Can it be assumed that if render nodes are available then it would
> always be the first one (/dev/dri/renderD128)? Otherwise I still think
> it is correct to run a for loop to try all the available render nodes
> (renderD128..renderD191)
> 
Ouch, so it seems that things are already in the "funny" lane. In that
case I would like to withdraw my objection on the topic, considering
that we already have a similar "not so elegant" solution.

Perhaps nvidia's GL extensions on the topic might provide a bit more
flexibility ?

Whereas for the "should we loop through all the rnodes vs open the first
one", I don't have a strong preference for either one.

Cheers,
Emil

P.S. For the future, please add a note as to which patch is the last
one. Revision changelog/info would be appreciated as well.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Ian Romanick
On 01/23/2015 01:53 PM, Jason Ekstrand wrote:
> On Fri, Jan 23, 2015 at 1:46 PM, Emil Velikov  > wrote:
> On 23/01/15 20:51, Jason Ekstrand wrote:
> > On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov  
> >  >> wrote:
> >
> > On 05/01/15 17:45, Laura Ekstrand wrote:
> > > This comment is vague.  Do you have a specific
> recommendation for the
> > > code here?
> > >
> > Seems like I'm way too subtle - yes I have a few.
> >
> >
> > 1. Add ARB_direct_state_access to struct gl_extension
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -3731,6 +3731,7 @@ struct gl_extensions
> > GLboolean ARB_depth_clamp;
> > GLboolean ARB_depth_texture;
> > GLboolean ARB_derivative_control;
> > +   GLboolean ARB_direct_state_access
> > GLboolean ARB_draw_buffers_blend;
> > GLboolean ARB_draw_elements_base_vertex;
> >
> >
> > 2. Use it in the extensions table.
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -103,6 +103,7 @@ static const struct extension
> extension_table[] = {
> > { "GL_ARB_depth_clamp",   
>  o(ARB_depth_clamp),
> > GL, 2003 },
> > { "GL_ARB_depth_texture",
> > o(ARB_depth_texture),   GLL,   
> 2001 },
> > { "GL_ARB_derivative_control",
> > o(ARB_derivative_control),  GL,   
>  2014 },
> > +   { "GL_ARB_direct_state_access",
> > o(ARB_direct_state_access), GL,   
>  2014 },
> >
> >
> > 3. Make use of if when the spec amends existing behaviour -
> most of the
> > spec text as of section "New Tokens" onwards. Clearly with
> this series
> > you're adding the new entry points(functions) so it does not apply
> > here :)
> >
> >
> > if (foo->Extensions.ARB_direct_state_access) {
> >  
> > }
> >
> >
> > Pretty much every extension that was added to mesa follows
> this approach
> > so keeping up with traditions is always nice.
> >
> >
> > Yes, and no...  We have the table of booleans in gl_extensions so that
> > we can expose different extensions/behavior on different drivers.
> > However, ARB_direct_state_access doesn't actually add new
> functionality,
> > just new ways of getting at old functionality.  We *should* be able to
> > implement it in a driver-agnostic way entirely within core mesa.
> > Therefore, there's no reason to be able to shut it off on a per-driver
> > basis and no reason for the flag in gl_extensions.  If we find
> that, for
> > some reason, we only want to support it in core contexts or that
> it adds
> > something some drivers can't handle it, then we'll need the flag.
> True, yet the usual approach so far had been:
> 1. add the flag
> 2. enable when/where possible
> 3. evaluate if things can be enabled for everyone
> 4. drop it (replace with dummy_true).
> Why bother ? See below.
> 
> 
> The "usual approach" is for extensions that add functionality and
> require per-driver implementation.  This extension is kind of unique in
> that *nothing* it adds is per-driver (as far as I know).
>  
> 
> There will be a point where the extension will still be dummy_false, yet
> the amendments to the spec will be applied.
> 
> 
> What "ammendments to the spec"?  Once it gets implemented, we'll turn it on.
>  
> 
> At that point there will be a "few" reports from your QA team and other
> people, that piglit (other) has regressed. Going the usual route will
> save you that, at the cost of having one extra commit worth
> (presumingly) ~50loc.
> 
> Hope with ^^ things make (a bit more) sense :)
> 
> 
> Not really.  Right now it's not even 100% implemented, so it needs to be
> off for everyone.  As far as anyone can tell, it will go directly from
> dummy_false to dummy_true.  If we do find something in the way of
> implementing it that can't be done on some drivers, we can add the flag
> and then turn it on per-driver instead of turning it on for everyone. 
> I'm really not seeing how a per-driver flag will do any good.

Yeah, I agree.  It's pretty common for things that are just API sugar to
start life as dummy_true.  The risk is generally low.  It may be a bit
higher in this case since there's a LOT of sugar, but I'm not terribly
worried.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listi

[Mesa-dev] [PATCH] GL: Update glext.h to fix ARB_dsa function prototypes.

2015-01-23 Thread Laura Ekstrand
---
 include/GL/glext.h | 48 ++--
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/include/GL/glext.h b/include/GL/glext.h
index d3cfbb5..0ca89ca 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -33,7 +33,7 @@ extern "C" {
 ** used to make the header, and the header can be found at
 **   http://www.opengl.org/registry/
 **
-** Khronos $Revision: 28986 $ on $Date: 2014-11-18 18:43:15 -0800 (Tue, 18 Nov 
2014) $
+** Khronos $Revision: 29537 $ on $Date: 2015-01-22 02:32:35 -0800 (Thu, 22 Jan 
2015) $
 */
 
 #if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && 
!defined(__SCITECH_SNAP__)
@@ -53,7 +53,7 @@ extern "C" {
 #define GLAPI extern
 #endif
 
-#define GL_GLEXT_VERSION 20141118
+#define GL_GLEXT_VERSION 20150122
 
 /* Generated C header for:
  * API: gl
@@ -2607,25 +2607,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, 
GLsizei count, const GLui
 typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum depth);
 typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint 
*ids);
 typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint xfb, 
GLuint index, GLuint buffer);
-typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, 
GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
+typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, 
GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
 typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb, GLenum 
pname, GLint *param);
 typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb, GLenum 
pname, GLuint index, GLint *param);
 typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb, 
GLenum pname, GLuint index, GLint64 *param);
 typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint *buffers);
-typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizei 
size, const void *data, GLbitfield flags);
-typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizei 
size, const void *data, GLenum usage);
-typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr 
offset, GLsizei size, const void *data);
-typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, 
GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);
+typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, 
GLsizeiptr size, const void *data, GLbitfield flags);
+typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizeiptr 
size, const void *data, GLenum usage);
+typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr 
offset, GLsizeiptr size, const void *data);
+typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, 
GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
 typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer, GLenum 
internalformat, GLenum format, GLenum type, const void *data);
-typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, 
GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum 
type, const void *data);
+typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, 
GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum 
type, const void *data);
 typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum 
access);
-typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, 
GLintptr offset, GLsizei length, GLbitfield access);
+typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, 
GLintptr offset, GLsizeiptr length, GLbitfield access);
 typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);
-typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, 
GLintptr offset, GLsizei length);
+typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, 
GLintptr offset, GLsizeiptr length);
 typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint buffer, 
GLenum pname, GLint *params);
 typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint buffer, 
GLenum pname, GLint64 *params);
 typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer, 
GLenum pname, void **params);
-typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, 
GLintptr offset, GLsizei size, void *data);
+typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, 
GLintptr offset, GLsizeiptr size, void *data);
 typedef void (APIENTRYP PFNGLCREATEFRAMEBUFFERSPROC) (GLsizei n, GLuint 
*framebuffers);
 typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERRENDERBUFFERPROC) (GLuint 
framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
 typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERPARAMETERIPROC) (GLuint 
fram

Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Emil Velikov
On 23/01/15 21:53, Jason Ekstrand wrote:
> 
> 
> On Fri, Jan 23, 2015 at 1:46 PM, Emil Velikov  > wrote:
> 
> On 23/01/15 20:51, Jason Ekstrand wrote:
> >
> >
> > On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov  
> >  >> wrote:
> >
> > On 05/01/15 17:45, Laura Ekstrand wrote:
> > > This comment is vague.  Do you have a specific
> recommendation for the
> > > code here?
> > >
> > Seems like I'm way too subtle - yes I have a few.
> >
> >
> > 1. Add ARB_direct_state_access to struct gl_extension
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -3731,6 +3731,7 @@ struct gl_extensions
> > GLboolean ARB_depth_clamp;
> > GLboolean ARB_depth_texture;
> > GLboolean ARB_derivative_control;
> > +   GLboolean ARB_direct_state_access
> > GLboolean ARB_draw_buffers_blend;
> > GLboolean ARB_draw_elements_base_vertex;
> >
> >
> > 2. Use it in the extensions table.
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -103,6 +103,7 @@ static const struct extension
> extension_table[] = {
> > { "GL_ARB_depth_clamp",   
>  o(ARB_depth_clamp),
> > GL, 2003 },
> > { "GL_ARB_depth_texture",
> > o(ARB_depth_texture),   GLL,   
> 2001 },
> > { "GL_ARB_derivative_control",
> > o(ARB_derivative_control),  GL,   
>  2014 },
> > +   { "GL_ARB_direct_state_access",
> > o(ARB_direct_state_access), GL,   
>  2014 },
> >
> >
> > 3. Make use of if when the spec amends existing behaviour -
> most of the
> > spec text as of section "New Tokens" onwards. Clearly with
> this series
> > you're adding the new entry points(functions) so it does not apply
> > here :)
> >
> >
> > if (foo->Extensions.ARB_direct_state_access) {
> >  
> > }
> >
> >
> > Pretty much every extension that was added to mesa follows
> this approach
> > so keeping up with traditions is always nice.
> >
> >
> > Yes, and no...  We have the table of booleans in gl_extensions so that
> > we can expose different extensions/behavior on different drivers.
> > However, ARB_direct_state_access doesn't actually add new
> functionality,
> > just new ways of getting at old functionality.  We *should* be able to
> > implement it in a driver-agnostic way entirely within core mesa.
> > Therefore, there's no reason to be able to shut it off on a per-driver
> > basis and no reason for the flag in gl_extensions.  If we find
> that, for
> > some reason, we only want to support it in core contexts or that
> it adds
> > something some drivers can't handle it, then we'll need the flag.
> True, yet the usual approach so far had been:
> 1. add the flag
> 2. enable when/where possible
> 3. evaluate if things can be enabled for everyone
> 4. drop it (replace with dummy_true).
> Why bother ? See below.
> 
> 
> The "usual approach" is for extensions that add functionality and
> require per-driver implementation.  This extension is kind of unique in
> that *nothing* it adds is per-driver (as far as I know).
>  
There has been other similar cases, yet I cannot pick one from the top
of my head. And yes I did understand that is has *nothing* driver
specific about it :)

> 
> There will be a point where the extension will still be dummy_false, yet
> the amendments to the spec will be applied.
> 
> 
> What "ammendments to the spec"?  Once it gets implemented, we'll turn it on.
>  
See note 3, that I've mentioned above. Here is a rough example:

As you handle the following
"
Accepted by the  parameter of GetTextureParameter{if}v and
GetTextureParameterI{i ui}v:

TEXTURE_TARGET  0x1006
"

you will allow the pname, in a scenario when one should not.
I.e. the extension will not be advertised, yet the parameter will be
accepted and no error will be thrown.

This is a silly example, yet I hope it illustrates the point.

> 
> At that point there will be a "few" reports from your QA team and other
> people, that piglit (other) has regressed. Going the usual route will
> save you that, at the cost of having one extra commit worth
> (presumingly) ~50loc.
> 
> Hope with ^^ things make (a bit more) sense :)
> 
> 
> Not really.  Right now it's not even 100% implemented, so it needs to be
> off for everyone.
True, I'm not against that.

Re: [Mesa-dev] [PATCH] GL: Update glext.h to fix ARB_dsa function prototypes.

2015-01-23 Thread Laura Ekstrand
I checked, and all of the currently implemented DSA functions build with
this file.

Laura

On Fri, Jan 23, 2015 at 2:20 PM, Laura Ekstrand 
wrote:

> ---
>  include/GL/glext.h | 48 ++--
>  1 file changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/include/GL/glext.h b/include/GL/glext.h
> index d3cfbb5..0ca89ca 100644
> --- a/include/GL/glext.h
> +++ b/include/GL/glext.h
> @@ -33,7 +33,7 @@ extern "C" {
>  ** used to make the header, and the header can be found at
>  **   http://www.opengl.org/registry/
>  **
> -** Khronos $Revision: 28986 $ on $Date: 2014-11-18 18:43:15 -0800 (Tue,
> 18 Nov 2014) $
> +** Khronos $Revision: 29537 $ on $Date: 2015-01-22 02:32:35 -0800 (Thu,
> 22 Jan 2015) $
>  */
>
>  #if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) &&
> !defined(__SCITECH_SNAP__)
> @@ -53,7 +53,7 @@ extern "C" {
>  #define GLAPI extern
>  #endif
>
> -#define GL_GLEXT_VERSION 20141118
> +#define GL_GLEXT_VERSION 20150122
>
>  /* Generated C header for:
>   * API: gl
> @@ -2607,25 +2607,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint
> first, GLsizei count, const GLui
>  typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum
> depth);
>  typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n,
> GLuint *ids);
>  typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint
> xfb, GLuint index, GLuint buffer);
> -typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint
> xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
> +typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint
> xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
>  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb,
> GLenum pname, GLint *param);
>  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb,
> GLenum pname, GLuint index, GLint *param);
>  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb,
> GLenum pname, GLuint index, GLint64 *param);
>  typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint
> *buffers);
> -typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer,
> GLsizei size, const void *data, GLbitfield flags);
> -typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizei
> size, const void *data, GLenum usage);
> -typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLintptr offset, GLsizei size, const void *data);
> -typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint
> readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset,
> GLsizei size);
> +typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer,
> GLsizeiptr size, const void *data, GLbitfield flags);
> +typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer,
> GLsizeiptr size, const void *data, GLenum usage);
> +typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLintptr offset, GLsizeiptr size, const void *data);
> +typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint
> readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset,
> GLsizeiptr size);
>  typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer,
> GLenum internalformat, GLenum format, GLenum type, const void *data);
> -typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum
> type, const void *data);
> +typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format,
> GLenum type, const void *data);
>  typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum
> access);
> -typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer,
> GLintptr offset, GLsizei length, GLbitfield access);
> +typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer,
> GLintptr offset, GLsizeiptr length, GLbitfield access);
>  typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);
> -typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint
> buffer, GLintptr offset, GLsizei length);
> +typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint
> buffer, GLintptr offset, GLsizeiptr length);
>  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint
> buffer, GLenum pname, GLint *params);
>  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint
> buffer, GLenum pname, GLint64 *params);
>  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer,
> GLenum pname, void **params);
> -typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLintptr offset, GLsizei size, void *data);
> +typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
> GLintptr offset, GLsizeiptr size, void *data);
>  typedef void (AP

Re: [Mesa-dev] [PATCH] i965: Do Sandybridge workaround flushes before each primitive.

2015-01-23 Thread Kenneth Graunke
On Thursday, January 22, 2015 07:14:33 PM Emil Velikov wrote:
> On 10/01/15 07:07, Kenneth Graunke wrote:
> > Sandybridge requires the post-sync non-zero workaround in a ton of
> > places, and if you ever miss one, the GPU usually hangs.
> > 
> Would it be worth including this in the stable branch ?
> Cc: 
> 
> Thanks
> Emil

Probably not right away, at least - this is really finnicky stuff, and while
I'm hoping it could help users with hangs, it might have the opposite effect
and cause lots of hangs :)

I was hoping to receive confirmation that it actually helped somebody first.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/41] glapi: Added ARB_direct_state_access.xml file.

2015-01-23 Thread Laura Ekstrand
Emil,

In situations such as your TEXTURE_TARGET example, the functionality is not
exposed to non-DSA functions.  I've been making the backend functions take
a bool dsa that instructs them how to behave depending on whether or not
glTexParameter or glTextureParameter is called.  Now this is arguably more
cumbersome than ctx->Extensions.ARB_direct_state_access, but it works to
prevent the user from seeing DSA functionality that would confuse them.

Laura

On Fri, Jan 23, 2015 at 2:20 PM, Emil Velikov 
wrote:

> On 23/01/15 21:53, Jason Ekstrand wrote:
> >
> >
> > On Fri, Jan 23, 2015 at 1:46 PM, Emil Velikov  > > wrote:
> >
> > On 23/01/15 20:51, Jason Ekstrand wrote:
> > >
> > >
> > > On Thu, Jan 22, 2015 at 9:27 AM, Emil Velikov <
> emil.l.veli...@gmail.com 
> > >  > >> wrote:
> > >
> > > On 05/01/15 17:45, Laura Ekstrand wrote:
> > > > This comment is vague.  Do you have a specific
> > recommendation for the
> > > > code here?
> > > >
> > > Seems like I'm way too subtle - yes I have a few.
> > >
> > >
> > > 1. Add ARB_direct_state_access to struct gl_extension
> > > --- a/src/mesa/main/mtypes.h
> > > +++ b/src/mesa/main/mtypes.h
> > > @@ -3731,6 +3731,7 @@ struct gl_extensions
> > > GLboolean ARB_depth_clamp;
> > > GLboolean ARB_depth_texture;
> > > GLboolean ARB_derivative_control;
> > > +   GLboolean ARB_direct_state_access
> > > GLboolean ARB_draw_buffers_blend;
> > > GLboolean ARB_draw_elements_base_vertex;
> > >
> > >
> > > 2. Use it in the extensions table.
> > > --- a/src/mesa/main/extensions.c
> > > +++ b/src/mesa/main/extensions.c
> > > @@ -103,6 +103,7 @@ static const struct extension
> > extension_table[] = {
> > > { "GL_ARB_depth_clamp",
> >  o(ARB_depth_clamp),
> > > GL, 2003 },
> > > { "GL_ARB_depth_texture",
> > > o(ARB_depth_texture),   GLL,
> > 2001 },
> > > { "GL_ARB_derivative_control",
> > > o(ARB_derivative_control),  GL,
> >  2014 },
> > > +   { "GL_ARB_direct_state_access",
> > > o(ARB_direct_state_access), GL,
> >  2014 },
> > >
> > >
> > > 3. Make use of if when the spec amends existing behaviour -
> > most of the
> > > spec text as of section "New Tokens" onwards. Clearly with
> > this series
> > > you're adding the new entry points(functions) so it does not
> apply
> > > here :)
> > >
> > >
> > > if (foo->Extensions.ARB_direct_state_access) {
> > >  
> > > }
> > >
> > >
> > > Pretty much every extension that was added to mesa follows
> > this approach
> > > so keeping up with traditions is always nice.
> > >
> > >
> > > Yes, and no...  We have the table of booleans in gl_extensions so
> that
> > > we can expose different extensions/behavior on different drivers.
> > > However, ARB_direct_state_access doesn't actually add new
> > functionality,
> > > just new ways of getting at old functionality.  We *should* be
> able to
> > > implement it in a driver-agnostic way entirely within core mesa.
> > > Therefore, there's no reason to be able to shut it off on a
> per-driver
> > > basis and no reason for the flag in gl_extensions.  If we find
> > that, for
> > > some reason, we only want to support it in core contexts or that
> > it adds
> > > something some drivers can't handle it, then we'll need the flag.
> > True, yet the usual approach so far had been:
> > 1. add the flag
> > 2. enable when/where possible
> > 3. evaluate if things can be enabled for everyone
> > 4. drop it (replace with dummy_true).
> > Why bother ? See below.
> >
> >
> > The "usual approach" is for extensions that add functionality and
> > require per-driver implementation.  This extension is kind of unique in
> > that *nothing* it adds is per-driver (as far as I know).
> >
> There has been other similar cases, yet I cannot pick one from the top
> of my head. And yes I did understand that is has *nothing* driver
> specific about it :)
>
> >
> > There will be a point where the extension will still be dummy_false,
> yet
> > the amendments to the spec will be applied.
> >
> >
> > What "ammendments to the spec"?  Once it gets implemented, we'll turn it
> on.
> >
> See note 3, that I've mentioned above. Here is a rough example:
>
> As you handle the following
> "
> Accepted by the  parameter of GetTextureParameter{if}v and
> GetTextureParameterI{i ui}v:
>
> TEXTURE_TARGET   

[Mesa-dev] [PATCH] i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful

2015-01-23 Thread Ian Romanick
From: Ian Romanick 

If try_replace_with_sel is able to replace the flow control with a SEL
instruction, then there is no flow control... failing SIMD16 because
of nonexistent flow control is wrong.

No piglit regressions on any i965 platform in Jenkins.

total instructions in shared programs: 4382707 -> 4382707 (0.00%)
instructions in affected programs: 0 -> 0
helped:0
HURT:  0
GAINED:2089
LOST:  0

No other platforms affected in shader-db.

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 +++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 16 +---
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 1de10bb..419fe48 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -535,7 +535,7 @@ public:
bool try_emit_saturate(ir_expression *ir);
bool try_emit_line(ir_expression *ir);
bool try_emit_mad(ir_expression *ir);
-   void try_replace_with_sel();
+   bool try_replace_with_sel();
bool opt_peephole_sel();
bool opt_peephole_predicated_break();
bool opt_saturate_propagation();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 510092e..2d64f3a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -384,10 +384,6 @@ fs_visitor::nir_emit_cf_list(exec_list *list)
 void
 fs_visitor::nir_emit_if(nir_if *if_stmt)
 {
-   if (brw->gen < 6) {
-  no16("Can't support (non-uniform) control flow on SIMD16\n");
-   }
-
/* first, put the condition into f0 */
fs_inst *inst = emit(MOV(reg_null_d,
 retype(get_nir_src(if_stmt->condition),
@@ -405,7 +401,9 @@ fs_visitor::nir_emit_if(nir_if *if_stmt)
 
emit(BRW_OPCODE_ENDIF);
 
-   try_replace_with_sel();
+   if (!try_replace_with_sel() && brw->gen < 6) {
+  no16("Can't support (non-uniform) control flow on SIMD16\n");
+   }
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9805b55..f5d7383 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2741,7 +2741,7 @@ fs_visitor::emit_if_gen6(ir_if *ir)
  *
  * If src0 is an immediate value, we promote it to a temporary GRF.
  */
-void
+bool
 fs_visitor::try_replace_with_sel()
 {
fs_inst *endif_inst = (fs_inst *) instructions.get_tail();
@@ -2755,7 +2755,7 @@ fs_visitor::try_replace_with_sel()
fs_inst *match = (fs_inst *) endif_inst->prev;
for (int i = 0; i < 4; i++) {
   if (match->is_head_sentinel() || match->opcode != opcodes[4-i-1])
- return;
+ return false;
   match = (fs_inst *) match->prev;
}
 
@@ -2797,16 +2797,16 @@ fs_visitor::try_replace_with_sel()
  sel->predicate = if_inst->predicate;
  sel->predicate_inverse = if_inst->predicate_inverse;
   }
+
+  return true;
}
+
+   return false;
 }
 
 void
 fs_visitor::visit(ir_if *ir)
 {
-   if (brw->gen < 6) {
-  no16("Can't support (non-uniform) control flow on SIMD16\n");
-   }
-
/* Don't point the annotation at the if statement, because then it plus
 * the then and else blocks get printed.
 */
@@ -2836,7 +2836,9 @@ fs_visitor::visit(ir_if *ir)
 
emit(BRW_OPCODE_ENDIF);
 
-   try_replace_with_sel();
+   if (!try_replace_with_sel() && brw->gen < 6) {
+  no16("Can't support (non-uniform) control flow on SIMD16\n");
+   }
 }
 
 void
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Make vec-to-movs handle src/dest aliasing.

2015-01-23 Thread Eric Anholt
Connor Abbott  writes:

> Argh, nevermind, I was reading it wrong...
>
> On Thu, Jan 22, 2015 at 8:18 PM, Connor Abbott  wrote:
>> What happens if you have something like foo = vec3(foo.z, bar.x,
>> foo.x)? I don't think emitting vector mov's for only the contiguous
>> components is enough.
>>
>> On Thu, Jan 22, 2015 at 4:51 PM, Eric Anholt  wrote:
>>> +static unsigned
>>> +insert_movs(nir_alu_instr *vec, unsigned start_channel,
>>> +unsigned start_src_idx, void *mem_ctx)
>
> We need a comment explaining what this function does and what it
> returns. Also, it only creates a single move so it should be called
> insert_mov().

How about:

/**
 * For a given writemask channel in the vec instruction, insert a MOV of all
 * the src values that come from the same reg to the destination of the vec
 * instruction.
 */


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Do Sandybridge workaround flushes before each primitive.

2015-01-23 Thread Emil Velikov
On 23/01/15 22:25, Kenneth Graunke wrote:
> On Thursday, January 22, 2015 07:14:33 PM Emil Velikov wrote:
>> On 10/01/15 07:07, Kenneth Graunke wrote:
>>> Sandybridge requires the post-sync non-zero workaround in a ton of
>>> places, and if you ever miss one, the GPU usually hangs.
>>>
>> Would it be worth including this in the stable branch ?
>> Cc: 
>>
>> Thanks
>> Emil
> 
> Probably not right away, at least - this is really finnicky stuff, and while
> I'm hoping it could help users with hangs, it might have the opposite effect
> and cause lots of hangs :)
> 
> I was hoping to receive confirmation that it actually helped somebody first.
> 
I'm having a SandyBrigde machine here. If you have a rough idea how one
can replicate the gpu hangs I can give it a try.

I've only seen GPU two hangs, but that was in the dawn of 10.2 (early
10.3). Happened while running piglit yet the issue was intermittent.

Cheers
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful

2015-01-23 Thread Matt Turner
On Fri, Jan 23, 2015 at 2:32 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> If try_replace_with_sel is able to replace the flow control with a SEL
> instruction, then there is no flow control... failing SIMD16 because
> of nonexistent flow control is wrong.
>
> No piglit regressions on any i965 platform in Jenkins.
>
> total instructions in shared programs: 4382707 -> 4382707 (0.00%)
> instructions in affected programs: 0 -> 0
> helped:0
> HURT:  0
> GAINED:2089
> LOST:  0
>
> No other platforms affected in shader-db.

We can probably do a little better by bailing in the generator when we
emit an IF instruction, since some optimizations like
opt_peephole_sel() and opt_peephole_predicated_break() remove control
flow.

But this is an improvement by itself, so

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/16] i965/fs: Add pass to propagate conditional modifiers.

2015-01-23 Thread Kenneth Graunke
On Monday, January 19, 2015 03:31:10 PM Matt Turner wrote:
> total instructions in shared programs: 5974160 -> 5959463 (-0.25%)
> instructions in affected programs: 1743737 -> 1729040 (-0.84%)
> GAINED:0
> LOST:  12
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |  1 +
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  1 +
>  src/mesa/drivers/dri/i965/brw_fs.h |  1 +
>  .../drivers/dri/i965/brw_fs_cmod_propagation.cpp   | 97 
> ++
>  4 files changed, 100 insertions(+)
>  create mode 100644 src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 3b72955..da48455 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -39,6 +39,7 @@ i965_FILES = \
>   brw_ff_gs_emit.c \
>   brw_ff_gs.h \
>   brw_fs_channel_expressions.cpp \
> + brw_fs_cmod_propagation.cpp \
>   brw_fs_copy_propagation.cpp \
>   brw_fs.cpp \
>   brw_fs_cse.cpp \
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 73d722e..994d457 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3581,6 +3581,7 @@ fs_visitor::optimize()
>OPT(opt_cse);
>OPT(opt_copy_propagate);
>OPT(opt_peephole_predicated_break);
> +  OPT(opt_cmod_propagation);
>OPT(dead_code_eliminate);
>OPT(opt_peephole_sel);
>OPT(dead_control_flow_eliminate, this);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 9c125a6..e1bc7d7 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -539,6 +539,7 @@ public:
> bool opt_peephole_sel();
> bool opt_peephole_predicated_break();
> bool opt_saturate_propagation();
> +   bool opt_cmod_propagation();
> void emit_bool_to_cond_code(ir_rvalue *condition);
> void emit_if_gen6(ir_if *ir);
> void emit_unspill(bblock_t *block, fs_inst *inst, fs_reg reg,
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
> new file mode 100644
> index 000..5ba2fd6
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
> @@ -0,0 +1,97 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "brw_fs.h"
> +#include "brw_fs_live_variables.h"
> +#include "brw_cfg.h"
> +
> +/** @file brw_fs_cmod_propagation.cpp
> + *
> + * Implements a pass that propagates the conditional modifier from a CMP x 
> 0.0
> + * instruction into the instruction that generated x. For instance, in this
> + * sequence
> + *
> + *add(8)  g70<1>Fg69<8,8,1>F4096F
> + *cmp.ge.f0(8)null   g70<8,8,1>F0F
> + *
> + * we can do the comparison as part of the ADD instruction directly:
> + *
> + *add.ge.f0(8)g70<1>Fg69<8,8,1>F4096F
> + */
> +
> +static bool
> +opt_cmod_propagation_local(fs_visitor *v, bblock_t *block)
> +{
> +   bool progress = false;
> +   int ip = block->end_ip + 1;
> +
> +   foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
> +  ip--;
> +
> +  if (inst->opcode != BRW_OPCODE_CMP ||
> +  inst->predicate != BRW_PREDICATE_NONE ||
> +  !inst->dst.is_null() ||
> +  inst->src[0].file != GRF ||
> +  inst->src[0].abs ||
> +  inst->src[0].negate ||
> +  !inst->src[1].is_zero())
> + continue;
> +
> +  foreach_inst_in_block_reverse_starting_from(fs_inst, scan_inst, inst,
> +  block) {
>

Re: [Mesa-dev] [PATCH 1/4] nir: use Python to autogenerate opcode information

2015-01-23 Thread Matt Turner
On Fri, Jan 23, 2015 at 1:43 PM, Jason Ekstrand  wrote:
>  create mode 100644 src/glsl/nir/nir_opcodes.py
>  create mode 100644 src/glsl/nir/nir_opcodes_c.py
>  create mode 100644 src/glsl/nir/nir_opcodes_h.py

Add the python files to EXTRA_DIST in src/glsl/Makefile.am before you commit.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] nir: add new constant folding infrastructure

2015-01-23 Thread Connor Abbott
To be honest, I'm not sure how much clearer this is than the original
version... I think any way we do it is going to be somewhat messy. But
I do understand what it's doing (at least I think so), and it's better
to have something that both of us understand than something that only
one of us really understands. And I do like the idea of making the C
compiler do more of the work by making dst, src0, src1, etc. actual C
variables so that we can include the original expression/statement
directly.

On Fri, Jan 23, 2015 at 4:43 PM, Jason Ekstrand  wrote:
> Add a required field to the Opcode class, const_expr, that contains an
> expression or statement that computes the result of the opcode given known
> constant inputs. Then take those const_expr's and expand them into a function
> that takes an opcode and an array of constant inputs and spits out the 
> constant
> result. This means that when adding opcodes, there's one less place to update,
> and almost all the opcodes are self-documenting since the information on how 
> to
> compute the result is right next to the definition.
>
> The helper functions in nir_constant_expressions.c were taken from
> ir_constant_expressions.cpp.
>
> v3 Jason Ekstrand 
>  - Use mako to generate one function per opcode instead of doing piles of
>string splicing
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/glsl/Makefile.am |   5 +
>  src/glsl/Makefile.sources|   1 +
>  src/glsl/nir/.gitignore  |   1 +
>  src/glsl/nir/nir_constant_expressions.h  |  31 ++
>  src/glsl/nir/nir_constant_expressions.py | 319 ++
>  src/glsl/nir/nir_opcodes.py  | 562 
> +--
>  6 files changed, 735 insertions(+), 184 deletions(-)
>  create mode 100644 src/glsl/nir/nir_constant_expressions.h
>  create mode 100644 src/glsl/nir/nir_constant_expressions.py
>
> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> index 59dda5f..e145cb2 100644
> --- a/src/glsl/Makefile.am
> +++ b/src/glsl/Makefile.am
> @@ -217,6 +217,7 @@ BUILT_SOURCES =   
>   \
> glsl_lexer.cpp  \
> glcpp/glcpp-parse.c \
> glcpp/glcpp-lex.c   \
> +   nir/nir_constant_expressions.c  \
> nir/nir_opcodes.c   \
> nir/nir_opcodes.h   \
> nir/nir_opt_algebraic.c
> @@ -232,6 +233,10 @@ dist-hook:
> $(RM) glcpp/tests/*.out
> $(RM) glcpp/tests/subtest*/*.out
>
> +nir/nir_constant_expressions.c: nir/nir_opcodes.py 
> nir/nir_constant_expressions.py nir/nir_constant_expressions.h
> +   $(MKDIR_P) nir; \
> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py 
> > $@
> +
>  nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
> $(MKDIR_P) nir; \
> $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index dc1c55d..dd76c44 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -14,6 +14,7 @@ LIBGLCPP_GENERATED_FILES = \
> $(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
>
>  NIR_GENERATED_FILES = \
> +   $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c \
> $(GLSL_BUILDDIR)/nir/nir_opcodes.c \
> $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
> $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
> diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
> index 4c28193..261f64f 100644
> --- a/src/glsl/nir/.gitignore
> +++ b/src/glsl/nir/.gitignore
> @@ -1,3 +1,4 @@
>  nir_opt_algebraic.c
>  nir_opcodes.c
>  nir_opcodes.h
> +nir_constant_expressions.c
> diff --git a/src/glsl/nir/nir_constant_expressions.h 
> b/src/glsl/nir/nir_constant_expressions.h
> new file mode 100644
> index 000..97997f2
> --- /dev/null
> +++ b/src/glsl/nir/nir_constant_expressions.h
> @@ -0,0 +1,31 @@
> +/*
> + * Copyright © 2014 Connor Abbott
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILIT

Re: [Mesa-dev] [PATCH] nir: Add nir_lower_alu_scalar.

2015-01-23 Thread Eric Anholt
Jason Ekstrand  writes:

> Also, Could we rename this to nir_lower_alu_to_scalar?  That's more
> descriptive.

That's even better.  Done.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Make vec-to-movs handle src/dest aliasing.

2015-01-23 Thread Connor Abbott
On Fri, Jan 23, 2015 at 5:34 PM, Eric Anholt  wrote:
> Connor Abbott  writes:
>
>> Argh, nevermind, I was reading it wrong...
>>
>> On Thu, Jan 22, 2015 at 8:18 PM, Connor Abbott  wrote:
>>> What happens if you have something like foo = vec3(foo.z, bar.x,
>>> foo.x)? I don't think emitting vector mov's for only the contiguous
>>> components is enough.
>>>
>>> On Thu, Jan 22, 2015 at 4:51 PM, Eric Anholt  wrote:
 +static unsigned
 +insert_movs(nir_alu_instr *vec, unsigned start_channel,
 +unsigned start_src_idx, void *mem_ctx)
>>
>> We need a comment explaining what this function does and what it
>> returns. Also, it only creates a single move so it should be called
>> insert_mov().
>
> How about:
>
> /**
>  * For a given writemask channel in the vec instruction, insert a MOV of all
>  * the src values that come from the same reg to the destination of the vec
>  * instruction.
>  */

First, I think we need to be a little more clear on what it does, e.g.
"for a given starting channel/source, scans the rest of the sources of
the vec instruction to find all sources that come from the same reg
and emits a writemasked MOV from the reg to the destination of the vec
instruction." Also, you need to explain what the return value is (a
mask of all the channels/sources we inserted a MOV for).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Add a brw_fs_cmod_propagation unit test with large VGRF writes.

2015-01-23 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 .../drivers/dri/i965/test_fs_cmod_propagation.cpp  | 40 ++
 1 file changed, 40 insertions(+)

Yep, that is indeed broken, and my proposed code fixes it.

Here's a unit test that provokes the problem.  Thanks for putting together
the unit tests - they're quite handy!

diff --git a/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp 
b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp
index d5d7b58..cc184aa 100644
--- a/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp
@@ -375,3 +375,43 @@ TEST_F(cmod_propagation_test, movnz)
EXPECT_EQ(BRW_OPCODE_CMP, instruction(block0, 0)->opcode);
EXPECT_EQ(BRW_CONDITIONAL_GE, instruction(block0, 0)->conditional_mod);
 }
+
+TEST_F(cmod_propagation_test, intervening_dest_write)
+{
+   fs_reg dest = v->vgrf(glsl_type::vec4_type);
+   fs_reg src0 = v->vgrf(glsl_type::float_type);
+   fs_reg src1 = v->vgrf(glsl_type::float_type);
+   fs_reg src2 = v->vgrf(glsl_type::vec2_type);
+   fs_reg zero(0.0f);
+   v->emit(BRW_OPCODE_ADD, offset(dest, 2), src0, src1);
+   v->emit(SHADER_OPCODE_TEX, dest, src2)
+  ->regs_written = 4;
+   v->emit(BRW_OPCODE_CMP, v->reg_null_f, offset(dest, 2), zero)
+  ->conditional_mod = BRW_CONDITIONAL_GE;
+
+   /* = Before =
+*
+* 0: add(8)dest+2  src0src1
+* 1: tex(8) rlen 4 dest+0  src2
+* 2: cmp.ge.f0(8)  nulldest+2  0.0f
+*
+* = After =
+* (no changes)
+*/
+
+   v->calculate_cfg();
+   bblock_t *block0 = v->cfg->blocks[0];
+
+   EXPECT_EQ(0, block0->start_ip);
+   EXPECT_EQ(2, block0->end_ip);
+
+   EXPECT_FALSE(cmod_propagation(v));
+   EXPECT_EQ(0, block0->start_ip);
+   EXPECT_EQ(2, block0->end_ip);
+   EXPECT_EQ(BRW_OPCODE_ADD, instruction(block0, 0)->opcode);
+   EXPECT_EQ(BRW_CONDITIONAL_NONE, instruction(block0, 0)->conditional_mod);
+   EXPECT_EQ(SHADER_OPCODE_TEX, instruction(block0, 1)->opcode);
+   EXPECT_EQ(BRW_CONDITIONAL_NONE, instruction(block0, 0)->conditional_mod);
+   EXPECT_EQ(BRW_OPCODE_CMP, instruction(block0, 2)->opcode);
+   EXPECT_EQ(BRW_CONDITIONAL_GE, instruction(block0, 2)->conditional_mod);
+}
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl/dri2: implement platform_null.

2015-01-23 Thread Haixia Shi
The NULL platform is for off-screen rendering only. Render node support is
required.

v2: Only consider the render nodes. Do not use normal nodes as they require
auth hooks.

Signed-off-by: Haixia Shi 
---
 src/egl/drivers/dri2/Makefile.am |   5 ++
 src/egl/drivers/dri2/egl_dri2.c  |  13 ++-
 src/egl/drivers/dri2/egl_dri2.h  |   3 +
 src/egl/drivers/dri2/platform_null.c | 169 +++
 4 files changed, 187 insertions(+), 3 deletions(-)
 create mode 100644 src/egl/drivers/dri2/platform_null.c

diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am
index 79a40e8..14b2d60 100644
--- a/src/egl/drivers/dri2/Makefile.am
+++ b/src/egl/drivers/dri2/Makefile.am
@@ -64,3 +64,8 @@ if HAVE_EGL_PLATFORM_DRM
 libegl_dri2_la_SOURCES += platform_drm.c
 AM_CFLAGS += -DHAVE_DRM_PLATFORM
 endif
+
+if HAVE_EGL_PLATFORM_NULL
+libegl_dri2_la_SOURCES += platform_null.c
+AM_CFLAGS += -DHAVE_NULL_PLATFORM
+endif
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 86e5f24..6ed137e 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -534,7 +534,7 @@ dri2_setup_screen(_EGLDisplay *disp)
  disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE;
  disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE;
   }
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
   if (dri2_dpy->image->base.version >= 8 &&
   dri2_dpy->image->createImageFromDmaBufs) {
  disp->Extensions.EXT_image_dma_buf_import = EGL_TRUE;
@@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
   return EGL_FALSE;
 
switch (disp->Platform) {
+#ifdef HAVE_NULL_PLATFORM
+   case _EGL_PLATFORM_NULL:
+  if (disp->Options.TestOnly)
+ return EGL_TRUE;
+  return dri2_initialize_null(drv, disp);
+#endif
+
 #ifdef HAVE_X11_PLATFORM
case _EGL_PLATFORM_X11:
   if (disp->Options.TestOnly)
@@ -1571,7 +1578,7 @@ dri2_create_wayland_buffer_from_image(_EGLDriver *drv, 
_EGLDisplay *dpy,
return dri2_dpy->vtbl->create_wayland_buffer_from_image(drv, dpy, img);
 }
 
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
 static EGLBoolean
 dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs)
 {
@@ -1829,7 +1836,7 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
case EGL_WAYLAND_BUFFER_WL:
   return dri2_create_image_wayland_wl_buffer(disp, ctx, buffer, attr_list);
 #endif
-#ifdef HAVE_DRM_PLATFORM
+#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM)
case EGL_LINUX_DMA_BUF_EXT:
   return dri2_create_image_dma_buf(disp, ctx, buffer, attr_list);
 #endif
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 9efe1f7..e206424 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -332,6 +332,9 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp);
 EGLBoolean
 dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp);
 
+EGLBoolean
+dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp);
+
 void
 dri2_flush_drawable_for_swapbuffers(_EGLDisplay *disp, _EGLSurface *draw);
 
diff --git a/src/egl/drivers/dri2/platform_null.c 
b/src/egl/drivers/dri2/platform_null.c
new file mode 100644
index 000..55ceab6
--- /dev/null
+++ b/src/egl/drivers/dri2/platform_null.c
@@ -0,0 +1,169 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (c) 2014 The Chromium OS Authors.
+ * Copyright © 2011 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "egl_dri2.h"
+#include "egl_dri2_fallbacks.h"
+#include "loader.h"
+
+static struct dri2_egl_display_vtbl dri2_null_display_vtbl = {
+   .create_pixmap_surface = dri2_fallback_create_pixmap_surface,
+   .crea

[Mesa-dev] [PATCH 2/2] gallium: Replace u_simple_list.h with util/simple_list.h

2015-01-23 Thread Eric Anholt
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.
---

For my NIR work, there were clashes between gallium's copy (from my driver)
and Mesa's (from NIR)

 src/gallium/auxiliary/Makefile.sources|   1 -
 src/gallium/auxiliary/draw/draw_llvm.c|   2 +-
 src/gallium/auxiliary/draw/draw_llvm.h|   2 +-
 src/gallium/auxiliary/gallivm/lp_bld_init.c   |   2 +-
 src/gallium/auxiliary/util/u_cache.c  |   2 +-
 src/gallium/auxiliary/util/u_simple_list.h| 199 --
 src/gallium/auxiliary/util/u_slab.c   |   2 +-
 src/gallium/drivers/llvmpipe/lp_context.c |   2 +-
 src/gallium/drivers/llvmpipe/lp_scene.c   |   2 +-
 src/gallium/drivers/llvmpipe/lp_state_fs.c|   2 +-
 src/gallium/drivers/llvmpipe/lp_state_setup.c |   2 +-
 src/gallium/drivers/llvmpipe/lp_texture.c |   2 +-
 src/gallium/drivers/r300/r300_context.c   |   2 +-
 src/gallium/drivers/r300/r300_flush.c |   2 +-
 src/gallium/drivers/r300/r300_query.c |   2 +-
 src/gallium/drivers/rbug/rbug_context.c   |   2 +-
 src/gallium/drivers/rbug/rbug_core.c  |   2 +-
 src/gallium/drivers/rbug/rbug_objects.c   |   2 +-
 src/gallium/drivers/rbug/rbug_screen.c|   2 +-
 src/gallium/drivers/trace/tr_context.c|   2 +-
 src/gallium/drivers/trace/tr_screen.c |   2 +-
 src/gallium/drivers/trace/tr_texture.c|   2 +-
 src/gallium/drivers/vc4/vc4_qir.c |   2 +-
 src/gallium/drivers/vc4/vc4_qir.h |   7 +-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c |   2 +-
 25 files changed, 23 insertions(+), 228 deletions(-)
 delete mode 100644 src/gallium/auxiliary/util/u_simple_list.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 3460482..c45dd18 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -273,7 +273,6 @@ C_SOURCES := \
util/u_ringbuffer.h \
util/u_sampler.c \
util/u_sampler.h \
-   util/u_simple_list.h \
util/u_simple_shaders.c \
util/u_simple_shaders.h \
util/u_slab.c \
diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index e7a72f9..6e1fb40 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -54,7 +54,7 @@
 #include "util/u_math.h"
 #include "util/u_pointer.h"
 #include "util/u_string.h"
-#include "util/u_simple_list.h"
+#include "util/simple_list.h"
 
 
 #define DEBUG_STORE 0
diff --git a/src/gallium/auxiliary/draw/draw_llvm.h 
b/src/gallium/auxiliary/draw/draw_llvm.h
index e734434..af1960e 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.h
+++ b/src/gallium/auxiliary/draw/draw_llvm.h
@@ -37,7 +37,7 @@
 #include "gallivm/lp_bld_limits.h"
 
 #include "pipe/p_context.h"
-#include "util/u_simple_list.h"
+#include "util/simple_list.h"
 
 
 struct draw_llvm;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 23a7c45..b9593de 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -31,7 +31,7 @@
 #include "util/u_cpu_detect.h"
 #include "util/u_debug.h"
 #include "util/u_memory.h"
-#include "util/u_simple_list.h"
+#include "util/simple_list.h"
 #include "os/os_time.h"
 #include "lp_bld.h"
 #include "lp_bld_debug.h"
diff --git a/src/gallium/auxiliary/util/u_cache.c 
b/src/gallium/auxiliary/util/u_cache.c
index 26aab2b..9395c66 100644
--- a/src/gallium/auxiliary/util/u_cache.c
+++ b/src/gallium/auxiliary/util/u_cache.c
@@ -42,7 +42,7 @@
 #include "util/u_math.h"
 #include "util/u_memory.h"
 #include "util/u_cache.h"
-#include "util/u_simple_list.h"
+#include "util/simple_list.h"
 
 
 struct util_cache_entry
diff --git a/src/gallium/auxiliary/util/u_simple_list.h 
b/src/gallium/auxiliary/util/u_simple_list.h
deleted file mode 100644
index 3f7def5..000
--- a/src/gallium/auxiliary/util/u_simple_list.h
+++ /dev/null
@@ -1,199 +0,0 @@
-/**
- * \file simple_list.h
- * Simple macros for type-safe, intrusive lists.
- *
- *  Intended to work with a list sentinal which is created as an empty
- *  list.  Insert & delete are O(1).
- *  
- * \author
- *  (C) 1997, Keith Whitwell
- */
-
-/*
- * Mesa 3-D graphics library
- *
- * Copyright (C) 1999-2001  Brian Paul   All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included
- * in all cop

[Mesa-dev] [PATCH 1/2] mesa: Move simple_list.h to src/util.

2015-01-23 Thread Eric Anholt
We have two copies of it in the tree, I'm going to delete one.
---
 src/gallium/targets/dri/Makefile.am|   1 +
 src/mesa/Makefile.sources  |   1 -
 src/mesa/drivers/dri/i915/i830_texblend.c  |   2 +-
 src/mesa/drivers/dri/i915/intel_syncobj.c  |   2 +-
 src/mesa/drivers/dri/r200/r200_cmdbuf.c|   2 +-
 src/mesa/drivers/dri/r200/r200_context.c   |   2 +-
 src/mesa/drivers/dri/r200/r200_ioctl.h |   2 +-
 src/mesa/drivers/dri/r200/r200_swtcl.c |   2 +-
 src/mesa/drivers/dri/r200/r200_tex.c   |   2 +-
 .../drivers/dri/radeon/radeon_common_context.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_context.c   |   2 +-
 src/mesa/drivers/dri/radeon/radeon_dma.c   |   2 +-
 src/mesa/drivers/dri/radeon/radeon_ioctl.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_ioctl.h |   2 +-
 src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c   |   2 +-
 src/mesa/drivers/dri/radeon/radeon_queryobj.c  |   2 +-
 src/mesa/drivers/dri/radeon/radeon_queryobj.h  |   2 +-
 src/mesa/drivers/dri/radeon/radeon_state.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_swtcl.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_tex.c   |   2 +-
 src/mesa/main/context.c|   2 +-
 src/mesa/main/enable.c |   2 +-
 src/mesa/main/light.c  |   2 +-
 src/mesa/main/mtypes.h |   2 +-
 src/mesa/main/simple_list.h| 210 -
 src/mesa/program/prog_hash_table.c |   2 +-
 src/mesa/tnl/t_rasterpos.c |   2 +-
 src/mesa/tnl/t_vb_light.c  |   2 +-
 src/mesa/tnl/t_vertex_generic.c|   2 +-
 src/mesa/tnl/t_vertex_sse.c|   2 +-
 src/util/Makefile.sources  |   1 +
 src/util/simple_list.h | 210 +
 32 files changed, 239 insertions(+), 238 deletions(-)
 delete mode 100644 src/mesa/main/simple_list.h
 create mode 100644 src/util/simple_list.h

diff --git a/src/gallium/targets/dri/Makefile.am 
b/src/gallium/targets/dri/Makefile.am
index 5df3a20..7f2ce6a 100644
--- a/src/gallium/targets/dri/Makefile.am
+++ b/src/gallium/targets/dri/Makefile.am
@@ -3,6 +3,7 @@ include $(top_srcdir)/src/gallium/Automake.inc
 AM_CFLAGS = \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
+   -I$(top_srcdir)/src \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/gallium/state_trackers/dri \
$(GALLIUM_TARGET_CFLAGS)
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 4203563..ced5fb0 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -183,7 +183,6 @@ MAIN_FILES = \
$(SRCDIR)main/shader_query.cpp \
$(SRCDIR)main/shared.c \
$(SRCDIR)main/shared.h \
-   $(SRCDIR)main/simple_list.h \
$(SRCDIR)main/state.c \
$(SRCDIR)main/state.h \
$(SRCDIR)main/stencil.c \
diff --git a/src/mesa/drivers/dri/i915/i830_texblend.c 
b/src/mesa/drivers/dri/i915/i830_texblend.c
index feea383..d5cbb37 100644
--- a/src/mesa/drivers/dri/i915/i830_texblend.c
+++ b/src/mesa/drivers/dri/i915/i830_texblend.c
@@ -28,7 +28,7 @@
 #include "main/glheader.h"
 #include "main/macros.h"
 #include "main/mtypes.h"
-#include "main/simple_list.h"
+#include "util/simple_list.h"
 #include "main/enums.h"
 #include "main/mm.h"
 
diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c 
b/src/mesa/drivers/dri/i915/intel_syncobj.c
index 9657d9a..d918cd7 100644
--- a/src/mesa/drivers/dri/i915/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i915/intel_syncobj.c
@@ -38,7 +38,7 @@
  * performance bottleneck, though.
  */
 
-#include "main/simple_list.h"
+#include "util/simple_list.h"
 #include "main/imports.h"
 
 #include "intel_context.h"
diff --git a/src/mesa/drivers/dri/r200/r200_cmdbuf.c 
b/src/mesa/drivers/dri/r200/r200_cmdbuf.c
index 1e6c0d8..13ac5af 100644
--- a/src/mesa/drivers/dri/r200/r200_cmdbuf.c
+++ b/src/mesa/drivers/dri/r200/r200_cmdbuf.c
@@ -35,7 +35,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include "main/imports.h"
 #include "main/macros.h"
 #include "main/context.h"
-#include "main/simple_list.h"
+#include "util/simple_list.h"
 
 #include "radeon_common.h"
 #include "r200_context.h"
diff --git a/src/mesa/drivers/dri/r200/r200_context.c 
b/src/mesa/drivers/dri/r200/r200_context.c
index 931f437..fb15082 100644
--- a/src/mesa/drivers/dri/r200/r200_context.c
+++ b/src/mesa/drivers/dri/r200/r200_context.c
@@ -37,7 +37,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include "main/api_arrayelt.h"
 #include "main/api_exec.h"
 #include "main/context.h"
-#include "main/simple_list.h"
+#include "util/simple_list.h"
 #include "main/imports.h"
 #include "ma

Re: [Mesa-dev] [PATCH 1/4] nir: use Python to autogenerate opcode information

2015-01-23 Thread Jason Ekstrand
On Fri, Jan 23, 2015 at 3:10 PM, Matt Turner  wrote:

> On Fri, Jan 23, 2015 at 1:43 PM, Jason Ekstrand 
> wrote:
> >  create mode 100644 src/glsl/nir/nir_opcodes.py
> >  create mode 100644 src/glsl/nir/nir_opcodes_c.py
> >  create mode 100644 src/glsl/nir/nir_opcodes_h.py
>
> Add the python files to EXTRA_DIST in src/glsl/Makefile.am before you
> commit.
>
Fixed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add a brw_fs_cmod_propagation unit test with large VGRF writes.

2015-01-23 Thread Matt Turner
On Fri, Jan 23, 2015 at 3:30 PM, Kenneth Graunke  wrote:
> Signed-off-by: Kenneth Graunke 
> ---
>  .../drivers/dri/i965/test_fs_cmod_propagation.cpp  | 40 
> ++
>  1 file changed, 40 insertions(+)
>
> Yep, that is indeed broken, and my proposed code fixes it.
>
> Here's a unit test that provokes the problem.  Thanks for putting together
> the unit tests - they're quite handy!

Awesome, looks good. I'll modify the patch the way you suggested and
squash this in.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 2/4] nir: add new constant folding infrastructure

2015-01-23 Thread Jason Ekstrand
Add a required field to the Opcode class, const_expr, that contains an
expression or statement that computes the result of the opcode given known
constant inputs. Then take those const_expr's and expand them into a function
that takes an opcode and an array of constant inputs and spits out the constant
result. This means that when adding opcodes, there's one less place to update,
and almost all the opcodes are self-documenting since the information on how to
compute the result is right next to the definition.

The helper functions in nir_constant_expressions.c were taken from
ir_constant_expressions.cpp.

v3 Jason Ekstrand 
 - Use mako to generate one function per opcode instead of doing piles of
   string splicing

v4 Jason Ekstrand 
 - More comments and better indentation in the mako
 - Add a description of the constant expression language in nir_opcodes.py
 - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am

Signed-off-by: Jason Ekstrand 
---
 src/glsl/Makefile.am |   6 +
 src/glsl/Makefile.sources|   1 +
 src/glsl/nir/.gitignore  |   1 +
 src/glsl/nir/nir_constant_expressions.h  |  31 ++
 src/glsl/nir/nir_constant_expressions.py | 351 +++
 src/glsl/nir/nir_opcodes.py  | 580 +--
 6 files changed, 786 insertions(+), 184 deletions(-)
 create mode 100644 src/glsl/nir/nir_constant_expressions.h
 create mode 100644 src/glsl/nir/nir_constant_expressions.py

diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index bbaffbe..8c6c8b9 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -37,6 +37,7 @@ EXTRA_DIST = tests glcpp/tests README TODO glcpp/README   
\
glsl_parser.yy  \
glcpp/glcpp-lex.l   \
glcpp/glcpp-parse.y \
+   nir/nir_constant_expressions.py \
nir/nir_opcodes.py  \
nir/nir_opcodes_c.py\
nir/nir_opcodes_h.py\
@@ -220,6 +221,7 @@ BUILT_SOURCES = 
\
glsl_lexer.cpp  \
glcpp/glcpp-parse.c \
glcpp/glcpp-lex.c   \
+   nir/nir_constant_expressions.c  \
nir/nir_opcodes.c   \
nir/nir_opcodes.h   \
nir/nir_opt_algebraic.c
@@ -235,6 +237,10 @@ dist-hook:
$(RM) glcpp/tests/*.out
$(RM) glcpp/tests/subtest*/*.out
 
+nir/nir_constant_expressions.c: nir/nir_opcodes.py 
nir/nir_constant_expressions.py nir/nir_constant_expressions.h
+   $(MKDIR_P) nir; \
+   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py > 
$@
+
 nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
$(MKDIR_P) nir; \
$(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index dc1c55d..dd76c44 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -14,6 +14,7 @@ LIBGLCPP_GENERATED_FILES = \
$(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
 
 NIR_GENERATED_FILES = \
+   $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c \
$(GLSL_BUILDDIR)/nir/nir_opcodes.c \
$(GLSL_BUILDDIR)/nir/nir_opcodes.h \
$(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
index 4c28193..261f64f 100644
--- a/src/glsl/nir/.gitignore
+++ b/src/glsl/nir/.gitignore
@@ -1,3 +1,4 @@
 nir_opt_algebraic.c
 nir_opcodes.c
 nir_opcodes.h
+nir_constant_expressions.c
diff --git a/src/glsl/nir/nir_constant_expressions.h 
b/src/glsl/nir/nir_constant_expressions.h
new file mode 100644
index 000..97997f2
--- /dev/null
+++ b/src/glsl/nir/nir_constant_expressions.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2014 Connor Abbott
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON

[Mesa-dev] [PATCH 1/2] nir/search: Add support for matching unknown constants

2015-01-23 Thread Jason Ekstrand
There are some algebraic transformations that we want to do but only if
certain things are constants.  For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.  This commit allows you to match an arbitrary constant value by
adding a "#" on the front of a variable name.
---
 src/glsl/nir/nir_algebraic.py | 8 
 src/glsl/nir/nir_search.c | 6 ++
 src/glsl/nir/nir_search.h | 7 +++
 3 files changed, 21 insertions(+)

diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
index 5be2842..df3ceb3 100644
--- a/src/glsl/nir/nir_algebraic.py
+++ b/src/glsl/nir/nir_algebraic.py
@@ -62,6 +62,7 @@ static const ${val.c_type} ${val.name} = {
 % elif isinstance(val, Variable):
${val.index}, /* ${val.var_name} */
{ ${', '.join(str(s) for s in val.swizzle)} },
+   ${'true' if val.is_constant else 'false'},
 % elif isinstance(val, Expression):
nir_op_${val.opcode},
{ ${', '.join(src.c_ptr for src in val.sources)} },
@@ -116,11 +117,18 @@ class Variable(Value):
 
   match = _swizzle_re.match(val)
   if match:
+ assert not val.startswith('#')
  val = match.group(1)
  self.swizzle = ['xyzw'.find(s) for s in match.group(2)]
   else:
  self.swizzle = range(4)
 
+  if val.startswith('#'):
+ val = val[1:]
+ self.is_constant = True
+  else:
+ self.is_constant = False
+
   self.var_name = val
   self.index = varset[val]
   self.name = name
diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 0d83ff5..6589edb 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -84,6 +84,10 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
 
  return true;
   } else {
+ if (var->is_constant &&
+ instr->src[src].src.ssa->parent_instr->type != 
nir_instr_type_load_const)
+return false;
+
  state->variables_seen |= (1 << var->variable);
  state->variables[var->variable].ssa = instr->src[src].src.ssa;
 
@@ -237,6 +241,8 @@ construct_value(const nir_search_value *value, nir_alu_type 
type,
   const nir_search_variable *var = nir_search_value_as_variable(value);
   assert(state->variables_seen & (1 << var->variable));
 
+  assert(!var->is_constant);
+
   nir_alu_src val;
   val.src = nir_src_for_ssa(state->variables[var->variable].ssa);
   val.abs = false;
diff --git a/src/glsl/nir/nir_search.h b/src/glsl/nir/nir_search.h
index 8b89dd0..902900f 100644
--- a/src/glsl/nir/nir_search.h
+++ b/src/glsl/nir/nir_search.h
@@ -54,6 +54,13 @@ typedef struct {
 * searches.
 */
uint8_t swizzle[4];
+
+   /** Indicates that the given variable must be a constant
+*
+* This is only alloed in search expressions and indicates that the
+* given variable is only allowed to match constant values.
+*/
+   bool is_constant;
 } nir_search_variable;
 
 typedef struct {
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nir: Add some more optimizations for handling constants in bcsel

2015-01-23 Thread Jason Ekstrand
shader-db results based on my scalarizing patches:

total instructions in shared programs: 6077319 -> 6076895 (-0.01%)
instructions in affected programs: 63509 -> 63085 (-0.67%)
helped:306
HURT:  0
GAINED:0
LOST:  0
---
 src/glsl/nir/nir_opt_algebraic.py | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 169bb41..5e40ef5 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -72,4 +72,17 @@ optimizations = [
(('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
 ]
 
+# Add optimizations to handle the case where the result of a ternary is
+# compared to a constant.  This way we can take things like
+for op in ['flt', 'fge', 'feq', 'fne',
+   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
+   optimizations += [
+  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
+   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
+  ((op, '#d', ('bcsel', a, '#b', '#c')),
+   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
+  (('bcsel', (op, 'a', 'b'), True, False), (op, 'a', 'b')),
+  (('bcsel', (op, 'a', 'b'), False, True), ('inot', (op, 'a', 'b'))),
+   ]
+
 print nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render()
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] nir: Make some helpers for copying ALU src/dests.

2015-01-23 Thread Jason Ekstrand
I'll throw together a patch to make nir_src_copy take pointers to keep
things consistent.

Reviewed-by: Jason Ekstrand 

On Thu, Jan 22, 2015 at 11:08 AM, Eric Anholt  wrote:

> Jason Ekstrand  writes:
>
> > On Wed, Jan 21, 2015 at 5:26 PM, Eric Anholt  wrote:
> >
> >> There aren't many users yet, but I wanted to do this from my scalarizing
> >> pass.
> >> ---
> >>  src/glsl/nir/nir.c | 18 ++
> >>  src/glsl/nir/nir.h |  5 -
> >>  src/glsl/nir/nir_lower_vec_to_movs.c   |  7 ++-
> >>  src/glsl/nir/nir_opt_peephole_select.c |  5 +
> >>  4 files changed, 25 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> >> index 16ad2da..e414df9 100644
> >> --- a/src/glsl/nir/nir.c
> >> +++ b/src/glsl/nir/nir.c
> >> @@ -175,6 +175,24 @@ nir_dest nir_dest_copy(nir_dest dest, void
> *mem_ctx)
> >> return ret;
> >>  }
> >>
> >> +void
> >> +nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void *mem_ctx)
> >> +{
> >>
> >
> > We already have nir_src_copy which returns a nir_src instead of taking a
> > pointer.  TBH, I'm not sure which I prefer, but it would be good to be
> > consistent.  Thoughts?
>
> Yeah, I was thinking that as the struct gets bigger, passing it around
> on the stack gets worse.  For API consistency, I think that would mean
> pointers for both.
>
> >> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> >> index 8dc5222..7f0aa36 100644
> >> --- a/src/glsl/nir/nir.h
> >> +++ b/src/glsl/nir/nir.h
> >> @@ -569,7 +569,10 @@ typedef struct {
> >> unsigned write_mask : 4; /* ignored if dest.is_ssa is true */
> >>  } nir_alu_dest;
> >>
> >> -#define OPCODE(name, num_inputs, output_size, output_type, \
> >> +void nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void
> *mem_ctx);
> >> +void nir_alu_dest_copy(nir_alu_dest *dest, nir_alu_dest *src, void
> >> *mem_ctx);
> >> +
> >> +#define OPCODE(name, num_inputs, output_size, output_type,  \
> >>
> >
> > Accidental whitespace change?
>
> Not sure how that happened, will fix.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] nir: Make some helpers for copying ALU src/dests.

2015-01-23 Thread Jason Ekstrand
On Fri, Jan 23, 2015 at 4:25 PM, Jason Ekstrand 
wrote:

> I'll throw together a patch to make nir_src_copy take pointers to keep
> things consistent.
>
> Reviewed-by: Jason Ekstrand 
>
> On Thu, Jan 22, 2015 at 11:08 AM, Eric Anholt  wrote:
>
>> Jason Ekstrand  writes:
>>
>> > On Wed, Jan 21, 2015 at 5:26 PM, Eric Anholt  wrote:
>> >
>> >> There aren't many users yet, but I wanted to do this from my
>> scalarizing
>> >> pass.
>> >> ---
>> >>  src/glsl/nir/nir.c | 18 ++
>> >>  src/glsl/nir/nir.h |  5 -
>> >>  src/glsl/nir/nir_lower_vec_to_movs.c   |  7 ++-
>> >>  src/glsl/nir/nir_opt_peephole_select.c |  5 +
>> >>  4 files changed, 25 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> >> index 16ad2da..e414df9 100644
>> >> --- a/src/glsl/nir/nir.c
>> >> +++ b/src/glsl/nir/nir.c
>> >> @@ -175,6 +175,24 @@ nir_dest nir_dest_copy(nir_dest dest, void
>> *mem_ctx)
>> >> return ret;
>> >>  }
>> >>
>> >> +void
>> >> +nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void *mem_ctx)
>>
>
One more comment if it's not too late.  Do you want src to be const here?
I guess it doesn't much matter, but it makes it a bit more clear.
--Jason


> >> +{
>> >>
>> >
>> > We already have nir_src_copy which returns a nir_src instead of taking a
>> > pointer.  TBH, I'm not sure which I prefer, but it would be good to be
>> > consistent.  Thoughts?
>>
>> Yeah, I was thinking that as the struct gets bigger, passing it around
>> on the stack gets worse.  For API consistency, I think that would mean
>> pointers for both.
>>
>> >> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> >> index 8dc5222..7f0aa36 100644
>> >> --- a/src/glsl/nir/nir.h
>> >> +++ b/src/glsl/nir/nir.h
>> >> @@ -569,7 +569,10 @@ typedef struct {
>> >> unsigned write_mask : 4; /* ignored if dest.is_ssa is true */
>> >>  } nir_alu_dest;
>> >>
>> >> -#define OPCODE(name, num_inputs, output_size, output_type, \
>> >> +void nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void
>> *mem_ctx);
>> >> +void nir_alu_dest_copy(nir_alu_dest *dest, nir_alu_dest *src, void
>> >> *mem_ctx);
>> >> +
>> >> +#define OPCODE(name, num_inputs, output_size, output_type,  \
>> >>
>> >
>> > Accidental whitespace change?
>>
>> Not sure how that happened, will fix.
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nir: When asked to print with a NULL state, just us bare variable names.

2015-01-23 Thread Eric Anholt
---
 src/glsl/nir/nir_print.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index 1a50ae9..2ef55ed 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -210,7 +210,9 @@ print_var_decl(nir_variable *var, print_var_state *state, 
FILE *fp)
 
glsl_print_type(var->type, fp);
 
-   struct set_entry *entry = _mesa_set_search(state->syms, var->name);
+   struct set_entry *entry = NULL;
+   if (state)
+  entry = _mesa_set_search(state->syms, var->name);
 
char *name;
 
@@ -231,18 +233,26 @@ print_var_decl(nir_variable *var, print_var_state *state, 
FILE *fp)
 
fprintf(fp, "\n");
 
-   _mesa_set_add(state->syms, name);
-   _mesa_hash_table_insert(state->ht, var, name);
+   if (state) {
+  _mesa_set_add(state->syms, name);
+  _mesa_hash_table_insert(state->ht, var, name);
+   }
 }
 
 static void
 print_var(nir_variable *var, print_var_state *state, FILE *fp)
 {
-   struct hash_entry *entry = _mesa_hash_table_search(state->ht, var);
+   const char *name;
+   if (state) {
+  struct hash_entry *entry = _mesa_hash_table_search(state->ht, var);
 
-   assert(entry != NULL);
+  assert(entry != NULL);
+  name = entry->data;
+   } else {
+  name = var->name;
+   }
 
-   fprintf(fp, "%s", (char *) entry->data);
+   fprintf(fp, "%s", name);
 }
 
 static void
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nir: Expose nir_print_instr() for debug prints

2015-01-23 Thread Eric Anholt
It's nice to have this present in your default cases so you can see what
instruction is triggering an abort.

v2: Just pass a NULL state, now that it won't crash when you do.
---
 src/glsl/nir/nir.h   | 1 +
 src/glsl/nir/nir_print.c | 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 58a8efe..0912837 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1484,6 +1484,7 @@ void nir_index_ssa_defs(nir_function_impl *impl);
 void nir_index_blocks(nir_function_impl *impl);
 
 void nir_print_shader(nir_shader *shader, FILE *fp);
+void nir_print_instr(nir_instr *instr, FILE *fp);
 
 #ifdef DEBUG
 void nir_validate_shader(nir_shader *shader);
diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index 2ef55ed..9c07950 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -621,8 +621,6 @@ print_instr(nir_instr *instr, print_var_state *state, 
unsigned tabs, FILE *fp)
   unreachable("Invalid instruction type");
   break;
}
-
-   fprintf(fp, "\n");
 }
 
 static int
@@ -668,6 +666,7 @@ print_block(nir_block *block, print_var_state *state, 
unsigned tabs, FILE *fp)
 
nir_foreach_instr(block, instr) {
   print_instr(instr, state, tabs, fp);
+  fprintf(fp, "\n");
}
 
print_tabs(tabs, fp);
@@ -881,3 +880,9 @@ nir_print_shader(nir_shader *shader, FILE *fp)
 
destroy_print_state(&state);
 }
+
+void
+nir_print_instr(nir_instr *instr, FILE *fp)
+{
+   print_instr(instr, NULL, 0, fp);
+}
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: When asked to print with a NULL state, just us bare variable names.

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 23, 2015 at 4:35 PM, Eric Anholt  wrote:

> ---
>  src/glsl/nir/nir_print.c | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
> index 1a50ae9..2ef55ed 100644
> --- a/src/glsl/nir/nir_print.c
> +++ b/src/glsl/nir/nir_print.c
> @@ -210,7 +210,9 @@ print_var_decl(nir_variable *var, print_var_state
> *state, FILE *fp)
>
> glsl_print_type(var->type, fp);
>
> -   struct set_entry *entry = _mesa_set_search(state->syms, var->name);
> +   struct set_entry *entry = NULL;
> +   if (state)
> +  entry = _mesa_set_search(state->syms, var->name);
>
> char *name;
>
> @@ -231,18 +233,26 @@ print_var_decl(nir_variable *var, print_var_state
> *state, FILE *fp)
>
> fprintf(fp, "\n");
>
> -   _mesa_set_add(state->syms, name);
> -   _mesa_hash_table_insert(state->ht, var, name);
> +   if (state) {
> +  _mesa_set_add(state->syms, name);
> +  _mesa_hash_table_insert(state->ht, var, name);
> +   }
>  }
>
>  static void
>  print_var(nir_variable *var, print_var_state *state, FILE *fp)
>  {
> -   struct hash_entry *entry = _mesa_hash_table_search(state->ht, var);
> +   const char *name;
> +   if (state) {
> +  struct hash_entry *entry = _mesa_hash_table_search(state->ht, var);
>
> -   assert(entry != NULL);
> +  assert(entry != NULL);
> +  name = entry->data;
> +   } else {
> +  name = var->name;
> +   }
>
> -   fprintf(fp, "%s", (char *) entry->data);
> +   fprintf(fp, "%s", name);
>  }
>
>  static void
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nir: Expose nir_print_instr() for debug prints

2015-01-23 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Jan 23, 2015 at 4:35 PM, Eric Anholt  wrote:

> It's nice to have this present in your default cases so you can see what
> instruction is triggering an abort.
>
> v2: Just pass a NULL state, now that it won't crash when you do.
> ---
>  src/glsl/nir/nir.h   | 1 +
>  src/glsl/nir/nir_print.c | 9 +++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 58a8efe..0912837 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1484,6 +1484,7 @@ void nir_index_ssa_defs(nir_function_impl *impl);
>  void nir_index_blocks(nir_function_impl *impl);
>
>  void nir_print_shader(nir_shader *shader, FILE *fp);
> +void nir_print_instr(nir_instr *instr, FILE *fp);
>
>  #ifdef DEBUG
>  void nir_validate_shader(nir_shader *shader);
> diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
> index 2ef55ed..9c07950 100644
> --- a/src/glsl/nir/nir_print.c
> +++ b/src/glsl/nir/nir_print.c
> @@ -621,8 +621,6 @@ print_instr(nir_instr *instr, print_var_state *state,
> unsigned tabs, FILE *fp)
>unreachable("Invalid instruction type");
>break;
> }
> -
> -   fprintf(fp, "\n");
>  }
>
>  static int
> @@ -668,6 +666,7 @@ print_block(nir_block *block, print_var_state *state,
> unsigned tabs, FILE *fp)
>
> nir_foreach_instr(block, instr) {
>print_instr(instr, state, tabs, fp);
> +  fprintf(fp, "\n");
> }
>
> print_tabs(tabs, fp);
> @@ -881,3 +880,9 @@ nir_print_shader(nir_shader *shader, FILE *fp)
>
> destroy_print_state(&state);
>  }
> +
> +void
> +nir_print_instr(nir_instr *instr, FILE *fp)
> +{
> +   print_instr(instr, NULL, 0, fp);
> +}
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Use pointers for nir_src_copy and nir_dest_copy

2015-01-23 Thread Jason Ekstrand
This avoids the overhead of copying structures and better matches the newly
added nir_alu_src_copy and nir_alu_dest_copy.
---

This should be obvious, but this applies on top of Eric Anholt's patch to
add nir_alu_src_copy and nir_alu_dest_copy

 src/glsl/nir/nir.c  | 56 +++--
 src/glsl/nir/nir.h  |  4 +--
 src/glsl/nir/nir_from_ssa.c |  4 +--
 src/glsl/nir/nir_lower_atomics.c|  4 +--
 src/glsl/nir/nir_lower_io.c |  8 ++---
 src/glsl/nir/nir_lower_locals_to_regs.c | 12 +++
 src/glsl/nir/nir_lower_samplers.cpp |  2 +-
 src/glsl/nir/nir_lower_system_values.c  |  2 +-
 src/glsl/nir/nir_opt_peephole_select.c  |  4 +--
 src/glsl/nir/nir_search.c   |  4 +--
 10 files changed, 47 insertions(+), 53 deletions(-)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index e414df9..472fa39 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -135,50 +135,44 @@ nir_function_overload_create(nir_function *func)
return overload;
 }
 
-nir_src nir_src_copy(nir_src src, void *mem_ctx)
+void nir_src_copy(nir_src *dest, const nir_src *src, void *mem_ctx)
 {
-   nir_src ret;
-   ret.is_ssa = src.is_ssa;
-   if (ret.is_ssa) {
-  ret.ssa = src.ssa;
+   dest->is_ssa = src->is_ssa;
+   if (src->is_ssa) {
+  dest->ssa = src->ssa;
} else {
-  ret.reg.base_offset = src.reg.base_offset;
-  ret.reg.reg = src.reg.reg;
-  if (src.reg.indirect) {
- ret.reg.indirect = ralloc(mem_ctx, nir_src);
- *ret.reg.indirect = *src.reg.indirect;
+  dest->reg.base_offset = src->reg.base_offset;
+  dest->reg.reg = src->reg.reg;
+  if (src->reg.indirect) {
+ dest->reg.indirect = ralloc(mem_ctx, nir_src);
+ nir_src_copy(dest->reg.indirect, src->reg.indirect, mem_ctx);
   } else {
- ret.reg.indirect = NULL;
+ dest->reg.indirect = NULL;
   }
}
-
-   return ret;
 }
 
-nir_dest nir_dest_copy(nir_dest dest, void *mem_ctx)
+void nir_dest_copy(nir_dest *dest, const nir_dest *src, void *mem_ctx)
 {
-   nir_dest ret;
-   ret.is_ssa = dest.is_ssa;
-   if (ret.is_ssa) {
-  ret.ssa = dest.ssa;
+   dest->is_ssa = src->is_ssa;
+   if (src->is_ssa) {
+  dest->ssa = src->ssa;
} else {
-  ret.reg.base_offset = dest.reg.base_offset;
-  ret.reg.reg = dest.reg.reg;
-  if (dest.reg.indirect) {
- ret.reg.indirect = ralloc(mem_ctx, nir_src);
- *ret.reg.indirect = *dest.reg.indirect;
+  dest->reg.base_offset = src->reg.base_offset;
+  dest->reg.reg = src->reg.reg;
+  if (src->reg.indirect) {
+ dest->reg.indirect = ralloc(mem_ctx, nir_src);
+ nir_src_copy(dest->reg.indirect, src->reg.indirect, mem_ctx);
   } else {
- ret.reg.indirect = NULL;
+ dest->reg.indirect = NULL;
   }
}
-
-   return ret;
 }
 
 void
 nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void *mem_ctx)
 {
-   dest->src = nir_src_copy(src->src, mem_ctx);
+   nir_src_copy(&dest->src, &src->src, mem_ctx);
dest->abs = src->abs;
dest->negate = src->negate;
for (unsigned i = 0; i < 4; i++)
@@ -188,7 +182,7 @@ nir_alu_src_copy(nir_alu_src *dest, nir_alu_src *src, void 
*mem_ctx)
 void
 nir_alu_dest_copy(nir_alu_dest *dest, nir_alu_dest *src, void *mem_ctx)
 {
-   dest->dest = nir_dest_copy(src->dest, mem_ctx);
+   nir_dest_copy(&dest->dest, &src->dest, mem_ctx);
dest->write_mask = src->write_mask;
dest->saturate = src->saturate;
 }
@@ -563,7 +557,7 @@ copy_deref_array(void *mem_ctx, nir_deref_array *deref)
ret->base_offset = deref->base_offset;
ret->deref_array_type = deref->deref_array_type;
if (deref->deref_array_type == nir_deref_array_type_indirect) {
-  ret->indirect = nir_src_copy(deref->indirect, mem_ctx);
+   nir_src_copy(&ret->indirect, &deref->indirect, mem_ctx);
}
ret->deref.type = deref->deref.type;
if (deref->deref.child)
@@ -1829,7 +1823,7 @@ ssa_def_rewrite_uses_src(nir_src *src, void *void_state)
struct ssa_def_rewrite_state *state = void_state;
 
if (src->is_ssa && src->ssa == state->old)
-  *src = nir_src_copy(state->new_src, state->mem_ctx);
+  nir_src_copy(src, &state->new_src, state->mem_ctx);
 
return true;
 }
@@ -1866,7 +1860,7 @@ nir_ssa_def_rewrite_uses(nir_ssa_def *def, nir_src 
new_src, void *mem_ctx)
   nir_if *if_use = (nir_if *)entry->key;
 
   _mesa_set_remove(def->if_uses, entry);
-  if_use->condition = nir_src_copy(new_src, mem_ctx);
+  nir_src_copy(&if_use->condition, &new_src, mem_ctx);
   _mesa_set_add(new_if_uses, if_use);
}
 }
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index f0d4de7..4770558 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -519,8 +519,8 @@ nir_dest_for_reg(nir_register *reg)
return dest;
 }
 
-nir_src nir_src_copy(nir_src src, void *mem_ctx);
-nir_dest nir_dest_copy(nir_dest dest, void *mem_ctx);
+void nir_src_copy(nir_src *d

Re: [Mesa-dev] [PATCH] i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful

2015-01-23 Thread Kenneth Graunke
On Friday, January 23, 2015 02:32:53 PM Ian Romanick wrote:
> From: Ian Romanick 
> 
> If try_replace_with_sel is able to replace the flow control with a SEL
> instruction, then there is no flow control... failing SIMD16 because
> of nonexistent flow control is wrong.
> 
> No piglit regressions on any i965 platform in Jenkins.
> 
> total instructions in shared programs: 4382707 -> 4382707 (0.00%)
> instructions in affected programs: 0 -> 0
> helped:0
> HURT:  0
> GAINED:2089
> LOST:  0
> 
> No other platforms affected in shader-db.
> 
> Signed-off-by: Ian Romanick 

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] videos of X, DRM, & Mesa talks from LCA2015

2015-01-23 Thread Alan Coopersmith

The amazing 2015 linux.conf.au video team has already posted all the videos
from the talks at this years conference - you can find them on their YouTube
channel at https://www.youtube.com/user/linuxconfau2015 or via the links on
http://lca2015.linux.org.au/

Talks of particular relevance to Xorg, DRI, and Mesa developers include:

Botching up IOCTLs
Daniel Vetter (Intel)
https://www.youtube.com/watch?v=WnqXHs_tGR4
http://lca2015.linux.org.au/schedule/30266/view_talk
Slides: http://people.freedesktop.org/~danvet/presentations/lca-2015.pdf

Displayport MST: why do my laptop dockoutputs not work?
David Airlie (Red Hat)
https://www.youtube.com/watch?v=6301tGNs9Dc
http://lca2015.linux.org.au/schedule/30303/view_talk
Slides: http://lca2015.linux.org.au/slides/93/lca2015mst.odp

Open-source OpenGL on the Raspberry Pi
Eric Anholt (Broadcom)
https://www.youtube.com/watch?v=EXDeketJNdk
http://lca2015.linux.org.au/schedule/30256/view_talk
Slides: http://lca2015.linux.org.au/slides/125/lca2015-rpi.pdf

Putting the Polish on Glamor
Keith Packard (HP)
https://www.youtube.com/watch?v=dXR-MVQvQZw
http://lca2015.linux.org.au/schedule/30108/view_talk
Slides: http://lca2015.linux.org.au/slides/64/glamor.odp

Reducing GLSL Compiler Memory Usage (or Fitting 5kg of Potatoes in a 2kg Bag)
Ian Romanick (Intel)
https://www.youtube.com/watch?v=K-5DTAD2Isk
http://lca2015.linux.org.au/schedule/30149/view_talk
Slides: http://people.freedesktop.org/~idr/LCA2015/

And semi-related is a talk from former-SFLC-lawyer Karen Sandler on the
troubles open source groups have had with non-profit status that have been
pushing the X.Org Foundation to join a larger group to leverage their
lawyers and accountants in meeting all the needed requirements, resulting
in our current moves to become part of SPI:
https://www.youtube.com/watch?v=Z5uY01QlyK0
http://lca2015.linux.org.au/schedule/30178/view_talk

--
-Alan Coopersmith-  alan.coopersm...@oracle.com
 Oracle Solaris Engineering - http://blogs.oracle.com/alanc
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: Enable VGPR spilling for all shader types v3

2015-01-23 Thread Tom Stellard
On Thu, Jan 22, 2015 at 11:27:32AM +0900, Michel Dänzer wrote:
> On 21.01.2015 21:12, Marek Olšák wrote:
> > We also had a case when the CPU accidentally corrupted shaders,
> > because the shaders were mapped after textures and a CPU texture
> > upload overflowed and overwrote shaders. I suppose we should have
> > unmapped the shaders.
> 
> Sounds like a good idea.
> 
> 
> Tom, for now I suggest this solution, summarized from Marek's previous
> descriptions:
> 
> (At least) for shaders which have relocations, keep a copy of the
> machine code in malloced memory. When the relocated values change,
> update them in the malloced memory, allocate a new BO, map it, copy the
> machine code from the malloced memory to the BO, replace any existing
> shader BO with the new one and invalidate the shader state.
> 

Hi,

Attached is a WIP patch attempting to implement it this way.
Unfortunately, I was unable to get it working, so I wanted to
submit it for review in case someone can spot what I'm doing wrong.

You can find the broken code wrapped in #if 0 in the
si_update_scratch_buffer() function in si_state_shaders.c

Based on the dmesg output and other tests I've done, it appears
that the GPU is still executing the shader code from the old bo
which does not contain the relocations.

The code in the #else branch works fine, but it updates the existing
bo in place rather than creating a new one.

Any idea what I've done wrong?

Thanks,
Tom
>From ba673155672756fb0bf9873b2ae76c3f5ccd02e2 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 10 Dec 2014 09:13:59 -0500
Subject: [PATCH] radeonsi: Enable VGPR spilling for all shader types v5 (WIP)

v2:
  - Only emit write SPI_TMPRING_SIZE once per packet.
  - Use context global scratch buffer.

v3:
  - Patch shaders using WRITE_DATA packet instead of map/unmap.
  - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
VS_PARTIAL_FLUSH when patching shaders.

v4:
  - Code cleanups.
  - Remove unnecessary multiplies.

v5:
  - Patch shaders in system memory and re-upload to vram.
---
 src/gallium/drivers/radeonsi/si_compute.c   |  42 +--
 src/gallium/drivers/radeonsi/si_hw_context.c|   1 +
 src/gallium/drivers/radeonsi/si_pipe.c  |   9 +-
 src/gallium/drivers/radeonsi/si_pipe.h  |   6 +
 src/gallium/drivers/radeonsi/si_shader.c|  54 +++--
 src/gallium/drivers/radeonsi/si_shader.h|   8 +-
 src/gallium/drivers/radeonsi/si_state_draw.c|  15 +++
 src/gallium/drivers/radeonsi/si_state_shaders.c | 141 +++-
 8 files changed, 227 insertions(+), 49 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c
index 981bccb..4dd4379 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -42,12 +42,6 @@
 #define NUM_USER_SGPRS 4
 #endif
 
-static const char *scratch_rsrc_dword0_symbol =
-	"SCRATCH_RSRC_DWORD0";
-
-static const char *scratch_rsrc_dword1_symbol =
-	"SCRATCH_RSRC_DWORD1";
-
 struct si_compute {
 	struct si_context *ctx;
 
@@ -68,8 +62,6 @@ struct si_compute {
 #endif
 };
 
-static void apply_scratch_relocs(const struct si_screen *sscreen,
-			struct si_shader *shader, uint64_t scratch_va);
 static void init_scratch_buffer(struct si_context *sctx, struct si_compute *program)
 {
 	unsigned scratch_bytes = 0;
@@ -85,8 +77,8 @@ static void init_scratch_buffer(struct si_context *sctx, struct si_compute *prog
 program->shader.binary.global_symbol_offsets[i];
 		unsigned scratch_bytes_needed;
 
-		si_shader_binary_read_config(&program->shader.binary,
-		&program->shader, offset);
+		si_shader_binary_read_config(sctx->screen, &program->shader.binary,
+&program->shader, offset);
 		scratch_bytes_needed = scratch_waves *
 program->shader.scratch_bytes_per_wave;
 		scratch_bytes = MAX2(scratch_bytes, scratch_bytes_needed);
@@ -101,7 +93,8 @@ static void init_scratch_buffer(struct si_context *sctx, struct si_compute *prog
 	scratch_buffer_va = program->scratch_bo->gpu_address;
 
 	/* Patch the shader with the scratch buffer address. */
-	apply_scratch_relocs(sctx->screen, &program->shader, scratch_buffer_va);
+	si_shader_apply_scratch_relocs(sctx,
+&program->shader, scratch_buffer_va);
 
 }
 
@@ -226,30 +219,6 @@ static unsigned compute_num_waves_for_scratch(
 	return scratch_waves;
 }
 
-static void apply_scratch_relocs(const struct si_screen *sscreen,
-			struct si_shader *shader, uint64_t scratch_va) {
-	unsigned i;
-	uint32_t scratch_rsrc_dword0 = scratch_va & 0x;
-	uint32_t scratch_rsrc_dword1 =
-		S_008F04_BASE_ADDRESS_HI(scratch_va >> 32)
-		|  S_008F04_STRIDE(shader->scratch_bytes_per_wave / 64);
-
-	if (!shader->binary.reloc_count) {
-		return;
-	}
-
-	for (i = 0 ; i < shader->binary.reloc_count; i++) {
-		const struct radeon_shader_reloc *reloc = &shader->binary.relocs[i];
-		if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name)) {
-			util_memcpy_cpu_to_le32(shader->b

Re: [Mesa-dev] [PATCH] radeonsi: Enable VGPR spilling for all shader types v3

2015-01-23 Thread Michel Dänzer
On 24.01.2015 11:56, Tom Stellard wrote:
> On Thu, Jan 22, 2015 at 11:27:32AM +0900, Michel Dänzer wrote:
>>
>> Tom, for now I suggest this solution, summarized from Marek's previous
>> descriptions:
>>
>> (At least) for shaders which have relocations, keep a copy of the
>> machine code in malloced memory. When the relocated values change,
>> update them in the malloced memory, allocate a new BO, map it, copy the
>> machine code from the malloced memory to the BO, replace any existing
>> shader BO with the new one and invalidate the shader state.
>>
> 
> Hi,
> 
> Attached is a WIP patch attempting to implement it this way.
> Unfortunately, I was unable to get it working, so I wanted to
> submit it for review in case someone can spot what I'm doing wrong.
> 
> You can find the broken code wrapped in #if 0 in the
> si_update_scratch_buffer() function in si_state_shaders.c
> 
> Based on the dmesg output and other tests I've done, it appears
> that the GPU is still executing the shader code from the old bo
> which does not contain the relocations.
> 
> The code in the #else branch works fine, but it updates the existing
> bo in place rather than creating a new one.
> 
> Any idea what I've done wrong?

[...]

> + /* Update the shaders, so they are using the latest scratch.  
> The
> +  * scratch buffer may have been changed since these shaders were
> +  * last used, so we still need to try to update them, even if
> +  * they require scratch buffers smaller than the current size.
> +  */
> + if (si_update_scratch_buffer(sctx, sctx->ps_shader))
> + sctx->emitted.named.ps = NULL;
> + if (si_update_scratch_buffer(sctx, sctx->gs_shader))
> + sctx->emitted.named.gs = NULL;
> + if (si_update_scratch_buffer(sctx, sctx->vs_shader))
> + sctx->emitted.named.vs = NULL;

Does this work instead?

if (si_update_scratch_buffer(sctx, sctx->ps_shader))
si_pm4_bind_state(sctx, ps, 
sctx->ps_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->gs_shader))
si_pm4_bind_state(sctx, gs, 
sctx->gs_shader->current->pm4);
if (si_update_scratch_buffer(sctx, sctx->vs_shader))
si_pm4_bind_state(sctx, vs, 
sctx->vs_shader->current->pm4);


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glapi: Do not use backtrace on FreeBSD.

2015-01-23 Thread Vinson Lee
Fix build error.

  CCLD libGL.la
libglapi.a(glapi_libglapi_la-glapi_gentable.o): In function 
`__glapi_gentable_NoOp':
glapi_gentable.c:76: undefined reference to `backtrace'

Signed-off-by: Vinson Lee 
---
 src/mapi/glapi/gen/gl_gentable.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mapi/glapi/gen/gl_gentable.py 
b/src/mapi/glapi/gen/gl_gentable.py
index 06a5ebf..fb578e3 100644
--- a/src/mapi/glapi/gen/gl_gentable.py
+++ b/src/mapi/glapi/gen/gl_gentable.py
@@ -42,7 +42,7 @@ header = """/* GLXEXT is the define used in the xserver when 
the GLX extension i
 #endif
 
 #if (defined(GLXEXT) && defined(HAVE_BACKTRACE)) \\
-   || (!defined(GLXEXT) && defined(DEBUG) && !defined(__CYGWIN__) && 
!defined(__MINGW32__) && !defined(__OpenBSD__) && !defined(__NetBSD__) && 
!defined(__DragonFly__))
+   || (!defined(GLXEXT) && defined(DEBUG) && !defined(__CYGWIN__) && 
!defined(__MINGW32__) && !defined(__OpenBSD__) && !defined(__NetBSD__) && 
!defined(__DragonFly__) && !defined(__FreeBSD__))
 #define USE_BACKTRACE
 #endif
 
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88766] codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88766

Bug ID: 88766
   Summary: codegen/nv50_ir.h:585:9: error: no member named 'tr1'
in namespace 'std'
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: FreeBSD
Status: NEW
  Severity: blocker
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: imir...@alum.mit.edu

mesa: 94e7b59a75fc2ecc51a74196f6cd198546603b85 (master 10.5.0-devel)

Build error on FreeBSD.

  CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:23:
./codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'
   std::tr1::unordered_set uses;
   ~^

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88766] codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88766

--- Comment #1 from Ilia Mirkin  ---
Is this a really old compiler? tr1 has been in gcc for quite a while. In any
case, there's no nouveau kernel support on FreeBSD ATM, so this isn't a huge
issue.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88766] codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88766

--- Comment #2 from Vinson Lee  ---
The error was seen on FreeBSD 10 with clang 3.4.

Removing the tr1 namespace fixes the build errors.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88766] codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88766

--- Comment #3 from Vinson Lee  ---
Created attachment 112757
  --> https://bugs.freedesktop.org/attachment.cgi?id=112757&action=edit
Remove tr1 namespace.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 88766] codegen/nv50_ir.h:585:9: error: no member named 'tr1' in namespace 'std'

2015-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88766

--- Comment #4 from Ilia Mirkin  ---
(In reply to Vinson Lee from comment #3)
> Created attachment 112757 [details] [review]
> Remove tr1 namespace.

That would require C++11 support, which I feel a lot worse about requiring than
having nouveau not compile with clang. I'm surprised that clang chose not to
ship tr1 headers... oh well.

If someone maps out the various version support for all this, perhaps we can
make a decision. Or some other approach is the standard way to deal with this?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/3] nir: Add a pass to lower vector phi nodes to scalar phi nodes

2015-01-23 Thread Jason Ekstrand
---
 src/glsl/Makefile.sources   |   1 +
 src/glsl/nir/nir.h  |   2 +
 src/glsl/nir/nir_lower_phis_to_scalar.c | 238 
 3 files changed, 241 insertions(+)
 create mode 100644 src/glsl/nir/nir_lower_phis_to_scalar.c

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 96c4ec5..02d0780 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -28,6 +28,7 @@ NIR_FILES = \
nir/nir_lower_global_vars_to_local.c \
nir/nir_lower_locals_to_regs.c \
nir/nir_lower_io.c \
+   nir/nir_lower_phis_to_scalar.c \
nir/nir_lower_samplers.cpp \
nir/nir_lower_system_values.c \
nir/nir_lower_to_source_mods.c \
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 119ca01..cda14aa 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1523,6 +1523,8 @@ void nir_remove_dead_variables(nir_shader *shader);
 void nir_lower_vec_to_movs(nir_shader *shader);
 void nir_lower_alu_to_scalar(nir_shader *shader);
 
+void nir_lower_phis_to_scalar(nir_shader *shader);
+
 void nir_lower_samplers(nir_shader *shader,
 struct gl_shader_program *shader_program,
 struct gl_program *prog);
diff --git a/src/glsl/nir/nir_lower_phis_to_scalar.c 
b/src/glsl/nir/nir_lower_phis_to_scalar.c
new file mode 100644
index 000..9f901d6
--- /dev/null
+++ b/src/glsl/nir/nir_lower_phis_to_scalar.c
@@ -0,0 +1,238 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Jason Ekstrand (ja...@jlekstrand.net)
+ *
+ */
+
+#include "nir.h"
+
+/*
+ * Implements common subexpression elimination
+ */
+
+struct lower_phis_to_scalar_state {
+   void *mem_ctx;
+   void *dead_ctx;
+
+   /* Hash table marking which phi nodes are scalarizable.  The key is
+* pointers to phi instructions and the entry is either NULL for not
+* scalarizable or non-null for scalarizable.
+*/
+   struct hash_table *phi_table;
+};
+
+/* Determines if the given phi node should be lowered.  The only phi nodes
+ * we will scalarize at the moment are those where all of the sources are
+ * scalarizable.
+ */
+static bool
+should_lower_phi(nir_phi_instr *phi, struct lower_phis_to_scalar_state *state)
+{
+   /* Already scalar */
+   if (phi->dest.ssa.num_components == 1)
+  return false;
+
+   struct hash_entry *entry = _mesa_hash_table_search(state->phi_table, phi);
+   if (entry)
+  return entry->data != NULL;
+
+   nir_foreach_phi_src(phi, src) {
+  /* Don't know what to do with non-ssa sources */
+  if (!src->src.is_ssa)
+ return false;
+
+  nir_instr *src_instr = src->src.ssa->parent_instr;
+  switch (src_instr->type) {
+  case nir_instr_type_alu: {
+ nir_alu_instr *src_alu = nir_instr_as_alu(src_instr);
+
+ /* ALU operations with output_size == 0 should be scalarized.  We
+  * will also see a bunch of vecN operations from scalarizing ALU
+  * operations and, since they can easily be copy-propagated, they
+  * are ok too.
+  */
+ return nir_op_infos[src_alu->op].output_size == 0 ||
+src_alu->op != nir_op_vec2 ||
+src_alu->op != nir_op_vec3 ||
+src_alu->op != nir_op_vec4;
+  }
+
+  case nir_instr_type_phi: {
+ nir_phi_instr *src_phi = nir_instr_as_phi(src_instr);
+
+ /* Insert an entry and mark it as scalarizable for now. That way
+  * we don't recurse forever and a cycle in the depencence graph
+  * won't automatically make us fail to scalarize.
+  */
+ entry = _mesa_hash_table_insert(state->phi_table, src_phi, (void *)1);
+ bool scalarizable = should_lower_phi(src_phi, state);
+ entry->data = (void *)scalarizable;
+
+ return scalarizable;
+

[Mesa-dev] [PATCH v2 2/3] i965/fs: Use NIR's scalarizing abilities and stop handling vectors

2015-01-23 Thread Jason Ekstrand
Now that we can scalarize with NIR, there's no need for all this code
anymore.  Let's get rid of it and just do scalar operations.
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  15 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 487 ++-
 2 files changed, 156 insertions(+), 346 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 84e0b9e..25197cd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -592,21 +592,6 @@ public:
fs_reg get_nir_alu_src(nir_alu_instr *instr, unsigned src);
fs_reg get_nir_dest(nir_dest dest);
void emit_percomp(fs_inst *inst, unsigned wr_mask);
-   void emit_percomp(enum opcode op, fs_reg dest, fs_reg src0,
- unsigned wr_mask, bool saturate = false,
- enum brw_predicate predicate = BRW_PREDICATE_NONE,
- enum brw_conditional_mod mod = BRW_CONDITIONAL_NONE);
-   void emit_percomp(enum opcode op, fs_reg dest, fs_reg src0, fs_reg src1,
- unsigned wr_mask, bool saturate = false,
- enum brw_predicate predicate = BRW_PREDICATE_NONE,
- enum brw_conditional_mod mod = BRW_CONDITIONAL_NONE);
-   void emit_math_percomp(enum opcode op, fs_reg dest, fs_reg src0,
-  unsigned wr_mask, bool saturate = false);
-   void emit_math_percomp(enum opcode op, fs_reg dest, fs_reg src0,
-  fs_reg src1, unsigned wr_mask,
-  bool saturate = false);
-   void emit_reduction(enum opcode op, fs_reg dest, fs_reg src,
-   unsigned num_components);
 
int setup_color_payload(fs_reg *dst, fs_reg color, unsigned components);
void emit_alpha_test();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index de0d780..a722660 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -34,6 +34,8 @@ nir_optimize(nir_shader *nir)
   progress = false;
   nir_lower_vars_to_ssa(nir);
   nir_validate_shader(nir);
+  nir_lower_phis_to_scalar(nir);
+  nir_validate_shader(nir);
   progress |= nir_copy_prop(nir);
   nir_validate_shader(nir);
   progress |= nir_opt_dce(nir);
@@ -85,6 +87,9 @@ fs_visitor::emit_nir_code()
nir_split_var_copies(nir);
nir_validate_shader(nir);
 
+   nir_lower_alu_to_scalar(nir);
+   nir_validate_shader(nir);
+
nir_optimize(nir);
 
/* Lower a bunch of stuff */
@@ -540,20 +545,30 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++)
   op[i] = get_nir_alu_src(instr, i);
 
+   if (nir_op_infos[instr->op].output_size == 0) {
+  /* We've already scalarized, so we know that we only have one
+   * channel.  The only question is which channel.
+   */
+  assert(_mesa_bitcount(instr->dest.write_mask) == 1);
+  unsigned off = ffs(instr->dest.write_mask) - 1;
+  result = offset(result, off);
+
+  for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++)
+ op[i] = offset(op[i], off);
+   }
+
switch (instr->op) {
case nir_op_fmov:
case nir_op_i2f:
-   case nir_op_u2f: {
-  fs_inst *inst = MOV(result, op[0]);
-  inst->saturate = instr->dest.saturate;
-  emit_percomp(inst, instr->dest.write_mask);
-   }
+   case nir_op_u2f:
+  emit(MOV(result, op[0]))
+  ->saturate = instr->dest.saturate;
   break;
 
case nir_op_imov:
case nir_op_f2i:
case nir_op_f2u:
-  emit_percomp(MOV(result, op[0]), instr->dest.write_mask);
+  emit(MOV(result, op[0]));
   break;
 
case nir_op_fsign: {
@@ -562,55 +577,46 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
  * Predicated OR ORs 1.0 (0x3f80) with the sign bit if val is not
  * zero.
  */
-  emit_percomp(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_NZ),
-   instr->dest.write_mask);
+  emit(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_NZ));
 
   fs_reg result_int = retype(result, BRW_REGISTER_TYPE_UD);
   op[0].type = BRW_REGISTER_TYPE_UD;
   result.type = BRW_REGISTER_TYPE_UD;
-  emit_percomp(AND(result_int, op[0], fs_reg(0x8000u)),
-   instr->dest.write_mask);
+  emit(AND(result_int, op[0], fs_reg(0x8000u)));
 
-  fs_inst *inst = OR(result_int, result_int, fs_reg(0x3f80u));
-  inst->predicate = BRW_PREDICATE_NORMAL;
-  emit_percomp(inst, instr->dest.write_mask);
+  emit(OR(result_int, result_int, fs_reg(0x3f80u)))
+  ->predicate = BRW_PREDICATE_NORMAL;
   if (instr->dest.saturate) {
- fs_inst *inst = MOV(result, result);
- inst->saturate = true;
- emit_percomp(inst, instr->dest.write_mask);
+ emit(MOV(result, result))
+ ->saturate = true;
   

[Mesa-dev] [PATCH v2 3/3] i965/fs_nir: Get rid of get_alu_src

2015-01-23 Thread Jason Ekstrand
Originally, get_alu_src was supposed to handle resolving swizzles and
things like that.  However, now that basically every instruction we have
only takes scalar sources, we don't really need it anymore.  The only case
where it's still marginally useful is for the mov and vecN operations that
are left over from SSA form.  We can handle those cases as a special case
easily enough.  As a side-effect, we don't need the vec_to_movs pass
anymore.
---
 src/mesa/drivers/dri/i965/brw_fs.h   |   1 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 153 +++
 2 files changed, 95 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 25197cd..b95e2c0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -589,7 +589,6 @@ public:
void nir_emit_texture(nir_tex_instr *instr);
void nir_emit_jump(nir_jump_instr *instr);
fs_reg get_nir_src(nir_src src);
-   fs_reg get_nir_alu_src(nir_alu_instr *instr, unsigned src);
fs_reg get_nir_dest(nir_dest dest);
void emit_percomp(fs_inst *inst, unsigned wr_mask);
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index a722660..d3916d6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -138,8 +138,6 @@ fs_visitor::emit_nir_code()
 
nir_convert_from_ssa(nir);
nir_validate_shader(nir);
-   nir_lower_vec_to_movs(nir);
-   nir_validate_shader(nir);
 
/* emit the arrays used for inputs and outputs - load/store intrinsics will
 * be converted to reads/writes of these arrays
@@ -417,6 +415,7 @@ fs_visitor::nir_emit_impl(nir_function_impl *impl)
 void
 fs_visitor::nir_emit_cf_list(exec_list *list)
 {
+   exec_list_validate(list);
foreach_list_typed(nir_cf_node, node, node, list) {
   switch (node->type) {
   case nir_cf_node_if:
@@ -538,34 +537,117 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
 {
struct brw_wm_prog_key *fs_key = (struct brw_wm_prog_key *) this->key;
 
-   fs_reg op[3];
fs_reg result = get_nir_dest(instr->dest.dest);
result.type = brw_type_for_nir_type(nir_op_infos[instr->op].output_type);
 
-   for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++)
-  op[i] = get_nir_alu_src(instr, i);
+   fs_reg op[4];
+   for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
+  op[i] = get_nir_src(instr->src[i].src);
+  op[i].type = 
brw_type_for_nir_type(nir_op_infos[instr->op].input_types[i]);
+  op[i].abs = instr->src[i].abs;
+  op[i].negate = instr->src[i].negate;
+   }
+
+   /* We get a bunch of mov's out of the from_ssa pass and they may still
+* be vectorized.  We'll handle them as a special-case.  We'll also
+* handle vecN here because it's basically the same thing.
+*/
+   bool need_extra_copy = false;
+   switch (instr->op) {
+   case nir_op_vec4:
+  if (!instr->src[3].src.is_ssa &&
+  instr->dest.dest.reg.reg == instr->src[3].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_vec3:
+  if (!instr->src[2].src.is_ssa &&
+  instr->dest.dest.reg.reg == instr->src[2].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_vec2:
+  if (!instr->src[1].src.is_ssa &&
+  instr->dest.dest.reg.reg == instr->src[1].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_imov:
+   case nir_op_fmov: {
+  if (!instr->src[0].src.is_ssa &&
+  instr->dest.dest.reg.reg == instr->src[0].src.reg.reg)
+ need_extra_copy = true;
+
+  fs_reg temp;
+  if (need_extra_copy) {
+ temp = retype(vgrf(4), result.type);
+  } else {
+ temp = result;
+  }
+
+  if (instr->op == nir_op_imov || instr->op == nir_op_fmov) {
+ for (unsigned i = 0; i < 4; i++) {
+if (!(instr->dest.write_mask & (1 << i)))
+   continue;
+
+emit(MOV(offset(temp, i),
+ offset(op[0], instr->src[0].swizzle[i])))
+->saturate = instr->dest.saturate;
+ }
+  } else {
+ for (unsigned i = 0; i < 4; i++) {
+if (!(instr->dest.write_mask & (1 << i)))
+   continue;
+
+emit(MOV(offset(temp, i),
+ offset(op[i], instr->src[i].swizzle[0])))
+->saturate = instr->dest.saturate;
+ }
+  }
+
+  /* In this case the source and destination registers were the same,
+   * so we need to insert an extra set of moves in order to deal with
+   * any swizzling.
+   */
+  if (need_extra_copy) {
+ for (unsigned i = 0; i < 4; i++) {
+if (!(instr->dest.write_mask & (1 << i)))
+   continue;
+
+emit(MOV(offset(result, i), offset(temp, i)));
+ }
+  }
+  return;
+   }
+   default:
+  brea

[Mesa-dev] [PATCH v2 0/3] i965/fs: Use the NIR scalarizer

2015-01-23 Thread Jason Ekstrand
This is a second version of my scalarizing series.  It uses the scalarizing
pass pushed by Eric Anholt earlier today instead of the one used in the
previous series.  Also, after this series, we no longer use the vec_to_movs
pass.

Jason Ekstrand (3):
  nir: Add a pass to lower vector phi nodes to scalar phi nodes
  i965/fs: Use NIR's scalarizing abilities and stop handling vectors
  i965/fs_nir: Get rid of get_alu_src

 src/glsl/Makefile.sources|   1 +
 src/glsl/nir/nir.h   |   2 +
 src/glsl/nir/nir_lower_phis_to_scalar.c  | 238 
 src/mesa/drivers/dri/i965/brw_fs.h   |  16 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 628 ---
 5 files changed, 486 insertions(+), 399 deletions(-)
 create mode 100644 src/glsl/nir/nir_lower_phis_to_scalar.c

-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nir/search: Use nir_alu_src_copy

2015-01-23 Thread Jason Ekstrand
Before we were doing this confusing thing where we copied structures and
then did a nir_src_copy in case there was an indirect.  Now, we just call
the new nir_alu_src_copy function instead.
---
 src/glsl/nir/nir_search.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 7ef22e8..18e0330 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -233,8 +233,8 @@ construct_value(const nir_search_value *value, nir_alu_type 
type,
   const nir_search_variable *var = nir_search_value_as_variable(value);
   assert(state->variables_seen & (1 << var->variable));
 
-  nir_alu_src val = state->variables[var->variable];
-  val.src = nir_src_copy(val.src, mem_ctx);
+  nir_alu_src val;
+  nir_alu_src_copy(&val, &state->variables[var->variable], mem_ctx);
 
   return val;
}
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nir/search: Add support for matching unknown constants

2015-01-23 Thread Jason Ekstrand
There are some algebraic transformations that we want to do but only if
certain things are constants.  For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.  This commit allows you to match an arbitrary constant value by
adding a "#" on the front of a variable name.
---
 src/glsl/nir/nir_algebraic.py | 8 
 src/glsl/nir/nir_search.c | 6 ++
 src/glsl/nir/nir_search.h | 7 +++
 3 files changed, 21 insertions(+)

diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
index f9b246d..5afd53d 100644
--- a/src/glsl/nir/nir_algebraic.py
+++ b/src/glsl/nir/nir_algebraic.py
@@ -60,6 +60,7 @@ static const ${val.c_type} ${val.name} = {
{ ${hex(val)} /* ${val.value} */ },
 % elif isinstance(val, Variable):
${val.index}, /* ${val.var_name} */
+   ${'true' if val.is_constant else 'false'},
 % elif isinstance(val, Expression):
nir_op_${val.opcode},
{ ${', '.join(src.c_ptr for src in val.sources)} },
@@ -109,6 +110,13 @@ class Constant(Value):
 class Variable(Value):
def __init__(self, val, name, varset):
   Value.__init__(self, name, "variable")
+
+  if val.startswith('#'):
+ val = val[1:]
+ self.is_constant = True
+  else:
+ self.is_constant = False
+
   self.var_name = val
   self.index = varset[val]
   self.name = name
diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 18e0330..ec89817 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -78,6 +78,10 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
 
  return true;
   } else {
+ if (var->is_constant &&
+ instr->src[src].src.ssa->parent_instr->type != 
nir_instr_type_load_const)
+return false;
+
  state->variables_seen |= (1 << var->variable);
  state->variables[var->variable].src = instr->src[src].src;
  state->variables[var->variable].abs = false;
@@ -236,6 +240,8 @@ construct_value(const nir_search_value *value, nir_alu_type 
type,
   nir_alu_src val;
   nir_alu_src_copy(&val, &state->variables[var->variable], mem_ctx);
 
+  assert(!var->is_constant);
+
   return val;
}
 
diff --git a/src/glsl/nir/nir_search.h b/src/glsl/nir/nir_search.h
index 8ec58b0..18aa28d 100644
--- a/src/glsl/nir/nir_search.h
+++ b/src/glsl/nir/nir_search.h
@@ -47,6 +47,13 @@ typedef struct {
 
/** The variable index;  Must be less than NIR_SEARCH_MAX_VARIABLES */
unsigned variable;
+
+   /** Indicates that the given variable must be a constant
+*
+* This is only alloed in search expressions and indicates that the
+* given variable is only allowed to match constant values.
+*/
+   bool is_constant;
 } nir_search_variable;
 
 typedef struct {
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] nir: Add some more optimizations for handling constants in bcsel

2015-01-23 Thread Jason Ekstrand
shader-db results based on my scalarizing patches:

total instructions in shared programs: 6077319 -> 6076895 (-0.01%)
instructions in affected programs: 63509 -> 63085 (-0.67%)
helped:306
HURT:  0
GAINED:0
LOST:  0
---
 src/glsl/nir/nir_opt_algebraic.py | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 9c62b28..481ef58 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -116,4 +116,17 @@ optimizations = [
(('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
 ]
 
+# Add optimizations to handle the case where the result of a ternary is
+# compared to a constant.  This way we can take things like
+for op in ['flt', 'fge', 'feq', 'fne',
+   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
+   optimizations += [
+  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
+   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
+  ((op, '#d', ('bcsel', a, '#b', '#c')),
+   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
+  (('bcsel', (op, 'a', 'b'), True, False), (op, 'a', 'b')),
+  (('bcsel', (op, 'a', 'b'), False, True), ('inot', (op, 'a', 'b'))),
+   ]
+
 print nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render()
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 2/4] nir: add new constant folding infrastructure

2015-01-23 Thread Connor Abbott
Other than the one comment fix below,

Reviewed-by: Connor Abbott 

On Fri, Jan 23, 2015 at 7:17 PM, Jason Ekstrand  wrote:
> Add a required field to the Opcode class, const_expr, that contains an
> expression or statement that computes the result of the opcode given known
> constant inputs. Then take those const_expr's and expand them into a function
> that takes an opcode and an array of constant inputs and spits out the 
> constant
> result. This means that when adding opcodes, there's one less place to update,
> and almost all the opcodes are self-documenting since the information on how 
> to
> compute the result is right next to the definition.
>
> The helper functions in nir_constant_expressions.c were taken from
> ir_constant_expressions.cpp.
>
> v3 Jason Ekstrand 

Might want to fix your email address here and a few lines below.

>  - Use mako to generate one function per opcode instead of doing piles of
>string splicing
>
> v4 Jason Ekstrand 
>  - More comments and better indentation in the mako
>  - Add a description of the constant expression language in nir_opcodes.py
>  - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/glsl/Makefile.am |   6 +
>  src/glsl/Makefile.sources|   1 +
>  src/glsl/nir/.gitignore  |   1 +
>  src/glsl/nir/nir_constant_expressions.h  |  31 ++
>  src/glsl/nir/nir_constant_expressions.py | 351 +++
>  src/glsl/nir/nir_opcodes.py  | 580 
> +--
>  6 files changed, 786 insertions(+), 184 deletions(-)
>  create mode 100644 src/glsl/nir/nir_constant_expressions.h
>  create mode 100644 src/glsl/nir/nir_constant_expressions.py
>
> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> index bbaffbe..8c6c8b9 100644
> --- a/src/glsl/Makefile.am
> +++ b/src/glsl/Makefile.am
> @@ -37,6 +37,7 @@ EXTRA_DIST = tests glcpp/tests README TODO glcpp/README 
>   \
> glsl_parser.yy  \
> glcpp/glcpp-lex.l   \
> glcpp/glcpp-parse.y \
> +   nir/nir_constant_expressions.py \
> nir/nir_opcodes.py  \
> nir/nir_opcodes_c.py\
> nir/nir_opcodes_h.py\
> @@ -220,6 +221,7 @@ BUILT_SOURCES =   
>   \
> glsl_lexer.cpp  \
> glcpp/glcpp-parse.c \
> glcpp/glcpp-lex.c   \
> +   nir/nir_constant_expressions.c  \
> nir/nir_opcodes.c   \
> nir/nir_opcodes.h   \
> nir/nir_opt_algebraic.c
> @@ -235,6 +237,10 @@ dist-hook:
> $(RM) glcpp/tests/*.out
> $(RM) glcpp/tests/subtest*/*.out
>
> +nir/nir_constant_expressions.c: nir/nir_opcodes.py 
> nir/nir_constant_expressions.py nir/nir_constant_expressions.h
> +   $(MKDIR_P) nir; \
> +   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py 
> > $@
> +
>  nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
> $(MKDIR_P) nir; \
> $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index dc1c55d..dd76c44 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -14,6 +14,7 @@ LIBGLCPP_GENERATED_FILES = \
> $(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
>
>  NIR_GENERATED_FILES = \
> +   $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c \
> $(GLSL_BUILDDIR)/nir/nir_opcodes.c \
> $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
> $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
> diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
> index 4c28193..261f64f 100644
> --- a/src/glsl/nir/.gitignore
> +++ b/src/glsl/nir/.gitignore
> @@ -1,3 +1,4 @@
>  nir_opt_algebraic.c
>  nir_opcodes.c
>  nir_opcodes.h
> +nir_constant_expressions.c
> diff --git a/src/glsl/nir/nir_constant_expressions.h 
> b/src/glsl/nir/nir_constant_expressions.h
> new file mode 100644
> index 000..97997f2
> --- /dev/null
> +++ b/src/glsl/nir/nir_constant_expressions.h
> @@ -0,0 +1,31 @@
> +/*
> + * Copyright © 2014 Connor Abbott
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the f

Re: [Mesa-dev] [PATCH v4 2/4] nir: add new constant folding infrastructure

2015-01-23 Thread Jason Ekstrand
On Jan 23, 2015 10:37 PM, "Connor Abbott"  wrote:
>
> Other than the one comment fix below,
>
> Reviewed-by: Connor Abbott 
>
> On Fri, Jan 23, 2015 at 7:17 PM, Jason Ekstrand 
wrote:
> > Add a required field to the Opcode class, const_expr, that contains an
> > expression or statement that computes the result of the opcode given
known
> > constant inputs. Then take those const_expr's and expand them into a
function
> > that takes an opcode and an array of constant inputs and spits out the
constant
> > result. This means that when adding opcodes, there's one less place to
update,
> > and almost all the opcodes are self-documenting since the information
on how to
> > compute the result is right next to the definition.
> >
> > The helper functions in nir_constant_expressions.c were taken from
> > ir_constant_expressions.cpp.
> >
> > v3 Jason Ekstrand 
>
> Might want to fix your email address here and a few lines below.

Oops.  I'll fix that.

>
> >  - Use mako to generate one function per opcode instead of doing piles
of
> >string splicing
> >
> > v4 Jason Ekstrand 
> >  - More comments and better indentation in the mako
> >  - Add a description of the constant expression language in
nir_opcodes.py
> >  - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am
> >
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/glsl/Makefile.am |   6 +
> >  src/glsl/Makefile.sources|   1 +
> >  src/glsl/nir/.gitignore  |   1 +
> >  src/glsl/nir/nir_constant_expressions.h  |  31 ++
> >  src/glsl/nir/nir_constant_expressions.py | 351 +++
> >  src/glsl/nir/nir_opcodes.py  | 580
+--
> >  6 files changed, 786 insertions(+), 184 deletions(-)
> >  create mode 100644 src/glsl/nir/nir_constant_expressions.h
> >  create mode 100644 src/glsl/nir/nir_constant_expressions.py
> >
> > diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> > index bbaffbe..8c6c8b9 100644
> > --- a/src/glsl/Makefile.am
> > +++ b/src/glsl/Makefile.am
> > @@ -37,6 +37,7 @@ EXTRA_DIST = tests glcpp/tests README TODO
glcpp/README   \
> > glsl_parser.yy  \
> > glcpp/glcpp-lex.l   \
> > glcpp/glcpp-parse.y \
> > +   nir/nir_constant_expressions.py \
> > nir/nir_opcodes.py  \
> > nir/nir_opcodes_c.py\
> > nir/nir_opcodes_h.py\
> > @@ -220,6 +221,7 @@ BUILT_SOURCES =
 \
> > glsl_lexer.cpp  \
> > glcpp/glcpp-parse.c \
> > glcpp/glcpp-lex.c   \
> > +   nir/nir_constant_expressions.c  \
> > nir/nir_opcodes.c   \
> > nir/nir_opcodes.h   \
> > nir/nir_opt_algebraic.c
> > @@ -235,6 +237,10 @@ dist-hook:
> > $(RM) glcpp/tests/*.out
> > $(RM) glcpp/tests/subtest*/*.out
> >
> > +nir/nir_constant_expressions.c: nir/nir_opcodes.py
nir/nir_constant_expressions.py nir/nir_constant_expressions.h
> > +   $(MKDIR_P) nir;
 \
> > +   $(PYTHON2) $(PYTHON_FLAGS)
$(srcdir)/nir/nir_constant_expressions.py > $@
> > +
> >  nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
> > $(MKDIR_P) nir;
 \
> > $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_h.py > $@
> > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> > index dc1c55d..dd76c44 100644
> > --- a/src/glsl/Makefile.sources
> > +++ b/src/glsl/Makefile.sources
> > @@ -14,6 +14,7 @@ LIBGLCPP_GENERATED_FILES = \
> > $(GLSL_BUILDDIR)/glcpp/glcpp-parse.c
> >
> >  NIR_GENERATED_FILES = \
> > +   $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c \
> > $(GLSL_BUILDDIR)/nir/nir_opcodes.c \
> > $(GLSL_BUILDDIR)/nir/nir_opcodes.h \
> > $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c
> > diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore
> > index 4c28193..261f64f 100644
> > --- a/src/glsl/nir/.gitignore
> > +++ b/src/glsl/nir/.gitignore
> > @@ -1,3 +1,4 @@
> >  nir_opt_algebraic.c
> >  nir_opcodes.c
> >  nir_opcodes.h
> > +nir_constant_expressions.c
> > diff --git a/src/glsl/nir/nir_constant_expressions.h
b/src/glsl/nir/nir_constant_expressions.h
> > new file mode 100644
> > index 000..97997f2
> > --- /dev/null
> > +++ b/src/glsl/nir/nir_constant_expressions.h
> > @@ -0,0 +1,31 @@
> > +/*
> > + * Copyright © 2014 Connor Abbott
> > + *
> > + * Permission is hereby granted, free of charge, to any person
obtaining a
> > + * copy of this software and associated documentation files (the
"Software"),
> > + * to deal in the Software without restriction, including without
limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
>