date:20130404

[Mesa-dev] [Bug 63097] libegl1-mesa-drivers from git (xedgers ppa) can't be installed due to dependencie change on ubuntu raring.

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=63097

Kenneth Graunke  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #2 from Kenneth Graunke  ---
This is the bug tracker for upstream Mesa development.  We write the software,
but we don't control distribution specific packaging.  You'll need to report
that to the people that run the xorg-edgers ppa.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gallium/radeonsi: add 2d tiling support for texture

2013-04-04 Thread Michel Dänzer

On Mit, 2013-04-03 at 17:21 -0400, j.gli...@gmail.com wrote: 
> From: Jerome Glisse 
> 
> Signed-off-by: Jerome Glisse 

FWIW, we've been using just 'radeonsi:' as the commit log summary
prefix.


> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index ca9e8b4..9483304 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> [...]
> @@ -1707,6 +1656,7 @@ static void si_cb(struct r600_context *rctx, struct 
> si_pm4_state *pm4,
> S_028C74_FORCE_DST_ALPHA_1(desc->swizzle[3] == 
> UTIL_FORMAT_SWIZZLE_1);
>  
> offset += r600_resource_va(rctx->context.screen, 
> state->cbufs[cb]->texture);
> +   offset += rtex->surface.level[0].offset;
> offset >>= 8;
>  
> /* FIXME handle enabling of CB beyond BASE8 which has different 
> offset */

This looks wrong. offset already includes
rtex->surface.level[level].offset at this point.


> @@ -2239,7 +2183,6 @@ static struct pipe_sampler_view 
> *si_create_sampler_view(struct pipe_context *ctx
>   S_008F24_LAST_ARRAY(state->u.tex.last_layer));
> view->state[6] = 0;
> view->state[7] = 0;
> -
> return &view->base;
>  }
>  

Please drop the whitespace-only hunks. 

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: Fallback to shader decoding if UVD isn't available

2013-04-04 Thread Michel Dänzer

On Mit, 2013-04-03 at 20:51 +0200, Andreas Boll wrote: 
> @@ -733,6 +757,9 @@ struct pipe_screen *radeonsi_screen_create(struct 
> radeon_winsys *ws)
> return NULL;
> }
>  
> +   /* UVD support. */
> +   rscreen->has_uvd = rscreen->info.drm_minor >= 31;

This might need to change depending how the DRM minor bumps pan out
between UVD and SI 2D tiling support. Other than that,

Reviewed-by: Michel Dänzer  

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa 9.1.1 for arm

2013-04-04 Thread Michel Dänzer

On Mit, 2013-04-03 at 11:04 -0700, Matt Turner wrote: 
> On Wed, Apr 3, 2013 at 5:25 AM, Michel Dänzer  wrote:
> > On Mit, 2013-04-03 at 14:40 +0400, Alexander Khryukin wrote:
> >> Hi all!
> >>
> >> i'm trying to build latest mesa release 9.1.1 on my arm board i.mx6
> >>
> >> and i got
> >>
> >>
> >> .c  -fPIC -DPIC -o .libs/xorg_driver.o
> >> In file included from xorg_driver.c:36:0:
> >> /usr/include/xorg/xf86PciInfo.h:50:2: warning: #warning "xf86PciInfo.h
> >> is deprecated.  For greater compatibility, drivers should include
> >> necessary PCI IDs locally rather than relying on this file from
> >> xorg-server." [-Wcpp]
> >> In file included from xorg_driver.c:41:0:
> >> /usr/include/xorg/fb.h:94:2: error: #error "GLYPHPADBYTES must be 4"
> >
> > Looks like this would need to be reported to the X.org xserver project.
> >
> > Do you really need to build Mesa with --enable-xorg though?
> 
> If I had to guess, he probably thought --enable-xorg made Mesa work
> with Xorg, rather than enabling the Gallium state tracker.

Apparently he just blindly copied the configure arguments from some RPM
spec file.


> I'd be nice to avoid this confusion by having a better name. I've seen
> others make this mistake.

Patches welcome. :) IME it's a losing battle though I'm afraid. 

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium/hud: replace malloc w/ MALLOC

2013-04-04 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek


On Thu, Apr 4, 2013 at 1:39 AM, Brian Paul  wrote:

> To match the FREE() called used later.  Fixes things on Windows.
> ---
>  src/gallium/auxiliary/hud/hud_context.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_context.c
> b/src/gallium/auxiliary/hud/hud_context.c
> index 5511f8e..b417f5d 100644
> --- a/src/gallium/auxiliary/hud/hud_context.c
> +++ b/src/gallium/auxiliary/hud/hud_context.c
> @@ -621,7 +621,7 @@ hud_pane_add_graph(struct hud_pane *pane, struct
> hud_graph *gr)
> }
>
> assert(pane->num_graphs < Elements(colors));
> -   gr->vertices = malloc(pane->max_num_vertices * sizeof(float) * 2);
> +   gr->vertices = MALLOC(pane->max_num_vertices * sizeof(float) * 2);
> gr->color[0] = colors[pane->num_graphs][0];
> gr->color[1] = colors[pane->num_graphs][1];
> gr->color[2] = colors[pane->num_graphs][2];
> --
> 1.7.3.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Christoph Bumiller

On 04.04.2013 03:45, Zack Rusin wrote:
> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
> painful to emulate without branching. Most hardware
> supports it natively and even llvm has a 'select' opcode
> which can handle it without too much hassle.
>
> diff --git a/src/gallium/docs/source/tgsi.rst 
> b/src/gallium/docs/source/tgsi.rst
> index 28308cb..6c5a02b 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -72,6 +72,17 @@ used.
>  
>dst.w = src.w
>  
> +.. opcode:: MOVC - Conditional move
> +
> +.. math::
> +
> +  dst.x = src0.x ? src1.x : src2.x
> +
> +  dst.y = src0.y ? src1.y : src2.y
> +
> +  dst.z = src0.z ? src1.z : src2.z
> +
> +  dst.w = src0.w ? src1.w : src2.w
>  

I think we already have that:

.. opcode:: UCMP - Integer Conditional Move

.. math::

  dst.x = src0.x ? src1.x : src2.x

  dst.y = src0.y ? src1.y : src2.y

  dst.z = src0.z ? src1.z : src2.z

  dst.w = src0.w ? src1.w : src2.w


No difference apart from the source ordering (the "integer" just implies
that any non-zero value counts as true, i.e. also inf, nan and -0).

And if you want more conditional ops, in theory we also have
predication, albeit support for that depends on the driver
(PIPE_SHADER_CAP_MAX_PREDS).

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: document debug flag

2013-04-04 Thread Erik Faye-Lund

Ping? Anything wrong with it?


On Tue, Mar 26, 2013 at 2:48 PM, Erik Faye-Lund  wrote:

> Signed-off-by: Erik Faye-Lund 
> ---
>
> Here you go, a version of the patch prepared for git-am
>
>  src/gallium/docs/source/debugging.rst | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/docs/source/debugging.rst
> b/src/gallium/docs/source/debugging.rst
> index e081cbf..dc308e5 100644
> --- a/src/gallium/docs/source/debugging.rst
> +++ b/src/gallium/docs/source/debugging.rst
> @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver.
>
>  Number of threads that the llvmpipe driver should use.
>
> +.. envvar:: FD_MESA_DEBUG  (0x0)
> +
> +Debug :ref:`flags` for the freedreno driver.
> +
>
>  .. _flags:
>
> --
> 1.8.1.4
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 63117] New: OSMesa Gallium Empty Output

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=63117

  Priority: medium
Bug ID: 63117
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: OSMesa Gallium Empty Output
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: hob...@ohiou.edu
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=63117

--- Comment #1 from Kevin Hobbs  ---
Since OSMesa switched to the Gallium driver VTK tests which use OSMesa produce
black output images.

Before the change VTK's LoadOpenGLExtension test had : 

GL_VERSION: 2.1 Mesa 9.2-devel (git-6173cc1)
GL_RENDERER: Mesa OffScreen

in the output, and after the change it had :

GL_VERSION: 2.1 Mesa 9.2.0 (git-f7ef83c)
GL_RENDERER: Gallium 0.4 on llvmpipe (LLVM 3.0, 128 bits)

VTK sets up the OSMesa contest like this :

contextId = OSMesaCreateContext(GL_RGBA, NULL);
OSMesaMakeCurrent(contextId, window,  GL_UNSIGNED_BYTE, xsize, ysize);

VTK gets the image data from OSMesa like:

MakeCurrent();
while(glGetError() != GL_NO_ERROR) {};
glReadBuffer(front_or_back);

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2] mesa: don't memcmp() off the end of a cache key.

2013-04-04 Thread Paul Berry

On 2 April 2013 01:31, Chris Forbes  wrote:

> Reported-by: `per` in #intel-gfx
>
> The size of the cache key varies, so store the actual size as well as
> the key blob itself, rather than just assuming it's the same as the size
> passed in.
>
> NOTE: This is a candidate for stable branches.
>
> V2: Don't leave silly holes in structure; use unsigned instead of
> GLuint.
>
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/program/prog_cache.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/program/prog_cache.c b/src/mesa/program/prog_cache.c
> index 47f926b..1041f35 100644
> --- a/src/mesa/program/prog_cache.c
> +++ b/src/mesa/program/prog_cache.c
> @@ -37,6 +37,7 @@
>  struct cache_item
>  {
> GLuint hash;
> +   unsigned keysize;
> void *key;
> struct gl_program *program;
> struct cache_item *next;
> @@ -183,7 +184,10 @@ _mesa_search_program_cache(struct gl_program_cache
> *cache,
>struct cache_item *c;
>
>for (c = cache->items[hash % cache->size]; c; c = c->next) {
> - if (c->hash == hash && memcmp(c->key, key, keysize) == 0) {
> + if (c->hash == hash &&
> +c->keysize == keysize &&
> +memcmp(c->key, key, keysize) == 0) {
> +
>

At the top of this function (_mesa_search_program_cache) there's another
memcmp that needs to be fixed:

   if (cache->last &&
   memcmp(cache->last->key, key, keysize) == 0) {
  return cache->last->program;
   }

needs to change to:

   if (cache->last &&
   cache->last->keysize == keysize &&
   memcmp(cache->last->key, key, keysize) == 0) {
  return cache->last->program;
   }

With that additional fix, this patch is:

Reviewed-by: Paul Berry 



>  cache->last = c;
>  return c->program;
>   }
> @@ -207,6 +211,7 @@ _mesa_program_cache_insert(struct gl_context *ctx,
>
> c->key = malloc(keysize);
> memcpy(c->key, key, keysize);
> +   c->keysize = keysize;
>
> c->program = program;  /* no refcount change */
>
> @@ -235,6 +240,7 @@ _mesa_shader_cache_insert(struct gl_context *ctx,
>
> c->key = malloc(keysize);
> memcpy(c->key, key, keysize);
> +   c->keysize = keysize;
>
> c->program = (struct gl_program *)program;  /* no refcount change */
>
> --
> 1.8.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] i965: Use ctx->Stencil._WriteEnabled in DEPTH_STENCIL_STATE.

2013-04-04 Thread Paul Berry

On 2 April 2013 11:11, Kenneth Graunke  wrote:

> This is the same computation as the _WriteEnabled flag, so we may as
> well use it.
>
> Signed-off-by: Kenneth Graunke 
>

This series is:

Reviewed-by: Paul Berry 


> ---
>  src/mesa/drivers/dri/i965/gen6_depthstencil.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen6_depthstencil.c
> b/src/mesa/drivers/dri/i965/gen6_depthstencil.c
> index 4ea517f..940d91f 100644
> --- a/src/mesa/drivers/dri/i965/gen6_depthstencil.c
> +++ b/src/mesa/drivers/dri/i965/gen6_depthstencil.c
> @@ -74,11 +74,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw)
>  ds->ds1.bf_stencil_test_mask = ctx->Stencil.ValueMask[back];
>}
>
> -  /* Not really sure about this:
> -   */
> -  if (ctx->Stencil.WriteMask[0] ||
> - (ctx->Stencil._TestTwoSide && ctx->Stencil.WriteMask[back]))
> -ds->ds0.stencil_write_enable = 1;
> +  ds->ds0.stencil_write_enable = ctx->Stencil._WriteEnabled;
> }
>
> /* _NEW_DEPTH */
> --
> 1.8.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Use a variable for the push constant size in kB.

2013-04-04 Thread Paul Berry

On 2 April 2013 21:11, Kenneth Graunke  wrote:

> This clarifies that the offset of 2 is actually 16 kB / 8kB units.
> It also keys both computations off of a single variable, which should
> make it easier to change in the future.
>
> Signed-off-by: Kenneth Graunke 
>

This series is:

Reviewed-by: Paul Berry 


> ---
>  src/mesa/drivers/dri/i965/gen7_urb.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c
> b/src/mesa/drivers/dri/i965/gen7_urb.c
> index dafe1ad..5ac3885 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -78,8 +78,9 @@ static void
>  gen7_upload_urb(struct brw_context *brw)
>  {
> struct intel_context *intel = &brw->intel;
> +   const int push_size_kB = 16;
> /* Total space for entries is URB size - 16kB for push constants */
> -   int handle_region_size = (brw->urb.size - 16) * 1024; /* bytes */
> +   int handle_region_size = (brw->urb.size - push_size_kB) * 1024; /*
> bytes */
>
> /* CACHE_NEW_VS_PROG */
> unsigned vs_size = MAX2(brw->vs.prog_data->urb_entry_size, 1);
> @@ -92,7 +93,7 @@ gen7_upload_urb(struct brw_context *brw)
> brw->urb.nr_vs_entries = ROUND_DOWN_TO(nr_vs_entries, 8);
>
> /* URB Starting Addresses are specified in multiples of 8kB. */
> -   brw->urb.vs_start = 2; /* skip over push constants */
> +   brw->urb.vs_start = push_size_kB / 8; /* skip over push constants */
>
> assert(brw->urb.nr_vs_entries % 8 == 0);
> assert(brw->urb.nr_gs_entries % 8 == 0);
> --
> 1.8.1.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 47607] [advocacy] Make Anomaly Warzone Earth work with Mesa

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=47607

--- Comment #9 from imamdxl8...@gmail.com ---
fixed with steam version on my Intel GMA 4500,

please update the installer.

Cheers

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: document debug flag

2013-04-04 Thread Brian Paul


On 04/04/2013 05:20 AM, Erik Faye-Lund wrote:

Ping? Anything wrong with it?


Looks OK to me.  Do you need someone to commit for you?

-Brian




On Tue, Mar 26, 2013 at 2:48 PM, Erik Faye-Lund mailto:kusmab...@gmail.com>> wrote:

Signed-off-by: Erik Faye-Lund mailto:kusmab...@gmail.com>>
---

Here you go, a version of the patch prepared for git-am

  src/gallium/docs/source/debugging.rst | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/gallium/docs/source/debugging.rst
b/src/gallium/docs/source/debugging.rst
index e081cbf..dc308e5 100644
--- a/src/gallium/docs/source/debugging.rst
+++ b/src/gallium/docs/source/debugging.rst
@@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver.

  Number of threads that the llvmpipe driver should use.

+.. envvar:: FD_MESA_DEBUG  (0x0)
+
+Debug :ref:`flags` for the freedreno driver.
+

  .. _flags:

--
1.8.1.4




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61364] LLVM assertion when starting X11

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61364

deimosaff...@yahoo.com changed:

   What|Removed |Added

 CC||deimosaff...@yahoo.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] st/wgl: make stw_current_context() non-static

2013-04-04 Thread Jose Fonseca



- Original Message -
> ---
>  src/gallium/state_trackers/wgl/stw_context.c |2 +-
>  src/gallium/state_trackers/wgl/stw_context.h |4 +++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/wgl/stw_context.c
> b/src/gallium/state_trackers/wgl/stw_context.c
> index 1488bee..5e5b41f 100644
> --- a/src/gallium/state_trackers/wgl/stw_context.c
> +++ b/src/gallium/state_trackers/wgl/stw_context.c
> @@ -48,7 +48,7 @@
>  #include "stw_tls.h"
>  
>  
> -static INLINE struct stw_context *
> +struct stw_context *
>  stw_current_context(void)
>  {
> struct st_context_iface *st;
> diff --git a/src/gallium/state_trackers/wgl/stw_context.h
> b/src/gallium/state_trackers/wgl/stw_context.h
> index 07a5c7d..2a4afb6 100644
> --- a/src/gallium/state_trackers/wgl/stw_context.h
> +++ b/src/gallium/state_trackers/wgl/stw_context.h
> @@ -28,7 +28,7 @@
>  #ifndef STW_CONTEXT_H
>  #define STW_CONTEXT_H
>  
> -#include 
> +#include "stw_icd.h"

Alternatively, you could pre-declare "struct stw_context;" here.
  
>  struct stw_framebuffer;
>  struct st_context_iface;

Either way,

  Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] util: add debug_memory_check_block(), debug_memory_tag()

2013-04-04 Thread Jose Fonseca



- Original Message -
> The former just checks that the given block is valid by checking
> the header and footer.
> 
> The later sets the memory block's tag.  With extra debug code, we
> can use that for monitoring/checking particular allocations.
> ---
>  src/gallium/auxiliary/os/os_memory_debug.h  |6 +++
>  src/gallium/auxiliary/util/u_debug_memory.c |   55
>  +++
>  2 files changed, 61 insertions(+), 0 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/os/os_memory_debug.h
> b/src/gallium/auxiliary/os/os_memory_debug.h
> index 36b8fc6..9a487de 100644
> --- a/src/gallium/auxiliary/os/os_memory_debug.h
> +++ b/src/gallium/auxiliary/os/os_memory_debug.h
> @@ -60,6 +60,12 @@ void *
>  debug_realloc(const char *file, unsigned line, const char *function,
>void *old_ptr, size_t old_size, size_t new_size );
>  
> +void
> +debug_memory_tag(void *ptr, unsigned tag);
> +
> +void
> +debug_memory_check_block(void *ptr);
> +
>  void
>  debug_memory_check(void);
>  
> diff --git a/src/gallium/auxiliary/util/u_debug_memory.c
> b/src/gallium/auxiliary/util/u_debug_memory.c
> index 4bf26a5..4723547 100644
> --- a/src/gallium/auxiliary/util/u_debug_memory.c
> +++ b/src/gallium/auxiliary/util/u_debug_memory.c
> @@ -76,6 +76,7 @@ struct debug_memory_header
>  #endif
>  
> unsigned magic;
> +   unsigned tag;

Long term, I think a "const char * tag" would be handier -- it could be used in 
the debug messages.

>  };
>  
>  struct debug_memory_footer
> @@ -140,6 +141,7 @@ debug_malloc(const char *file, unsigned line, const char
> *function,
> hdr->function = function;
> hdr->size = size;
> hdr->magic = DEBUG_MEMORY_MAGIC;
> +   hdr->tag = 0;
>  #if DEBUG_FREED_MEMORY
> hdr->freed = FALSE;
>  #endif
> @@ -263,6 +265,7 @@ debug_realloc(const char *file, unsigned line, const char
> *function,
> new_hdr->function = old_hdr->function;
> new_hdr->size = new_size;
> new_hdr->magic = DEBUG_MEMORY_MAGIC;
> +   new_hdr->tag = 0;
>  #if DEBUG_FREED_MEMORY
> new_hdr->freed = FALSE;
>  #endif
> @@ -348,6 +351,58 @@ debug_memory_end(unsigned long start_no)
>  
>  
>  /**
> + * Put a tag (arbitrary integer) on a memory block.
> + * Can be useful for debugging.
> + */
> +void
> +debug_memory_tag(void *ptr, unsigned tag)
> +{
> +   struct debug_memory_header *hdr;
> +
> +   if (!ptr)
> +  return;
> +
> +   hdr = header_from_data(ptr);
> +   if (hdr->magic != DEBUG_MEMORY_MAGIC) {
> +  debug_printf("%s corrupted memory at %p\n", __FUNCTION__, ptr);
> +  debug_assert(0);
> +   }
> +
> +   hdr->tag = tag;
> +}
> +
> +
> +/**
> + * Check the given block of memory for validity/corruption.
> + */
> +void
> +debug_memory_check_block(void *ptr)
> +{
> +   struct debug_memory_header *hdr;
> +   struct debug_memory_footer *ftr;
> +
> +   if (!ptr)
> +  return;
> +
> +   hdr = header_from_data(ptr);
> +   ftr = footer_from_header(hdr);
> +
> +   if (hdr->magic != DEBUG_MEMORY_MAGIC) {
> +  debug_printf("%s:%u:%s: bad or corrupted memory %p\n",
> +   hdr->file, hdr->line, hdr->function, ptr);
> +  debug_assert(0);
> +   }
> +
> +   if (ftr->magic != DEBUG_MEMORY_MAGIC) {
> +  debug_printf("%s:%u:%s: buffer overflow %p\n",
> +   hdr->file, hdr->line, hdr->function, ptr);
> +  debug_assert(0);
> +   }
> +}
> +
> +
> +
> +/**
>   * We can periodically call this from elsewhere to do a basic sanity
>   * check of the heap memory we've allocated.
>   */
>


Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] st/wgl: make stw_current_context() non-static

2013-04-04 Thread Brian Paul


On 04/04/2013 08:38 AM, Jose Fonseca wrote:



- Original Message -

---
  src/gallium/state_trackers/wgl/stw_context.c |2 +-
  src/gallium/state_trackers/wgl/stw_context.h |4 +++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/wgl/stw_context.c
b/src/gallium/state_trackers/wgl/stw_context.c
index 1488bee..5e5b41f 100644
--- a/src/gallium/state_trackers/wgl/stw_context.c
+++ b/src/gallium/state_trackers/wgl/stw_context.c
@@ -48,7 +48,7 @@
  #include "stw_tls.h"


-static INLINE struct stw_context *
+struct stw_context *
  stw_current_context(void)
  {
 struct st_context_iface *st;
diff --git a/src/gallium/state_trackers/wgl/stw_context.h
b/src/gallium/state_trackers/wgl/stw_context.h
index 07a5c7d..2a4afb6 100644
--- a/src/gallium/state_trackers/wgl/stw_context.h
+++ b/src/gallium/state_trackers/wgl/stw_context.h
@@ -28,7 +28,7 @@
  #ifndef STW_CONTEXT_H
  #define STW_CONTEXT_H

-#include
+#include "stw_icd.h"


Alternatively, you could pre-declare "struct stw_context;" here.


Actually, I think I can just omit that hunk.  I think it's a left-over 
from something I tried earlier.




  struct stw_framebuffer;
  struct st_context_iface;


Either way,

   Reviewed-by: Jose Fonseca


Thanks.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] util: add debug_memory_check_block(), debug_memory_tag()

2013-04-04 Thread Brian Paul


On 04/04/2013 08:40 AM, Jose Fonseca wrote:



- Original Message -

The former just checks that the given block is valid by checking
the header and footer.

The later sets the memory block's tag.  With extra debug code, we
can use that for monitoring/checking particular allocations.
---
  src/gallium/auxiliary/os/os_memory_debug.h  |6 +++
  src/gallium/auxiliary/util/u_debug_memory.c |   55
  +++
  2 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/src/gallium/auxiliary/os/os_memory_debug.h
b/src/gallium/auxiliary/os/os_memory_debug.h
index 36b8fc6..9a487de 100644
--- a/src/gallium/auxiliary/os/os_memory_debug.h
+++ b/src/gallium/auxiliary/os/os_memory_debug.h
@@ -60,6 +60,12 @@ void *
  debug_realloc(const char *file, unsigned line, const char *function,
void *old_ptr, size_t old_size, size_t new_size );

+void
+debug_memory_tag(void *ptr, unsigned tag);
+
+void
+debug_memory_check_block(void *ptr);
+
  void
  debug_memory_check(void);

diff --git a/src/gallium/auxiliary/util/u_debug_memory.c
b/src/gallium/auxiliary/util/u_debug_memory.c
index 4bf26a5..4723547 100644
--- a/src/gallium/auxiliary/util/u_debug_memory.c
+++ b/src/gallium/auxiliary/util/u_debug_memory.c
@@ -76,6 +76,7 @@ struct debug_memory_header
  #endif

 unsigned magic;
+   unsigned tag;


Long term, I think a "const char * tag" would be handier -- it could be used in 
the debug messages.



Yeah.  In the case I was investigating, it was just easier to hack in 
code like if (tag==42) vs. if (strcmp(tag, "foo")==0)".


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Zack Rusin

> On 04.04.2013 03:45, Zack Rusin wrote:
> > It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
> > painful to emulate without branching. Most hardware
> > supports it natively and even llvm has a 'select' opcode
> > which can handle it without too much hassle.
> >
> > diff --git a/src/gallium/docs/source/tgsi.rst
> > b/src/gallium/docs/source/tgsi.rst
> > index 28308cb..6c5a02b 100644
> > --- a/src/gallium/docs/source/tgsi.rst
> > +++ b/src/gallium/docs/source/tgsi.rst
> > @@ -72,6 +72,17 @@ used.
> >  
> >dst.w = src.w
> >  
> > +.. opcode:: MOVC - Conditional move
> > +
> > +.. math::
> > +
> > +  dst.x = src0.x ? src1.x : src2.x
> > +
> > +  dst.y = src0.y ? src1.y : src2.y
> > +
> > +  dst.z = src0.z ? src1.z : src2.z
> > +
> > +  dst.w = src0.w ? src1.w : src2.w
> >  
> 
> I think we already have that:
> 
> .. opcode:: UCMP - Integer Conditional Move
> 
> .. math::
> 
>   dst.x = src0.x ? src1.x : src2.x
> 
>   dst.y = src0.y ? src1.y : src2.y
> 
>   dst.z = src0.z ? src1.z : src2.z
> 
>   dst.w = src0.w ? src1.w : src2.w
> 
> 
> No difference apart from the source ordering (the "integer" just implies
> that any non-zero value counts as true, i.e. also inf, nan and -0).

That's really broken. UCMP needs to be a an unsigned version of the CMP 
instruction which does
dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
not a whole new instruction. It's what everyone implements anyway. So if 
st_glsl_to_tgsi needs
a conditional move we need to add the above patch and change it to use it.

> And if you want more conditional ops, in theory we also have
> predication, albeit support for that depends on the driver
> (PIPE_SHADER_CAP_MAX_PREDS).

No, that's a completely different thing. 

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Jose Fonseca



- Original Message -
> > On 04.04.2013 03:45, Zack Rusin wrote:
> > > It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
> > > painful to emulate without branching. Most hardware
> > > supports it natively and even llvm has a 'select' opcode
> > > which can handle it without too much hassle.
> > >
> > > diff --git a/src/gallium/docs/source/tgsi.rst
> > > b/src/gallium/docs/source/tgsi.rst
> > > index 28308cb..6c5a02b 100644
> > > --- a/src/gallium/docs/source/tgsi.rst
> > > +++ b/src/gallium/docs/source/tgsi.rst
> > > @@ -72,6 +72,17 @@ used.
> > >  
> > >dst.w = src.w
> > >  
> > > +.. opcode:: MOVC - Conditional move
> > > +
> > > +.. math::
> > > +
> > > +  dst.x = src0.x ? src1.x : src2.x
> > > +
> > > +  dst.y = src0.y ? src1.y : src2.y
> > > +
> > > +  dst.z = src0.z ? src1.z : src2.z
> > > +
> > > +  dst.w = src0.w ? src1.w : src2.w
> > >  
> > 
> > I think we already have that:
> > 
> > .. opcode:: UCMP - Integer Conditional Move
> > 
> > .. math::
> > 
> >   dst.x = src0.x ? src1.x : src2.x
> > 
> >   dst.y = src0.y ? src1.y : src2.y
> > 
> >   dst.z = src0.z ? src1.z : src2.z
> > 
> >   dst.w = src0.w ? src1.w : src2.w
> > 
> > 
> > No difference apart from the source ordering (the "integer" just implies
> > that any non-zero value counts as true, i.e. also inf, nan and -0).
> 
> That's really broken. UCMP needs to be a an unsigned version of the CMP
> instruction which does
> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
> not a whole new instruction. It's what everyone implements anyway. So if
> st_glsl_to_tgsi needs
> a conditional move we need to add the above patch and change it to use it.

Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does that 
the spec says it supposedly does -- it seems everybody implements it as an 
unsigned version of CMP. That is, it seems UCMP's description needs to be fixed.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] freedreno: document debug flag

2013-04-04 Thread Erik Faye-Lund

On Thu, Apr 4, 2013 at 4:32 PM, Brian Paul  wrote:
>
> On 04/04/2013 05:20 AM, Erik Faye-Lund wrote:
>>
>> Ping? Anything wrong with it?
>
>
> Looks OK to me.  Do you need someone to commit for you?
>

Well, I don't have commit access. But I was expecting to see rob pick
it up and push it out the next time he had some updates, but I can't
see it in his repo either.

Either would be fine by me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Christoph Bumiller

On 04.04.2013 16:53, Zack Rusin wrote:
>> On 04.04.2013 03:45, Zack Rusin wrote:
>>> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
>>> painful to emulate without branching. Most hardware
>>> supports it natively and even llvm has a 'select' opcode
>>> which can handle it without too much hassle.
>>>
>>> diff --git a/src/gallium/docs/source/tgsi.rst
>>> b/src/gallium/docs/source/tgsi.rst
>>> index 28308cb..6c5a02b 100644
>>> --- a/src/gallium/docs/source/tgsi.rst
>>> +++ b/src/gallium/docs/source/tgsi.rst
>>> @@ -72,6 +72,17 @@ used.
>>>  
>>>dst.w = src.w
>>>  
>>> +.. opcode:: MOVC - Conditional move
>>> +
>>> +.. math::
>>> +
>>> +  dst.x = src0.x ? src1.x : src2.x
>>> +
>>> +  dst.y = src0.y ? src1.y : src2.y
>>> +
>>> +  dst.z = src0.z ? src1.z : src2.z
>>> +
>>> +  dst.w = src0.w ? src1.w : src2.w
>>>  
>> I think we already have that:
>>
>> .. opcode:: UCMP - Integer Conditional Move
>>
>> .. math::
>>
>>   dst.x = src0.x ? src1.x : src2.x
>>
>>   dst.y = src0.y ? src1.y : src2.y
>>
>>   dst.z = src0.z ? src1.z : src2.z
>>
>>   dst.w = src0.w ? src1.w : src2.w
>>
>>
>> No difference apart from the source ordering (the "integer" just implies
>> that any non-zero value counts as true, i.e. also inf, nan and -0).
> That's really broken. UCMP needs to be a an unsigned version of the CMP 
> instruction which does
Did you mean signed version ?
Would you mind doing an s/UCMP/ICMP in TGSI and then chaning all the
UCMPs in other code to MOVC ?
You're right, it would make more sense like this, though you might want
to call it IMOVC so the condition register isn't interpreted as a float
... or is it supposed to be ?

> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
> not a whole new instruction. It's what everyone implements anyway. So if 
> st_glsl_to_tgsi needs
> a conditional move we need to add the above patch and change it to use it.
>
>> And if you want more conditional ops, in theory we also have
>> predication, albeit support for that depends on the driver
>> (PIPE_SHADER_CAP_MAX_PREDS).
> No, that's a completely different thing. 
>
> z

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Christoph Bumiller

On 04.04.2013 17:01, Jose Fonseca wrote:
>
> - Original Message -
>>> On 04.04.2013 03:45, Zack Rusin wrote:
 It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
 painful to emulate without branching. Most hardware
 supports it natively and even llvm has a 'select' opcode
 which can handle it without too much hassle.

 diff --git a/src/gallium/docs/source/tgsi.rst
 b/src/gallium/docs/source/tgsi.rst
 index 28308cb..6c5a02b 100644
 --- a/src/gallium/docs/source/tgsi.rst
 +++ b/src/gallium/docs/source/tgsi.rst
 @@ -72,6 +72,17 @@ used.
  
dst.w = src.w
  
 +.. opcode:: MOVC - Conditional move
 +
 +.. math::
 +
 +  dst.x = src0.x ? src1.x : src2.x
 +
 +  dst.y = src0.y ? src1.y : src2.y
 +
 +  dst.z = src0.z ? src1.z : src2.z
 +
 +  dst.w = src0.w ? src1.w : src2.w
  
>>> I think we already have that:
>>>
>>> .. opcode:: UCMP - Integer Conditional Move
>>>
>>> .. math::
>>>
>>>   dst.x = src0.x ? src1.x : src2.x
>>>
>>>   dst.y = src0.y ? src1.y : src2.y
>>>
>>>   dst.z = src0.z ? src1.z : src2.z
>>>
>>>   dst.w = src0.w ? src1.w : src2.w
>>>
>>>
>>> No difference apart from the source ordering (the "integer" just implies
>>> that any non-zero value counts as true, i.e. also inf, nan and -0).
>> That's really broken. UCMP needs to be a an unsigned version of the CMP
>> instruction which does
>> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
>> not a whole new instruction. It's what everyone implements anyway. So if
>> st_glsl_to_tgsi needs
>> a conditional move we need to add the above patch and change it to use it.
> Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does 
> that the spec says it supposedly does -- it seems everybody implements it as 
> an unsigned version of CMP. That is, it seems UCMP's description needs to be 
> fixed.

Erm, unsigned < 0 doesn't make sense.

Definitely what the description says:
static void
micro_ucmp(union tgsi_exec_channel *dst,
   const union tgsi_exec_channel *src0,
   const union tgsi_exec_channel *src1,
   const union tgsi_exec_channel *src2)
{
   dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
   dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
   dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
   dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
}

or

   case TGSI_OPCODE_UCMP:
   case TGSI_OPCODE_CMP:
  FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
 src0 = fetchSrc(0, c);
 src1 = fetchSrc(1, c);
 src2 = fetchSrc(2, c);
 if (src1 == src2)
mkMov(dst0[c], src1);
 else
mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
CC_NE(not equal 0),
  srcTy, dst0[c], src1, src2, src0);
  }


> Jose
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Add a pass to flip matrix/vector multiplies to use dot products.

2013-04-04 Thread Paul Berry

On 2 April 2013 23:33, Kenneth Graunke  wrote:

> This pass flips (matrix * vector) operations to (vector *
> matrixTranspose) for certain built-in matrices (currently
> gl_ModelViewProjectionMatrix and gl_TextureMatrix).
>
> This is equivalent, but results in dot products rather than multiplies
> and adds.  On some hardware, this is more efficient.
>
> This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set
> to indicate they prefer dot products.
>
> Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10)
> on a Haswell GT2 system.  Passes Piglit on Ivybridge.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/Makefile.sources  |   1 +
>  src/glsl/glsl_parser_extras.cpp|   7 +-
>  src/glsl/ir_optimization.h |   4 +-
>  src/glsl/linker.cpp|   2 +-
>  src/glsl/main.cpp  |   2 +-
>  src/glsl/opt_flip_matrices.cpp | 122
> +
>  src/glsl/test_optpass.cpp  |   2 +-
>  src/mesa/drivers/dri/i965/brw_shader.cpp   |   2 +-
>  src/mesa/main/ff_fragment_shader.cpp   |   2 +-
>  src/mesa/program/ir_to_mesa.cpp|   5 +-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   3 +-
>  11 files changed, 142 insertions(+), 10 deletions(-)
>  create mode 100644 src/glsl/opt_flip_matrices.cpp
>
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index c294aa4..df1e9d5 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -80,6 +80,7 @@ LIBGLSL_FILES = \
> $(GLSL_SRCDIR)/opt_dead_code.cpp \
> $(GLSL_SRCDIR)/opt_dead_code_local.cpp \
> $(GLSL_SRCDIR)/opt_dead_functions.cpp \
> +   $(GLSL_SRCDIR)/opt_flip_matrices.cpp \
> $(GLSL_SRCDIR)/opt_function_inlining.cpp \
> $(GLSL_SRCDIR)/opt_if_simplification.cpp \
> $(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
> diff --git a/src/glsl/glsl_parser_extras.cpp
> b/src/glsl/glsl_parser_extras.cpp
> index 9740903..e0387b8 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -1206,7 +1206,8 @@ ast_struct_specifier::ast_struct_specifier(const
> char *identifier,
>  bool
>  do_common_optimization(exec_list *ir, bool linked,
>bool uniform_locations_assigned,
> -  unsigned max_unroll_iterations)
> +  unsigned max_unroll_iterations,
> +   bool prefer_dp4)
>  {
> GLboolean progress = GL_FALSE;
>
> @@ -1220,6 +1221,10 @@ do_common_optimization(exec_list *ir, bool linked,
> progress = do_if_simplification(ir) || progress;
> progress = do_copy_propagation(ir) || progress;
> progress = do_copy_propagation_elements(ir) || progress;
> +
> +   if (prefer_dp4 && !linked)
> +  progress = opt_flip_matrices(ir) || progress;
> +
> if (linked)
>progress = do_dead_code(ir, uniform_locations_assigned) || progress;
> else
> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
> index 2454bbe..b404256 100644
> --- a/src/glsl/ir_optimization.h
> +++ b/src/glsl/ir_optimization.h
> @@ -65,7 +65,8 @@ enum lower_packing_builtins_op {
>
>  bool do_common_optimization(exec_list *ir, bool linked,
> bool uniform_locations_assigned,
> -   unsigned max_unroll_iterations);
> +   unsigned max_unroll_iterations,
> +bool prefer_dp4);
>
>  bool do_algebraic(exec_list *instructions);
>  bool do_constant_folding(exec_list *instructions);
> @@ -78,6 +79,7 @@ bool do_dead_code(exec_list *instructions, bool
> uniform_locations_assigned);
>  bool do_dead_code_local(exec_list *instructions);
>  bool do_dead_code_unlinked(exec_list *instructions);
>  bool do_dead_functions(exec_list *instructions);
> +bool opt_flip_matrices(exec_list *instructions);
>  bool do_function_inlining(exec_list *instructions);
>  bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true,
> bool lower_sub_return = true, bool lower_main_return = false, bool
> lower_continue = false, bool lower_break = false);
>  bool do_lower_texture_projection(exec_list *instructions);
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 2b30d2b..67a6c5a 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -1767,7 +1767,7 @@ link_shaders(struct gl_context *ctx, struct
> gl_shader_program *prog)
>
>unsigned max_unroll =
> ctx->ShaderCompilerOptions[i].MaxUnrollIterations;
>
> -  while (do_common_optimization(prog->_LinkedShaders[i]->ir, true,
> false, max_unroll))
> +  while (do_common_optimization(prog->_LinkedShaders[i]->ir, true,
> false, max_unroll, ctx->mvp_with_dp4))
>  ;
> }
>
> diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
> index ce084b4..13dfdd3 100644
> --- a/src/glsl/main.cpp
> +++ b/src/glsl/main.cpp
> @@ -176,7 +176,7 @@ compile_shader(struct gl

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Jose Fonseca



- Original Message -
> On 04.04.2013 17:01, Jose Fonseca wrote:
> >
> > - Original Message -
> >>> On 04.04.2013 03:45, Zack Rusin wrote:
>  It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
>  painful to emulate without branching. Most hardware
>  supports it natively and even llvm has a 'select' opcode
>  which can handle it without too much hassle.
> 
>  diff --git a/src/gallium/docs/source/tgsi.rst
>  b/src/gallium/docs/source/tgsi.rst
>  index 28308cb..6c5a02b 100644
>  --- a/src/gallium/docs/source/tgsi.rst
>  +++ b/src/gallium/docs/source/tgsi.rst
>  @@ -72,6 +72,17 @@ used.
>   
> dst.w = src.w
>   
>  +.. opcode:: MOVC - Conditional move
>  +
>  +.. math::
>  +
>  +  dst.x = src0.x ? src1.x : src2.x
>  +
>  +  dst.y = src0.y ? src1.y : src2.y
>  +
>  +  dst.z = src0.z ? src1.z : src2.z
>  +
>  +  dst.w = src0.w ? src1.w : src2.w
>   
> >>> I think we already have that:
> >>>
> >>> .. opcode:: UCMP - Integer Conditional Move
> >>>
> >>> .. math::
> >>>
> >>>   dst.x = src0.x ? src1.x : src2.x
> >>>
> >>>   dst.y = src0.y ? src1.y : src2.y
> >>>
> >>>   dst.z = src0.z ? src1.z : src2.z
> >>>
> >>>   dst.w = src0.w ? src1.w : src2.w
> >>>
> >>>
> >>> No difference apart from the source ordering (the "integer" just implies
> >>> that any non-zero value counts as true, i.e. also inf, nan and -0).
> >> That's really broken. UCMP needs to be a an unsigned version of the CMP
> >> instruction which does
> >> dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
> >> not a whole new instruction. It's what everyone implements anyway. So if
> >> st_glsl_to_tgsi needs
> >> a conditional move we need to add the above patch and change it to use it.
> > Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does
> > that the spec says it supposedly does -- it seems everybody implements it
> > as an unsigned version of CMP. That is, it seems UCMP's description needs
> > to be fixed.
> 
> Erm, unsigned < 0 doesn't make sense.

Ah indeed!

> Definitely what the description says:
> static void
> micro_ucmp(union tgsi_exec_channel *dst,
>const union tgsi_exec_channel *src0,
>const union tgsi_exec_channel *src1,
>const union tgsi_exec_channel *src2)
> {
>dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
>dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
>dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
>dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> }
> 
> or
> 
>case TGSI_OPCODE_UCMP:
>case TGSI_OPCODE_CMP:
>   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
>  src0 = fetchSrc(0, c);
>  src1 = fetchSrc(1, c);
>  src2 = fetchSrc(2, c);
>  if (src1 == src2)
> mkMov(dst0[c], src1);
>  else
> mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> CC_NE(not equal 0),
>   srcTy, dst0[c], src1, src2, src0);
>   }
> 

But odd enough, the implementations I happend to look at seemed to do "foo >= 
0":

src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c has:

static void emit_ucmp(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
LLVMBuilderRef builder = bld_base->base.gallivm->builder;

LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
emit_data->args[0], 
lp_build_const_float(bld_base->base.gallivm, 0.), "");

emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, 
emit_data->args[2], emit_data->args[1], "");
}

(it doesn't even seem to do integers at all)

src/gallium/drivers/r600/r600_shader.c:

static int tgsi_ucmp(struct r600_shader_ctx *ctx)
{
struct tgsi_full_instruction *inst = 
&ctx->parse.FullToken.FullInstruction;
struct r600_bytecode_alu alu;
int i, r;
int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);

for (i = 0; i < lasti + 1; i++) {
if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
continue;

memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP3_CNDGE_INT;
r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
r600_bytecode_src(&alu.src[2], &ctx->src[1], i);
tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);
alu.dst.chan = i;
alu.dst.write = 1;
alu.is_op3 = 1;
if (i == lasti)
alu.last = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
return 0;
}

___
mesa-dev mailing list
mesa-dev@l

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Zack Rusin

> > Erm, unsigned < 0 doesn't make sense.
> 
> Ah indeed!
> 
> > Definitely what the description says:
> > static void
> > micro_ucmp(union tgsi_exec_channel *dst,
> >            const union tgsi_exec_channel *src0,
> >            const union tgsi_exec_channel *src1,
> >            const union tgsi_exec_channel *src2)
> > {
> >    dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> >    dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> >    dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> >    dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > }
> > 
> > or
> > 
> >    case TGSI_OPCODE_UCMP:
> >    case TGSI_OPCODE_CMP:
> >       FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> >          src0 = fetchSrc(0, c);
> >          src1 = fetchSrc(1, c);
> >          src2 = fetchSrc(2, c);
> >          if (src1 == src2)
> >             mkMov(dst0[c], src1);
> >          else
> >             mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> > CC_NE(not equal 0),
> >                   srcTy, dst0[c], src1, src2, src0);
> >       }
> > 
> 
> But odd enough, the implementations I happend to look at seemed to do "foo >=
> 0":

Yea, like I mentioned it's pretty broken. Sometimes it's implemented as UCMP, 
sometimes it's implemented as MOVC.
It seems to be used only as MOVC. 
It feels silly writing this, but we should probably make UCMP act like UCMP and 
add MOVC and use it when we need a MOVC.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Christoph Bumiller

On 04.04.2013 17:23, Jose Fonseca wrote:
>
> - Original Message -
>> On 04.04.2013 17:01, Jose Fonseca wrote:
>>> - Original Message -
> On 04.04.2013 03:45, Zack Rusin wrote:
>> It's part of SM4 (http://goo.gl/4IpeK). It's also fairly
>> painful to emulate without branching. Most hardware
>> supports it natively and even llvm has a 'select' opcode
>> which can handle it without too much hassle.
>>
>> diff --git a/src/gallium/docs/source/tgsi.rst
>> b/src/gallium/docs/source/tgsi.rst
>> index 28308cb..6c5a02b 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -72,6 +72,17 @@ used.
>>  
>>dst.w = src.w
>>  
>> +.. opcode:: MOVC - Conditional move
>> +
>> +.. math::
>> +
>> +  dst.x = src0.x ? src1.x : src2.x
>> +
>> +  dst.y = src0.y ? src1.y : src2.y
>> +
>> +  dst.z = src0.z ? src1.z : src2.z
>> +
>> +  dst.w = src0.w ? src1.w : src2.w
>>  
> I think we already have that:
>
> .. opcode:: UCMP - Integer Conditional Move
>
> .. math::
>
>   dst.x = src0.x ? src1.x : src2.x
>
>   dst.y = src0.y ? src1.y : src2.y
>
>   dst.z = src0.z ? src1.z : src2.z
>
>   dst.w = src0.w ? src1.w : src2.w
>
>
> No difference apart from the source ordering (the "integer" just implies
> that any non-zero value counts as true, i.e. also inf, nan and -0).
 That's really broken. UCMP needs to be a an unsigned version of the CMP
 instruction which does
 dst.chan = (src0.chan < 0) ? src1.chan : src2.chan
 not a whole new instruction. It's what everyone implements anyway. So if
 st_glsl_to_tgsi needs
 a conditional move we need to add the above patch and change it to use it.
>>> Yes, it doesn't seem that any of the TGSI_OPCODE_UCMP implementation does
>>> that the spec says it supposedly does -- it seems everybody implements it
>>> as an unsigned version of CMP. That is, it seems UCMP's description needs
>>> to be fixed.
>> Erm, unsigned < 0 doesn't make sense.
> Ah indeed!
>
>> Definitely what the description says:
>> static void
>> micro_ucmp(union tgsi_exec_channel *dst,
>>const union tgsi_exec_channel *src0,
>>const union tgsi_exec_channel *src1,
>>const union tgsi_exec_channel *src2)
>> {
>>dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
>>dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
>>dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
>>dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
>> }
>>
>> or
>>
>>case TGSI_OPCODE_UCMP:
>>case TGSI_OPCODE_CMP:
>>   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
>>  src0 = fetchSrc(0, c);
>>  src1 = fetchSrc(1, c);
>>  src2 = fetchSrc(2, c);
>>  if (src1 == src2)
>> mkMov(dst0[c], src1);
>>  else
>> mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
>> CC_NE(not equal 0),
>>   srcTy, dst0[c], src1, src2, src0);
>>   }
>>
> But odd enough, the implementations I happend to look at seemed to do "foo >= 
> 0":

Well, some people can't read documentation ... or they rely on the
condition value always being a glsl-to-tgsi boolean which is only either
0 or ~0/-1.

> src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c has:
>
> static void emit_ucmp(
> const struct lp_build_tgsi_action * action,
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
> {
> LLVMBuilderRef builder = bld_base->base.gallivm->builder;
>
> LLVMValueRef v = LLVMBuildFCmp(builder, LLVMRealUGE,
> emit_data->args[0], 
> lp_build_const_float(bld_base->base.gallivm, 0.), "");
>
> emit_data->output[emit_data->chan] = LLVMBuildSelect(builder, v, 
> emit_data->args[2], emit_data->args[1], "");
> }
>
> (it doesn't even seem to do integers at all)
>
> src/gallium/drivers/r600/r600_shader.c:
>
> static int tgsi_ucmp(struct r600_shader_ctx *ctx)
> {
>   struct tgsi_full_instruction *inst = 
> &ctx->parse.FullToken.FullInstruction;
>   struct r600_bytecode_alu alu;
>   int i, r;
>   int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
>
>   for (i = 0; i < lasti + 1; i++) {
>   if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
>   continue;
>
>   memset(&alu, 0, sizeof(struct r600_bytecode_alu));
>   alu.op = ALU_OP3_CNDGE_INT;
>   r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
>   r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
>   r600_bytecode_src(&alu.src[2], &ctx->src[1], i);
>   tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);
>   alu.dst.chan = i;
>   alu.dst.write = 1;
>   alu.is_op3 = 1;
>   if (i == lasti

[Mesa-dev] [Bug 47607] [advocacy] Make Anomaly Warzone Earth work with Mesa

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=47607

Eric Anholt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Eric Anholt  ---
Great to hear it's fixed -- closing this bug for us.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Jose Fonseca



- Original Message -
> > > Erm, unsigned < 0 doesn't make sense.
> > 
> > Ah indeed!
> > 
> > > Definitely what the description says:
> > > static void
> > > micro_ucmp(union tgsi_exec_channel *dst,
> > >            const union tgsi_exec_channel *src0,
> > >            const union tgsi_exec_channel *src1,
> > >            const union tgsi_exec_channel *src2)
> > > {
> > >    dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > >    dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > >    dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > >    dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > }
> > > 
> > > or
> > > 
> > >    case TGSI_OPCODE_UCMP:
> > >    case TGSI_OPCODE_CMP:
> > >       FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > >          src0 = fetchSrc(0, c);
> > >          src1 = fetchSrc(1, c);
> > >          src2 = fetchSrc(2, c);
> > >          if (src1 == src2)
> > >             mkMov(dst0[c], src1);
> > >          else
> > >             mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> > > CC_NE(not equal 0),
> > >                   srcTy, dst0[c], src1, src2, src0);
> > >       }
> > > 
> > 
> > But odd enough, the implementations I happend to look at seemed to do "foo
> > >=
> > 0":
> 
> Yea, like I mentioned it's pretty broken. Sometimes it's implemented as UCMP,
> sometimes it's implemented as MOVC.
> It seems to be used only as MOVC.
> It feels silly writing this, but we should probably make UCMP act like UCMP
> and add MOVC and use it when we need a MOVC.

Zack, I believe Christoph has a point when he says that UCMP is semantically 
the same as MOVC.

Because for unsigned integers, "foo > 0" is the same as "foo != 0", therefore 
having UCMP defined as

  dst = src0 > 0 ? src1 : src2

or a MOVC as

  dst = src0 != 0 ? src1 : src2

is pretty much the same.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965: Remove fixed-function texture projection avoidance optimization.

2013-04-04 Thread Eric Anholt

Kenneth Graunke  writes:

> On 03/13/2013 09:17 AM, Eric Anholt wrote:
>> Kenneth Graunke  writes:
>>
>>> This optimization attempts to avoid extra attribute interpolation
>>> instructions for texture coordinates where the W-component is 1.0.
>>>
>>> Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes
>>> state atom (all the brw_vs_constval.c code) needs to run on each draw.
>>> It computes the input_size_masks array, then uses that to compute
>>> proj_attrib_mask.  Differences in proj_attrib_mask can cause
>>> state-dependent fragment shader recompiles.  We also often fail to guess
>>> proj_attrib_mask for the fragment shader precompile, causing us to
>>> needlessly compile it twice.
>>>
>>> Furthermore, this optimization only applies to fixed-function programs;
>>> it does not help modern GLSL-based programs at all.  Generally, older
>>> fixed-function programs run fine on modern hardware anyway.
>>>
>>> The optimization has existed in some form since the initial commit.  When
>>> we rewrote the fragment shader backend, we dropped it for a while.  Eric
>>> readded it in commit eb30820f268608cf451da32de69723036dddbc62 as part of
>>> an attempt to cure a ~1% performance regression caused by converting the
>>> fixed-function fragment shader generation code from Mesa IR to GLSL IR.
>>> However, no performance data was included in the commit message, so it's
>>> unclear whether or not it was successful.
>>>
>>> Time has passed, so I decided to re-measure this.  Surprisingly,
>>> Eric's OpenArena timedemo actually runs /faster/ after removing this and
>>> the brw_wm_input_sizes atom.  On Ivybridge at 1024x768, I measured a
>>> 1.39532% +/- 0.91833% increase in FPS (n = 55).
>>
>> Removing it on SNB+ makes sense to me.  But given the higher cost of
>> math pre-gen6, I think we should test on one of those too.
>
> I finally rebased this patch series ("proj_attrib" in my tree).  On 
> Ironlake, OpenArena at 1024x768 (n = 37) shows no statistically 
> significant difference.
>
> So, can we just scrap it altogether?

Sweet.  Let's do it.


pgpkby749J4Yu.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Zack Rusin

Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to unsigned 
followed by ucmp. Alright, I'm withdrawing the patch.

z

- Original Message -
> 
> 
> - Original Message -
> > > > Erm, unsigned < 0 doesn't make sense.
> > > 
> > > Ah indeed!
> > > 
> > > > Definitely what the description says:
> > > > static void
> > > > micro_ucmp(union tgsi_exec_channel *dst,
> > > >            const union tgsi_exec_channel *src0,
> > > >            const union tgsi_exec_channel *src1,
> > > >            const union tgsi_exec_channel *src2)
> > > > {
> > > >    dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > > >    dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > > >    dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > > >    dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > > }
> > > > 
> > > > or
> > > > 
> > > >    case TGSI_OPCODE_UCMP:
> > > >    case TGSI_OPCODE_CMP:
> > > >       FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > > >          src0 = fetchSrc(0, c);
> > > >          src1 = fetchSrc(1, c);
> > > >          src2 = fetchSrc(2, c);
> > > >          if (src1 == src2)
> > > >             mkMov(dst0[c], src1);
> > > >          else
> > > >             mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> > > > CC_NE(not equal 0),
> > > >                   srcTy, dst0[c], src1, src2, src0);
> > > >       }
> > > > 
> > > 
> > > But odd enough, the implementations I happend to look at seemed to do
> > > "foo
> > > >=
> > > 0":
> > 
> > Yea, like I mentioned it's pretty broken. Sometimes it's implemented as
> > UCMP,
> > sometimes it's implemented as MOVC.
> > It seems to be used only as MOVC.
> > It feels silly writing this, but we should probably make UCMP act like UCMP
> > and add MOVC and use it when we need a MOVC.
> 
> Zack, I believe Christoph has a point when he says that UCMP is semantically
> the same as MOVC.
> 
> Because for unsigned integers, "foo > 0" is the same as "foo != 0", therefore
> having UCMP defined as
> 
>   dst = src0 > 0 ? src1 : src2
> 
> or a MOVC as
> 
>   dst = src0 != 0 ? src1 : src2
> 
> is pretty much the same.
> 
> Jose
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Jose Fonseca

There might be some value in renaming UCMP to be MOVC though.  I think 
everybody here can agree that UCMP, though semantically correct, is misleading.

Jose

- Original Message -
> Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to unsigned
> followed by ucmp. Alright, I'm withdrawing the patch.
> 
> z
> 
> - Original Message -
> > 
> > 
> > - Original Message -
> > > > > Erm, unsigned < 0 doesn't make sense.
> > > > 
> > > > Ah indeed!
> > > > 
> > > > > Definitely what the description says:
> > > > > static void
> > > > > micro_ucmp(union tgsi_exec_channel *dst,
> > > > >            const union tgsi_exec_channel *src0,
> > > > >            const union tgsi_exec_channel *src1,
> > > > >            const union tgsi_exec_channel *src2)
> > > > > {
> > > > >    dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > > > >    dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > > > >    dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > > > >    dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > > > }
> > > > > 
> > > > > or
> > > > > 
> > > > >    case TGSI_OPCODE_UCMP:
> > > > >    case TGSI_OPCODE_CMP:
> > > > >       FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > > > >          src0 = fetchSrc(0, c);
> > > > >          src1 = fetchSrc(1, c);
> > > > >          src2 = fetchSrc(2, c);
> > > > >          if (src1 == src2)
> > > > >             mkMov(dst0[c], src1);
> > > > >          else
> > > > >             mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> > > > > CC_NE(not equal 0),
> > > > >                   srcTy, dst0[c], src1, src2, src0);
> > > > >       }
> > > > > 
> > > > 
> > > > But odd enough, the implementations I happend to look at seemed to do
> > > > "foo
> > > > >=
> > > > 0":
> > > 
> > > Yea, like I mentioned it's pretty broken. Sometimes it's implemented as
> > > UCMP,
> > > sometimes it's implemented as MOVC.
> > > It seems to be used only as MOVC.
> > > It feels silly writing this, but we should probably make UCMP act like
> > > UCMP
> > > and add MOVC and use it when we need a MOVC.
> > 
> > Zack, I believe Christoph has a point when he says that UCMP is
> > semantically
> > the same as MOVC.
> > 
> > Because for unsigned integers, "foo > 0" is the same as "foo != 0",
> > therefore
> > having UCMP defined as
> > 
> >   dst = src0 > 0 ? src1 : src2
> > 
> > or a MOVC as
> > 
> >   dst = src0 != 0 ? src1 : src2
> > 
> > is pretty much the same.
> > 
> > Jose
> > 
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Roland Scheidegger

FWIW it looks like we could use that opcode a bit more in glsl to tgsi
translation. There's one use of it in st_glsl_to_tgsi.cpp (though
coupled with a USNE which I'm not sure is even necessary) but another
place states that "If TGSI had a UCMP instruction or similar, this extra
instruction would not be necessary" (when it translates away what would
be a ucmp into i2f->cmp).
I don't know that if "MOVC" or "UCMP" is a better name.

Roland



Am 04.04.2013 18:06, schrieb Zack Rusin:
> Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to unsigned 
> followed by ucmp. Alright, I'm withdrawing the patch.
> 
> z
> 
> - Original Message -
>>
>>
>> - Original Message -
> Erm, unsigned < 0 doesn't make sense.

 Ah indeed!

> Definitely what the description says:
> static void
> micro_ucmp(union tgsi_exec_channel *dst,
>const union tgsi_exec_channel *src0,
>const union tgsi_exec_channel *src1,
>const union tgsi_exec_channel *src2)
> {
>dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
>dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
>dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
>dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> }
>
> or
>
>case TGSI_OPCODE_UCMP:
>case TGSI_OPCODE_CMP:
>   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
>  src0 = fetchSrc(0, c);
>  src1 = fetchSrc(1, c);
>  src2 = fetchSrc(2, c);
>  if (src1 == src2)
> mkMov(dst0[c], src1);
>  else
> mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than 0) :
> CC_NE(not equal 0),
>   srcTy, dst0[c], src1, src2, src0);
>   }
>

 But odd enough, the implementations I happend to look at seemed to do
 "foo
> =
 0":
>>>
>>> Yea, like I mentioned it's pretty broken. Sometimes it's implemented as
>>> UCMP,
>>> sometimes it's implemented as MOVC.
>>> It seems to be used only as MOVC.
>>> It feels silly writing this, but we should probably make UCMP act like UCMP
>>> and add MOVC and use it when we need a MOVC.
>>
>> Zack, I believe Christoph has a point when he says that UCMP is semantically
>> the same as MOVC.
>>
>> Because for unsigned integers, "foo > 0" is the same as "foo != 0", therefore
>> having UCMP defined as
>>
>>   dst = src0 > 0 ? src1 : src2
>>
>> or a MOVC as
>>
>>   dst = src0 != 0 ? src1 : src2
>>
>> is pretty much the same.
>>
>> Jose
>>
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Marek Olšák

FWIW, I think UCMP is a misleading name. Whatever the name will be, it
should be prefixed with "I" or "U", because it's not a floating-point
opcode. How about UCND? :D

Marek


On Thu, Apr 4, 2013 at 6:23 PM, Jose Fonseca  wrote:

> There might be some value in renaming UCMP to be MOVC though.  I think
> everybody here can agree that UCMP, though semantically correct, is
> misleading.
>
> Jose
>
> - Original Message -
> > Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to
> unsigned
> > followed by ucmp. Alright, I'm withdrawing the patch.
> >
> > z
> >
> > - Original Message -
> > >
> > >
> > > - Original Message -
> > > > > > Erm, unsigned < 0 doesn't make sense.
> > > > >
> > > > > Ah indeed!
> > > > >
> > > > > > Definitely what the description says:
> > > > > > static void
> > > > > > micro_ucmp(union tgsi_exec_channel *dst,
> > > > > >const union tgsi_exec_channel *src0,
> > > > > >const union tgsi_exec_channel *src1,
> > > > > >const union tgsi_exec_channel *src2)
> > > > > > {
> > > > > >dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > > > > >dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > > > > >dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > > > > >dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > > > > }
> > > > > >
> > > > > > or
> > > > > >
> > > > > >case TGSI_OPCODE_UCMP:
> > > > > >case TGSI_OPCODE_CMP:
> > > > > >   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > > > > >  src0 = fetchSrc(0, c);
> > > > > >  src1 = fetchSrc(1, c);
> > > > > >  src2 = fetchSrc(2, c);
> > > > > >  if (src1 == src2)
> > > > > > mkMov(dst0[c], src1);
> > > > > >  else
> > > > > > mkCmp(OP_SLCT, (srcTy == TYPE_F32) ? CC_LT(less than
> 0) :
> > > > > > CC_NE(not equal 0),
> > > > > >   srcTy, dst0[c], src1, src2, src0);
> > > > > >   }
> > > > > >
> > > > >
> > > > > But odd enough, the implementations I happend to look at seemed to
> do
> > > > > "foo
> > > > > >=
> > > > > 0":
> > > >
> > > > Yea, like I mentioned it's pretty broken. Sometimes it's implemented
> as
> > > > UCMP,
> > > > sometimes it's implemented as MOVC.
> > > > It seems to be used only as MOVC.
> > > > It feels silly writing this, but we should probably make UCMP act
> like
> > > > UCMP
> > > > and add MOVC and use it when we need a MOVC.
> > >
> > > Zack, I believe Christoph has a point when he says that UCMP is
> > > semantically
> > > the same as MOVC.
> > >
> > > Because for unsigned integers, "foo > 0" is the same as "foo != 0",
> > > therefore
> > > having UCMP defined as
> > >
> > >   dst = src0 > 0 ? src1 : src2
> > >
> > > or a MOVC as
> > >
> > >   dst = src0 != 0 ? src1 : src2
> > >
> > > is pretty much the same.
> > >
> > > Jose
> > >
> > >
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: implement ucmp

2013-04-04 Thread Zack Rusin

and add a test for it

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   21 
 .../tests/graw/fragment-shader/frag-ucmp.sh|   11 ++
 2 files changed, 32 insertions(+)
 create mode 100644 src/gallium/tests/graw/fragment-shader/frag-ucmp.sh

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index dfe581d..f3ae7b6 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -986,6 +986,26 @@ cmp_emit_cpu(
 cond, emit_data->args[1], emit_data->args[2]);
 }
 
+/* TGSI_OPCODE_UCMP (CPU Only) */
+static void
+ucmp_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
+   struct lp_build_context *uint_bld = &bld_base->uint_bld;
+   LLVMValueRef unsigned_cond = 
+  LLVMBuildBitCast(builder, emit_data->args[0], uint_bld->vec_type, "");
+   LLVMValueRef cond = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
+unsigned_cond,
+uint_bld->zero);
+   emit_data->output[emit_data->chan] =
+  lp_build_select(&bld_base->base,
+  cond, emit_data->args[1], emit_data->args[2]);
+}
+
+
 /* TGSI_OPCODE_CND (CPU Only) */
 static void
 cnd_emit_cpu(
@@ -1701,6 +1721,7 @@ lp_set_default_actions_cpu(
bld_base->sqrt_action.emit = sqrt_emit_cpu;
 
bld_base->op_actions[TGSI_OPCODE_UADD].emit = uadd_emit_cpu;
+   bld_base->op_actions[TGSI_OPCODE_UCMP].emit = ucmp_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_UDIV].emit = udiv_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_UMAX].emit = umax_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_UMIN].emit = umin_emit_cpu;
diff --git a/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh 
b/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
new file mode 100644
index 000..fa4ea25
--- /dev/null
+++ b/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
@@ -0,0 +1,11 @@
+FRAG
+DCL IN[0], COLOR, LINEAR
+DCL OUT[0], COLOR
+DCL TEMP[0]
+IMM[0] FLT32 {   10., 1., 0., 0.}
+IMM[1] UINT32 {1, 0, 0, 0}
+0: MUL TEMP[0].x, IN[0]., IMM[0].
+1: F2U TEMP[0].x, TEMP[0].
+2: AND TEMP[0].x, TEMP[0]., IMM[1].
+3: UCMP OUT[0], TEMP[0]., IMM[0].yzzz, IMM[0].yyyz
+4: END
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] llvmpipe: implement ucmp

2013-04-04 Thread Jose Fonseca



- Original Message -
> and add a test for it
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   21
>  
>  .../tests/graw/fragment-shader/frag-ucmp.sh|   11 ++
>  2 files changed, 32 insertions(+)
>  create mode 100644 src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index dfe581d..f3ae7b6 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -986,6 +986,26 @@ cmp_emit_cpu(
>  cond, emit_data->args[1],
>  emit_data->args[2]);
>  }
>  
> +/* TGSI_OPCODE_UCMP (CPU Only) */
> +static void
> +ucmp_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> +   struct lp_build_context *uint_bld = &bld_base->uint_bld;
> +   LLVMValueRef unsigned_cond =
> +  LLVMBuildBitCast(builder, emit_data->args[0], uint_bld->vec_type, "");
> +   LLVMValueRef cond = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
> +unsigned_cond,
> +uint_bld->zero);
> +   emit_data->output[emit_data->chan] =
> +  lp_build_select(&bld_base->base,
> +  cond, emit_data->args[1], emit_data->args[2]);
> +}
> +
> +
>  /* TGSI_OPCODE_CND (CPU Only) */
>  static void
>  cnd_emit_cpu(
> @@ -1701,6 +1721,7 @@ lp_set_default_actions_cpu(
> bld_base->sqrt_action.emit = sqrt_emit_cpu;
>  
> bld_base->op_actions[TGSI_OPCODE_UADD].emit = uadd_emit_cpu;
> +   bld_base->op_actions[TGSI_OPCODE_UCMP].emit = ucmp_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_UDIV].emit = udiv_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_UMAX].emit = umax_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_UMIN].emit = umin_emit_cpu;
> diff --git a/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
> b/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
> new file mode 100644
> index 000..fa4ea25
> --- /dev/null
> +++ b/src/gallium/tests/graw/fragment-shader/frag-ucmp.sh
> @@ -0,0 +1,11 @@
> +FRAG
> +DCL IN[0], COLOR, LINEAR
> +DCL OUT[0], COLOR
> +DCL TEMP[0]
> +IMM[0] FLT32 {   10., 1., 0., 0.}
> +IMM[1] UINT32 {1, 0, 0, 0}
> +0: MUL TEMP[0].x, IN[0]., IMM[0].
> +1: F2U TEMP[0].x, TEMP[0].
> +2: AND TEMP[0].x, TEMP[0]., IMM[1].
> +3: UCMP OUT[0], TEMP[0]., IMM[0].yzzz, IMM[0].yyyz
> +4: END
> --
> 1.7.10.4
> 
> 


Looks great. Thanks Zack.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61364] LLVM assertion when starting X11

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61364

--- Comment #7 from Armin K  ---
Does it work if you build Mesa with --with-llvm-shared-libs? Tom mentioned
something like this happens if Mesa is linked to static LLVM libaries.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 61364] LLVM assertion when starting X11

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=61364

--- Comment #8 from Armin K  ---
It might be related to this ... See the explanation

http://lists.freedesktop.org/archives/mesa-dev/2013-January/032944.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] register_allocate: Fix the type of best_benefit.

2013-04-04 Thread Eric Anholt

Ian Romanick  writes:

> On 04/02/2013 01:38 PM, Matt Turner wrote:
>
> Candidate for stable branches?

Our floats all happen to have integer values, and radeon doesn't use
this code.


pgpEzhz83hN7W.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add an optimization pass to flatten simple nested if blocks.

2013-04-04 Thread Eric Anholt

Kenneth Graunke  writes:
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index c294aa4..b5282a6 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -80,6 +80,7 @@ LIBGLSL_FILES = \
>   $(GLSL_SRCDIR)/opt_dead_code.cpp \
>   $(GLSL_SRCDIR)/opt_dead_code_local.cpp \
>   $(GLSL_SRCDIR)/opt_dead_functions.cpp \
> + $(GLSL_SRCDIR)/opt_flatten_nested_if_blocks.cpp \
>   $(GLSL_SRCDIR)/opt_function_inlining.cpp \
>   $(GLSL_SRCDIR)/opt_if_simplification.cpp \
>   $(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
> diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
> index 9740903..0992294 100644
> --- a/src/glsl/glsl_parser_extras.cpp
> +++ b/src/glsl/glsl_parser_extras.cpp
> @@ -1218,6 +1218,7 @@ do_common_optimization(exec_list *ir, bool linked,
>progress = do_structure_splitting(ir) || progress;
> }
> progress = do_if_simplification(ir) || progress;
> +   progress = opt_flatten_nested_if_blocks(ir) || progress;
> progress = do_copy_propagation(ir) || progress;
> progress = do_copy_propagation_elements(ir) || progress;
> if (linked)
> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
> index 2454bbe..a8885d7 100644
> --- a/src/glsl/ir_optimization.h
> +++ b/src/glsl/ir_optimization.h
> @@ -82,6 +82,7 @@ bool do_function_inlining(exec_list *instructions);
>  bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, 
> bool lower_sub_return = true, bool lower_main_return = false, bool 
> lower_continue = false, bool lower_break = false);
>  bool do_lower_texture_projection(exec_list *instructions);
>  bool do_if_simplification(exec_list *instructions);
> +bool opt_flatten_nested_if_blocks(exec_list *instructions);
>  bool do_discard_simplification(exec_list *instructions);
>  bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth = 
> 0);
>  bool do_mat_op_to_vec(exec_list *instructions);
> diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp 
> b/src/glsl/opt_flatten_nested_if_blocks.cpp
> new file mode 100644
> index 000..c702102
> --- /dev/null
> +++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
> @@ -0,0 +1,103 @@
> +/*
> + * Copyright © 2013 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +/**
> + * \file opt_flatten_nested_if_blocks.cpp
> + *
> + * Flattens nested if blocks such as:
> + *
> + * if (x) {
> + *if (y) {
> + *   ...
> + *}
> + * }
> + *
> + * into a single if block with a combined condition:
> + *
> + * if (x && y) {
> + *...
> + * }
> + */
> +
> +#include "ir.h"
> +#include "ir_builder.h"
> +
> +using namespace ir_builder;
> +
> +namespace {
> +
> +class nested_if_flattener : public ir_hierarchical_visitor {
> +public:
> +   nested_if_flattener()
> +   {
> +  progress = false;
> +   }
> +
> +   ir_visitor_status visit_leave(ir_if *);
> +   ir_visitor_status visit_enter(ir_assignment *);
> +
> +   bool progress;
> +};
> +
> +} /* unnamed namespace */
> +
> +/* We only care about the top level "if" instructions, so don't
> + * descend into expressions.
> + */
> +ir_visitor_status
> +nested_if_flattener::visit_enter(ir_assignment *ir)
> +{
> +   (void) ir;
> +   return visit_continue_with_parent;
> +}
> +
> +bool
> +opt_flatten_nested_if_blocks(exec_list *instructions)
> +{
> +   nested_if_flattener v;
> +
> +   v.run(instructions);
> +   return v.progress;
> +}
> +
> +
> +ir_visitor_status
> +nested_if_flattener::visit_leave(ir_if *ir)
> +{
> +   /* Only handle a single ir_if within the then clause of an ir_if.  No 
> extra
> +* instructions, no else clauses, nothing.
> +*/
> +   if (ir->then_instructions.is_empty() || !ir->else_instructions.is_empty())
> +  return visit_continue;
> +
> +   ir_if *inner = ((ir_instruction *) ir->then_instructions.head)->as_if()

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Roland Scheidegger

Well if the condition is just "any bit set" then it doesn't matter if
the input is a float or int or whatever (of course, for floats, that
definition is different than != zero, as it doesn't hold for negative zero).
That would be the same as for instance the bitwise instructions which
also don't have a "U" or "I" prefix.

Roland

Am 04.04.2013 18:49, schrieb Marek Olšák:
> FWIW, I think UCMP is a misleading name. Whatever the name will be, it
> should be prefixed with "I" or "U", because it's not a floating-point
> opcode. How about UCND? :D
> 
> Marek
> 
> 
> On Thu, Apr 4, 2013 at 6:23 PM, Jose Fonseca  > wrote:
> 
> There might be some value in renaming UCMP to be MOVC though.  I
> think everybody here can agree that UCMP, though semantically
> correct, is misleading.
> 
> Jose
> 
> - Original Message -
> > Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to
> unsigned
> > followed by ucmp. Alright, I'm withdrawing the patch.
> >
> > z
> >
> > - Original Message -
> > >
> > >
> > > - Original Message -
> > > > > > Erm, unsigned < 0 doesn't make sense.
> > > > >
> > > > > Ah indeed!
> > > > >
> > > > > > Definitely what the description says:
> > > > > > static void
> > > > > > micro_ucmp(union tgsi_exec_channel *dst,
> > > > > >const union tgsi_exec_channel *src0,
> > > > > >const union tgsi_exec_channel *src1,
> > > > > >const union tgsi_exec_channel *src2)
> > > > > > {
> > > > > >dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > > > > >dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > > > > >dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > > > > >dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > > > > }
> > > > > >
> > > > > > or
> > > > > >
> > > > > >case TGSI_OPCODE_UCMP:
> > > > > >case TGSI_OPCODE_CMP:
> > > > > >   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > > > > >  src0 = fetchSrc(0, c);
> > > > > >  src1 = fetchSrc(1, c);
> > > > > >  src2 = fetchSrc(2, c);
> > > > > >  if (src1 == src2)
> > > > > > mkMov(dst0[c], src1);
> > > > > >  else
> > > > > > mkCmp(OP_SLCT, (srcTy == TYPE_F32) ?
> CC_LT(less than 0) :
> > > > > > CC_NE(not equal 0),
> > > > > >   srcTy, dst0[c], src1, src2, src0);
> > > > > >   }
> > > > > >
> > > > >
> > > > > But odd enough, the implementations I happend to look at
> seemed to do
> > > > > "foo
> > > > > >=
> > > > > 0":
> > > >
> > > > Yea, like I mentioned it's pretty broken. Sometimes it's
> implemented as
> > > > UCMP,
> > > > sometimes it's implemented as MOVC.
> > > > It seems to be used only as MOVC.
> > > > It feels silly writing this, but we should probably make UCMP
> act like
> > > > UCMP
> > > > and add MOVC and use it when we need a MOVC.
> > >
> > > Zack, I believe Christoph has a point when he says that UCMP is
> > > semantically
> > > the same as MOVC.
> > >
> > > Because for unsigned integers, "foo > 0" is the same as "foo != 0",
> > > therefore
> > > having UCMP defined as
> > >
> > >   dst = src0 > 0 ? src1 : src2
> > >
> > > or a MOVC as
> > >
> > >   dst = src0 != 0 ? src1 : src2
> > >
> > > is pretty much the same.
> > >
> > > Jose
> > >
> > >
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 63132] New: [r600/llvm] src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:1887:lp_emit_declaration_soa: Assertion `idx < 256' failed.

2013-04-04 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=63132

  Priority: medium
Bug ID: 63132
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [r600/llvm]
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:1887:l
p_emit_declaration_soa: Assertion `idx < 256' failed.
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: johannesoberm...@gmx.de
  Hardware: Other
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

Created attachment 77438
  --> https://bugs.freedesktop.org/attachment.cgi?id=77438&action=edit
R600_DUMP_SHADERS=1

00:01.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI
Wrestler [Radeon HD 6310] (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 84a5
Flags: bus master, fast devsel, latency 0, IRQ 40
Memory at c000 (32-bit, prefetchable) [size=256M]
I/O ports at f000 [size=256]
Memory at feb0 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at  [disabled]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010

Kernel driver in use: radeon

Trying to open https://www.shadertoy.com/browse in konqueror.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add an optimization pass to flatten simple nested if blocks.

2013-04-04 Thread Matt Turner

On Wed, Apr 3, 2013 at 11:56 PM, Kenneth Graunke wrote:

> GLBenchmark 2.7's shaders contain conditional blocks like:
>
> if (x) {
> if (y) {
> ...
> }
> }
>
> where the outer conditional's then clause contains exactly one statement
> (the nested if) and there are no else clauses.  This can easily be
> optimized into:
>
> if (x && y) {
> ...
> }
>
> This saves a few instructions in GLBenchmark 2.7:
>
> total instructions in shared programs: 11833 -> 11649 (-1.55%)
> instructions in affected programs: 8234 -> 8050 (-2.23%)
>
> It also helps CS:GO slightly (-0.05%/-0.22%).  More importantly,
> however, it simplifies the control flow graph, which could enable other
> optimizations.
>
> Signed-off-by: Kenneth Graunke 
>

Reviewed-by: Matt Turner 

Any measurable performance improvement?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: Add a conditional move inststruction

2013-04-04 Thread Marek Olšák

I see. Fair point.

Marek


On Thu, Apr 4, 2013 at 7:32 PM, Roland Scheidegger wrote:

> Well if the condition is just "any bit set" then it doesn't matter if
> the input is a float or int or whatever (of course, for floats, that
> definition is different than != zero, as it doesn't hold for negative
> zero).
> That would be the same as for instance the bitwise instructions which
> also don't have a "U" or "I" prefix.
>
> Roland
>
> Am 04.04.2013 18:49, schrieb Marek Olšák:
> > FWIW, I think UCMP is a misleading name. Whatever the name will be, it
> > should be prefixed with "I" or "U", because it's not a floating-point
> > opcode. How about UCND? :D
> >
> > Marek
> >
> >
> > On Thu, Apr 4, 2013 at 6:23 PM, Jose Fonseca  > > wrote:
> >
> > There might be some value in renaming UCMP to be MOVC though.  I
> > think everybody here can agree that UCMP, though semantically
> > correct, is misleading.
> >
> > Jose
> >
> > - Original Message -
> > > Hah, yea, I'm sorry, that's a good point. So movc is a bitcast to
> > unsigned
> > > followed by ucmp. Alright, I'm withdrawing the patch.
> > >
> > > z
> > >
> > > - Original Message -
> > > >
> > > >
> > > > - Original Message -
> > > > > > > Erm, unsigned < 0 doesn't make sense.
> > > > > >
> > > > > > Ah indeed!
> > > > > >
> > > > > > > Definitely what the description says:
> > > > > > > static void
> > > > > > > micro_ucmp(union tgsi_exec_channel *dst,
> > > > > > >const union tgsi_exec_channel *src0,
> > > > > > >const union tgsi_exec_channel *src1,
> > > > > > >const union tgsi_exec_channel *src2)
> > > > > > > {
> > > > > > >dst->u[0] = src0->u[0] ? src1->u[0] : src2->u[0];
> > > > > > >dst->u[1] = src0->u[1] ? src1->u[1] : src2->u[1];
> > > > > > >dst->u[2] = src0->u[2] ? src1->u[2] : src2->u[2];
> > > > > > >dst->u[3] = src0->u[3] ? src1->u[3] : src2->u[3];
> > > > > > > }
> > > > > > >
> > > > > > > or
> > > > > > >
> > > > > > >case TGSI_OPCODE_UCMP:
> > > > > > >case TGSI_OPCODE_CMP:
> > > > > > >   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> > > > > > >  src0 = fetchSrc(0, c);
> > > > > > >  src1 = fetchSrc(1, c);
> > > > > > >  src2 = fetchSrc(2, c);
> > > > > > >  if (src1 == src2)
> > > > > > > mkMov(dst0[c], src1);
> > > > > > >  else
> > > > > > > mkCmp(OP_SLCT, (srcTy == TYPE_F32) ?
> > CC_LT(less than 0) :
> > > > > > > CC_NE(not equal 0),
> > > > > > >   srcTy, dst0[c], src1, src2, src0);
> > > > > > >   }
> > > > > > >
> > > > > >
> > > > > > But odd enough, the implementations I happend to look at
> > seemed to do
> > > > > > "foo
> > > > > > >=
> > > > > > 0":
> > > > >
> > > > > Yea, like I mentioned it's pretty broken. Sometimes it's
> > implemented as
> > > > > UCMP,
> > > > > sometimes it's implemented as MOVC.
> > > > > It seems to be used only as MOVC.
> > > > > It feels silly writing this, but we should probably make UCMP
> > act like
> > > > > UCMP
> > > > > and add MOVC and use it when we need a MOVC.
> > > >
> > > > Zack, I believe Christoph has a point when he says that UCMP is
> > > > semantically
> > > > the same as MOVC.
> > > >
> > > > Because for unsigned integers, "foo > 0" is the same as "foo !=
> 0",
> > > > therefore
> > > > having UCMP defined as
> > > >
> > > >   dst = src0 > 0 ? src1 : src2
> > > >
> > > > or a MOVC as
> > > >
> > > >   dst = src0 != 0 ? src1 : src2
> > > >
> > > > is pretty much the same.
> > > >
> > > > Jose
> > > >
> > > >
> > >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org  mesa-dev@lists.freedesktop.org>
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] gallium: add PIPE_BIND_COMMAND_BUFFER

2013-04-04 Thread Christoph Bumiller

Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER
target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.
---
 src/gallium/docs/source/screen.rst   |2 ++
 src/gallium/include/pipe/p_defines.h |1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index c1a3c0b..f8cdded 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -306,6 +306,8 @@ resources might be created and handled quite differently.
   bound to the graphics pipeline as a shader resource.
 * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
   bound to the compute program as a shader resource.
+* ``PIPE_BIND_COMMAND_BUFFER``: A buffer or that may be sourced by the
+  GPU command processor, like with indirect drawing.
 
 .. _pipe_usage:
 
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 5b00acc..2b79f2a 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -315,6 +315,7 @@ enum pipe_flush_flags {
 #define PIPE_BIND_GLOBAL   (1 << 18) /* set_global_binding */
 #define PIPE_BIND_SHADER_RESOURCE  (1 << 19) /* set_shader_resources */
 #define PIPE_BIND_COMPUTE_RESOURCE (1 << 20) /* set_compute_resources */
+#define PIPE_BIND_COMMAND_BUFFER   (1 << 21) /* pipe_draw_info.indirect */
 
 /* The first two flags above were previously part of the amorphous
  * TEXTURE_USAGE, most of which are now descriptions of the ways a
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] mesa: implement GL_ARB_draw_indirect

2013-04-04 Thread Christoph Bumiller

---
 src/mapi/glapi/gen/Makefile.am   |1 +
 src/mapi/glapi/gen/gl_API.xml|4 +-
 src/mesa/drivers/dri/i965/brw_draw.c |3 +-
 src/mesa/drivers/dri/i965/brw_draw.h |3 +-
 src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c |9 +-
 src/mesa/main/api_validate.c |  159 
 src/mesa/main/api_validate.h |   26 +++
 src/mesa/main/bufferobj.c|9 +
 src/mesa/main/dd.h   |   12 ++
 src/mesa/main/dlist.c|   41 
 src/mesa/main/extensions.c   |2 +
 src/mesa/main/get.c  |5 +
 src/mesa/main/get_hash_params.py |2 +
 src/mesa/main/mtypes.h   |4 +
 src/mesa/main/tests/dispatch_sanity.cpp  |8 +-
 src/mesa/main/vtxfmt.c   |7 +
 src/mesa/state_tracker/st_cb_rasterpos.c |2 +-
 src/mesa/state_tracker/st_draw.c |3 +-
 src/mesa/state_tracker/st_draw.h |6 +-
 src/mesa/state_tracker/st_draw_feedback.c|3 +-
 src/mesa/tnl/tnl.h   |3 +-
 src/mesa/vbo/vbo.h   |5 +-
 src/mesa/vbo/vbo_exec_array.c|  255 +-
 src/mesa/vbo/vbo_exec_draw.c |2 +-
 src/mesa/vbo/vbo_primitive_restart.c |4 +-
 src/mesa/vbo/vbo_rebase.c|2 +-
 src/mesa/vbo/vbo_save_api.c  |   53 ++
 src/mesa/vbo/vbo_save_draw.c |2 +-
 src/mesa/vbo/vbo_split_copy.c|2 +-
 src/mesa/vbo/vbo_split_inplace.c |2 +-
 30 files changed, 611 insertions(+), 28 deletions(-)

diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 36e47e2..243c148 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -96,6 +96,7 @@ API_XML = \
ARB_depth_clamp.xml \
ARB_draw_buffers_blend.xml \
ARB_draw_elements_base_vertex.xml \
+   ARB_draw_indirect.xml \
ARB_draw_instanced.xml \
ARB_ES2_compatibility.xml \
ARB_ES3_compatibility.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index df95924..f22fdac 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8240,6 +8240,8 @@
 
 
 
+http://www.w3.org/2001/XInclude"/>
+
 
   
   
@@ -8317,7 +8319,7 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+
 
 http://www.w3.org/2001/XInclude"/>
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 809bcc5..d0c8415 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -548,7 +548,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount )
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *indirect )
 {
struct intel_context *intel = intel_context(ctx);
const struct gl_client_array **arrays = ctx->Array._DrawArrays;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.h 
b/src/mesa/drivers/dri/i965/brw_draw.h
index d86a9e7..3dfac2e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.h
+++ b/src/mesa/drivers/dri/i965/brw_draw.h
@@ -41,7 +41,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount );
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *tfb_vertcount );
 
 void brw_draw_init( struct brw_context *brw );
 void brw_draw_destroy( struct brw_context *brw );
diff --git a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c 
b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
index 436db32..4dee0b8 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
+++ b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
@@ -222,7 +222,8 @@ TAG(vbo_render_prims)(struct gl_context *ctx,
  const struct _mesa_index_buffer *ib,
  GLboolean index_bounds_valid,
  GLuint min_index, GLuint max_index,
- struct gl_transform_feedback_object *tfb_vertcount);
+ struct gl_transform_feedback_object *tfb_vertcount,
+ struct gl_buffer_object *indirect);
 
 static GLboolean
 vbo_maybe_split(struct gl_context *ctx, const struct gl_client_array **arrays,
@@ -453,7 +454,8 @@ TAG(vbo_render_prims)(struct gl_context *ctx,
  const struct _mesa_index_buffer *ib,
  GLboolean index_bounds_valid,

[Mesa-dev] [PATCH 4/4] st/mesa: add support for indirect drawing

2013-04-04 Thread Christoph Bumiller

---
 src/mesa/state_tracker/st_cb_bufferobjects.c |3 +++
 src/mesa/state_tracker/st_draw.c |   11 ++-
 src/mesa/state_tracker/st_extensions.c   |4 +++-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c 
b/src/mesa/state_tracker/st_cb_bufferobjects.c
index 8ff32c8..5a44bf2 100644
--- a/src/mesa/state_tracker/st_cb_bufferobjects.c
+++ b/src/mesa/state_tracker/st_cb_bufferobjects.c
@@ -205,6 +205,9 @@ st_bufferobj_data(struct gl_context *ctx,
case GL_UNIFORM_BUFFER:
   bind = PIPE_BIND_CONSTANT_BUFFER;
   break;
+   case GL_DRAW_INDIRECT_BUFFER:
+  bind = PIPE_BIND_COMMAND_BUFFER;
+  break;
default:
   bind = 0;
}
diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index ee1c902..f1379ab 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -256,6 +256,14 @@ st_draw_vbo(struct gl_context *ctx,
   }
}
 
+   if (indirect) {
+  info.indirect = st_buffer_object(indirect)->buffer;
+
+  /* Primitive restart is not handled by the VBO module in this case. */
+  info.primitive_restart = ctx->Array._PrimitiveRestart;
+  info.restart_index = ctx->Array._RestartIndex;
+   }
+
/* do actual drawing */
for (i = 0; i < nr_prims; i++) {
   info.mode = translate_prim( ctx, prims[i].mode );
@@ -268,6 +276,7 @@ st_draw_vbo(struct gl_context *ctx,
  info.min_index = info.start;
  info.max_index = info.start + info.count - 1;
   }
+  info.indirect_offset = prims[i].indirect_offset;
 
   if (ST_DEBUG & DEBUG_DRAW) {
  debug_printf("st/draw: mode %s  start %u  count %u  indexed %d\n",
@@ -277,7 +286,7 @@ st_draw_vbo(struct gl_context *ctx,
   info.indexed);
   }
 
-  if (info.count_from_stream_output) {
+  if (info.count_from_stream_output || info.indirect) {
  cso_draw_vbo(st->cso_context, &info);
   }
   else if (info.primitive_restart) {
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 11db9d3..c021cda 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -398,7 +398,9 @@ void st_init_extensions(struct st_context *st)
   { o(MESA_texture_array),   PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS 
},
 
   { o(OES_standard_derivatives), PIPE_CAP_SM3  
},
-  { o(ARB_texture_cube_map_array),   PIPE_CAP_CUBE_MAP_ARRAY   
}
+  { o(ARB_texture_cube_map_array),   PIPE_CAP_CUBE_MAP_ARRAY   
},
+  { o(ARB_draw_indirect),PIPE_CAP_DRAW_INDIRECT
},
+  { o(ARB_multi_draw_indirect),  PIPE_CAP_DRAW_INDIRECT
}
};
 
/* Required: render target and sampler support */
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] gallium: add facilities for indirect drawing

2013-04-04 Thread Christoph Bumiller

---
 src/gallium/auxiliary/util/u_draw.c  |   39 ++
 src/gallium/auxiliary/util/u_draw.h  |5 +++
 src/gallium/auxiliary/util/u_dump_state.c|3 ++
 src/gallium/docs/source/screen.rst   |3 ++
 src/gallium/drivers/freedreno/freedreno_screen.c |1 +
 src/gallium/drivers/i915/i915_screen.c   |1 +
 src/gallium/drivers/llvmpipe/lp_draw_arrays.c|5 +++
 src/gallium/drivers/llvmpipe/lp_screen.c |2 +
 src/gallium/drivers/nv30/nv30_screen.c   |1 +
 src/gallium/drivers/nv50/nv50_screen.c   |2 +
 src/gallium/drivers/r300/r300_screen.c   |1 +
 src/gallium/drivers/r600/r600_pipe.c |1 +
 src/gallium/drivers/radeonsi/radeonsi_pipe.c |1 +
 src/gallium/drivers/softpipe/sp_draw_arrays.c|6 +++
 src/gallium/drivers/softpipe/sp_screen.c |2 +
 src/gallium/drivers/svga/svga_screen.c   |1 +
 src/gallium/drivers/trace/tr_dump_state.c|3 ++
 src/gallium/include/pipe/p_defines.h |3 +-
 src/gallium/include/pipe/p_state.h   |   22 
 19 files changed, 101 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_draw.c 
b/src/gallium/auxiliary/util/u_draw.c
index 83d9284..7a28cf1 100644
--- a/src/gallium/auxiliary/util/u_draw.c
+++ b/src/gallium/auxiliary/util/u_draw.c
@@ -27,6 +27,7 @@
 
 
 #include "util/u_debug.h"
+#include "util/u_inlines.h"
 #include "util/u_math.h"
 #include "util/u_format.h"
 #include "util/u_draw.h"
@@ -123,3 +124,41 @@ util_draw_max_index(
 
return max_index + 1;
 }
+
+
+void
+util_draw_indirect(struct pipe_context *pipe,
+   const struct pipe_draw_info *_info)
+{
+   struct pipe_draw_info info;
+   struct pipe_transfer *transfer;
+   uint32_t *params;
+
+   assert(_info->indirect);
+   assert(!_info->count_from_stream_output);
+
+   memcpy(&info, _info, sizeof(info));
+
+   params = (uint32_t *)
+  pipe_buffer_map_range(pipe,
+_info->indirect,
+_info->indirect_offset,
+_info->indexed ? (4 * 4) : (3 * 4),
+PIPE_TRANSFER_READ,
+&transfer);
+   if (!transfer) {
+  debug_printf("%s: failed to map indirect buffer\n", __FUNCTION__);
+  return;
+   }
+
+   info.count = params[0];
+   info.instance_count = params[1];
+   info.start = params[2];
+   info.index_bias = _info->indexed ? params[3] : 0;
+   info.start_instance = _info->indexed ? params[4] : params[3];
+   info.indirect = NULL;
+
+   pipe_buffer_unmap(pipe, transfer);
+
+   pipe->draw_vbo(pipe, &info);
+}
diff --git a/src/gallium/auxiliary/util/u_draw.h 
b/src/gallium/auxiliary/util/u_draw.h
index 3dc6918..acec56e 100644
--- a/src/gallium/auxiliary/util/u_draw.h
+++ b/src/gallium/auxiliary/util/u_draw.h
@@ -142,6 +142,11 @@ util_draw_range_elements(struct pipe_context *pipe,
 }
 
 
+void
+util_draw_indirect(struct pipe_context *pipe,
+   const struct pipe_draw_info *info);
+
+
 unsigned
 util_draw_max_index(
   const struct pipe_vertex_buffer *vertex_buffers,
diff --git a/src/gallium/auxiliary/util/u_dump_state.c 
b/src/gallium/auxiliary/util/u_dump_state.c
index 2f28f3c..21b6044 100644
--- a/src/gallium/auxiliary/util/u_dump_state.c
+++ b/src/gallium/auxiliary/util/u_dump_state.c
@@ -758,6 +758,9 @@ util_dump_draw_info(FILE *stream, const struct 
pipe_draw_info *state)
 
util_dump_member(stream, ptr, state, count_from_stream_output);
 
+   util_dump_member(stream, ptr, state, indirect);
+   util_dump_member(stream, uint, state, indirect_offset);
+
util_dump_struct_end(stream);
 }
 
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index f8cdded..ed4749d 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -151,6 +151,9 @@ The integer capabilities:
   dedicated memory should return 1 and all software rasterizers should return 
0.
 * ``PIPE_CAP_QUERY_PIPELINE_STATISTICS``: Whether 
PIPE_QUERY_PIPELINE_STATISTICS
   is supported.
+* ``PIPE_CAP_DRAW_INDIRECT``: Whether the driver supports taking draw arguments
+  { count, instance_count, start, index_bias } from a PIPE_BUFFER resource.
+  See pipe_draw_info.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 283d07f..2b13e29 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -200,6 +200,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_USER_INDEX_BUFFERS:
case PIPE_CAP_QUERY_PIPELINE_STATISTICS:
+   case PIPE_CAP_DRAW_INDIRECT:
return 0;
 
/* Stream output. */
diff --git a/src/gallium/drivers/i915/i91

[Mesa-dev] [PATCH] mesa: implement GL_ARB_draw_indirect (added missing ARB_draw_indirect.xml)

2013-04-04 Thread Christoph Bumiller

---
 src/mapi/glapi/gen/ARB_draw_indirect.xml |   45 +
 src/mapi/glapi/gen/Makefile.am   |1 +
 src/mapi/glapi/gen/gl_API.xml|4 +-
 src/mesa/drivers/dri/i965/brw_draw.c |3 +-
 src/mesa/drivers/dri/i965/brw_draw.h |3 +-
 src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c |9 +-
 src/mesa/main/api_validate.c |  159 
 src/mesa/main/api_validate.h |   26 +++
 src/mesa/main/bufferobj.c|9 +
 src/mesa/main/dd.h   |   12 ++
 src/mesa/main/dlist.c|   41 
 src/mesa/main/extensions.c   |2 +
 src/mesa/main/get.c  |5 +
 src/mesa/main/get_hash_params.py |2 +
 src/mesa/main/mtypes.h   |4 +
 src/mesa/main/tests/dispatch_sanity.cpp  |8 +-
 src/mesa/main/vtxfmt.c   |7 +
 src/mesa/state_tracker/st_cb_rasterpos.c |2 +-
 src/mesa/state_tracker/st_draw.c |3 +-
 src/mesa/state_tracker/st_draw.h |6 +-
 src/mesa/state_tracker/st_draw_feedback.c|3 +-
 src/mesa/tnl/tnl.h   |3 +-
 src/mesa/vbo/vbo.h   |5 +-
 src/mesa/vbo/vbo_exec_array.c|  255 +-
 src/mesa/vbo/vbo_exec_draw.c |2 +-
 src/mesa/vbo/vbo_primitive_restart.c |4 +-
 src/mesa/vbo/vbo_rebase.c|2 +-
 src/mesa/vbo/vbo_save_api.c  |   53 ++
 src/mesa/vbo/vbo_save_draw.c |2 +-
 src/mesa/vbo/vbo_split_copy.c|2 +-
 src/mesa/vbo/vbo_split_inplace.c |2 +-
 31 files changed, 656 insertions(+), 28 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_draw_indirect.xml

diff --git a/src/mapi/glapi/gen/ARB_draw_indirect.xml 
b/src/mapi/glapi/gen/ARB_draw_indirect.xml
new file mode 100644
index 000..7de03cd
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_draw_indirect.xml
@@ -0,0 +1,45 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 36e47e2..243c148 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -96,6 +96,7 @@ API_XML = \
ARB_depth_clamp.xml \
ARB_draw_buffers_blend.xml \
ARB_draw_elements_base_vertex.xml \
+   ARB_draw_indirect.xml \
ARB_draw_instanced.xml \
ARB_ES2_compatibility.xml \
ARB_ES3_compatibility.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index df95924..f22fdac 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8240,6 +8240,8 @@
 
 
 
+http://www.w3.org/2001/XInclude"/>
+
 
   
   
@@ -8317,7 +8319,7 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+
 
 http://www.w3.org/2001/XInclude"/>
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 809bcc5..d0c8415 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -548,7 +548,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount )
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *indirect )
 {
struct intel_context *intel = intel_context(ctx);
const struct gl_client_array **arrays = ctx->Array._DrawArrays;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.h 
b/src/mesa/drivers/dri/i965/brw_draw.h
index d86a9e7..3dfac2e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.h
+++ b/src/mesa/drivers/dri/i965/brw_draw.h
@@ -41,7 +41,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount );
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *tfb_vertcount );
 
 void brw_draw_init( struct brw_context *brw );
 void brw_draw_destroy( struct brw_context *brw );
diff --git a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c 
b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
index 436db32..4dee0b8 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
+++ b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
@@ -222,7 +222,8 @@ TAG(vbo_render_prims)(struct gl_context *ctx,
  const struct _mesa_index_buffer *ib,

Re: [Mesa-dev] [PATCH] glsl: Add an optimization pass to flatten simple nested if blocks.

2013-04-04 Thread Kenneth Graunke


On 04/04/2013 09:08 AM, Eric Anholt wrote:

Kenneth Graunke  writes:

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index c294aa4..b5282a6 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -80,6 +80,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_dead_code.cpp \
$(GLSL_SRCDIR)/opt_dead_code_local.cpp \
$(GLSL_SRCDIR)/opt_dead_functions.cpp \
+   $(GLSL_SRCDIR)/opt_flatten_nested_if_blocks.cpp \
$(GLSL_SRCDIR)/opt_function_inlining.cpp \
$(GLSL_SRCDIR)/opt_if_simplification.cpp \
$(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 9740903..0992294 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1218,6 +1218,7 @@ do_common_optimization(exec_list *ir, bool linked,
progress = do_structure_splitting(ir) || progress;
 }
 progress = do_if_simplification(ir) || progress;
+   progress = opt_flatten_nested_if_blocks(ir) || progress;
 progress = do_copy_propagation(ir) || progress;
 progress = do_copy_propagation_elements(ir) || progress;
 if (linked)
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index 2454bbe..a8885d7 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -82,6 +82,7 @@ bool do_function_inlining(exec_list *instructions);
  bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, bool 
lower_sub_return = true, bool lower_main_return = false, bool lower_continue = 
false, bool lower_break = false);
  bool do_lower_texture_projection(exec_list *instructions);
  bool do_if_simplification(exec_list *instructions);
+bool opt_flatten_nested_if_blocks(exec_list *instructions);
  bool do_discard_simplification(exec_list *instructions);
  bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth = 0);
  bool do_mat_op_to_vec(exec_list *instructions);
diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp 
b/src/glsl/opt_flatten_nested_if_blocks.cpp
new file mode 100644
index 000..c702102
--- /dev/null
+++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
@@ -0,0 +1,103 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file opt_flatten_nested_if_blocks.cpp
+ *
+ * Flattens nested if blocks such as:
+ *
+ * if (x) {
+ *if (y) {
+ *   ...
+ *}
+ * }
+ *
+ * into a single if block with a combined condition:
+ *
+ * if (x && y) {
+ *...
+ * }
+ */
+
+#include "ir.h"
+#include "ir_builder.h"
+
+using namespace ir_builder;
+
+namespace {
+
+class nested_if_flattener : public ir_hierarchical_visitor {
+public:
+   nested_if_flattener()
+   {
+  progress = false;
+   }
+
+   ir_visitor_status visit_leave(ir_if *);
+   ir_visitor_status visit_enter(ir_assignment *);
+
+   bool progress;
+};
+
+} /* unnamed namespace */
+
+/* We only care about the top level "if" instructions, so don't
+ * descend into expressions.
+ */
+ir_visitor_status
+nested_if_flattener::visit_enter(ir_assignment *ir)
+{
+   (void) ir;
+   return visit_continue_with_parent;
+}
+
+bool
+opt_flatten_nested_if_blocks(exec_list *instructions)
+{
+   nested_if_flattener v;
+
+   v.run(instructions);
+   return v.progress;
+}
+
+
+ir_visitor_status
+nested_if_flattener::visit_leave(ir_if *ir)
+{
+   /* Only handle a single ir_if within the then clause of an ir_if.  No extra
+* instructions, no else clauses, nothing.
+*/
+   if (ir->then_instructions.is_empty() || !ir->else_instructions.is_empty())
+  return visit_continue;
+
+   ir_if *inner = ((ir_instruction *) ir->then_instructions.head)->as_if();
+   if (!inner || !inner->next->is_tail_sentinel() ||
+   !inner->else_instructions.is_empty())
+  return visit_continue;
+
+   ir->condition = logic_and(ir->condition, inner->condition);
+   inner->t

Re: [Mesa-dev] [PATCH 2/2] glsl: Add a pass to flip matrix/vector multiplies to use dot products.

2013-04-04 Thread Kenneth Graunke


On 04/04/2013 08:13 AM, Paul Berry wrote:

On 2 April 2013 23:33, Kenneth Graunke  wrote:

[snip]

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index ce084b4..13dfdd3 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -176,7 +176,7 @@ compile_shader(struct gl_context *ctx, struct
gl_shader *shader)
 if (!state->error && !shader->ir->is_empty()) {
bool progress;
do {
-progress = do_common_optimization(shader->ir, false, false,
32);
+progress = do_common_optimization(shader->ir, false, false,
32, false);


What's the reason for passing false in this case?  It seems like we
ought to pass ctx->mvp_with_dp4 in all cases.


Fair enough.  For the standalone compiler, I just picked something 
rather arbitrarily.  ctx->mvp_with_dp4 is false for now.



For that matter, I'm curious why we don't just check the value of
ctx->mvp_with_dp4 from inside do_common_optimization()--it seems like
that would be easier to maintain.


It doesn't currently have access to gl_context.  I could instead pass 
that...or move this flag inside ctx->ShaderCompilerOptions and pass a 
const pointer to that instead.  Preferences?



With that question addressed, this series is:

Reviewed-by: Paul Berry 


Thanks Paul!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Add a pass to flip matrix/vector multiplies to use dot products.

2013-04-04 Thread Paul Berry

On 4 April 2013 12:13, Kenneth Graunke  wrote:

> On 04/04/2013 08:13 AM, Paul Berry wrote:
>
>> On 2 April 2013 23:33, Kenneth Graunke  wrote:
>>
> [snip]
>
>  diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
>> index ce084b4..13dfdd3 100644
>> --- a/src/glsl/main.cpp
>> +++ b/src/glsl/main.cpp
>> @@ -176,7 +176,7 @@ compile_shader(struct gl_context *ctx, struct
>> gl_shader *shader)
>>  if (!state->error && !shader->ir->is_empty()) {
>> bool progress;
>> do {
>> -progress = do_common_optimization(shader-**>ir, false,
>> false,
>> 32);
>> +progress = do_common_optimization(shader-**>ir, false,
>> false,
>> 32, false);
>>
>>
>> What's the reason for passing false in this case?  It seems like we
>> ought to pass ctx->mvp_with_dp4 in all cases.
>>
>
> Fair enough.  For the standalone compiler, I just picked something rather
> arbitrarily.  ctx->mvp_with_dp4 is false for now.


Ah! I got confused and didn't realize that this call site was in the
standalone compiler.  Makes sense.


>
>
>  For that matter, I'm curious why we don't just check the value of
>> ctx->mvp_with_dp4 from inside do_common_optimization()--it seems like
>> that would be easier to maintain.
>>
>
> It doesn't currently have access to gl_context.  I could instead pass
> that...or move this flag inside ctx->ShaderCompilerOptions and pass a const
> pointer to that instead.  Preferences?


I like the idea of moving the flag inside ctx->ShaderCompilerOptions and
passing a const pointer to that.  Assuming it's not too much trouble.


>
>
>  With that question addressed, this series is:
>>
>> Reviewed-by: Paul Berry 
>>
>
> Thanks Paul!
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] mesa: implement GL_ARB_draw_indirect

2013-04-04 Thread Brian Paul


I just did a quick skim and found a few minor things.

First, the subject might be "mesa: implement GL_ARB_draw_indirect and 
GL_ARB_multi_draw_indirect"


This is a big patch and I think it could have been broken down into 
smaller pieces, but I know it's a PITA to redo.  Next time.



On 04/04/2013 12:18 PM, Christoph Bumiller wrote:

---
  src/mapi/glapi/gen/Makefile.am   |1 +
  src/mapi/glapi/gen/gl_API.xml|4 +-
  src/mesa/drivers/dri/i965/brw_draw.c |3 +-
  src/mesa/drivers/dri/i965/brw_draw.h |3 +-
  src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c |9 +-
  src/mesa/main/api_validate.c |  159 
  src/mesa/main/api_validate.h |   26 +++
  src/mesa/main/bufferobj.c|9 +
  src/mesa/main/dd.h   |   12 ++
  src/mesa/main/dlist.c|   41 
  src/mesa/main/extensions.c   |2 +
  src/mesa/main/get.c  |5 +
  src/mesa/main/get_hash_params.py |2 +
  src/mesa/main/mtypes.h   |4 +
  src/mesa/main/tests/dispatch_sanity.cpp  |8 +-
  src/mesa/main/vtxfmt.c   |7 +
  src/mesa/state_tracker/st_cb_rasterpos.c |2 +-
  src/mesa/state_tracker/st_draw.c |3 +-
  src/mesa/state_tracker/st_draw.h |6 +-
  src/mesa/state_tracker/st_draw_feedback.c|3 +-
  src/mesa/tnl/tnl.h   |3 +-
  src/mesa/vbo/vbo.h   |5 +-
  src/mesa/vbo/vbo_exec_array.c|  255 +-
  src/mesa/vbo/vbo_exec_draw.c |2 +-
  src/mesa/vbo/vbo_primitive_restart.c |4 +-
  src/mesa/vbo/vbo_rebase.c|2 +-
  src/mesa/vbo/vbo_save_api.c  |   53 ++
  src/mesa/vbo/vbo_save_draw.c |2 +-
  src/mesa/vbo/vbo_split_copy.c|2 +-
  src/mesa/vbo/vbo_split_inplace.c |2 +-
  30 files changed, 611 insertions(+), 28 deletions(-)

diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 36e47e2..243c148 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -96,6 +96,7 @@ API_XML = \
ARB_depth_clamp.xml \
ARB_draw_buffers_blend.xml \
ARB_draw_elements_base_vertex.xml \
+   ARB_draw_indirect.xml \
ARB_draw_instanced.xml \
ARB_ES2_compatibility.xml \
ARB_ES3_compatibility.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index df95924..f22fdac 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8240,6 +8240,8 @@

  

+http://www.w3.org/2001/XInclude"/>
+
  


@@ -8317,7 +8319,7 @@

  http://www.w3.org/2001/XInclude"/>

-
+

  http://www.w3.org/2001/XInclude"/>

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 809bcc5..d0c8415 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -548,7 +548,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount )
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *indirect )
  {
 struct intel_context *intel = intel_context(ctx);
 const struct gl_client_array **arrays = ctx->Array._DrawArrays;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.h 
b/src/mesa/drivers/dri/i965/brw_draw.h
index d86a9e7..3dfac2e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.h
+++ b/src/mesa/drivers/dri/i965/brw_draw.h
@@ -41,7 +41,8 @@ void brw_draw_prims( struct gl_context *ctx,
 GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
-struct gl_transform_feedback_object *tfb_vertcount );
+struct gl_transform_feedback_object *tfb_vertcount,
+struct gl_buffer_object *tfb_vertcount );

  void brw_draw_init( struct brw_context *brw );
  void brw_draw_destroy( struct brw_context *brw );
diff --git a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c 
b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
index 436db32..4dee0b8 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
+++ b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c
@@ -222,7 +222,8 @@ TAG(vbo_render_prims)(struct gl_context *ctx,
  const struct _mesa_index_buffer *ib,
  GLboolean index_bounds_valid,
  GLuint min_index, GLuint max_index,
- struct gl_transform_feedback_object *tfb_vertcount);
+ struct gl_transform_

Re: [Mesa-dev] [PATCH 3/4] gallium: add facilities for indirect drawing

2013-04-04 Thread Brian Paul


On 04/04/2013 12:18 PM, Christoph Bumiller wrote:

---
  src/gallium/auxiliary/util/u_draw.c  |   39 ++
  src/gallium/auxiliary/util/u_draw.h  |5 +++
  src/gallium/auxiliary/util/u_dump_state.c|3 ++
  src/gallium/docs/source/screen.rst   |3 ++
  src/gallium/drivers/freedreno/freedreno_screen.c |1 +
  src/gallium/drivers/i915/i915_screen.c   |1 +
  src/gallium/drivers/llvmpipe/lp_draw_arrays.c|5 +++
  src/gallium/drivers/llvmpipe/lp_screen.c |2 +
  src/gallium/drivers/nv30/nv30_screen.c   |1 +
  src/gallium/drivers/nv50/nv50_screen.c   |2 +
  src/gallium/drivers/r300/r300_screen.c   |1 +
  src/gallium/drivers/r600/r600_pipe.c |1 +
  src/gallium/drivers/radeonsi/radeonsi_pipe.c |1 +
  src/gallium/drivers/softpipe/sp_draw_arrays.c|6 +++
  src/gallium/drivers/softpipe/sp_screen.c |2 +
  src/gallium/drivers/svga/svga_screen.c   |1 +
  src/gallium/drivers/trace/tr_dump_state.c|3 ++
  src/gallium/include/pipe/p_defines.h |3 +-
  src/gallium/include/pipe/p_state.h   |   22 
  19 files changed, 101 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_draw.c 
b/src/gallium/auxiliary/util/u_draw.c
index 83d9284..7a28cf1 100644
--- a/src/gallium/auxiliary/util/u_draw.c
+++ b/src/gallium/auxiliary/util/u_draw.c
@@ -27,6 +27,7 @@


  #include "util/u_debug.h"
+#include "util/u_inlines.h"
  #include "util/u_math.h"
  #include "util/u_format.h"
  #include "util/u_draw.h"
@@ -123,3 +124,41 @@ util_draw_max_index(

 return max_index + 1;
  }
+
+



Could you put a comment on this function to explain exactly what it's 
doing?  E.g. convert an "indirect" draw into a "direct" draw by 
mapping the parameter buffer, etc.




+void
+util_draw_indirect(struct pipe_context *pipe,
+   const struct pipe_draw_info *_info)


I know there's other instances of this, but params with leading 
underscores look kind of weird.  Maybe "infoIn"?




+{
+   struct pipe_draw_info info;
+   struct pipe_transfer *transfer;
+   uint32_t *params;
+
+   assert(_info->indirect);
+   assert(!_info->count_from_stream_output);
+
+   memcpy(&info, _info, sizeof(info));
+
+   params = (uint32_t *)
+  pipe_buffer_map_range(pipe,
+_info->indirect,
+_info->indirect_offset,
+_info->indexed ? (4 * 4) : (3 * 4),


Document those magic numbers?



+PIPE_TRANSFER_READ,
+&transfer);
+   if (!transfer) {
+  debug_printf("%s: failed to map indirect buffer\n", __FUNCTION__);
+  return;
+   }
+
+   info.count = params[0];
+   info.instance_count = params[1];
+   info.start = params[2];
+   info.index_bias = _info->indexed ? params[3] : 0;
+   info.start_instance = _info->indexed ? params[4] : params[3];
+   info.indirect = NULL;
+
+   pipe_buffer_unmap(pipe, transfer);
+
+   pipe->draw_vbo(pipe,&info);
+}
diff --git a/src/gallium/auxiliary/util/u_draw.h 
b/src/gallium/auxiliary/util/u_draw.h
index 3dc6918..acec56e 100644
--- a/src/gallium/auxiliary/util/u_draw.h
+++ b/src/gallium/auxiliary/util/u_draw.h
@@ -142,6 +142,11 @@ util_draw_range_elements(struct pipe_context *pipe,
  }


+void
+util_draw_indirect(struct pipe_context *pipe,
+   const struct pipe_draw_info *info);
+
+
  unsigned
  util_draw_max_index(
const struct pipe_vertex_buffer *vertex_buffers,
diff --git a/src/gallium/auxiliary/util/u_dump_state.c 
b/src/gallium/auxiliary/util/u_dump_state.c
index 2f28f3c..21b6044 100644
--- a/src/gallium/auxiliary/util/u_dump_state.c
+++ b/src/gallium/auxiliary/util/u_dump_state.c
@@ -758,6 +758,9 @@ util_dump_draw_info(FILE *stream, const struct 
pipe_draw_info *state)

 util_dump_member(stream, ptr, state, count_from_stream_output);

+   util_dump_member(stream, ptr, state, indirect);
+   util_dump_member(stream, uint, state, indirect_offset);
+
 util_dump_struct_end(stream);
  }

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index f8cdded..ed4749d 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -151,6 +151,9 @@ The integer capabilities:
dedicated memory should return 1 and all software rasterizers should return 
0.
  * ``PIPE_CAP_QUERY_PIPELINE_STATISTICS``: Whether 
PIPE_QUERY_PIPELINE_STATISTICS
is supported.
+* ``PIPE_CAP_DRAW_INDIRECT``: Whether the driver supports taking draw arguments
+  { count, instance_count, start, index_bias } from a PIPE_BUFFER resource.
+  See pipe_draw_info.


  .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 283d07f..2b13e29 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedr

Re: [Mesa-dev] [PATCH 4/4] st/mesa: add support for indirect drawing

2013-04-04 Thread Brian Paul


On 04/04/2013 12:18 PM, Christoph Bumiller wrote:

---
  src/mesa/state_tracker/st_cb_bufferobjects.c |3 +++
  src/mesa/state_tracker/st_draw.c |   11 ++-
  src/mesa/state_tracker/st_extensions.c   |4 +++-
  3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c 
b/src/mesa/state_tracker/st_cb_bufferobjects.c
index 8ff32c8..5a44bf2 100644
--- a/src/mesa/state_tracker/st_cb_bufferobjects.c
+++ b/src/mesa/state_tracker/st_cb_bufferobjects.c
@@ -205,6 +205,9 @@ st_bufferobj_data(struct gl_context *ctx,
 case GL_UNIFORM_BUFFER:
bind = PIPE_BIND_CONSTANT_BUFFER;
break;
+   case GL_DRAW_INDIRECT_BUFFER:
+  bind = PIPE_BIND_COMMAND_BUFFER;
+  break;
 default:
bind = 0;
 }
diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index ee1c902..f1379ab 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -256,6 +256,14 @@ st_draw_vbo(struct gl_context *ctx,
}
 }

+   if (indirect) {
+  info.indirect = st_buffer_object(indirect)->buffer;
+
+  /* Primitive restart is not handled by the VBO module in this case. */
+  info.primitive_restart = ctx->Array._PrimitiveRestart;
+  info.restart_index = ctx->Array._RestartIndex;
+   }
+
 /* do actual drawing */
 for (i = 0; i<  nr_prims; i++) {
info.mode = translate_prim( ctx, prims[i].mode );
@@ -268,6 +276,7 @@ st_draw_vbo(struct gl_context *ctx,
   info.min_index = info.start;
   info.max_index = info.start + info.count - 1;
}
+  info.indirect_offset = prims[i].indirect_offset;

if (ST_DEBUG&  DEBUG_DRAW) {
   debug_printf("st/draw: mode %s  start %u  count %u  indexed %d\n",
@@ -277,7 +286,7 @@ st_draw_vbo(struct gl_context *ctx,
info.indexed);
}

-  if (info.count_from_stream_output) {
+  if (info.count_from_stream_output || info.indirect) {
   cso_draw_vbo(st->cso_context,&info);
}
else if (info.primitive_restart) {
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 11db9d3..c021cda 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -398,7 +398,9 @@ void st_init_extensions(struct st_context *st)
{ o(MESA_texture_array),   
PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS },

{ o(OES_standard_derivatives), PIPE_CAP_SM3 
 },
-  { o(ARB_texture_cube_map_array),   PIPE_CAP_CUBE_MAP_ARRAY   
}
+  { o(ARB_texture_cube_map_array),   PIPE_CAP_CUBE_MAP_ARRAY   
},
+  { o(ARB_draw_indirect),PIPE_CAP_DRAW_INDIRECT
},
+  { o(ARB_multi_draw_indirect),  PIPE_CAP_DRAW_INDIRECT
}
 };

 /* Required: render target and sampler support */


Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] pipe-loader: Fix out of source build

2013-04-04 Thread Niels Ole Salscheider

Am Sonntag, 24. Februar 2013, 15:02:33 schrieb Matt Turner:
> On Sun, Feb 24, 2013 at 2:00 PM, Niels Ole Salscheider
> 
>  wrote:
> > Signed-off-by: Niels Ole Salscheider 
> > ---
> > 
> >  src/gallium/targets/opencl/Makefile.am | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/gallium/targets/opencl/Makefile.am
> > b/src/gallium/targets/opencl/Makefile.am index c5c3003..709112f 100644
> > --- a/src/gallium/targets/opencl/Makefile.am
> > +++ b/src/gallium/targets/opencl/Makefile.am
> > @@ -32,11 +32,11 @@ libOpenCL_la_SOURCES =
> > 
> >  # Force usage of a C++ linker
> >  nodist_EXTRA_libOpenCL_la_SOURCES = dummy.cpp
> > 
> > -PIPE_SRC_DIR = $(top_srcdir)/src/gallium/targets/pipe-loader
> > +PIPE_BUILD_DIR = $(top_builddir)/src/gallium/targets/pipe-loader
> > 
> >  # Provide compatibility with scripts for the old Mesa build system for
> >  # a while by putting a link to the driver into /lib of the build tree.
> >  all-local: libOpenCL.la
> > 
> > -   @$(MAKE) -C $(PIPE_SRC_DIR)
> > +   @$(MAKE) -C $(PIPE_BUILD_DIR)
> > 
> > $(MKDIR_P) $(top_builddir)/$(LIB_DIR)
> > ln -f .libs/libOpenCL.so* $(top_builddir)/$(LIB_DIR)/
> > 
> > --
> > 1.8.1.3
> 
> I think I've fixed this in a different way (that doesn't involve
> calling $(MAKE)) in this branch:
> http://cgit.freedesktop.org/~mattst88/mesa/log/?h=make-dist

Do you intend to merge this branch in the forseeable future?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] clover: Fix linkage of libOpenCL

2013-04-04 Thread Niels Ole Salscheider

Clover needs the irreader component of llvm
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 81d4a3f..bfba1b3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1649,7 +1649,7 @@ if test "x$enable_gallium_llvm" = xyes; then
 fi
 
 if test "x$enable_opencl" = xyes; then
-LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
+LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo irreader linker 
instrumentation"
 fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: some minor cube map cleanup

2013-04-04 Thread sroland

From: Roland Scheidegger 

The ar_ge_as_at variable was just very very confusing since the condition
was actually the other way around (as_at_ge_ar). So change the condition
(and the selects depending on it) to match the variable name.
Also, while here, change the chosen major axis in case the coord values
are the same. OpenGL doesn't care one bit which one is chosen in this
case but it looks like dx10 would require z chosen over y, and y chosen
over x (previously did x chosen over y, y chosen over z). Since it's all
the same effort just honor dx10's wishes.
---
 src/gallium/auxiliary/gallivm/lp_bld_sample.c |   21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index fe29d25..734cfe0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -1403,12 +1403,13 @@ lp_build_cube_lookup(struct lp_build_sample_context 
*bld,
   signr = LLVMBuildAnd(builder, ri, signmask, "");
 
   /*
-   * major face determination: select x if x >= y else select y
-   * select previous result if y >= max(x,y) else select z
+   * major face determination: select x if x > y else select y
+   * select z if z >= max(x,y) else select previous result
+   * if some axis are the same we chose z over y, y over x.
*/
-  as_ge_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, as, at);
+  as_ge_at = lp_build_cmp(coord_bld, PIPE_FUNC_GREATER, as, at);
   maxasat = lp_build_max(coord_bld, as, at);
-  ar_ge_as_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, maxasat, ar);
+  ar_ge_as_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, ar, maxasat);
 
   /*
* compute all possible new s/t coords
@@ -1449,13 +1450,13 @@ lp_build_cube_lookup(struct lp_build_sample_context 
*bld,
  dmaxtnew = lp_build_select(coord_bld, as_ge_at, dmax[1], dmax[2]);
   }
 
-  *face_s = lp_build_select(cint_bld, ar_ge_as_at, *face_s, snewz);
-  *face_t = lp_build_select(cint_bld, ar_ge_as_at, *face_t, tnewz);
-  ma = lp_build_select(coord_bld, ar_ge_as_at, ma, r);
-  *face = lp_build_select(cint_bld, ar_ge_as_at, *face, facez);
+  *face_s = lp_build_select(cint_bld, ar_ge_as_at, snewz, *face_s);
+  *face_t = lp_build_select(cint_bld, ar_ge_as_at, tnewz, *face_t);
+  ma = lp_build_select(coord_bld, ar_ge_as_at, r, ma);
+  *face = lp_build_select(cint_bld, ar_ge_as_at, facez, *face);
   if (need_derivs) {
- dmaxsnew = lp_build_select(coord_bld, ar_ge_as_at, dmaxsnew, dmax[0]);
- dmaxtnew = lp_build_select(coord_bld, ar_ge_as_at, dmaxtnew, dmax[1]);
+ dmaxsnew = lp_build_select(coord_bld, ar_ge_as_at, dmax[0], dmaxsnew);
+ dmaxtnew = lp_build_select(coord_bld, ar_ge_as_at, dmax[1], dmaxtnew);
   }
 
   *face_s = LLVMBuildBitCast(builder, *face_s,
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: some minor cube map cleanup

2013-04-04 Thread Jose Fonseca

Looks good Roland.

- Original Message -
> From: Roland Scheidegger 
> 
> The ar_ge_as_at variable was just very very confusing since the condition
> was actually the other way around (as_at_ge_ar). So change the condition
> (and the selects depending on it) to match the variable name.
> Also, while here, change the chosen major axis in case the coord values
> are the same. OpenGL doesn't care one bit which one is chosen in this
> case but it looks like dx10 would require z chosen over y, and y chosen
> over x (previously did x chosen over y, y chosen over z). Since it's all
> the same effort just honor dx10's wishes.

It would be nice to have this paragraph in as a code comment too.

Jose

> ---
>  src/gallium/auxiliary/gallivm/lp_bld_sample.c |   21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> index fe29d25..734cfe0 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> @@ -1403,12 +1403,13 @@ lp_build_cube_lookup(struct lp_build_sample_context
> *bld,
>signr = LLVMBuildAnd(builder, ri, signmask, "");
>  
>/*
> -   * major face determination: select x if x >= y else select y
> -   * select previous result if y >= max(x,y) else select z
> +   * major face determination: select x if x > y else select y
> +   * select z if z >= max(x,y) else select previous result
> +   * if some axis are the same we chose z over y, y over x.
> */
> -  as_ge_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, as, at);
> +  as_ge_at = lp_build_cmp(coord_bld, PIPE_FUNC_GREATER, as, at);
>maxasat = lp_build_max(coord_bld, as, at);
> -  ar_ge_as_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, maxasat, ar);
> +  ar_ge_as_at = lp_build_cmp(coord_bld, PIPE_FUNC_GEQUAL, ar, maxasat);
>  
>/*
> * compute all possible new s/t coords
> @@ -1449,13 +1450,13 @@ lp_build_cube_lookup(struct lp_build_sample_context
> *bld,
>   dmaxtnew = lp_build_select(coord_bld, as_ge_at, dmax[1], dmax[2]);
>}
>  
> -  *face_s = lp_build_select(cint_bld, ar_ge_as_at, *face_s, snewz);
> -  *face_t = lp_build_select(cint_bld, ar_ge_as_at, *face_t, tnewz);
> -  ma = lp_build_select(coord_bld, ar_ge_as_at, ma, r);
> -  *face = lp_build_select(cint_bld, ar_ge_as_at, *face, facez);
> +  *face_s = lp_build_select(cint_bld, ar_ge_as_at, snewz, *face_s);
> +  *face_t = lp_build_select(cint_bld, ar_ge_as_at, tnewz, *face_t);
> +  ma = lp_build_select(coord_bld, ar_ge_as_at, r, ma);
> +  *face = lp_build_select(cint_bld, ar_ge_as_at, facez, *face);
>if (need_derivs) {
> - dmaxsnew = lp_build_select(coord_bld, ar_ge_as_at, dmaxsnew,
> dmax[0]);
> - dmaxtnew = lp_build_select(coord_bld, ar_ge_as_at, dmaxtnew,
> dmax[1]);
> + dmaxsnew = lp_build_select(coord_bld, ar_ge_as_at, dmax[0],
> dmaxsnew);
> + dmaxtnew = lp_build_select(coord_bld, ar_ge_as_at, dmax[1],
> dmaxtnew);
>}
>  
>*face_s = LLVMBuildBitCast(builder, *face_s,
> --
> 1.7.9.5
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] gallium: add PIPE_BIND_COMMAND_BUFFER

2013-04-04 Thread Jose Fonseca

I think that PIPE_BIND_INDIRECT_BUFFER would be more self-descriptive.

Or do you envision other uses of such buffer?

Jose

- Original Message -
> Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER
> target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.
> ---
>  src/gallium/docs/source/screen.rst   |2 ++
>  src/gallium/include/pipe/p_defines.h |1 +
>  2 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/src/gallium/docs/source/screen.rst
> b/src/gallium/docs/source/screen.rst
> index c1a3c0b..f8cdded 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -306,6 +306,8 @@ resources might be created and handled quite differently.
>bound to the graphics pipeline as a shader resource.
>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>bound to the compute program as a shader resource.
> +* ``PIPE_BIND_COMMAND_BUFFER``: A buffer or that may be sourced by the
> +  GPU command processor, like with indirect drawing.
>  
>  .. _pipe_usage:
>  
> diff --git a/src/gallium/include/pipe/p_defines.h
> b/src/gallium/include/pipe/p_defines.h
> index 5b00acc..2b79f2a 100644
> --- a/src/gallium/include/pipe/p_defines.h
> +++ b/src/gallium/include/pipe/p_defines.h
> @@ -315,6 +315,7 @@ enum pipe_flush_flags {
>  #define PIPE_BIND_GLOBAL   (1 << 18) /* set_global_binding */
>  #define PIPE_BIND_SHADER_RESOURCE  (1 << 19) /* set_shader_resources */
>  #define PIPE_BIND_COMPUTE_RESOURCE (1 << 20) /* set_compute_resources */
> +#define PIPE_BIND_COMMAND_BUFFER   (1 << 21) /* pipe_draw_info.indirect
> */
>  
>  /* The first two flags above were previously part of the amorphous
>   * TEXTURE_USAGE, most of which are now descriptions of the ways a
> --
> 1.7.3.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] gallium: add PIPE_BIND_COMMAND_BUFFER

2013-04-04 Thread Christoph Bumiller

On 04.04.2013 21:44, Jose Fonseca wrote:
> I think that PIPE_BIND_INDIRECT_BUFFER would be more self-descriptive.
>
> Or do you envision other uses of such buffer?

It's possible that at some point we add a mechanism to let the driver
store arbitrary commands into a buffer created by the st, or have
resources used as arguments conditional rendering ...
Lost of possiblities, but nothing concrete, and for the command lists
like with D3D's deferred contexts we'd probably return opaque objects
that can contain more auxiliary data.
I like it to be more generic, but then it could turn out that there be
different requirements on these "command source" buffers in the future
... I'm undecided now.


>
> Jose
>
> - Original Message -
>> Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER
>> target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.
>> ---
>>  src/gallium/docs/source/screen.rst   |2 ++
>>  src/gallium/include/pipe/p_defines.h |1 +
>>  2 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/screen.rst
>> b/src/gallium/docs/source/screen.rst
>> index c1a3c0b..f8cdded 100644
>> --- a/src/gallium/docs/source/screen.rst
>> +++ b/src/gallium/docs/source/screen.rst
>> @@ -306,6 +306,8 @@ resources might be created and handled quite differently.
>>bound to the graphics pipeline as a shader resource.
>>  * ``PIPE_BIND_COMPUTE_RESOURCE``: A buffer or texture that can be
>>bound to the compute program as a shader resource.
>> +* ``PIPE_BIND_COMMAND_BUFFER``: A buffer or that may be sourced by the
>> +  GPU command processor, like with indirect drawing.
>>  
>>  .. _pipe_usage:
>>  
>> diff --git a/src/gallium/include/pipe/p_defines.h
>> b/src/gallium/include/pipe/p_defines.h
>> index 5b00acc..2b79f2a 100644
>> --- a/src/gallium/include/pipe/p_defines.h
>> +++ b/src/gallium/include/pipe/p_defines.h
>> @@ -315,6 +315,7 @@ enum pipe_flush_flags {
>>  #define PIPE_BIND_GLOBAL   (1 << 18) /* set_global_binding */
>>  #define PIPE_BIND_SHADER_RESOURCE  (1 << 19) /* set_shader_resources */
>>  #define PIPE_BIND_COMPUTE_RESOURCE (1 << 20) /* set_compute_resources */
>> +#define PIPE_BIND_COMMAND_BUFFER   (1 << 21) /* pipe_draw_info.indirect
>> */
>>  
>>  /* The first two flags above were previously part of the amorphous
>>   * TEXTURE_USAGE, most of which are now descriptions of the ways a
>> --
>> 1.7.3.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add an optimization pass to flatten simple nested if blocks.

2013-04-04 Thread Eric Anholt

Kenneth Graunke  writes:

> On 04/04/2013 09:08 AM, Eric Anholt wrote:
>> Kenneth Graunke  writes:
>>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>>> index c294aa4..b5282a6 100644
>>> --- a/src/glsl/Makefile.sources
>>> +++ b/src/glsl/Makefile.sources
>>> @@ -80,6 +80,7 @@ LIBGLSL_FILES = \
>>> $(GLSL_SRCDIR)/opt_dead_code.cpp \
>>> $(GLSL_SRCDIR)/opt_dead_code_local.cpp \
>>> $(GLSL_SRCDIR)/opt_dead_functions.cpp \
>>> +   $(GLSL_SRCDIR)/opt_flatten_nested_if_blocks.cpp \
>>> $(GLSL_SRCDIR)/opt_function_inlining.cpp \
>>> $(GLSL_SRCDIR)/opt_if_simplification.cpp \
>>> $(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
>>> diff --git a/src/glsl/glsl_parser_extras.cpp 
>>> b/src/glsl/glsl_parser_extras.cpp
>>> index 9740903..0992294 100644
>>> --- a/src/glsl/glsl_parser_extras.cpp
>>> +++ b/src/glsl/glsl_parser_extras.cpp
>>> @@ -1218,6 +1218,7 @@ do_common_optimization(exec_list *ir, bool linked,
>>> progress = do_structure_splitting(ir) || progress;
>>>  }
>>>  progress = do_if_simplification(ir) || progress;
>>> +   progress = opt_flatten_nested_if_blocks(ir) || progress;
>>>  progress = do_copy_propagation(ir) || progress;
>>>  progress = do_copy_propagation_elements(ir) || progress;
>>>  if (linked)
>>> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
>>> index 2454bbe..a8885d7 100644
>>> --- a/src/glsl/ir_optimization.h
>>> +++ b/src/glsl/ir_optimization.h
>>> @@ -82,6 +82,7 @@ bool do_function_inlining(exec_list *instructions);
>>>   bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, 
>>> bool lower_sub_return = true, bool lower_main_return = false, bool 
>>> lower_continue = false, bool lower_break = false);
>>>   bool do_lower_texture_projection(exec_list *instructions);
>>>   bool do_if_simplification(exec_list *instructions);
>>> +bool opt_flatten_nested_if_blocks(exec_list *instructions);
>>>   bool do_discard_simplification(exec_list *instructions);
>>>   bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth 
>>> = 0);
>>>   bool do_mat_op_to_vec(exec_list *instructions);
>>> diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp 
>>> b/src/glsl/opt_flatten_nested_if_blocks.cpp
>>> new file mode 100644
>>> index 000..c702102
>>> --- /dev/null
>>> +++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
>>> @@ -0,0 +1,103 @@
>>> +/*
>>> + * Copyright © 2013 Intel Corporation
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentation files (the 
>>> "Software"),
>>> + * to deal in the Software without restriction, including without 
>>> limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the 
>>> next
>>> + * paragraph) shall be included in all copies or substantial portions of 
>>> the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>>> OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>>> OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>>> + * DEALINGS IN THE SOFTWARE.
>>> + */
>>> +
>>> +/**
>>> + * \file opt_flatten_nested_if_blocks.cpp
>>> + *
>>> + * Flattens nested if blocks such as:
>>> + *
>>> + * if (x) {
>>> + *if (y) {
>>> + *   ...
>>> + *}
>>> + * }
>>> + *
>>> + * into a single if block with a combined condition:
>>> + *
>>> + * if (x && y) {
>>> + *...
>>> + * }
>>> + */
>>> +
>>> +#include "ir.h"
>>> +#include "ir_builder.h"
>>> +
>>> +using namespace ir_builder;
>>> +
>>> +namespace {
>>> +
>>> +class nested_if_flattener : public ir_hierarchical_visitor {
>>> +public:
>>> +   nested_if_flattener()
>>> +   {
>>> +  progress = false;
>>> +   }
>>> +
>>> +   ir_visitor_status visit_leave(ir_if *);
>>> +   ir_visitor_status visit_enter(ir_assignment *);
>>> +
>>> +   bool progress;
>>> +};
>>> +
>>> +} /* unnamed namespace */
>>> +
>>> +/* We only care about the top level "if" instructions, so don't
>>> + * descend into expressions.
>>> + */
>>> +ir_visitor_status
>>> +nested_if_flattener::visit_enter(ir_assignment *ir)
>>> +{
>>> +   (void) ir;
>>> +   return visit_continue_with_parent;
>>> +}
>>> +
>>> +bool
>>> +opt_flatten_nested_if_blocks(exec_list *instructions)
>>> +{
>>> +   nested_if_flattener v;
>>> +
>>> +   v.run(instructions);
>>> +   return v.progress;
>>> +}
>>> +
>>> +
>>> +ir_visitor_status
>>> +nested_if_flattene

[Mesa-dev] [PATCH] st/xlib: add HUD support for xlib/GLX

2013-04-04 Thread Brian Paul

For the softpipe and llvmpipe drivers.
---
 src/gallium/state_trackers/glx/xlib/xm_api.c |   15 +++
 src/gallium/state_trackers/glx/xlib/xm_api.h |3 +++
 src/gallium/state_trackers/glx/xlib/xm_st.c  |   12 
 src/gallium/state_trackers/glx/xlib/xm_st.h  |4 
 4 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
b/src/gallium/state_trackers/glx/xlib/xm_api.c
index 021175c..04960f3 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
@@ -64,6 +64,8 @@
 #include "util/u_atomic.h"
 #include "util/u_inlines.h"
 
+#include "hud/hud_context.h"
+
 #include "xm_public.h"
 #include 
 
@@ -910,6 +912,8 @@ XMesaContext XMesaCreateContext( XMesaVisual v, 
XMesaContext share_list,
 
c->st->st_manager_private = (void *) c;
 
+   c->hud = hud_create(c->st->pipe, c->st->cso_context);
+
return c;
 
 fail:
@@ -925,6 +929,10 @@ fail:
 PUBLIC
 void XMesaDestroyContext( XMesaContext c )
 {
+   if (c->hud) {
+  hud_destroy(c->hud);
+   }
+
c->st->destroy(c->st);
 
/* FIXME: We should destroy the screen here, but if we do so, surfaces may 
@@ -1224,6 +1232,13 @@ void XMesaSwapBuffers( XMesaBuffer b )
 {
XMesaContext xmctx = XMesaGetCurrentContext();
 
+   /* Need to draw HUD before flushing */
+   if (xmctx && xmctx->hud) {
+  struct pipe_resource *back =
+ xmesa_get_framebuffer_resource(b->stfb, ST_ATTACHMENT_BACK_LEFT);
+  hud_draw(xmctx->hud, back);
+   }
+
if (xmctx && xmctx->xm_buffer == b) {
   xmctx->st->flush( xmctx->st, ST_FLUSH_FRONT, NULL);
}
diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.h 
b/src/gallium/state_trackers/glx/xlib/xm_api.h
index 606bcf3..6d37ed7 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.h
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.h
@@ -67,6 +67,8 @@ and create a window, you must do the following to use the 
X/Mesa interface:
 # include 
 # include 
 
+struct hud_context;
+
 typedef struct xmesa_display *XMesaDisplay;
 typedef struct xmesa_buffer *XMesaBuffer;
 typedef struct xmesa_context *XMesaContext;
@@ -305,6 +307,7 @@ struct xmesa_context {
XMesaVisual xm_visual;  /** pixel format info */
XMesaBuffer xm_buffer;  /** current drawbuffer */
XMesaBuffer xm_read_buffer;  /** current readbuffer */
+   struct hud_context *hud;
 };
 
 
diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c 
b/src/gallium/state_trackers/glx/xlib/xm_st.c
index a681e82..1cfd89e 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_st.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_st.c
@@ -317,6 +317,18 @@ xmesa_destroy_st_framebuffer(struct st_framebuffer_iface 
*stfbi)
free(stfbi);
 }
 
+/**
+ * Return the pipe_surface which corresponds to the given
+ * framebuffer attachment.
+ */
+struct pipe_resource *
+xmesa_get_framebuffer_resource(struct st_framebuffer_iface *stfbi,
+   enum st_attachment_type att)
+{
+   struct xmesa_st_framebuffer *xstfb = xmesa_st_framebuffer(stfbi);
+   return xstfb->textures[att];
+}
+
 void
 xmesa_swap_st_framebuffer(struct st_framebuffer_iface *stfbi)
 {
diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.h 
b/src/gallium/state_trackers/glx/xlib/xm_st.h
index a293728..c939c1e 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_st.h
+++ b/src/gallium/state_trackers/glx/xlib/xm_st.h
@@ -40,6 +40,10 @@ xmesa_create_st_framebuffer(XMesaDisplay xmdpy, XMesaBuffer 
b);
 void
 xmesa_destroy_st_framebuffer(struct st_framebuffer_iface *stfbi);
 
+struct pipe_resource *
+xmesa_get_framebuffer_resource(struct st_framebuffer_iface *stfbi,
+   enum st_attachment_type att);
+
 void
 xmesa_swap_st_framebuffer(struct st_framebuffer_iface *stfbi);
 
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/xlib: add HUD support for xlib/GLX

2013-04-04 Thread Jose Fonseca


- Original Message -
> For the softpipe and llvmpipe drivers.
> ---
>  src/gallium/state_trackers/glx/xlib/xm_api.c |   15 +++
>  src/gallium/state_trackers/glx/xlib/xm_api.h |3 +++
>  src/gallium/state_trackers/glx/xlib/xm_st.c  |   12 
>  src/gallium/state_trackers/glx/xlib/xm_st.h  |4 
>  4 files changed, 34 insertions(+), 0 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c
> b/src/gallium/state_trackers/glx/xlib/xm_api.c
> index 021175c..04960f3 100644
> --- a/src/gallium/state_trackers/glx/xlib/xm_api.c
> +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
> @@ -64,6 +64,8 @@
>  #include "util/u_atomic.h"
>  #include "util/u_inlines.h"
>  
> +#include "hud/hud_context.h"
> +
>  #include "xm_public.h"
>  #include 
>  
> @@ -910,6 +912,8 @@ XMesaContext XMesaCreateContext( XMesaVisual v,
> XMesaContext share_list,
>  
> c->st->st_manager_private = (void *) c;
>  
> +   c->hud = hud_create(c->st->pipe, c->st->cso_context);
> +
> return c;
>  
>  fail:
> @@ -925,6 +929,10 @@ fail:
>  PUBLIC
>  void XMesaDestroyContext( XMesaContext c )
>  {
> +   if (c->hud) {
> +  hud_destroy(c->hud);
> +   }
> +
> c->st->destroy(c->st);
>  
> /* FIXME: We should destroy the screen here, but if we do so, surfaces
> may
> @@ -1224,6 +1232,13 @@ void XMesaSwapBuffers( XMesaBuffer b )
>  {
> XMesaContext xmctx = XMesaGetCurrentContext();
>  
> +   /* Need to draw HUD before flushing */
> +   if (xmctx && xmctx->hud) {
> +  struct pipe_resource *back =
> + xmesa_get_framebuffer_resource(b->stfb, ST_ATTACHMENT_BACK_LEFT);
> +  hud_draw(xmctx->hud, back);
> +   }
> +
> if (xmctx && xmctx->xm_buffer == b) {
>xmctx->st->flush( xmctx->st, ST_FLUSH_FRONT, NULL);
> }
> diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.h
> b/src/gallium/state_trackers/glx/xlib/xm_api.h
> index 606bcf3..6d37ed7 100644
> --- a/src/gallium/state_trackers/glx/xlib/xm_api.h
> +++ b/src/gallium/state_trackers/glx/xlib/xm_api.h
> @@ -67,6 +67,8 @@ and create a window, you must do the following to use the
> X/Mesa interface:
>  # include 
>  # include 
>  
> +struct hud_context;
> +
>  typedef struct xmesa_display *XMesaDisplay;
>  typedef struct xmesa_buffer *XMesaBuffer;
>  typedef struct xmesa_context *XMesaContext;
> @@ -305,6 +307,7 @@ struct xmesa_context {
> XMesaVisual xm_visual;/** pixel format info */
> XMesaBuffer xm_buffer;/** current drawbuffer */
> XMesaBuffer xm_read_buffer;  /** current readbuffer */
> +   struct hud_context *hud;
>  };
>  
>  
> diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.c
> b/src/gallium/state_trackers/glx/xlib/xm_st.c
> index a681e82..1cfd89e 100644
> --- a/src/gallium/state_trackers/glx/xlib/xm_st.c
> +++ b/src/gallium/state_trackers/glx/xlib/xm_st.c
> @@ -317,6 +317,18 @@ xmesa_destroy_st_framebuffer(struct st_framebuffer_iface
> *stfbi)
> free(stfbi);
>  }
>  
> +/**
> + * Return the pipe_surface which corresponds to the given
> + * framebuffer attachment.
> + */
> +struct pipe_resource *
> +xmesa_get_framebuffer_resource(struct st_framebuffer_iface *stfbi,
> +   enum st_attachment_type att)
> +{
> +   struct xmesa_st_framebuffer *xstfb = xmesa_st_framebuffer(stfbi);
> +   return xstfb->textures[att];
> +}
> +
>  void
>  xmesa_swap_st_framebuffer(struct st_framebuffer_iface *stfbi)
>  {
> diff --git a/src/gallium/state_trackers/glx/xlib/xm_st.h
> b/src/gallium/state_trackers/glx/xlib/xm_st.h
> index a293728..c939c1e 100644
> --- a/src/gallium/state_trackers/glx/xlib/xm_st.h
> +++ b/src/gallium/state_trackers/glx/xlib/xm_st.h
> @@ -40,6 +40,10 @@ xmesa_create_st_framebuffer(XMesaDisplay xmdpy,
> XMesaBuffer b);
>  void
>  xmesa_destroy_st_framebuffer(struct st_framebuffer_iface *stfbi);
>  
> +struct pipe_resource *
> +xmesa_get_framebuffer_resource(struct st_framebuffer_iface *stfbi,
> +   enum st_attachment_type att);
> +
>  void
>  xmesa_swap_st_framebuffer(struct st_framebuffer_iface *stfbi);
>  
> --
> 1.7.3.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

Looks good to me.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: Work without sse2 if llvm is new enough

2013-04-04 Thread Adam Jackson

At least on llvm 3.2 this appears to work fine.  Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.

Signed-off-by: Adam Jackson 
---
 src/gallium/drivers/llvmpipe/lp_screen.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 6700887..ebcf680 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -485,9 +485,10 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
 {
struct llvmpipe_screen *screen;
 
-#ifdef PIPE_ARCH_X86
-   /* require SSE2 due to LLVM PR6960. */
util_cpu_detect();
+
+#if defined(PIPE_ARCH_X86) && HAVE_LLVM < 0x0302
+   /* require SSE2 due to LLVM PR6960. */
if (!util_cpu_caps.has_sse2)
return NULL;
 #endif
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] clover: Fix linkage of libOpenCL

2013-04-04 Thread Niels Ole Salscheider

Clover needs the irreader component of llvm

v2: Check for irreader component
irreader is only available with LLVM 3.3 >= 177971

Signed-off-by: Niels Ole Salscheider 
---
 configure.ac | 4 
 1 file changed, 4 insertions(+)

diff --git a/configure.ac b/configure.ac
index 81d4a3f..fea5868 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1650,6 +1650,10 @@ if test "x$enable_gallium_llvm" = xyes; then
 
 if test "x$enable_opencl" = xyes; then
 LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
+# LLVM 3.3 >= 177971 requires IRReader
+if $LLVM_CONFIG --components | grep -q '\'; then
+LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"
+fi
 fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
-- 
1.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] llvmpipe: Work without sse2 if llvm is new enough

2013-04-04 Thread Jose Fonseca

- Original Message -
> At least on llvm 3.2 this appears to work fine.  Tested on an Athlon XP
> 2600+, which has sse and 3dnow but not sse2.
> 
> Signed-off-by: Adam Jackson 
> ---
>  src/gallium/drivers/llvmpipe/lp_screen.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 6700887..ebcf680 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -485,9 +485,10 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
>  {
> struct llvmpipe_screen *screen;
>  
> -#ifdef PIPE_ARCH_X86
> -   /* require SSE2 due to LLVM PR6960. */
> util_cpu_detect();
> +
> +#if defined(PIPE_ARCH_X86) && HAVE_LLVM < 0x0302
> +   /* require SSE2 due to LLVM PR6960. */
> if (!util_cpu_caps.has_sse2)
> return NULL;
>  #endif
> --
> 1.8.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

Sounds good to me.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: fix breakc

2013-04-04 Thread Zack Rusin

we break when the mask values are 0 not, 1, plus it's bit comparison
not a floating point comparison. This fixes both.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   26 ---
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index d8c419b..1e062e9 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -217,15 +217,14 @@ static void lp_exec_break_condition(struct lp_exec_mask 
*mask,
 LLVMValueRef cond)
 {
LLVMBuilderRef builder = mask->bld->gallivm->builder;
-   LLVMValueRef exec_mask = LLVMBuildNot(builder,
+   LLVMValueRef cond_mask = LLVMBuildAnd(builder,
  mask->exec_mask,
- "break");
-
-   exec_mask = LLVMBuildAnd(builder, exec_mask, cond, "");
+ cond, "cond_mask");
+   cond_mask = LLVMBuildNot(builder, cond, "break_cond");
 
mask->break_mask = LLVMBuildAnd(builder,
mask->break_mask,
-   exec_mask, "break_full");
+   cond_mask, "breakc_full");
 
lp_exec_mask_update(mask);
 }
@@ -287,14 +286,14 @@ static void lp_exec_endloop(struct gallivm_state *gallivm,
   builder,
   LLVMIntNE,
   LLVMBuildBitCast(builder, mask->exec_mask, reg_type, ""),
-  LLVMConstNull(reg_type), "");
+  LLVMConstNull(reg_type), "i1cond");
 
/* i2cond = (looplimiter > 0) */
i2cond = LLVMBuildICmp(
   builder,
   LLVMIntSGT,
   limiter,
-  LLVMConstNull(int_type), "");
+  LLVMConstNull(int_type), "i2cond");
 
/* if( i1cond && i2cond ) */
icond = LLVMBuildAnd(builder, i1cond, i2cond, "");
@@ -2298,13 +2297,16 @@ breakc_emit(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
-   LLVMValueRef tmp;
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
+   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
+   struct lp_build_context *uint_bld = &bld_base->uint_bld;
+   LLVMValueRef unsigned_cond = 
+  LLVMBuildBitCast(builder, emit_data->args[0], uint_bld->vec_type, "");
+   LLVMValueRef cond = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
+unsigned_cond,
+uint_bld->zero);
 
-   tmp = lp_build_cmp(&bld_base->base, PIPE_FUNC_NOTEQUAL,
-  emit_data->args[0], bld->bld_base.base.zero);
-
-   lp_exec_break_condition(&bld->exec_mask, tmp);
+   lp_exec_break_condition(&bld->exec_mask, cond);
 }
 
 static void
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium/hud: initialize sampler state

2013-04-04 Thread Brian Paul

The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with
unnormalized texcoords (at least for softpipe).
---
 src/gallium/auxiliary/hud/hud_context.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index b417f5d..5722df3 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -1018,6 +1018,12 @@ hud_create(struct pipe_context *pipe, struct cso_context 
*cso)
hud->font_sampler_view = pipe->create_sampler_view(pipe, hud->font.texture,
   &view_templ);
 
+   /* sampler state (for font drawing) */
+   hud->font_sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP;
+   hud->font_sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP;
+   hud->font_sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP;
+   hud->font_sampler_state.normalized_coords = 0;
+
/* constants */
hud->constbuf.buffer_size = sizeof(hud->constants);
hud->constbuf.user_buffer = &hud->constants;
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] gallium/hud: add GALLIUM_HUD_PERIOD env var

2013-04-04 Thread Brian Paul

To set the graph update rate, in seconds.  The default update rate
has also been changed to 1/2 second.
---
 src/gallium/auxiliary/hud/hud_context.c |   17 -
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 5722df3..a5145c2 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -703,12 +703,27 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
struct hud_pane *pane = NULL;
unsigned x = 10, y = 10;
unsigned width = 251, height = 100;
+   unsigned period = 500 * 1000;  /* default period (1/2 second) */
+   const char *period_env;
+
+   /*
+* The GALLIUM_HUD_PERIOD env var sets the graph update rate.
+* The env var is in seconds (a float).
+* Zero means update after every frame.
+*/
+   period_env = getenv("GALLIUM_HUD_PERIOD");
+   if (period_env) {
+  float p = atof(period_env);
+  if (p >= 0.0) {
+ period = (unsigned) (p * 1000 * 1000);
+  }
+   }
 
while ((num = parse_string(env, name)) != 0) {
   env += num;
 
   if (!pane) {
- pane = hud_pane_create(x, y, x + width, y + height, 4, 10);
+ pane = hud_pane_create(x, y, x + width, y + height, period, 10);
  if (!pane)
 return;
   }
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Add an optimization pass to flatten simple nested if blocks.

2013-04-04 Thread Eric Anholt

Eric Anholt  writes:

> Kenneth Graunke  writes:
>
>> On 04/04/2013 09:08 AM, Eric Anholt wrote:
>>> Kenneth Graunke  writes:
 diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
 index c294aa4..b5282a6 100644
 --- a/src/glsl/Makefile.sources
 +++ b/src/glsl/Makefile.sources
 @@ -80,6 +80,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_dead_code.cpp \
$(GLSL_SRCDIR)/opt_dead_code_local.cpp \
$(GLSL_SRCDIR)/opt_dead_functions.cpp \
 +  $(GLSL_SRCDIR)/opt_flatten_nested_if_blocks.cpp \
$(GLSL_SRCDIR)/opt_function_inlining.cpp \
$(GLSL_SRCDIR)/opt_if_simplification.cpp \
$(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
 diff --git a/src/glsl/glsl_parser_extras.cpp 
 b/src/glsl/glsl_parser_extras.cpp
 index 9740903..0992294 100644
 --- a/src/glsl/glsl_parser_extras.cpp
 +++ b/src/glsl/glsl_parser_extras.cpp
 @@ -1218,6 +1218,7 @@ do_common_optimization(exec_list *ir, bool linked,
 progress = do_structure_splitting(ir) || progress;
  }
  progress = do_if_simplification(ir) || progress;
 +   progress = opt_flatten_nested_if_blocks(ir) || progress;
  progress = do_copy_propagation(ir) || progress;
  progress = do_copy_propagation_elements(ir) || progress;
  if (linked)
 diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
 index 2454bbe..a8885d7 100644
 --- a/src/glsl/ir_optimization.h
 +++ b/src/glsl/ir_optimization.h
 @@ -82,6 +82,7 @@ bool do_function_inlining(exec_list *instructions);
   bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, 
 bool lower_sub_return = true, bool lower_main_return = false, bool 
 lower_continue = false, bool lower_break = false);
   bool do_lower_texture_projection(exec_list *instructions);
   bool do_if_simplification(exec_list *instructions);
 +bool opt_flatten_nested_if_blocks(exec_list *instructions);
   bool do_discard_simplification(exec_list *instructions);
   bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth 
 = 0);
   bool do_mat_op_to_vec(exec_list *instructions);
 diff --git a/src/glsl/opt_flatten_nested_if_blocks.cpp 
 b/src/glsl/opt_flatten_nested_if_blocks.cpp
 new file mode 100644
 index 000..c702102
 --- /dev/null
 +++ b/src/glsl/opt_flatten_nested_if_blocks.cpp
 @@ -0,0 +1,103 @@
 +/*
 + * Copyright © 2013 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the 
 "Software"),
 + * to deal in the Software without restriction, including without 
 limitation
 + * the rights to use, copy, modify, merge, publish, distribute, 
 sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the 
 next
 + * paragraph) shall be included in all copies or substantial portions of 
 the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
 EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
 SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +/**
 + * \file opt_flatten_nested_if_blocks.cpp
 + *
 + * Flattens nested if blocks such as:
 + *
 + * if (x) {
 + *if (y) {
 + *   ...
 + *}
 + * }
 + *
 + * into a single if block with a combined condition:
 + *
 + * if (x && y) {
 + *...
 + * }
 + */
 +
 +#include "ir.h"
 +#include "ir_builder.h"
 +
 +using namespace ir_builder;
 +
 +namespace {
 +
 +class nested_if_flattener : public ir_hierarchical_visitor {
 +public:
 +   nested_if_flattener()
 +   {
 +  progress = false;
 +   }
 +
 +   ir_visitor_status visit_leave(ir_if *);
 +   ir_visitor_status visit_enter(ir_assignment *);
 +
 +   bool progress;
 +};
 +
 +} /* unnamed namespace */
 +
 +/* We only care about the top level "if" instructions, so don't
 + * descend into expressions.
 + */
 +ir_visitor_status
 +nested_if_flattener::visit_enter(ir_assignment *ir)
 +{
 +   (void) ir;
 +   return visit_continue_with_parent;
 +}
 +
 +bool
 +opt_flatten_nested_if_blocks(exec_list *instructions)

Re: [Mesa-dev] [PATCH 1/2] gallium/hud: initialize sampler state

2013-04-04 Thread Marek Olšák

It would be better to use PIPE_TEX_WRAP_CLAMP_TO_EDGE. In any case:

Reviewed-by: Marek Olšák 

Marek


On Fri, Apr 5, 2013 at 12:38 AM, Brian Paul  wrote:

> The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with
> unnormalized texcoords (at least for softpipe).
> ---
>  src/gallium/auxiliary/hud/hud_context.c |6 ++
>  1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_context.c
> b/src/gallium/auxiliary/hud/hud_context.c
> index b417f5d..5722df3 100644
> --- a/src/gallium/auxiliary/hud/hud_context.c
> +++ b/src/gallium/auxiliary/hud/hud_context.c
> @@ -1018,6 +1018,12 @@ hud_create(struct pipe_context *pipe, struct
> cso_context *cso)
> hud->font_sampler_view = pipe->create_sampler_view(pipe,
> hud->font.texture,
>&view_templ);
>
> +   /* sampler state (for font drawing) */
> +   hud->font_sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP;
> +   hud->font_sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP;
> +   hud->font_sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP;
> +   hud->font_sampler_state.normalized_coords = 0;
> +
> /* constants */
> hud->constbuf.buffer_size = sizeof(hud->constants);
> hud->constbuf.user_buffer = &hud->constants;
> --
> 1.7.3.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gallium/hud: add GALLIUM_HUD_PERIOD env var

2013-04-04 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek


On Fri, Apr 5, 2013 at 12:38 AM, Brian Paul  wrote:

> To set the graph update rate, in seconds.  The default update rate
> has also been changed to 1/2 second.
> ---
>  src/gallium/auxiliary/hud/hud_context.c |   17 -
>  1 files changed, 16 insertions(+), 1 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_context.c
> b/src/gallium/auxiliary/hud/hud_context.c
> index 5722df3..a5145c2 100644
> --- a/src/gallium/auxiliary/hud/hud_context.c
> +++ b/src/gallium/auxiliary/hud/hud_context.c
> @@ -703,12 +703,27 @@ hud_parse_env_var(struct hud_context *hud, const
> char *env)
> struct hud_pane *pane = NULL;
> unsigned x = 10, y = 10;
> unsigned width = 251, height = 100;
> +   unsigned period = 500 * 1000;  /* default period (1/2 second) */
> +   const char *period_env;
> +
> +   /*
> +* The GALLIUM_HUD_PERIOD env var sets the graph update rate.
> +* The env var is in seconds (a float).
> +* Zero means update after every frame.
> +*/
> +   period_env = getenv("GALLIUM_HUD_PERIOD");
> +   if (period_env) {
> +  float p = atof(period_env);
> +  if (p >= 0.0) {
> + period = (unsigned) (p * 1000 * 1000);
> +  }
> +   }
>
> while ((num = parse_string(env, name)) != 0) {
>env += num;
>
>if (!pane) {
> - pane = hud_pane_create(x, y, x + width, y + height, 4, 10);
> + pane = hud_pane_create(x, y, x + width, y + height, period, 10);
>   if (!pane)
>  return;
>}
> --
> 1.7.3.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965/vs: Use GRFs for pull constant offsets on gen7.

2013-04-04 Thread Eric Anholt

This allows the computation of the offset to get written directly into the
message source.  Improves performance of low-resolution GLB2.7 by 4.6% +/-
1.4% (n=11).
---
 src/mesa/drivers/dri/i965/brw_defines.h|1 +
 src/mesa/drivers/dri/i965/brw_shader.cpp   |2 ++
 src/mesa/drivers/dri/i965/brw_vec4.cpp |8 -
 src/mesa/drivers/dri/i965/brw_vec4.h   |4 +++
 src/mesa/drivers/dri/i965/brw_vec4_emit.cpp|   45 +++-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |   18 +++---
 6 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 3d07c36..a13f9dc 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -739,6 +739,7 @@ enum opcode {
VS_OPCODE_SCRATCH_READ,
VS_OPCODE_SCRATCH_WRITE,
VS_OPCODE_PULL_CONSTANT_LOAD,
+   VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
 };
 
 #define BRW_PREDICATE_NONE 0
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 1a52039..b3bd1b9 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -498,6 +498,8 @@ brw_instruction_name(enum opcode op)
   return "scratch_write";
case VS_OPCODE_PULL_CONSTANT_LOAD:
   return "pull_constant_load";
+   case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
+  return "pull_constant_load_gen7";
 
default:
   /* Yes, this leaks.  It's in debug code, it should never occur, and if
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index c58fb44..1013aae 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -223,7 +223,13 @@ vec4_instruction::is_math()
 bool
 vec4_instruction::is_send_from_grf()
 {
-   return opcode == SHADER_OPCODE_SHADER_TIME_ADD;
+   switch (opcode) {
+   case SHADER_OPCODE_SHADER_TIME_ADD:
+   case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
+  return true;
+   default:
+  return false;
+   }
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 8f130e1..e286925 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -544,6 +544,10 @@ private:
struct brw_reg dst,
struct brw_reg index,
struct brw_reg offset);
+   void generate_pull_constant_load_gen7(vec4_instruction *inst,
+ struct brw_reg dst,
+ struct brw_reg surf_index,
+ struct brw_reg offset);
 
struct brw_context *brw;
struct intel_context *intel;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
index e378f7f..963901c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
@@ -558,27 +558,11 @@ 
vec4_generator::generate_pull_constant_load(vec4_instruction *inst,
 struct brw_reg index,
 struct brw_reg offset)
 {
+   assert(intel->gen <= 7);
assert(index.file == BRW_IMMEDIATE_VALUE &&
  index.type == BRW_REGISTER_TYPE_UD);
uint32_t surf_index = index.dw1.ud;
 
-   if (intel->gen == 7) {
-  gen6_resolve_implied_move(p, &offset, inst->base_mrf);
-  brw_instruction *insn = brw_next_insn(p, BRW_OPCODE_SEND);
-  brw_set_dest(p, insn, dst);
-  brw_set_src0(p, insn, offset);
-  brw_set_sampler_message(p, insn,
-  surf_index,
-  0, /* LD message ignores sampler unit */
-  GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
-  1, /* rlen */
-  1, /* mlen */
-  false, /* no header */
-  BRW_SAMPLER_SIMD_MODE_SIMD4X2,
-  0);
-  return;
-   }
-
struct brw_reg header = brw_vec8_grf(0, 0);
 
gen6_resolve_implied_move(p, &header, inst->base_mrf);
@@ -614,6 +598,29 @@ 
vec4_generator::generate_pull_constant_load(vec4_instruction *inst,
 }
 
 void
+vec4_generator::generate_pull_constant_load_gen7(vec4_instruction *inst,
+ struct brw_reg dst,
+ struct brw_reg surf_index,
+ struct brw_reg offset)
+{
+   assert(surf_index.file == BRW_IMMEDIATE_VALUE &&
+ surf_index.type == BRW_REGISTER_TYPE_UD);
+
+   brw_instruction *insn = brw_next_insn(p, BRW_OPCODE_SEND);
+   brw_set_dest(p, insn, dst);
+   brw_set_src0(p, insn, offset);
+   brw_set_sampler_message(p, insn,
+

[Mesa-dev] [PATCH 1/2] i965/vs: Fix writemasks on pull constant offset setup.

2013-04-04 Thread Eric Anholt

When you src_reg(dst_reg(int_type)), you get a grf. (swizzle).  But if
you dst_reg(src_reg(int_type)), you get a grf.xyzw (writemask).  By going
the direction we did, we were writing more channels than were read, so we
wouldn't register coalesce the ADD or MUL.

Right now the MOV is still baked into the emit, but I'm about to fix it
for gen7.
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 8bd2fd8..ce07381 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2716,18 +2716,18 @@ vec4_visitor::get_pull_constant_offset(vec4_instruction 
*inst,
   src_reg *reladdr, int reg_offset)
 {
if (reladdr) {
-  src_reg index = src_reg(this, glsl_type::int_type);
+  dst_reg index = dst_reg(this, glsl_type::int_type);
 
-  emit_before(inst, ADD(dst_reg(index), *reladdr, src_reg(reg_offset)));
+  emit_before(inst, ADD(index, *reladdr, src_reg(reg_offset)));
 
   /* Pre-gen6, the message header uses byte offsets instead of vec4
* (16-byte) offset units.
*/
   if (intel->gen < 6) {
-emit_before(inst, MUL(dst_reg(index), index, src_reg(16)));
+emit_before(inst, MUL(index, src_reg(index), src_reg(16)));
   }
 
-  return index;
+  return src_reg(index);
} else {
   int message_header_scale = intel->gen < 6 ? 16 : 1;
   return src_reg(reg_offset * message_header_scale);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965/vs: Use GRFs for pull constant offsets on gen7.

2013-04-04 Thread Eric Anholt

Eric Anholt  writes:

> This allows the computation of the offset to get written directly into the
> message source.  Improves performance of low-resolution GLB2.7 by 4.6% +/-
> 1.4% (n=11).

Scratch that.  Looks like I tested against a bad build, so the numbers
are invalid and I'll have to try again.  I also see a chance to remove
some more MOVs with a replacement for patch 1, I think.

pgpVD4YWaV_U1.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix breakc

2013-04-04 Thread Roland Scheidegger

Am 05.04.2013 00:08, schrieb Zack Rusin:
> we break when the mask values are 0 not, 1, plus it's bit comparison
> not a floating point comparison. This fixes both.
This sentence doesn't quite parse for me.

> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   26 
> ---
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index d8c419b..1e062e9 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -217,15 +217,14 @@ static void lp_exec_break_condition(struct lp_exec_mask 
> *mask,
>  LLVMValueRef cond)
>  {
> LLVMBuilderRef builder = mask->bld->gallivm->builder;
> -   LLVMValueRef exec_mask = LLVMBuildNot(builder,
> +   LLVMValueRef cond_mask = LLVMBuildAnd(builder,
>   mask->exec_mask,
> - "break");
> -
> -   exec_mask = LLVMBuildAnd(builder, exec_mask, cond, "");
> + cond, "cond_mask");
> +   cond_mask = LLVMBuildNot(builder, cond, "break_cond");
>  
> mask->break_mask = LLVMBuildAnd(builder,
> mask->break_mask,
> -   exec_mask, "break_full");
> +   cond_mask, "breakc_full");
>  
> lp_exec_mask_update(mask);
>  }
So the old logic did ((~exec_mask) & cond) & break_mask
whereas new is (~(exec_mask & cond)) & break_mask.
That is not just inverting the cond bits which is what I gathered from
the commit message (which would be ~(exec_mask | cond) but I guess it's
right...


> @@ -287,14 +286,14 @@ static void lp_exec_endloop(struct gallivm_state 
> *gallivm,
>builder,
>LLVMIntNE,
>LLVMBuildBitCast(builder, mask->exec_mask, reg_type, ""),
> -  LLVMConstNull(reg_type), "");
> +  LLVMConstNull(reg_type), "i1cond");
>  
> /* i2cond = (looplimiter > 0) */
> i2cond = LLVMBuildICmp(
>builder,
>LLVMIntSGT,
>limiter,
> -  LLVMConstNull(int_type), "");
> +  LLVMConstNull(int_type), "i2cond");
>  
> /* if( i1cond && i2cond ) */
> icond = LLVMBuildAnd(builder, i1cond, i2cond, "");
> @@ -2298,13 +2297,16 @@ breakc_emit(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> -   LLVMValueRef tmp;
> struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
> +   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
> +   struct lp_build_context *uint_bld = &bld_base->uint_bld;
> +   LLVMValueRef unsigned_cond = 
> +  LLVMBuildBitCast(builder, emit_data->args[0], uint_bld->vec_type, "");
> +   LLVMValueRef cond = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
> +unsigned_cond,
> +uint_bld->zero);
>  
> -   tmp = lp_build_cmp(&bld_base->base, PIPE_FUNC_NOTEQUAL,
> -  emit_data->args[0], bld->bld_base.base.zero);
> -
> -   lp_exec_break_condition(&bld->exec_mask, tmp);
> +   lp_exec_break_condition(&bld->exec_mask, cond);
>  }
>  

I think you could avoid doing the bitcast manually if the type of breakc
src would be correctly set in tgsi_opcode_infer_src_type().
(breakc seems very poorly documented too, it is listed under ps2_x
section which looks like it isn't even true and has no description what
it does and what the args are.)

Otherwise looks good to me.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

78 matches

Mail list logo