date:20140930

Re: [Mesa-dev] [PATCH] glsl: make consistent use of DECLARE_RALLOC_CXX_OPERATORS

2014-09-30 Thread Kenneth Graunke

On Tuesday, September 30, 2014 12:14:06 AM Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
> 
> Noticed this when investigating how ralloc worked. I'm moderately sure that
> the old code was fine, but seems nicer to use the cooked known-to-work macro.

Yeah, Curro fixed this to actually hook up the destructor, so we can use it now 
(it didn't used to).

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: relax draw api validation on ES2

2014-09-30 Thread Tapani Pälli

Patch fixes failing test in WebGL conformance test
'point-no-attributes' when running Chrome on OpenGL ES.
(Shader program may draw points using constant data in shader.)

No Piglit regressions.

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/api_validate.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 51a3d1f..9b80600 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
 
switch (ctx->API) {
case API_OPENGLES2:
-  /* For ES2, we can draw if any vertex array is enabled (and we
-   * should always have a vertex program/shader). */
-  if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current)
+  /* For ES2, we can draw if we have a vertex program/shader). */
+  if (!ctx->VertexProgram._Current)
 return GL_FALSE;
   break;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2

2014-09-30 Thread Kenneth Graunke

On Tuesday, September 30, 2014 10:28:26 AM Tapani Pälli wrote:
> Patch fixes failing test in WebGL conformance test
> 'point-no-attributes' when running Chrome on OpenGL ES.
> (Shader program may draw points using constant data in shader.)
> 
> No Piglit regressions.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/api_validate.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index 51a3d1f..9b80600 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char 
> *function)
>  
> switch (ctx->API) {
> case API_OPENGLES2:
> -  /* For ES2, we can draw if any vertex array is enabled (and we
> -   * should always have a vertex program/shader). */
> -  if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current)
> +  /* For ES2, we can draw if we have a vertex program/shader). */
> +  if (!ctx->VertexProgram._Current)
>return GL_FALSE;
>break;

Looks right to me.  The git history shows that it's been this way since it was 
written 5 years ago, and I see no comments, git commit explanations, or spec 
text saying why it should be like this.  Using constant data seems totally 
reasonable, and we allow it on GL.

Thanks, Tapani.

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.

2014-09-30 Thread Kenneth Graunke

Write-back caching cannot be used for buffers being scanned out by the
display engine; surfaces used for scan-out must be write-through or
uncached.  I originally chose WT for render targets because it works in
all cases.  However, we really want to use write-back caching where
possible, as it is more efficient.

Most renderbuffers are not used for scanout - off-screen FBOs certainly
are fine, and non-pageflipped backbuffers should be fine as well.  So
in most cases WB will work.  However, we don't know what will be used
for scan-out, so we instead simply use the PTE value specified by the
kernel, as it knows these things.

This matches our MOCS choice on Haswell.

Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5
in a microbenchmark (spotted by Eero Tamminen).  Improves performance
in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
Broadwell GT2.

Signed-off-by: Kenneth Graunke 
Reported-by: Eero Tamminen 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Cc'd to stable because it's a pretty trivial change and provides a sizable
boost to performance on new hardware.

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 40eb2ea..6dd343f 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
  horizontal_alignment(mt) |
  surface_tiling_mode(tiling);
 
-   surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
+   surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
 
surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
  SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
-- 
2.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965: Add a BRW_MOCS_PTE #define.

2014-09-30 Thread Kenneth Graunke

Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all
three caches (L3, LLC, and eLLC where available), but leaves the LLC
caching mode up to the kernel's page table entry.

This allows the kernel to pick WB/WT/UC based on whether it's using a
buffer for scanout.

Signed-off-by: Kenneth Graunke 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_defines.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Cc'd to stable because it's required by the next patch.

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 2faebe8..5d09409 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -2386,8 +2386,12 @@ enum brw_wm_barycentric_interp_mode {
 #define HSW_MOCS_WB_LLC_WB_ELLC (2 << 1)
 #define HSW_MOCS_UC_LLC_WB_ELLC (3 << 1)
 
-/* Broadwell: write-back or write-through; always use all the caches. */
-#define BDW_MOCS_WB 0x78
-#define BDW_MOCS_WT 0x58
+/* Broadwell: these defines always use all available caches (L3, LLC, eLLC),
+ * and let you force write-back (WB) or write-through (WT) caching, or leave
+ * it up to the page table entry (PTE) specified by the kernel.
+ */
+#define BDW_MOCS_WB  0x78
+#define BDW_MOCS_WT  0x58
+#define BDW_MOCS_PTE 0x18
 
 #endif
-- 
2.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.

2014-09-30 Thread Daniel Vetter

On Tue, Sep 30, 2014 at 01:15:56AM -0700, Kenneth Graunke wrote:
> Write-back caching cannot be used for buffers being scanned out by the
> display engine; surfaces used for scan-out must be write-through or
> uncached.  I originally chose WT for render targets because it works in
> all cases.  However, we really want to use write-back caching where
> possible, as it is more efficient.
> 
> Most renderbuffers are not used for scanout - off-screen FBOs certainly
> are fine, and non-pageflipped backbuffers should be fine as well.  So
> in most cases WB will work.  However, we don't know what will be used
> for scan-out, so we instead simply use the PTE value specified by the
> kernel, as it knows these things.
> 
> This matches our MOCS choice on Haswell.
> 
> Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5
> in a microbenchmark (spotted by Eero Tamminen).  Improves performance
> in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
> Broadwell GT2.
> 
> Signed-off-by: Kenneth Graunke 
> Reported-by: Eero Tamminen 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Cc'd to stable because it's a pretty trivial change and provides a sizable
> boost to performance on new hardware.

Both patches are Reviewed-by: Daniel Vetter 

Aside: Not using WT on display can lead to corruption (apparently bdw is
fairly aggressive with writeback so hard to spot in reality), so imo
definitely stable material.

With the hw display crc stuff we now support in the kernel/igt we could
even write an automated testcase for these corruptions, but probably not
worth the hassle.
-Daniel

> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index 40eb2ea..6dd343f 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
>   horizontal_alignment(mt) |
>   surface_tiling_mode(tiling);
>  
> -   surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
> +   surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
>  
> surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
>   SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
> -- 
> 2.1.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radeonsi: Fix tiling mode index for stencil resources

2014-09-30 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Tue, Sep 30, 2014 at 5:58 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> We are currently only dealing with depth-only or stencil-only resources
> here, not with resources having both depth and stencil[0]. In both cases,
> the tiling mode index is in the tile_mode field, not in the
> stencil_tile_mode field.
>
> [0] Add an assertion for that.
>
> Signed-off-by: Michel Dänzer 
> ---
>  src/gallium/drivers/radeonsi/si_dma.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_dma.c 
> b/src/gallium/drivers/radeonsi/si_dma.c
> index c067cd9..cd6ff4a 100644
> --- a/src/gallium/drivers/radeonsi/si_dma.c
> +++ b/src/gallium/drivers/radeonsi/si_dma.c
> @@ -162,6 +162,8 @@ static void si_dma_copy_tile(struct si_context *ctx,
> tiled_y = detile ? src_y : dst_y;
> tiled_z = detile ? src_z : dst_z;
>
> +   
> assert(!util_format_is_depth_and_stencil(rtiled->resource.b.b.format));
> +
> array_mode = si_array_mode(rtiled->surface.level[tiled_lvl].mode);
> slice_tile_max = (rtiled->surface.level[tiled_lvl].nblk_x *
>   rtiled->surface.level[tiled_lvl].nblk_y) / (8*8) - 
> 1;
> @@ -179,8 +181,7 @@ static void si_dma_copy_tile(struct si_context *ctx,
> bank_w = cik_bank_wh(rtiled->surface.bankw);
> mt_aspect = cik_macro_tile_aspect(rtiled->surface.mtilea);
> tile_split = cik_tile_split(rtiled->surface.tile_split);
> -   tile_mode_index = si_tile_mode_index(rtiled, tiled_lvl,
> -
> util_format_has_stencil(util_format_description(rtiled->resource.b.b.format)));
> +   tile_mode_index = si_tile_mode_index(rtiled, tiled_lvl, false);
> nbanks = si_num_banks(sscreen, rtiled);
> base += rtiled->resource.gpu_address;
> addr += rlinear->resource.gpu_address;
> --
> 2.1.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH libdrm] radeon: Always multiply pitch_bytes by nsamples, not by slice_pt

2014-09-30 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Tue, Sep 30, 2014 at 5:58 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> slice_pt is tileb[0] / tile_split, which isn't directly related to the
> pitch.
>
> This caused pitch_bytes to be too large in some cases.
>
> [0] Tile size in bytes
>
> Signed-off-by: Michel Dänzer 
> ---
>  radeon/radeon_surface.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/radeon/radeon_surface.c b/radeon/radeon_surface.c
> index 0723425..930017e 100644
> --- a/radeon/radeon_surface.c
> +++ b/radeon/radeon_surface.c
> @@ -595,7 +595,7 @@ static void eg_surf_minify(struct radeon_surface *surf,
>  mtile_ps = (mtile_pr * surflevel->nblk_y) / mtileh;
>
>  surflevel->offset = offset;
> -surflevel->pitch_bytes = surflevel->nblk_x * bpe * slice_pt;
> +surflevel->pitch_bytes = surflevel->nblk_x * bpe * surf->nsamples;
>  surflevel->slice_size = mtile_ps * mtileb * slice_pt;
>
>  surf->bo_size = offset + surflevel->slice_size * surflevel->nblk_z * 
> surf->array_size;
> @@ -1498,7 +1498,7 @@ static void si_surf_minify_2d(struct radeon_surface 
> *surf,
>  /* macro tile per slice */
>  mtile_ps = (mtile_pr * surflevel->nblk_y) / yalign;
>  surflevel->offset = offset;
> -surflevel->pitch_bytes = surflevel->nblk_x * bpe * slice_pt;
> +surflevel->pitch_bytes = surflevel->nblk_x * bpe * surf->nsamples;
>  surflevel->slice_size = mtile_ps * mtileb * slice_pt;
>
>  surf->bo_size = offset + surflevel->slice_size * surflevel->nblk_z * 
> surf->array_size;
> --
> 2.1.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84242

--- Comment #10 from David Kredba  ---
The same result with gcc 5.0 svn rev. 215679.

.eh_frame_hdr table[5712] FDE at 00c45788 overlaps table[5713] FDE at
00c45684.

Now I will try older binutils with two gcc versions used before.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees

2014-09-30 Thread Iago Toral Quiroga

I just noticed that we should add:

index a4fe2bd..ca53eb8 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -415,6 +415,17 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, 
minmax_range baserange)
   }
}
 
+   /* If we got here we could not discard any of the operands of the minmax
+* expression, but we can still try to resolve the expression if both
+* operands are constant. We do this after the loop above, to make sure
+* that if our operands are minmax expressions we have tried to prune them
+* first (hopefully reducing them to constants).
+*/
+   ir_constant *a = expr->operands[0]->as_constant();
+   ir_constant *b = expr->operands[1]->as_constant();
+   if (a && b)
+  return combine_constant(ismin, a, b);
+
return expr;
 }

at the bottom of prune_expression. This makes sure that when we prune
the operands of a minmax expression to constants, we also resolve the
parent expression to a constant, otherwise we will leave the parent with
two constant arguments. I noticed this while reworking the unit tests
for mixed vectors.

Connor: if you give the okay to this change I will squash it in before
pushing.

Iago

On lun, 2014-09-29 at 13:19 -0400, Connor Abbott wrote:
> On Mon, Sep 29, 2014 at 7:49 AM, Iago Toral Quiroga  wrote:
> > Original patch by Petri Latvala :
> >
> > Add an optimization pass that drops min/max expression operands that
> > can be proven to not contribute to the final result. The algorithm is
> > similar to alpha-beta pruning on a minmax search, from the field of
> > AI.
> >
> > This optimization pass can optimize min/max expressions where operands
> > are min/max expressions. Such code can appear in shaders by itself, or
> > as the result of clamp() or AMD_shader_trinary_minmax functions.
> >
> > This optimization pass improves the generated code for piglit's
> > AMD_shader_trinary_minmax tests as follows:
> >
> > total instructions in shared programs: 75 -> 67 (-10.67%)
> > instructions in affected programs: 60 -> 52 (-13.33%)
> > GAINED:0
> > LOST:  0
> >
> > All tests (max3, min3, mid3) improved.
> >
> > A full shader-db run:
> >
> > total instructions in shared programs: 4293603 -> 4293575 (-0.00%)
> > instructions in affected programs: 1188 -> 1160 (-2.36%)
> > GAINED:0
> > LOST:  0
> >
> > Improvements happen in Guacamelee and Serious Sam 3. One shader from
> > Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of
> > dropping of a (constant float (0.0)) operand, which was
> > compiled to a saturate modifier.
> >
> > Version 2 by Iago Toral Quiroga :
> >
> > Changes from review feedback:
> > - Squashed various cosmetic changes sent by Matt Turner.
> > - Make less_all_components return an enum rather than setting a class 
> > member.
> >   (Suggested by Mat Turner). Also, renamed it to compare_components.
> > - Make less_all_components, smaller_constant and larger_constant static.
> >   (Suggested by Mat Turner)
> > - Change mixmax_range to call its limits "low" and "high" instead of
> >   "range[0]" and "range[1]". (Suggested by Connor Abbot).
> > - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
> >   Connor Abbot).
> > - Make the logic more clearer by rearrenging the code and commenting.
> >   (Suggested by Connor Abbot).
> > - Added comment to explain why we need to recurse twice. (Suggested by
> >   Connor Abbot).
> > - If we cannot prune an expression, do not return early. Instead, attempt
> >   to prune its children. (Suggested by Connor Abbot).
> >
> > Other changes:
> > - Instead of having a global "valid" visitor member, let the various 
> > functions
> >   that can determine this status return a boolean and check for its value
> >   to decide what to do in each case. This is more flexible and allows to
> >   recurse into children of parents that could not be prunned due to invalid
> >   ranges (so related to the last bullet in the review feedback).
> > - Make sure we always check if a range is valid before working with it. 
> > Since
> >   any use of get_range, combine_range or range_intersection can invalidate
> >   a range we should check for this situation every time we use any of these
> >   functions.
> >
> > Version 3 by Iago Toral Quiroga :
> >
> > Changes from review feedback:
> > - Now we can make get_range, combine_range and range_intersection static too
> >   (suggested by Connor Abbot).
> > - Do not return NULL when looking for the larger or greater constant into
> >   mixed vector constants. Instead, produce a new constant by doing a
> >   component-wise minmax. With this we can also remove of the validations 
> > when
> >   we call into these functions (suggested by Connor Abbot).
> > - Add a comment explaining the meaning of the baserange argument in
> >   prune_expression (suggested by Connor Abbot).
> >
> > Oth

Re: [Mesa-dev] [PATCH] egl: setup screen iterator before using it

2014-09-30 Thread Juha-Pekka Heikkila

On 29.09.2014 19:07, Matt Turner wrote:
> On Mon, Sep 29, 2014 at 5:08 AM, Tapani Pälli  wrote:
>> commit 4ed23fd broke creation of pbuffer surfaces, patch fixes
>> the failure, noticed when running chrome with '--use-gl=egl'.
> 
> Cc'ing JP so he can review as well.
> 
> Reviewed-by: Matt Turner 
> 

Just to ack, we discussed this with Tapani last night and the patch was
already committed as

Reviewed-by: Juha-Pekka Heikkila 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] SandyBridge's 'resinfo' -> returned value for SURFTYPE_BUFFER?

2014-09-30 Thread Samuel Iglesias Gonsálvez

Hello,

I am looking at bug 57439 [0] where it shows an error
in a piglit test [1] related to textureSize() function happening
in Intel SandyBridge hardware.

According to SNB's PRM documentation (vol4 part1 page 141), the
returned value for SURFTYPE_BUFFER (the surface type used in the test)
is not defined in the 'resinfo' message type. For IvyBridge's doc it is
defined as the buffer size, which is calculated from combined
Depth/Height/Width values.

As it is not clear that SNB returns the same value than IVB for that
kind of message and surface type, I send this email here asking for a
clarification :-)

Best regards,

Sam

[0] https://bugs.freedesktop.org/show_bug.cgi?id=57439
[1] ./bin/textureSize 140 fs samplerBuffer -auto -fbo


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84242

--- Comment #11 from David Kredba  ---
With Gentoo vanilla binutils 2.24-r3 with two slim LTO patches and the patch
referred by Emil Velikov in Comment #3

https://projects.archlinux.org/svntogit/packages.git/plain/trunk/binutils-2.24-shared-pie.patch?h=packages/binutils&id=47bdd59a9967ee8dd2bcc47797855185c6471546

it builds fine even with LTO enabled (using a trick with calling configure with
LTO turned off and then -fno-lto -fno-use-linker-plugin removed from each
Makefile).

So trunk binutils seems to be source of the problem.
I have to start with bisecting them.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=81680

Marek Olšák  changed:

   What|Removed |Added

 Attachment #105815|0   |1
is obsolete||

--- Comment #40 from Marek Olšák  ---
Created attachment 107124
  --> https://bugs.freedesktop.org/attachment.cgi?id=107124&action=edit
possible fix

Could you please test this patch?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] replace file specific compileroptimization withinline attibute

2014-09-30 Thread Marc Dietrich

Hi Matt,

Am Donnerstag, 25. September 2014, 09:56:42 schrieb Marc Dietrich:
> Am Mittwoch, 24. September 2014, 18:35:24 schrieb Matt Turner:
> > On Wed, Sep 24, 2014 at 6:25 AM, Marc Dietrich  wrote:
> > > Am Montag, 22. September 2014, 11:48:29 schrieb Matt Turner:
> > >> We need a configure check for support for __attribute__((target)). I'm
> > >> going to send a series that adds support for this (and does the check
> > >> for existing attribute uses, so once that goes in you can rebase this
> > >> patch on that).
> > > 
> > > nice, but won't work with the workaround above. Pragma and attribute
> > > does
> > > the same so, we could check for the attribute and use the pragma
> > > instead.
> > 
> > I wonder if the best thing to do is to add target("sse4.1") in
> > addition to using -msse4.1. That way, we'll retain compatibility with
> 
> The idea of this patch was to remove per file optimization flags because
> this breaks LTO. LTO will recompile all files during the final link and
> apply any "high-level" compiler flags from a single file (e.g. -msse4.1) to
> all files used in the linking process.

I tried to find some hints how gcc handles this. Unfortunately, the gcc docs 
aren't very helpful [1] and I failed to construct a test case :-( I tend to 
say that gcc does not apply the target options in the final link to *all* 
files, so this problem does seems not to exist at all (I'm running lto 
compiled mesa on amdfam10h with no sse4.1 support and see no crashes so far). 
As a side note, using "-msse4.1 -fno-lto" would prevent it in any case and 
also be compatible with clang.

Marc

[1]: info gcc on -flto:
 When producing the final binary with `-flto', GCC only applies
 link-time optimizations to those files that contain bytecode.
 Therefore, you can mix and match object files and libraries with
 GIMPLE bytecodes and final object code.  GCC automatically selects
 which files to optimize in LTO mode and which files to link without
 further processing.

 There are some code generation flags preserved by GCC when
 generating bytecodes, as they need to be used during the final link
 stage.  Currently, the following options are saved into the GIMPLE
 bytecode files: `-fPIC', `-fcommon' and all the `-m' target flags.

 At link time, these options are read in and reapplied.  Note that
 the current implementation makes no attempt to recognize
 conflicting values for these options.  If different files have
 conflicting option values (e.g., one file is compiled with `-fPIC'
 and another isn't), the compiler simply uses the last value read
 from the bytecode files.  It is recommended, then, that you
 compile all the files participating in the same link with the same
 options.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: move lp_jit_screen_init() call after allocation of screen object

2014-09-30 Thread Brian Paul

The screen argument isn't actually used by lp_jit_screen_init() at this
time, but let's move the call so that we pass a valid pointer.

v2: don't leak screen if lp_jit_screen_init() fails.
---
 src/gallium/drivers/llvmpipe/lp_screen.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 3025322..a264f99 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -557,9 +557,6 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
return NULL;
 #endif
 
-   if (!lp_jit_screen_init(screen))
-  return NULL;
-
 #ifdef DEBUG
LP_DEBUG = debug_get_flags_option("LP_DEBUG", lp_debug_flags, 0 );
 #endif
@@ -570,6 +567,11 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
if (!screen)
   return NULL;
 
+   if (!lp_jit_screen_init(screen)) {
+  FREE(screen);
+  return NULL;
+   }
+
screen->winsys = winsys;
 
screen->base.destroy = llvmpipe_destroy_screen;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] llvmpipe: move lp_jit_screen_init() call after allocation of screen object

2014-09-30 Thread Brian Paul


On 09/29/2014 07:46 PM, Michel Dänzer wrote:

On 30.09.2014 10:45, Michel Dänzer wrote:

On 30.09.2014 07:16, Brian Paul wrote:

The screen argument isn't actually used by lp_jit_screen_init() at this
time,


I guess that's why gcc didn't warn about it?


Nope, it actually does warn about it. Mea culpa for not noticing that.


Yeah, I patched this after seeing the gcc warning.  New, non-leaking, 
patch posted.  Thanks, Michel.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84242] FTBFS: libOpenCL.so.1.0.0: ld: .eh_frame_hdr table[5707] FDE at 0000000000c45b8c overlaps table[5708] FDE at 0000000000c45a88

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84242

--- Comment #12 from Emil Velikov  ---
(In reply to comment #11)
> With Gentoo vanilla binutils 2.24-r3 with two slim LTO patches and the patch
> referred by Emil Velikov in Comment #3
> 
> https://projects.archlinux.org/svntogit/packages.git/plain/trunk/binutils-2.
> 24-shared-pie.patch?h=packages/
> binutils&id=47bdd59a9967ee8dd2bcc47797855185c6471546
> 
> it builds fine even with LTO enabled (using a trick with calling configure
> with LTO turned off and then -fno-lto -fno-use-linker-plugin removed from
> each Makefile).
> 
> So trunk binutils seems to be source of the problem.
> I have to start with bisecting them.

Nicely done. I hope that the problem does not end up a sheep in wolf's clothing
- i.e. somewhere else. Using gcc+bintutils, to compile a third library, which
links to another two compiler(ish) products... there are so many things that
can be happening in there.

Thank you for the great initiative :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2

2014-09-30 Thread Ian Romanick

On 09/30/2014 12:28 AM, Tapani Pälli wrote:
> Patch fixes failing test in WebGL conformance test
> 'point-no-attributes' when running Chrome on OpenGL ES.
> (Shader program may draw points using constant data in shader.)
> 
> No Piglit regressions.

This sounds believable.  Did you also try the ES2 or ES3 conformance
suite?  I could have sworn that we had a bug related to this a long time
ago, and we discovered it using the conformance suite.

Either way, we should get a piglit test too... I think we have a test
for desktop OpenGL (maybe 3.1?), so it shouldn't be too hard to adapt that.

> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/api_validate.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index 51a3d1f..9b80600 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char 
> *function)
>  
> switch (ctx->API) {
> case API_OPENGLES2:
> -  /* For ES2, we can draw if any vertex array is enabled (and we
> -   * should always have a vertex program/shader). */
> -  if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current)
> +  /* For ES2, we can draw if we have a vertex program/shader). */
> +  if (!ctx->VertexProgram._Current)
>return GL_FALSE;
>break;
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: fix CS tracing and remove excessive CS dumping

2014-09-30 Thread Marek Olšák

Jerome,

Could you please review this?

Thanks,

Marek

On Sat, Sep 20, 2014 at 12:26 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/radeonsi/si_hw_context.c | 36 
> ++--
>  src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
>  src/gallium/drivers/radeonsi/si_state_draw.c | 21 
>  3 files changed, 25 insertions(+), 35 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
> b/src/gallium/drivers/radeonsi/si_hw_context.c
> index eaefa6a..e030c75 100644
> --- a/src/gallium/drivers/radeonsi/si_hw_context.c
> +++ b/src/gallium/drivers/radeonsi/si_hw_context.c
> @@ -102,20 +102,8 @@ void si_context_gfx_flush(void *context, unsigned flags,
> /* force to keep tiling flags */
> flags |= RADEON_FLUSH_KEEP_TILING_FLAGS;
>
> -#if SI_TRACE_CS
> -   if (ctx->screen->b.trace_bo) {
> -   struct si_screen *sscreen = ctx->screen;
> -   unsigned i;
> -
> -   for (i = 0; i < cs->cdw; i++) {
> -   fprintf(stderr, "[%4d] [%5d] 0x%08x\n", 
> sscreen->b.cs_count, i, cs->buf[i]);
> -   }
> -   sscreen->b.cs_count++;
> -   }
> -#endif
> -
> /* Flush the CS. */
> -   ctx->b.ws->cs_flush(cs, flags, fence, 0);
> +   ctx->b.ws->cs_flush(cs, flags, fence, ctx->screen->b.cs_count++);
> ctx->b.rings.gfx.flushing = false;
>
>  #if SI_TRACE_CS
> @@ -125,7 +113,7 @@ void si_context_gfx_flush(void *context, unsigned flags,
>
> for (i = 0; i < 10; i++) {
> usleep(5);
> -   if 
> (!ctx->ws->buffer_is_busy(sscreen->b.trace_bo->buf, RADEON_USAGE_READWRITE)) {
> +   if 
> (!ctx->b.ws->buffer_is_busy(sscreen->b.trace_bo->buf, 
> RADEON_USAGE_READWRITE)) {
> break;
> }
> }
> @@ -169,23 +157,3 @@ void si_begin_new_cs(struct si_context *ctx)
>
> ctx->b.initial_gfx_cs_size = ctx->b.rings.gfx.cs->cdw;
>  }
> -
> -#if SI_TRACE_CS
> -void si_trace_emit(struct si_context *sctx)
> -{
> -   struct si_screen *sscreen = sctx->screen;
> -   struct radeon_winsys_cs *cs = sctx->cs;
> -   uint64_t va;
> -
> -   va = sscreen->b.trace_bo->gpu_address;
> -   r600_context_bo_reloc(sctx, sscreen->b.trace_bo, 
> RADEON_USAGE_READWRITE);
> -   cs->buf[cs->cdw++] = PKT3(PKT3_WRITE_DATA, 4, 0);
> -   cs->buf[cs->cdw++] = 
> PKT3_WRITE_DATA_DST_SEL(PKT3_WRITE_DATA_DST_SEL_MEM_SYNC) |
> -   PKT3_WRITE_DATA_WR_CONFIRM |
> -   
> PKT3_WRITE_DATA_ENGINE_SEL(PKT3_WRITE_DATA_ENGINE_SEL_ME);
> -   cs->buf[cs->cdw++] = va & 0xUL;
> -   cs->buf[cs->cdw++] = (va >> 32UL) & 0xUL;
> -   cs->buf[cs->cdw++] = cs->cdw;
> -   cs->buf[cs->cdw++] = sscreen->b.cs_count;
> -}
> -#endif
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 2cce5cc..cba6d98 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -94,7 +94,8 @@ static struct pipe_context *si_create_context(struct 
> pipe_screen *screen, void *
> }
>
> sctx->b.rings.gfx.cs = ws->cs_create(ws, RING_GFX, 
> si_context_gfx_flush,
> -sctx, NULL);
> +sctx, sscreen->b.trace_bo ?
> +   sscreen->b.trace_bo->cs_buf : 
> NULL);
> sctx->b.rings.gfx.flush = si_context_gfx_flush;
>
> si_init_all_descriptors(sctx);
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index 041..a475344 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -1025,3 +1025,24 @@ void si_draw_vbo(struct pipe_context *ctx, const 
> struct pipe_draw_info *info)
> pipe_resource_reference(&ib.buffer, NULL);
> sctx->b.num_draw_calls++;
>  }
> +
> +#if SI_TRACE_CS
> +void si_trace_emit(struct si_context *sctx)
> +{
> +   struct si_screen *sscreen = sctx->screen;
> +   struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
> +   uint64_t va;
> +
> +   va = sscreen->b.trace_bo->gpu_address;
> +   r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, 
> sscreen->b.trace_bo,
> + RADEON_USAGE_READWRITE, RADEON_PRIO_MIN);
> +   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0));
> +   radeon_emit(cs, 
> PKT3_WRITE_DATA_DST_SEL(PKT3_WRITE_DATA_DST_SEL_MEM_SYNC) |
> +   PKT3_WRITE_DATA_WR_CONFIRM |
> +   
> PKT3_WRITE_DATA_ENGINE_SEL(PKT3_WRITE_DATA_ENGINE_SEL_ME));
> +   radeon_emit(cs, va & 0xUL);
> +   radeon_emit(cs, (va >> 32UL) & 0xUL);
> +   radeon_emit(cs,

Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits

2014-09-30 Thread Ian Romanick

On 09/20/2014 07:41 PM, Matt Turner wrote:
> On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes  wrote:
>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>> index 79d2e94..c11ad4f 100644
>> --- a/src/mesa/main/shaderapi.c
>> +++ b/src/mesa/main/shaderapi.c
>> @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void)
>>  void
>>  _mesa_init_shader_state(struct gl_context *ctx)
>>  {
>> +   int i;
> 
> In context, this declaration looks odd. Move it below the two just
> after this hunk?

Not in core Mesa where we have to do dumb ol' C89. :(

>> /* Device drivers may override these to control what kind of instructions
>>  * are generated by the GLSL compiler.
>>  */
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 10/56] mesa: Generalize sso stage interleaving check for tess

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/main/pipelineobj.c | 53 
> +++--
>  1 file changed, 37 insertions(+), 16 deletions(-)
> 
> diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
> index c902107..b91289e 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -662,6 +662,38 @@ program_stages_all_active(struct gl_pipeline_object 
> *pipe,
> return status;
>  }
>  
> +static bool
> +program_stages_interleaved_illegally(struct gl_pipeline_object *pipe)

const

> +{
> +   struct gl_shader_program *prev = NULL;
> +   unsigned i, j;
> +
> +   /* Look for programs bound to stages: A -> B -> A, with
> +* any intervening sequence of unrelated programs or
> +* empty stages
> +*/

I think this (and perhaps the next comment) are wrapped too narrow. :)

> +
> +   for (i = 0; i < MESA_SHADER_STAGES; i++) {
> +  /* Empty stages anywhere in the pipe are OK */
> +  if (!pipe->CurrentProgram[i])
> + continue;
> +
> +  if (prev && pipe->CurrentProgram[i] != prev) {
> + /* We've seen an A -> B transition; look at the rest of
> +  * the pipe to see if we ever see A again.
> +  */
> + for (j = i + 1; j < MESA_SHADER_STAGES; j++) {
> +if (pipe->CurrentProgram[j] == prev)
> +   return true;
> + }
> +  }

It took me a bit to convince myself that this code is correct.  I think
this would be a good place for a unit test.

Since this is a good clean-up for this code, I think it could also land
before the reset of the series.

> +
> +  prev = pipe->CurrentProgram[i];
> +   }
> +
> +   return false;
> +}
> +
>  extern GLboolean
>  _mesa_validate_program_pipeline(struct gl_context* ctx,
>  struct gl_pipeline_object *pipe,
> @@ -714,22 +746,11 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
>  * Without Tesselation, the only case where this can occur is the geometry
>  * shader between the fragment shader and vertex shader.
>  */
> -   if (pipe->CurrentProgram[MESA_SHADER_GEOMETRY]
> -   && pipe->CurrentProgram[MESA_SHADER_FRAGMENT]
> -   && pipe->CurrentProgram[MESA_SHADER_VERTEX]) {
> -  if (pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name == 
> pipe->CurrentProgram[MESA_SHADER_FRAGMENT]->Name &&
> -  pipe->CurrentProgram[MESA_SHADER_GEOMETRY]->Name != 
> pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name) {
> - pipe->InfoLog =
> -ralloc_asprintf(pipe,
> -"Program %d is active for geometry stage between 
> "
> -"two stages for which another program %d is "
> -"active",
> -pipe->CurrentProgram[MESA_SHADER_GEOMETRY]->Name,
> -pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name);
> - goto err;
> -  }
> -
> -  /* XXX tess */
> +   if (program_stages_interleaved_illegally(pipe)) {
> +  pipe->InfoLog = ralloc_strdup(pipe, "Program is active for multiple 
> shader"
> +  "stages with an intervening stage 
> provided"
> +  "by another program");
> +  goto err;
> }
>  
> /* Section 2.11.11 (Shader Execution), subheading "Validation," of the
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 15/56] mesa/main: Add misc tessellation shader stuff.

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> From: Fabian Bieler 
> 
> ---
>  src/mesa/main/context.c   |  6 +
>  src/mesa/main/mtypes.h|  3 ++-
>  src/mesa/main/shaderapi.c | 29 
>  src/mesa/main/state.c | 67 
> +--
>  4 files changed, 102 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> index d9be2f5..d4190b6 100644
> --- a/src/mesa/main/context.c
> +++ b/src/mesa/main/context.c
> @@ -1904,6 +1904,12 @@ _mesa_valid_to_render(struct gl_context *ctx, const 
> char *where)
>  */
> (void) from_glsl_shader[MESA_SHADER_GEOMETRY];
>  
> +   /* FINISHME: If GL_NV_tessellation_program is ever supported, the current
> +* FINISHME: tessellation control and evaluation programs should 
> validated here.
> +*/
> +   (void) from_glsl_shader[GL_TESS_CONTROL_PROGRAM_NV];
> +   (void) from_glsl_shader[GL_TESS_EVALUATION_PROGRAM_NV];

I think you mean MESA_.

> +
> if (!from_glsl_shader[MESA_SHADER_FRAGMENT]) {
>if (ctx->FragmentProgram.Enabled && !ctx->FragmentProgram._Enabled) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 9088e97..9bd78e4 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2566,7 +2566,8 @@ struct gl_sl_pragmas
>   */
>  struct gl_shader
>  {
> -   /** GL_FRAGMENT_SHADER || GL_VERTEX_SHADER || GL_GEOMETRY_SHADER_ARB.
> +   /** GL_FRAGMENT_SHADER || GL_VERTEX_SHADER || GL_GEOMETRY_SHADER_ARB ||
> +*  GL_TESS_CONTROL_SHADER || GL_TESS_EVALUATION_SHADER.
>  * Must be the first field.
>  */
> GLenum Type;
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 8160062..7ef9f74 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -206,6 +206,10 @@ _mesa_validate_shader_target(const struct gl_context 
> *ctx, GLenum type)
>return ctx == NULL || ctx->Extensions.ARB_vertex_shader;
> case GL_GEOMETRY_SHADER_ARB:
>return ctx == NULL || _mesa_has_geometry_shaders(ctx);
> +   case GL_TESS_CONTROL_SHADER:
> +  return ctx == NULL || ctx->Extensions.ARB_tessellation_shader;
> +   case GL_TESS_EVALUATION_SHADER:
> +  return ctx == NULL || ctx->Extensions.ARB_tessellation_shader;
> case GL_COMPUTE_SHADER:
>return ctx == NULL || ctx->Extensions.ARB_compute_shader;
> default:
> @@ -423,6 +427,8 @@ detach_shader(struct gl_context *ctx, GLuint program, 
> GLuint shader)
>   /* sanity check - make sure the new list's entries are sensible */
>   for (j = 0; j < shProg->NumShaders; j++) {
>  assert(shProg->Shaders[j]->Type == GL_VERTEX_SHADER ||
> +   shProg->Shaders[j]->Type == GL_TESS_CONTROL_SHADER ||
> +   shProg->Shaders[j]->Type == GL_TESS_EVALUATION_SHADER ||
> shProg->Shaders[j]->Type == GL_GEOMETRY_SHADER ||
> shProg->Shaders[j]->Type == GL_FRAGMENT_SHADER);
>  assert(shProg->Shaders[j]->RefCount > 0);
> @@ -1041,6 +1047,12 @@ print_shader_info(const struct gl_shader_program 
> *shProg)
> if (shProg->_LinkedShaders[MESA_SHADER_GEOMETRY])
>printf("  geom prog %u\n",
>shProg->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->Id);
> +   if (shProg->_LinkedShaders[MESA_SHADER_TESS_CTRL])
> +  printf("  tesc prog %u\n",
> +  shProg->_LinkedShaders[MESA_SHADER_TESS_CTRL]->Program->Id);
> +   if (shProg->_LinkedShaders[MESA_SHADER_TESS_EVAL])
> +  printf("  tese prog %u\n",
> +  shProg->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->Id);
>  }
>  
>  
> @@ -1117,6 +1129,8 @@ void
>  _mesa_use_program(struct gl_context *ctx, struct gl_shader_program *shProg)
>  {
> use_shader_program(ctx, GL_VERTEX_SHADER, shProg, &ctx->Shader);
> +   use_shader_program(ctx, GL_TESS_CONTROL_SHADER, shProg, &ctx->Shader);
> +   use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, &ctx->Shader);
> use_shader_program(ctx, GL_GEOMETRY_SHADER_ARB, shProg, &ctx->Shader);
> use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg, &ctx->Shader);
> use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, &ctx->Shader);
> @@ -1959,6 +1973,21 @@ _mesa_copy_linked_program_data(gl_shader_stage type,
> case MESA_SHADER_VERTEX:
>dst->UsesClipDistanceOut = src->Vert.UsesClipDistance;
>break;
> +   case MESA_SHADER_TESS_CTRL: {
> +  struct gl_tess_ctrl_program *dst_tcp =
> + (struct gl_tess_ctrl_program *) dst;
> +  dst_tcp->VerticesOut = src->TessCtrl.VerticesOut;
> +   }
> +  break;
> +   case MESA_SHADER_TESS_EVAL: {
> +  struct gl_tess_eval_program *dst_tep =
> + (struct gl_tess_eval_program *) dst;
> +  dst_tep->PrimitiveMode = src->TessEval.PrimitiveMode;
> +  dst_tep->Spacing = src->TessEval.Spacing;
> +  dst_tep->VertexOrder = src->TessEval.VertexOrder;
> +  dst_tep

Re: [Mesa-dev] [RFC PATCH 09/56] mesa: Allow tess stages in glUseProgramStages

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> ---
>  src/mesa/main/pipelineobj.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
> index 61a5785..c902107 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -243,14 +243,13 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
> stages, GLuint program)
>  *
>  * "If stages is not the special value ALL_SHADER_BITS, and has a bit
>  * set that is not recognized, the error INVALID_VALUE is generated."
> -*
> -* NOT YET SUPPORTED:
> -* GL_TESS_CONTROL_SHADER_BIT
> -* GL_TESS_EVALUATION_SHADER_BIT
>  */
> any_valid_stages = GL_VERTEX_SHADER_BIT | GL_FRAGMENT_SHADER_BIT;
> if (_mesa_has_geometry_shaders(ctx))
>any_valid_stages |= GL_GEOMETRY_SHADER_BIT;
> +   if (ctx->Extensions.ARB_tessellation_shader)
> +  any_valid_stages |= GL_TESS_CONTROL_SHADER_BIT |
> +  GL_TESS_EVALUATION_SHADER_BIT;
>  
> if (stages != GL_ALL_SHADER_BITS && (stages & ~any_valid_stages) != 0) {
>_mesa_error(ctx, GL_INVALID_VALUE, "glUseProgramStages(Stages)");
> @@ -326,6 +325,12 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
> stages, GLuint program)
>  
> if ((stages & GL_GEOMETRY_SHADER_BIT) != 0)
>_mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER, shProg, pipe);
> +
> +   if ((stages & GL_TESS_CONTROL_SHADER_BIT) != 0)
> +  _mesa_use_shader_program(ctx, GL_TESS_CONTROL_SHADER, shProg, pipe);
> +
> +   if ((stages & GL_TESS_EVALUATION_SHADER_BIT) != 0)
> +  _mesa_use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, pipe);
>  }
>  
>  /**
> @@ -723,6 +728,8 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
>  pipe->CurrentProgram[MESA_SHADER_VERTEX]->Name);
>   goto err;
>}
> +
> +  /* XXX tess */

Other places in Mesa use FINISHME.  I haven't gotten far enough in this
series to see if these are fixed, so this comment may be irrelevant.

> }
>  
> /* Section 2.11.11 (Shader Execution), subheading "Validation," of the
> @@ -742,6 +749,8 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
> && pipe->CurrentProgram[MESA_SHADER_GEOMETRY]) {
>pipe->InfoLog = ralloc_strdup(pipe, "Program lacks a vertex shader");
>goto err;
> +
> +  /* XXX: tess */
> }
>  
> /* Section 2.11.11 (Shader Execution), subheading "Validation," of the
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 16/56] mesa/program: Add misc tessellation shader support.

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> From: Fabian Bieler 
> 
> ---
>  src/mesa/program/program.c | 44 ++
>  src/mesa/program/program.h | 60 
> +-
>  2 files changed, 103 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c
> index dc030b0..d7c457a 100644
> --- a/src/mesa/program/program.c
> +++ b/src/mesa/program/program.c
> @@ -101,6 +101,14 @@ _mesa_init_program(struct gl_context *ctx)
> _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current,
>  NULL);
>  
> +   ctx->TessCtrlProgram.Enabled = GL_FALSE;
> +   _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current,
> +NULL);
> +
> +   ctx->TessEvalProgram.Enabled = GL_FALSE;
> +   _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current,
> +NULL);
> +

Indentation looks off here.  Mixed tabs?

> /* XXX probably move this stuff */
> ctx->ATIFragmentShader.Enabled = GL_FALSE;
> ctx->ATIFragmentShader.Current = ctx->Shared->DefaultFragmentShader;
> @@ -120,6 +128,8 @@ _mesa_free_program_data(struct gl_context *ctx)
> _mesa_reference_fragprog(ctx, &ctx->FragmentProgram.Current, NULL);
> _mesa_delete_shader_cache(ctx, ctx->FragmentProgram.Cache);
> _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, NULL);
> +   _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current, NULL);
> +   _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current, NULL);
>  
> /* XXX probably move this stuff */
> if (ctx->ATIFragmentShader.Current) {
> @@ -152,6 +162,12 @@ _mesa_update_default_objects_program(struct gl_context 
> *ctx)
> _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current,
>ctx->Shared->DefaultGeometryProgram);
>  
> +   _mesa_reference_tesscprog(ctx, &ctx->TessCtrlProgram.Current,
> +  ctx->Shared->DefaultTessCtrlProgram);
> +
> +   _mesa_reference_tesseprog(ctx, &ctx->TessEvalProgram.Current,
> +  ctx->Shared->DefaultTessEvalProgram);
> +
> /* XXX probably move this stuff */
> if (ctx->ATIFragmentShader.Current) {
>ctx->ATIFragmentShader.Current->RefCount--;
> @@ -373,6 +389,16 @@ _mesa_new_program(struct gl_context *ctx, GLenum target, 
> GLuint id)
>   CALLOC_STRUCT(gl_geometry_program),
>   target, id);
>break;
> +   case GL_TESS_CONTROL_PROGRAM_NV:
> +  prog = _mesa_init_tess_ctrl_program(ctx,
> +  
> CALLOC_STRUCT(gl_tess_ctrl_program),
> +  target, id);
> +  break;
> +   case GL_TESS_EVALUATION_PROGRAM_NV:
> +  prog = _mesa_init_tess_eval_program(ctx,
> + CALLOC_STRUCT(gl_tess_eval_program),
> + target, id);
> +  break;
> case GL_COMPUTE_PROGRAM_NV:
>prog = _mesa_init_compute_program(ctx,
>  CALLOC_STRUCT(gl_compute_program),
> @@ -590,6 +616,24 @@ _mesa_clone_program(struct gl_context *ctx, const struct 
> gl_program *prog)
>   gpc->UsesStreams = gp->UsesStreams;
>}
>break;
> +   case GL_TESS_CONTROL_PROGRAM_NV:
> +  {
> + const struct gl_tess_ctrl_program *tcp = 
> gl_tess_ctrl_program_const(prog);
> + struct gl_tess_ctrl_program *tcpc = gl_tess_ctrl_program(clone);
> + tcpc->VerticesOut = tcp->VerticesOut;
> + // XXX: tcpc->UsesBarrier = tcp->UseBarrier;

This comment seems odd.  None of the other places mention this missing
field, and why is this field missing?

> +  }
> +  break;
> +   case GL_TESS_EVALUATION_PROGRAM_NV:
> +  {
> + const struct gl_tess_eval_program *tep = 
> gl_tess_eval_program_const(prog);
> + struct gl_tess_eval_program *tepc = gl_tess_eval_program(clone);
> + tepc->PrimitiveMode = tep->PrimitiveMode;
> + tepc->Spacing = tep->Spacing;
> + tepc->VertexOrder = tep->VertexOrder;
> + tepc->PointMode = tep->PointMode;
> +  }
> +  break;
> default:
>_mesa_problem(NULL, "Unexpected target in _mesa_clone_program");
> }
> diff --git a/src/mesa/program/program.h b/src/mesa/program/program.h
> index dd5198a..0216e62 100644
> --- a/src/mesa/program/program.h
> +++ b/src/mesa/program/program.h
> @@ -148,6 +148,24 @@ _mesa_reference_geomprog(struct gl_context *ctx,
> (struct gl_program *) prog);
>  }
>  
> +static inline void
> +_mesa_reference_tesscprog(struct gl_context *ctx,
> + struct gl_tess_ctrl_program **ptr,
> + struct gl_tess_ctrl_program *prog)
> +{
> +   _mesa_reference_program(ctx, (struct gl_program **) ptr,
> +   (st

[Mesa-dev] [PATCH] gallium/util: add util_bitcount64

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

I'll need this in radeonsi.
---
 src/gallium/auxiliary/util/u_math.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index 39bd40f..48d5c31 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -727,6 +727,14 @@ util_bitcount(unsigned n)
 #endif
 }
 
+
+static INLINE unsigned
+util_bitcount64(uint64_t n)
+{
+   return util_bitcount(n) + util_bitcount(n >> 32);
+}
+
+
 /**
  * Reverse bits in n
  * Algorithm taken from:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 17/56] mesa: Add support for UNIFORM_BLOCK_REFERENCED_BY_TESS_*_SHADER

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/main/uniforms.c | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
> index 0d0cbf5..ceeadf4 100644
> --- a/src/mesa/main/uniforms.c
> +++ b/src/mesa/main/uniforms.c
> @@ -1127,6 +1127,18 @@ _mesa_GetActiveUniformBlockiv(GLuint program,
>params[0] = 
> shProg->UniformBlockStageIndex[MESA_SHADER_VERTEX][uniformBlockIndex] != -1;
>return;
>  
> +   case GL_UNIFORM_BLOCK_REFERENCED_BY_TESS_CONTROL_SHADER:
> +  if (!ctx->Extensions.ARB_tessellation_shader)
> + break;
> +  params[0] = 
> shProg->UniformBlockStageIndex[MESA_SHADER_TESS_CTRL][uniformBlockIndex] != 
> -1;
> +  return;
> +
> +   case GL_UNIFORM_BLOCK_REFERENCED_BY_TESS_EVALUATION_SHADER:
> +  if (!ctx->Extensions.ARB_tessellation_shader)
> + break;
> +  params[0] = 
> shProg->UniformBlockStageIndex[MESA_SHADER_TESS_EVAL][uniformBlockIndex] != 
> -1;
> +  return;
> +
> case GL_UNIFORM_BLOCK_REFERENCED_BY_GEOMETRY_SHADER:
>params[0] = 
> shProg->UniformBlockStageIndex[MESA_SHADER_GEOMETRY][uniformBlockIndex] != -1;
>return;
> @@ -1136,11 +1148,12 @@ _mesa_GetActiveUniformBlockiv(GLuint program,
>return;
>  
> default:
> -  _mesa_error(ctx, GL_INVALID_ENUM,
> -   "glGetActiveUniformBlockiv(pname 0x%x (%s))",
> -   pname, _mesa_lookup_enum_by_nr(pname));
> -  return;
> +  break;
> }
> +
> +   _mesa_error(ctx, GL_INVALID_ENUM,
> +   "glGetActiveUniformBlockiv(pname 0x%x (%s))",
> +   pname, _mesa_lookup_enum_by_nr(pname));
>  }

This last hunk seems spurious.  Does some later patch depend on this?

>  void GLAPIENTRY
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64

2014-09-30 Thread Ilia Mirkin

Perhaps do the same thing as util_bitcount, i.e.

#if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304)
  return __builtin_popcountll(n);
#else
...
#endif

Perhaps the gcc version check is no longer necessary, unlikely
anyone's using gcc3.3 or earlier at this point. But whatever.

On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> I'll need this in radeonsi.
> ---
>  src/gallium/auxiliary/util/u_math.h | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_math.h 
> b/src/gallium/auxiliary/util/u_math.h
> index 39bd40f..48d5c31 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -727,6 +727,14 @@ util_bitcount(unsigned n)
>  #endif
>  }
>
> +
> +static INLINE unsigned
> +util_bitcount64(uint64_t n)
> +{
> +   return util_bitcount(n) + util_bitcount(n >> 32);
> +}
> +
> +
>  /**
>   * Reverse bits in n
>   * Algorithm taken from:
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl()

2014-09-30 Thread Brian Paul

Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC.
Fixes polygon stipple regression.
---
 src/gallium/auxiliary/tgsi/tgsi_transform.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.h 
b/src/gallium/auxiliary/tgsi/tgsi_transform.h
index bfcdd56..921aa90 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_transform.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_transform.h
@@ -120,7 +120,7 @@ tgsi_transform_input_decl(struct tgsi_transform_context 
*ctx,
decl.Declaration.File = TGSI_FILE_INPUT;
decl.Declaration.Interpolate = 1;
decl.Declaration.Semantic = 1;
-   decl.Semantic.Name = TGSI_SEMANTIC_GENERIC;
+   decl.Semantic.Name = sem_name;
decl.Semantic.Index = sem_index;
decl.Range.First =
decl.Range.Last = index;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 12/56] mesa: Add tessellation shader builtin varyings.

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> From: Fabian Bieler 
> 
> ---
>  src/mesa/main/mtypes.h| 15 ++-
>  src/mesa/program/prog_print.c |  4 
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 9e989d7..9088e97 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -239,6 +239,8 @@ typedef enum
> VARYING_SLOT_VIEWPORT, /* Appears as VS or GS output */
> VARYING_SLOT_FACE, /* FS only */
> VARYING_SLOT_PNTC, /* FS only */
> +   VARYING_SLOT_TESS_LEVEL_OUTER, /* Appears in both tessellation shaders. */
> +   VARYING_SLOT_TESS_LEVEL_INNER, /* Appears in both tessellation shaders. */
> VARYING_SLOT_VAR0, /* First generic varying slot */
> VARYING_SLOT_MAX = VARYING_SLOT_VAR0 + MAX_VARYING
>  } gl_varying_slot;
> @@ -275,6 +277,8 @@ typedef enum
>  #define VARYING_BIT_VIEWPORT BITFIELD64_BIT(VARYING_SLOT_VIEWPORT)
>  #define VARYING_BIT_FACE BITFIELD64_BIT(VARYING_SLOT_FACE)
>  #define VARYING_BIT_PNTC BITFIELD64_BIT(VARYING_SLOT_PNTC)
> +#define VARYING_BIT_TESS_LEVEL_OUTER 
> BITFIELD64_BIT(VARYING_SLOT_TESS_LEVEL_OUTER)
> +#define VARYING_BIT_TESS_LEVEL_INNER 
> BITFIELD64_BIT(VARYING_SLOT_TESS_LEVEL_INNER)
>  #define VARYING_BIT_VAR(V) BITFIELD64_BIT(VARYING_SLOT_VAR0 + (V))
>  /*@}*/
>  
> @@ -298,6 +302,8 @@ _mesa_varying_slot_in_fs(gl_varying_slot slot)
> case VARYING_SLOT_EDGE:
> case VARYING_SLOT_CLIP_VERTEX:
> case VARYING_SLOT_LAYER:
> +   case VARYING_SLOT_TESS_LEVEL_OUTER:
> +   case VARYING_SLOT_TESS_LEVEL_INNER:
>return GL_FALSE;
> default:
>return GL_TRUE;
> @@ -2140,7 +2146,7 @@ typedef enum
>  * \name Geometry shader system values
>  */
> /*@{*/
> -   SYSTEM_VALUE_INVOCATION_ID,
> +   SYSTEM_VALUE_INVOCATION_ID,  /**< (Also in Tessellation Control shader) */
> /*@}*/
>  
> /**
> @@ -2153,6 +2159,13 @@ typedef enum
> SYSTEM_VALUE_SAMPLE_MASK_IN,
> /*@}*/
>  
> +   /**
> +* \name Tessellation Evaluation shader system values
> +*/
> +   /*@{*/
> +   SYSTEM_VALUE_TESS_COORD,
> +   /*@}*/
> +

This hunk and the previous hunk should get merged with the hunk in patch
19.  I don't think it matters much whether they go to 19 or 19 comes here.

> SYSTEM_VALUE_MAX /**< Number of values */
>  } gl_system_value;
>  
> diff --git a/src/mesa/program/prog_print.c b/src/mesa/program/prog_print.c
> index 475e241..26881e8 100644
> --- a/src/mesa/program/prog_print.c
> +++ b/src/mesa/program/prog_print.c
> @@ -147,6 +147,8 @@ arb_input_attrib_string(GLint index, GLenum progType)
>"fragment.(twenty-one)", /* VARYING_SLOT_VIEWPORT */
>"fragment.(twenty-two)", /* VARYING_SLOT_FACE */
>"fragment.(twenty-three)", /* VARYING_SLOT_PNTC */
> +  "fragment.(twenty-four)", /* VARYING_SLOT_TESS_LEVEL_OUTER */
> +  "fragment.(twenty-five)", /* VARYING_SLOT_TESS_LEVEL_INNER */
>"fragment.varying[0]",
>"fragment.varying[1]",
>"fragment.varying[2]",
> @@ -272,6 +274,8 @@ arb_output_attrib_string(GLint index, GLenum progType)
>"result.(twenty-one)", /* VARYING_SLOT_VIEWPORT */
>"result.(twenty-two)", /* VARYING_SLOT_FACE */
>"result.(twenty-three)", /* VARYING_SLOT_PNTC */
> +  "result.(twenty-four)", /* VARYING_SLOT_TESS_LEVEL_OUTER */
> +  "result.(twenty-five)", /* VARYING_SLOT_TESS_LEVEL_INNER */
>"result.varying[0]",
>"result.varying[1]",
>"result.varying[2]",
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 21/56] glsl: Add tessellation shader defines and built-in variables.

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:41 PM, Chris Forbes wrote:
> From: Fabian Bieler 
> 
> ---
>  src/glsl/builtin_variables.cpp | 62 
> +-
>  src/glsl/glcpp/glcpp-parse.y   |  3 ++
>  2 files changed, 64 insertions(+), 1 deletion(-)
> 
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index 5b6f4ae..7ba0fe8 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -343,6 +343,8 @@ public:
> void generate_constants();
> void generate_uniforms();
> void generate_vs_special_vars();
> +   void generate_tcs_special_vars();
> +   void generate_tes_special_vars();
> void generate_gs_special_vars();
> void generate_fs_special_vars();
> void generate_cs_special_vars();
> @@ -842,6 +844,40 @@ builtin_variable_generator::generate_vs_special_vars()
>  
>  
>  /**
> + * Generate variables which only exist in tessellation control shaders.
> + */
> +void
> +builtin_variable_generator::generate_tcs_special_vars()
> +{
> +   add_input(-1, int_t, "gl_PatchVerticesIn");
> +   add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");// XXX: or 
> sysval?
> +   add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
> +
> +   add_output(VARYING_SLOT_TESS_LEVEL_OUTER,
> +array(float_t, 4), "gl_TessLevelOuter");
> +   add_output(VARYING_SLOT_TESS_LEVEL_INNER,
> +array(float_t, 2), "gl_TessLevelInner");
> +}
> +
> +
> +/**
> + * Generate variables which only exist in tessellation evaluation shaders.
> + */
> +void
> +builtin_variable_generator::generate_tes_special_vars()
> +{
> +   add_input(-1, int_t, "gl_PatchVerticesIn");
> +   add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");// XXX: or 
> sysval?
> +   add_system_value(SYSTEM_VALUE_TESS_COORD, vec3_t, "gl_TessCoord");
> +
> +   add_input(VARYING_SLOT_TESS_LEVEL_OUTER,
> +array(float_t, 4), "gl_TessLevelOuter");
> +   add_input(VARYING_SLOT_TESS_LEVEL_INNER,
> +array(float_t, 2), "gl_TessLevelInner");
> +}
> +
> +
> +/**
>   * Generate variables which only exist in geometry shaders.
>   */
>  void
> @@ -964,6 +1000,9 @@ builtin_variable_generator::add_varying(int slot, const 
> glsl_type *type,
>  const char *name_as_gs_input)
>  {
> switch (state->stage) {
> +   case MESA_SHADER_TESS_CTRL:
> +   case MESA_SHADER_TESS_EVAL:
> +  // XXX: is this correct?
> case MESA_SHADER_GEOMETRY:
>this->per_vertex_in.add_field(slot, type, name);
>/* FALLTHROUGH */
> @@ -1016,13 +1055,28 @@ builtin_variable_generator::generate_varyings()
>}
> }
>  
> +   if (state->stage == MESA_SHADER_TESS_CTRL ||
> +   state->stage == MESA_SHADER_TESS_EVAL) {
> +  const glsl_type *per_vertex_in_type =
> + this->per_vertex_in.construct_interface_instance();
> +  add_variable("gl_in", array(per_vertex_in_type, 
> state->Const.MaxPatchVertices),

This looks wrong, but I believe that it is correct.  Maybe add a spec
quotation?

/* Section 7.1 (Built-In Language Variables) of the GLSL 4.00 spec
 * says:
 *
 *"In the tessellation control language, built-in variables are
 *intrinsically declared as:
 *
 *in gl_PerVertex {
 *vec4 gl_Position;
 *float gl_PointSize;
 *float gl_ClipDistance[];
 *} gl_in[gl_MaxPatchVertices];"
 */

It may also be worth adding a similar quotation to the
MESA_SHADER_GEOMETRY case below.

> +   ir_var_shader_in, -1);
> +   }
> if (state->stage == MESA_SHADER_GEOMETRY) {
>const glsl_type *per_vertex_in_type =
>   this->per_vertex_in.construct_interface_instance();
>add_variable("gl_in", array(per_vertex_in_type, 0),
> ir_var_shader_in, -1);
> }
> -   if (state->stage == MESA_SHADER_VERTEX || state->stage == 
> MESA_SHADER_GEOMETRY) {
> +   if (state->stage == MESA_SHADER_TESS_CTRL) {
> +  const glsl_type *per_vertex_out_type =
> + this->per_vertex_out.construct_interface_instance();
> +  add_variable("gl_out", array(per_vertex_out_type, 0),
> +   ir_var_shader_out, -1);
> +   }
> +   if (state->stage == MESA_SHADER_VERTEX ||
> +   state->stage == MESA_SHADER_TESS_EVAL ||
> +   state->stage == MESA_SHADER_GEOMETRY) {
>const glsl_type *per_vertex_out_type =
>   this->per_vertex_out.construct_interface_instance();
>const glsl_struct_field *fields = 
> per_vertex_out_type->fields.structure;
> @@ -1057,6 +,12 @@ _mesa_glsl_initialize_variables(exec_list 
> *instructions,
> case MESA_SHADER_VERTEX:
>gen.generate_vs_special_vars();
>break;
> +   case MESA_SHADER_TESS_CTRL:
> +  gen.generate_tcs_special_vars();
> +  break;
> +   case MESA_SHADER_TESS_EVAL:
> +  gen.generate_tes_special_vars();
> +  break;

Re: [Mesa-dev] [RFC PATCH 00/56] ARB_tessellation_shader for core mesa

2014-09-30 Thread Ian Romanick

On 09/20/2014 06:40 PM, Chris Forbes wrote:
> This series adds all the driver-independent bits for ARB_tessellation_shader.
> It's not quite finished, and there are still a handful of ugly hacks to
> remove, but I think it's complete enough to start getting some review 
> feedback.

Patches 1, 2, and 4 through 11, 13, 14, 15, 18, and 20 are

Reviewed-by: Ian Romanick 

I agree with Ken's comments about patch 3.

I sent a couple comments on patches 10, 12 (that also applies to 19),
15, 16, 17, and 21.  I'll try to either get more comments or more R-b
out later this week.

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/13] radeonsi: get fs_write_all from tgsi_shader_info directly

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 8 ++--
 src/gallium/drivers/radeonsi/si_shader.h | 6 --
 src/gallium/drivers/radeonsi/si_state.c  | 5 +
 3 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 5c3efd4..e76b969 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1438,11 +1438,6 @@ static void si_llvm_emit_fs_epilogue(struct 
lp_build_tgsi_context * bld_base)
 
tgsi_parse_token(parse);
 
-   if (parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_PROPERTY &&
-   parse->FullToken.FullProperty.Property.PropertyName ==
-   TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS)
-   shader->fs_write_all = TRUE;
-
if (parse->FullToken.Token.Type != TGSI_TOKEN_TYPE_DECLARATION)
continue;
 
@@ -1499,7 +1494,8 @@ static void si_llvm_emit_fs_epilogue(struct 
lp_build_tgsi_context * bld_base)
memcpy(last_args, args, sizeof(args));
 
/* Handle FS_COLOR0_WRITES_ALL_CBUFS. */
-   if (shader->fs_write_all && 
shader->output[i].sid == 0 &&
+   if 
(shader->selector->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0] 
&&
+shader->output[i].sid == 0 &&
si_shader_ctx->shader->key.ps.nr_cbufs > 1) 
{
for (int c = 1; c < 
si_shader_ctx->shader->key.ps.nr_cbufs; c++) {

si_llvm_init_export_args_load(bld_base,
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 8f5b431..c6026bd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -124,11 +124,6 @@ struct si_shader_selector {
 
/* PIPE_SHADER_[VERTEX|FRAGMENT|...] */
unsignedtype;
-
-   /* 1 when the shader contains
-* TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS, otherwise it's 0.
-* Used to determine whether we need to include nr_cbufs in the key */
-   unsignedfs_write_all;
 };
 
 union si_shader_key {
@@ -184,7 +179,6 @@ struct si_shader {
 
unsignednparam;
booluses_instanceid;
-   boolfs_write_all;
boolvs_out_misc_write;
boolvs_out_point_size;
boolvs_out_edgeflag;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 0e2d6c4..eb25606 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2215,7 +2215,7 @@ static INLINE void si_shader_selector_key(struct 
pipe_context *ctx,
key->vs.gs_used_inputs = 
sctx->gs_shader->current->gs_used_inputs;
}
} else if (sel->type == PIPE_SHADER_FRAGMENT) {
-   if (sel->fs_write_all)
+   if 
(sel->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0])
key->ps.nr_cbufs = sctx->framebuffer.state.nr_cbufs;
key->ps.export_16bpc = sctx->framebuffer.export_16bpc;
 
@@ -2312,9 +2312,6 @@ static void *si_create_shader_state(struct pipe_context 
*ctx,
sel->so = state->stream_output;
tgsi_scan_shader(state->tokens, &sel->info);
 
-   if (pipe_shader_type == PIPE_SHADER_FRAGMENT)
-   sel->fs_write_all = sel->info.color0_writes_all_cbufs;
-
r = si_shader_select(ctx, sel);
if (r) {
free(sel);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

Use an array of properties indexed by TGSI_PROPERTY_* definitions.
---
 src/gallium/auxiliary/draw/draw_gs.c | 23 -
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c  | 15 +++---
 src/gallium/auxiliary/tgsi/tgsi_scan.c   | 59 ++--
 src/gallium/auxiliary/tgsi/tgsi_scan.h   |  6 +--
 src/gallium/auxiliary/util/u_pstipple.c  |  8 +---
 src/gallium/drivers/llvmpipe/lp_state_fs.c   | 10 +---
 src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++---
 src/gallium/drivers/r300/r300_fs.c   |  8 +---
 src/gallium/drivers/radeonsi/si_shader.c | 53 +++--
 9 files changed, 70 insertions(+), 136 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
b/src/gallium/auxiliary/draw/draw_gs.c
index 878fcca..0c2f892 100644
--- a/src/gallium/auxiliary/draw/draw_gs.c
+++ b/src/gallium/auxiliary/draw/draw_gs.c
@@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw,
tgsi_scan_shader(state->tokens, &gs->info);
 
/* setup the defaults */
-   gs->input_primitive = PIPE_PRIM_TRIANGLES;
-   gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP;
-   gs->max_output_vertices = 32;
gs->max_out_prims = 0;
 
 #ifdef HAVE_LLVM
@@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw,
   gs->vector_length = 1;
}
 
-   for (i = 0; i < gs->info.num_properties; ++i) {
-  if (gs->info.properties[i].name ==
-  TGSI_PROPERTY_GS_INPUT_PRIM)
- gs->input_primitive = gs->info.properties[i].data[0];
-  else if (gs->info.properties[i].name ==
-   TGSI_PROPERTY_GS_OUTPUT_PRIM)
- gs->output_primitive = gs->info.properties[i].data[0];
-  else if (gs->info.properties[i].name ==
-   TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES)
- gs->max_output_vertices = gs->info.properties[i].data[0];
-   }
+   gs->input_primitive =
+ gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0];
+   gs->output_primitive =
+ gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0];
+   gs->max_output_vertices =
+ gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
+   if (!gs->max_output_vertices)
+  gs->max_output_vertices = 32;
+
/* Primitive boundary is bigger than max_output_vertices by one, because
 * the specification says that the geometry shader should exit if the 
 * number of emitted vertices is bigger or equal to max_output_vertices and
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index c0bd7be..2d7f32d 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
* were forgetting so we're using MAX_VERTEX_VARYING from
* that spec even though we could debug_assert if it's not
* set, but that's a lot uglier. */
-  uint max_output_vertices = 32;
-  uint i = 0;
+  uint max_output_vertices;
+
   /* inputs are always indirect with gs */
   bld.indirect_files |= (1 << TGSI_FILE_INPUT);
   bld.gs_iface = gs_iface;
@@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
   bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex;
   bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive;
 
-  for (i = 0; i < info->num_properties; ++i) {
- if (info->properties[i].name ==
- TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) {
-max_output_vertices = info->properties[i].data[0];
- }
-  }
+  max_output_vertices =
+info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
+  if (!max_output_vertices)
+ max_output_vertices = 32;
+
   bld.max_output_vertices_vec =
  lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type,
 max_output_vertices);
diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index c71bb36..f9d1896 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
  {
 const struct tgsi_full_property *fullprop
= &parse.FullToken.FullProperty;
+unsigned name = fullprop->Property.PropertyName;
 
-info->properties[info->num_properties].name =
-   fullprop->Property.PropertyName;
-memcpy(info->properties[info->num_properties].data,
-   fullprop->u, 8 * sizeof(unsigned));;
-
-++info->num_properties;
+assert(name < Elements(info->properties));
+memcpy(info->properties[name],
+   fullprop->u, 8 * sizeof(unsigned));
  }
  break;
 
@@ -296,35 +294,26 @@ tgsi_scan_shader(const struct tgsi_token *t

[Mesa-dev] [PATCH 04/13] tgsi: remove some not so useful variables from tgsi_shader_info

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/auxiliary/tgsi/tgsi_scan.c   |  8 
 src/gallium/auxiliary/tgsi/tgsi_scan.h   |  3 ---
 src/gallium/drivers/llvmpipe/lp_state_fs.c   |  4 +++-
 src/gallium/drivers/softpipe/sp_quad_blend.c |  5 ++---
 src/gallium/drivers/softpipe/sp_setup.c  | 12 
 src/gallium/drivers/svga/svga_state_fs.c |  2 +-
 6 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index f9d1896..d68dca8 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -293,14 +293,6 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
info->uses_kill = (info->opcode_count[TGSI_OPCODE_KILL_IF] ||
   info->opcode_count[TGSI_OPCODE_KILL]);
 
-   /* extract simple properties */
-   info->origin_lower_left =
- info->properties[TGSI_PROPERTY_FS_COORD_ORIGIN][0];
-   info->pixel_center_integer =
- info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0];
-   info->color0_writes_all_cbufs =
- info->properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0];
-
/* The dimensions of the IN decleration in geometry shader have
 * to be deduced from the type of the input primitive.
 */
diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h 
b/src/gallium/auxiliary/tgsi/tgsi_scan.h
index 0d79e29..934acec 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
@@ -76,9 +76,6 @@ struct tgsi_shader_info
boolean uses_vertexid;
boolean uses_primid;
boolean uses_frontface;
-   boolean origin_lower_left;
-   boolean pixel_center_integer;
-   boolean color0_writes_all_cbufs;
boolean writes_viewport_index;
boolean writes_layer;
boolean is_msaa_sampler[PIPE_MAX_SAMPLERS];
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c 
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 349d85a..cc75266 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -2323,6 +2323,8 @@ generate_fragment(struct llvmpipe_context *lp,
   LLVMValueRef mask_store = lp_build_array_alloca(gallivm, mask_type,
   num_loop, "mask_store");
   LLVMValueRef color_store[PIPE_MAX_COLOR_BUFS][TGSI_NUM_CHANNELS];
+  boolean pixel_center_integer =
+ shader->info.base.properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0];
 
   /*
* The shader input interpolation info is not explicitely baked in the
@@ -2333,7 +2335,7 @@ generate_fragment(struct llvmpipe_context *lp,
gallivm,
shader->info.base.num_inputs,
inputs,
-   shader->info.base.pixel_center_integer,
+   pixel_center_integer,
builder, fs_type,
a0_ptr, dadx_ptr, dady_ptr,
x, y);
diff --git a/src/gallium/drivers/softpipe/sp_quad_blend.c 
b/src/gallium/drivers/softpipe/sp_quad_blend.c
index 6c52c90..d60e508 100644
--- a/src/gallium/drivers/softpipe/sp_quad_blend.c
+++ b/src/gallium/drivers/softpipe/sp_quad_blend.c
@@ -923,9 +923,8 @@ blend_fallback(struct quad_stage *qs,
struct softpipe_context *softpipe = qs->softpipe;
const struct pipe_blend_state *blend = softpipe->blend;
unsigned cbuf;
-   boolean write_all;
-
-   write_all = softpipe->fs_variant->info.color0_writes_all_cbufs;
+   boolean write_all =
+  
softpipe->fs_variant->info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS][0];
 
for (cbuf = 0; cbuf < softpipe->framebuffer.nr_cbufs; cbuf++) {
   if (softpipe->framebuffer.cbufs[cbuf]) {
diff --git a/src/gallium/drivers/softpipe/sp_setup.c 
b/src/gallium/drivers/softpipe/sp_setup.c
index 7937e10..989ed9c 100644
--- a/src/gallium/drivers/softpipe/sp_setup.c
+++ b/src/gallium/drivers/softpipe/sp_setup.c
@@ -562,17 +562,21 @@ static void
 setup_fragcoord_coeff(struct setup_context *setup, uint slot)
 {
const struct tgsi_shader_info *fsInfo = &setup->softpipe->fs_variant->info;
+   boolean origin_lower_left =
+ fsInfo->properties[TGSI_PROPERTY_FS_COORD_ORIGIN][0];
+   boolean pixel_center_integer =
+ fsInfo->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER][0];
 
/*X*/
-   setup->coef[slot].a0[0] = fsInfo->pixel_center_integer ? 0.0f : 0.5f;
+   setup->coef[slot].a0[0] = pixel_center_integer ? 0.0f : 0.5f;
setup->coef[slot].dadx[0] = 1.0f;
setup->coef[slot].dady[0] = 0.0f;
/*Y*/
setup->coef[slot].a0[1] =
-  (fsInfo->origin_lower_left ? 
setup->softpipe->framebuffer.height-1 : 0)
-  + (fsInfo->pixel_center_integer ? 0.0f : 0.5f);
+  (origin_lower_left ? setup->softpipe->framebuffer.height-1 : 
0)
+  + (pixel_center_integer ? 0.0f :

[Mesa-dev] [PATCH 01/13] radeonsi: get tgsi_shader_info only once before compilation

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 25 +++--
 src/gallium/drivers/radeonsi/si_shader.h |  2 ++
 src/gallium/drivers/radeonsi/si_state.c  | 10 +++---
 3 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 9d2cc80..276ba81 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2805,7 +2805,6 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
 {
struct si_shader_selector *sel = shader->selector;
struct si_shader_context si_shader_ctx;
-   struct tgsi_shader_info shader_info;
struct lp_build_tgsi_context * bld_base;
LLVMModuleRef mod;
int r = 0;
@@ -2826,13 +2825,11 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
radeon_llvm_context_init(&si_shader_ctx.radeon_bld);
bld_base = &si_shader_ctx.radeon_bld.soa.bld_base;
 
-   tgsi_scan_shader(sel->tokens, &shader_info);
-
-   if (shader_info.uses_kill)
+   if (sel->info.uses_kill)
shader->db_shader_control |= S_02880C_KILL_ENABLE(1);
 
-   shader->uses_instanceid = shader_info.uses_instanceid;
-   bld_base->info = &shader_info;
+   shader->uses_instanceid = sel->info.uses_instanceid;
+   bld_base->info = &sel->info;
bld_base->emit_fetch_funcs[TGSI_FILE_CONSTANT] = fetch_constant;
 
bld_base->op_actions[TGSI_OPCODE_TEX] = tex_action;
@@ -2876,16 +2873,16 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs;
bld_base->emit_epilogue = si_llvm_emit_gs_epilogue;
 
-   for (i = 0; i < shader_info.num_properties; i++) {
-   switch (shader_info.properties[i].name) {
+   for (i = 0; i < sel->info.num_properties; i++) {
+   switch (sel->info.properties[i].name) {
case TGSI_PROPERTY_GS_INPUT_PRIM:
-   shader->gs_input_prim = 
shader_info.properties[i].data[0];
+   shader->gs_input_prim = 
sel->info.properties[i].data[0];
break;
case TGSI_PROPERTY_GS_OUTPUT_PRIM:
-   shader->gs_output_prim = 
shader_info.properties[i].data[0];
+   shader->gs_output_prim = 
sel->info.properties[i].data[0];
break;
case TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES:
-   shader->gs_max_out_vertices = 
shader_info.properties[i].data[0];
+   shader->gs_max_out_vertices = 
sel->info.properties[i].data[0];
break;
}
}
@@ -2897,10 +2894,10 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
si_shader_ctx.radeon_bld.load_input = declare_input_fs;
bld_base->emit_epilogue = si_llvm_emit_fs_epilogue;
 
-   for (i = 0; i < shader_info.num_properties; i++) {
-   switch (shader_info.properties[i].name) {
+   for (i = 0; i < sel->info.num_properties; i++) {
+   switch (sel->info.properties[i].name) {
case TGSI_PROPERTY_FS_DEPTH_LAYOUT:
-   switch (shader_info.properties[i].data[0]) {
+   switch (sel->info.properties[i].data[0]) {
case TGSI_FS_DEPTH_LAYOUT_GREATER:
shader->db_shader_control |=

S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_GREATER_THAN_Z);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index d8a63df..8f5b431 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -30,6 +30,7 @@
 #define SI_SHADER_H
 
 #include  /* LLVMModuleRef */
+#include "tgsi/tgsi_scan.h"
 
 #define SI_SGPR_CONST  0
 #define SI_SGPR_SAMPLER2
@@ -117,6 +118,7 @@ struct si_shader_selector {
 
struct tgsi_token   *tokens;
struct pipe_stream_output_info  so;
+   struct tgsi_shader_info info;
 
unsignednum_shaders;
 
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ed90f13..0e2d6c4 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -30,7 +30,6 @@
 #include "radeon/r600_cs.h"
 
 #include "tgsi/tgsi_parse.h"
-#include "tgsi/tgsi_scan.h"
 #include "util/u_format.h"
 #include "util/u_format_s3tc.h"
 #include "util/u_framebuffer.h"

[Mesa-dev] [PATCH 07/13] radeonsi: move geometry shader properties from si_shader to si_shader_selector

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 24 ++--
 src/gallium/drivers/radeonsi/si_shader.h | 10 +-
 src/gallium/drivers/radeonsi/si_state.c  | 25 +++--
 src/gallium/drivers/radeonsi/si_state_draw.c |  8 
 4 files changed, 38 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index c5f13be..6372ccf 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -109,7 +109,7 @@ static struct si_shader_context * si_shader_context(
  * less than 64, so that a 64-bit bitmask of used inputs or outputs can be
  * calculated.
  */
-static unsigned get_unique_index(unsigned semantic_name, unsigned index)
+unsigned si_shader_io_get_unique_index(unsigned semantic_name, unsigned index)
 {
switch (semantic_name) {
case TGSI_SEMANTIC_POSITION:
@@ -160,7 +160,7 @@ static unsigned get_unique_index(unsigned semantic_name, 
unsigned index)
 static int get_param_index(unsigned semantic_name, unsigned index,
   uint64_t mask)
 {
-   unsigned unique_index = get_unique_index(semantic_name, index);
+   unsigned unique_index = si_shader_io_get_unique_index(semantic_name, 
index);
int i, param_index = 0;
 
/* If not present... */
@@ -337,13 +337,6 @@ static void declare_input_gs(
struct si_shader *shader = si_shader_ctx->shader;
 
si_store_shader_io_attribs(shader, decl);
-
-   if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID) {
-   shader->gs_used_inputs |=
-   1llu << get_unique_index(decl->Semantic.Name,
-decl->Semantic.Index);
-   shader->nparam++;
-   }
 }
 
 static LLVMValueRef fetch_input_gs(
@@ -410,7 +403,7 @@ static LLVMValueRef fetch_input_gs(
args[1] = vtx_offset;
args[2] = lp_build_const_int32(gallivm,
   (get_param_index(input->name, input->sid,
-   shader->gs_used_inputs) 
* 4 +
+   
shader->selector->gs_used_inputs) * 4 +
swizzle) * 256);
args[3] = uint->zero;
args[4] = uint->one;  /* OFFEN */
@@ -2304,7 +2297,7 @@ static void si_llvm_emit_vertex(
 */
can_emit = LLVMBuildICmp(gallivm->builder, LLVMIntULE, gs_next_vertex,
 lp_build_const_int32(gallivm,
- 
shader->gs_max_out_vertices), "");
+ 
shader->selector->gs_max_out_vertices), "");
kill = lp_build_select(&bld_base->base, can_emit,
   lp_build_const_float(gallivm, 1.0f),
   lp_build_const_float(gallivm, -1.0f));
@@ -2319,7 +2312,7 @@ static void si_llvm_emit_vertex(
LLVMValueRef out_val = LLVMBuildLoad(gallivm->builder, 
out_ptr[chan], "");
LLVMValueRef voffset =
lp_build_const_int32(gallivm, (i * 4 + chan) *
-
shader->gs_max_out_vertices);
+
shader->selector->gs_max_out_vertices);
 
voffset = lp_build_add(uint, voffset, gs_next_vertex);
voffset = lp_build_mul_imm(uint, voffset, 4);
@@ -2767,7 +2760,7 @@ static int si_generate_gs_copy_shader(struct si_screen 
*sscreen,
for (chan = 0; chan < 4; chan++) {
args[2] = lp_build_const_int32(gallivm,
   (i * 4 + chan) *
-  gs->gs_max_out_vertices 
* 16 * 4);
+  
gs->selector->gs_max_out_vertices * 16 * 4);
 
outputs[i].values[chan] =
LLVMBuildBitCast(gallivm->builder,
@@ -2866,11 +2859,6 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
si_shader_ctx.radeon_bld.load_input = declare_input_gs;
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs;
bld_base->emit_epilogue = si_llvm_emit_gs_epilogue;
-
-   shader->gs_output_prim =
-   sel->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0];
-   shader->gs_max_out_vertices =
-   
sel->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
break;
case TGSI_PROCESSOR_FRAGMENT:
si_shader_ctx.radeon_bld.load_input = declare_input_fs;
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/rade

[Mesa-dev] [PATCH 12/13] radeonsi: pass the GS shader directly to si_generate_gs_copy_shader

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 286014c..4e8f80f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2701,14 +2701,13 @@ int si_compile_llvm(struct si_screen *sscreen, struct 
si_shader *shader,
 /* Generate code for the hardware VS shader stage to go with a geometry shader 
*/
 static int si_generate_gs_copy_shader(struct si_screen *sscreen,
  struct si_shader_context *si_shader_ctx,
- bool dump)
+ struct si_shader *gs, bool dump)
 {
struct gallivm_state *gallivm = &si_shader_ctx->radeon_bld.gallivm;
struct lp_build_tgsi_context *bld_base = 
&si_shader_ctx->radeon_bld.soa.bld_base;
struct lp_build_context *base = &bld_base->base;
struct lp_build_context *uint = &bld_base->uint_bld;
struct si_shader *shader = si_shader_ctx->shader;
-   struct si_shader *gs = si_shader_ctx->shader->selector->current;
struct si_shader_output_values *outputs;
LLVMValueRef t_list_ptr, t_list;
LLVMValueRef args[9];
@@ -2910,7 +2909,8 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
shader->gs_copy_shader->selector = shader->selector;
shader->gs_copy_shader->key = shader->key;
si_shader_ctx.shader = shader->gs_copy_shader;
-   if ((r = si_generate_gs_copy_shader(sscreen, &si_shader_ctx, 
dump))) {
+   if ((r = si_generate_gs_copy_shader(sscreen, &si_shader_ctx,
+   shader, dump))) {
free(shader->gs_copy_shader);
shader->gs_copy_shader = NULL;
goto out;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/13] radeonsi: make the vertex shader key smaller

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

We only support 16 vertex attribs, not 32.
---
 src/gallium/drivers/radeonsi/si_shader.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index d9a89e3..c0e5cf4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -31,6 +31,7 @@
 
 #include  /* LLVMModuleRef */
 #include "tgsi/tgsi_scan.h"
+#include "si_state.h"
 
 #define SI_SGPR_CONST  0
 #define SI_SGPR_SAMPLER2
@@ -140,7 +141,7 @@ union si_shader_key {
unsignedalpha_to_one:1;
} ps;
struct {
-   unsignedinstance_divisors[PIPE_MAX_ATTRIBS];
+   unsignedinstance_divisors[SI_NUM_VERTEX_BUFFERS];
/* The mask of "get_unique_index" bits, needed for ES,
 * it describes how the ES->GS ring buffer is laid out. */
uint64_tgs_used_inputs;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/13] radeonsi: set LLVMByValAttribute for all descriptor arrays

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

I hope this is correct.
---
 src/gallium/drivers/radeonsi/si_shader.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 69382bd..286014c 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2391,7 +2391,7 @@ static void create_function(struct si_shader_context 
*si_shader_ctx)
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct si_shader *shader = si_shader_ctx->shader;
LLVMTypeRef params[SI_NUM_PARAMS], f32, i8, i32, v2i32, v3i32, v16i8, 
v4i32, v8i32;
-   unsigned i, last_sgpr, num_params;
+   unsigned i, last_array_pointer, last_sgpr, num_params;
 
i8 = LLVMInt8TypeInContext(gallivm->context);
i32 = LLVMInt32TypeInContext(gallivm->context);
@@ -2406,10 +2406,12 @@ static void create_function(struct si_shader_context 
*si_shader_ctx)
params[SI_PARAM_RW_BUFFERS] = const_array(v16i8, SI_NUM_RW_BUFFERS);
params[SI_PARAM_SAMPLER] = const_array(v4i32, SI_NUM_SAMPLER_STATES);
params[SI_PARAM_RESOURCE] = const_array(v8i32, SI_NUM_SAMPLER_VIEWS);
+   last_array_pointer = SI_PARAM_RESOURCE;
 
switch (si_shader_ctx->type) {
case TGSI_PROCESSOR_VERTEX:
params[SI_PARAM_VERTEX_BUFFER] = const_array(v16i8, 
SI_NUM_VERTEX_BUFFERS);
+   last_array_pointer = SI_PARAM_VERTEX_BUFFER;
params[SI_PARAM_BASE_VERTEX] = i32;
params[SI_PARAM_START_INSTANCE] = i32;
num_params = SI_PARAM_START_INSTANCE+1;
@@ -2493,18 +2495,13 @@ static void create_function(struct si_shader_context 
*si_shader_ctx)
 
for (i = 0; i <= last_sgpr; ++i) {
LLVMValueRef P = 
LLVMGetParam(si_shader_ctx->radeon_bld.main_fn, i);
-   switch (i) {
-   default:
-   LLVMAddAttribute(P, LLVMInRegAttribute);
-   break;
+
/* We tell llvm that array inputs are passed by value to allow 
Sinking pass
 * to move load. Inputs are constant so this is fine. */
-   case SI_PARAM_CONST:
-   case SI_PARAM_SAMPLER:
-   case SI_PARAM_RESOURCE:
+   if (i <= last_array_pointer)
LLVMAddAttribute(P, LLVMByValAttribute);
-   break;
-   }
+   else
+   LLVMAddAttribute(P, LLVMInRegAttribute);
}
 
if (bld_base->info &&
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/13] radeonsi: always compile shaders on demand

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

The first compiled shader is sometimes useless, because the key doesn't match
the key for the draw call where it's used.
---
 src/gallium/drivers/radeonsi/si_state.c | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index eb25606..da5fcb0 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2305,19 +2305,12 @@ static void *si_create_shader_state(struct pipe_context 
*ctx,
unsigned pipe_shader_type)
 {
struct si_shader_selector *sel = CALLOC_STRUCT(si_shader_selector);
-   int r;
 
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
sel->so = state->stream_output;
tgsi_scan_shader(state->tokens, &sel->info);
 
-   r = si_shader_select(ctx, sel);
-   if (r) {
-   free(sel);
-   return NULL;
-   }
-
return sel;
 }
 
@@ -2344,10 +2337,7 @@ static void si_bind_vs_shader(struct pipe_context *ctx, 
void *state)
struct si_context *sctx = (struct si_context *)ctx;
struct si_shader_selector *sel = state;
 
-   if (sctx->vs_shader == sel)
-   return;
-
-   if (!sel || !sel->current)
+   if (sctx->vs_shader == sel || !sel)
return;
 
sctx->vs_shader = sel;
@@ -2373,8 +2363,8 @@ static void si_bind_ps_shader(struct pipe_context *ctx, 
void *state)
if (sctx->ps_shader == sel)
return;
 
-   /* use dummy shader if supplied shader is corrupt */
-   if (!sel || !sel->current) {
+   /* use a dummy shader if binding a NULL shader */
+   if (!sel) {
if (!sctx->dummy_pixel_shader) {
sctx->dummy_pixel_shader =

util_make_fragment_cloneinput_shader(&sctx->b.b, 0,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/13] radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLE

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

st/mesa has the same flag in its shader key, we don't need to do it
in the driver anymore.

Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets.
---
 src/gallium/drivers/radeonsi/si_shader.c | 4 ++--
 src/gallium/drivers/radeonsi/si_shader.h | 1 -
 src/gallium/drivers/radeonsi/si_state.c  | 2 --
 3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6372ccf..69382bd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -501,7 +501,7 @@ static void declare_input_fs(
interp_param = 0;
break;
case TGSI_INTERPOLATE_LINEAR:
-   if (si_shader_ctx->shader->key.ps.interp_at_sample)
+   if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_SAMPLE)
interp_param = LLVMGetParam(main_fn, 
SI_PARAM_LINEAR_SAMPLE);
else if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_CENTROID)
interp_param = LLVMGetParam(main_fn, 
SI_PARAM_LINEAR_CENTROID);
@@ -515,7 +515,7 @@ static void declare_input_fs(
}
/* fall through to perspective */
case TGSI_INTERPOLATE_PERSPECTIVE:
-   if (si_shader_ctx->shader->key.ps.interp_at_sample)
+   if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_SAMPLE)
interp_param = LLVMGetParam(main_fn, 
SI_PARAM_PERSP_SAMPLE);
else if (decl->Interp.Location == TGSI_INTERPOLATE_LOC_CENTROID)
interp_param = LLVMGetParam(main_fn, 
SI_PARAM_PERSP_CENTROID);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index c46e649..d9a89e3 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -137,7 +137,6 @@ union si_shader_key {
unsignedcolor_two_side:1;
unsignedalpha_func:3;
unsignedflatshade:1;
-   unsignedinterp_at_sample:1;
unsignedalpha_to_one:1;
} ps;
struct {
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 46dbca3..88a50f3 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2221,8 +2221,6 @@ static INLINE void si_shader_selector_key(struct 
pipe_context *ctx,
if (sctx->queued.named.rasterizer) {
key->ps.color_two_side = 
sctx->queued.named.rasterizer->two_side;
key->ps.flatshade = 
sctx->queued.named.rasterizer->flatshade;
-   key->ps.interp_at_sample = sctx->framebuffer.nr_samples 
> 1 &&
-  sctx->ps_iter_samples == 
sctx->framebuffer.nr_samples;
 
if (sctx->queued.named.blend) {
key->ps.alpha_to_one = 
sctx->queued.named.blend->alpha_to_one &&
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/13] radeonsi: set number of userdata SGPRs of GS copy shader to 4

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

It only needs the constant buffer with clip planes and read-write resources
for the GS->VS ring and streamout. That's 2 pointers.
---
 src/gallium/drivers/radeonsi/si_shader.c |  9 -
 src/gallium/drivers/radeonsi/si_shader.h | 18 ++
 src/gallium/drivers/radeonsi/si_state_draw.c |  6 +-
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4e8f80f..8680824 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2402,8 +2402,8 @@ static void create_function(struct si_shader_context 
*si_shader_ctx)
v8i32 = LLVMVectorType(i32, 8);
v16i8 = LLVMVectorType(i8, 16);
 
-   params[SI_PARAM_CONST] = const_array(v16i8, SI_NUM_CONST_BUFFERS);
params[SI_PARAM_RW_BUFFERS] = const_array(v16i8, SI_NUM_RW_BUFFERS);
+   params[SI_PARAM_CONST] = const_array(v16i8, SI_NUM_CONST_BUFFERS);
params[SI_PARAM_SAMPLER] = const_array(v4i32, SI_NUM_SAMPLER_STATES);
params[SI_PARAM_RESOURCE] = const_array(v8i32, SI_NUM_SAMPLER_VIEWS);
last_array_pointer = SI_PARAM_RESOURCE;
@@ -2415,10 +2415,16 @@ static void create_function(struct si_shader_context 
*si_shader_ctx)
params[SI_PARAM_BASE_VERTEX] = i32;
params[SI_PARAM_START_INSTANCE] = i32;
num_params = SI_PARAM_START_INSTANCE+1;
+
if (shader->key.vs.as_es) {
params[SI_PARAM_ES2GS_OFFSET] = i32;
num_params++;
} else {
+   if (shader->is_gs_copy_shader) {
+   last_array_pointer = SI_PARAM_CONST;
+   num_params = SI_PARAM_CONST+1;
+   }
+
/* The locations of the other parameters are assigned 
dynamically. */
 
/* Streamout SGPRs. */
@@ -2716,6 +2722,7 @@ static int si_generate_gs_copy_shader(struct si_screen 
*sscreen,
outputs = MALLOC(gs->noutput * sizeof(outputs[0]));
 
si_shader_ctx->type = TGSI_PROCESSOR_VERTEX;
+   shader->is_gs_copy_shader = true;
 
radeon_llvm_context_init(&si_shader_ctx->radeon_bld);
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index c0e5cf4..11e5ae0 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -33,10 +33,10 @@
 #include "tgsi/tgsi_scan.h"
 #include "si_state.h"
 
-#define SI_SGPR_CONST  0
-#define SI_SGPR_SAMPLER2
-#define SI_SGPR_RESOURCE   4
-#define SI_SGPR_RW_BUFFERS 6  /* rings (& stream-out, VS only) */
+#define SI_SGPR_RW_BUFFERS 0  /* rings (& stream-out, VS only) */
+#define SI_SGPR_CONST  2
+#define SI_SGPR_SAMPLER4
+#define SI_SGPR_RESOURCE   6
 #define SI_SGPR_VERTEX_BUFFER  8  /* VS only */
 #define SI_SGPR_BASE_VERTEX10 /* VS only */
 #define SI_SGPR_START_INSTANCE 11 /* VS only */
@@ -44,13 +44,14 @@
 
 #define SI_VS_NUM_USER_SGPR12
 #define SI_GS_NUM_USER_SGPR8
+#define SI_GSCOPY_NUM_USER_SGPR4
 #define SI_PS_NUM_USER_SGPR9
 
 /* LLVM function parameter indices */
-#define SI_PARAM_CONST 0
-#define SI_PARAM_SAMPLER   1
-#define SI_PARAM_RESOURCE  2
-#define SI_PARAM_RW_BUFFERS3
+#define SI_PARAM_RW_BUFFERS0
+#define SI_PARAM_CONST 1
+#define SI_PARAM_SAMPLER   2
+#define SI_PARAM_RESOURCE  3
 
 /* VS only parameters */
 #define SI_PARAM_VERTEX_BUFFER 4
@@ -183,6 +184,7 @@ struct si_shader {
boolvs_out_layer;
unsignednr_pos_exports;
unsignedclip_dist_write;
+   boolis_gs_copy_shader;
 };
 
 static inline struct si_shader* si_get_vs_state(struct si_context *sctx)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 6ad2df0..e8d84a9 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -166,7 +166,11 @@ static void si_shader_vs(struct pipe_context *ctx, struct 
si_shader *shader)
 
vgpr_comp_cnt = shader->uses_instanceid ? 3 : 0;
 
-   num_user_sgprs = SI_VS_NUM_USER_SGPR;
+   if (shader->is_gs_copy_shader)
+   num_user_sgprs = SI_GSCOPY_NUM_USER_SGPR;
+   else
+   num_user_sgprs = SI_VS_NUM_USER_SGPR;
+
num_sgprs = shader->num_sgprs;
if (num_user_sgprs > num_sgprs) {
/* Last 2 reserved SGPRs are used for VCC */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/13] radeonsi: remove unused variable si_shader::gs_input_prim

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 2 --
 src/gallium/drivers/radeonsi/si_shader.h | 1 -
 2 files changed, 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index e76b969..c5f13be 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2867,8 +2867,6 @@ int si_shader_create(struct si_screen *sscreen, struct 
si_shader *shader)
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs;
bld_base->emit_epilogue = si_llvm_emit_gs_epilogue;
 
-   shader->gs_input_prim =
-   sel->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0];
shader->gs_output_prim =
sel->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0];
shader->gs_max_out_vertices =
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index c6026bd..827f79e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -172,7 +172,6 @@ struct si_shader {
struct si_shader_output output[40];
 
/* geometry shader properties */
-   unsignedgs_input_prim;
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
uint64_tgs_used_inputs; /* mask of "get_unique_index" 
bits */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/13] radeonsi: don't flush shader caches when building PM4 shader states

2014-09-30 Thread Marek Olšák

From: Marek Olšák 

This is a wrong place to flush caches to say the least.

I don't think we need to flush the instruction caches if we don't patch
shaders with DMA.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 2881199..6ad2df0 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -75,8 +75,6 @@ static void si_shader_es(struct pipe_context *ctx, struct 
si_shader *shader)
   S_00B328_VGPR_COMP_CNT(vgpr_comp_cnt));
si_pm4_set_reg(pm4, R_00B32C_SPI_SHADER_PGM_RSRC2_ES,
   S_00B32C_USER_SGPR(num_user_sgprs));
-
-   sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE;
 }
 
 static void si_shader_gs(struct pipe_context *ctx, struct si_shader *shader)
@@ -147,8 +145,6 @@ static void si_shader_gs(struct pipe_context *ctx, struct 
si_shader *shader)
   S_00B228_SGPRS((num_sgprs - 1) / 8));
si_pm4_set_reg(pm4, R_00B22C_SPI_SHADER_PGM_RSRC2_GS,
   S_00B22C_USER_SGPR(num_user_sgprs));
-
-   sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE;
 }
 
 static void si_shader_vs(struct pipe_context *ctx, struct si_shader *shader)
@@ -223,8 +219,6 @@ static void si_shader_vs(struct pipe_context *ctx, struct 
si_shader *shader)
   S_00B12C_SO_BASE2_EN(!!shader->selector->so.stride[2]) |
   S_00B12C_SO_BASE3_EN(!!shader->selector->so.stride[3]) |
   S_00B12C_SO_EN(!!shader->selector->so.num_outputs));
-
-   sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE;
 }
 
 static void si_shader_ps(struct pipe_context *ctx, struct si_shader *shader)
@@ -305,8 +299,6 @@ static void si_shader_ps(struct pipe_context *ctx, struct 
si_shader *shader)
si_pm4_set_reg(pm4, R_00B02C_SPI_SHADER_PGM_RSRC2_PS,
   S_00B02C_EXTRA_LDS_SIZE(shader->lds_size) |
   S_00B02C_USER_SGPR(num_user_sgprs));
-
-   sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE;
 }
 
 /*
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=81680

--- Comment #41 from Ernst Sjöstrand  ---
No longer crashes after applying the patch here!

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: relax draw api validation on ES2

2014-09-30 Thread Tapani


On 09/30/2014 06:13 PM, Ian Romanick wrote:

On 09/30/2014 12:28 AM, Tapani Pälli wrote:

Patch fixes failing test in WebGL conformance test
'point-no-attributes' when running Chrome on OpenGL ES.
(Shader program may draw points using constant data in shader.)

No Piglit regressions.

This sounds believable.  Did you also try the ES2 or ES3 conformance
suite?  I could have sworn that we had a bug related to this a long time
ago, and we discovered it using the conformance suite.


Did not check non-web conformance suite but I can give it a try.


Either way, we should get a piglit test too... I think we have a test
for desktop OpenGL (maybe 3.1?), so it shouldn't be too hard to adapt that.


OK, I will make tests (for existing fixes + there's still bunch of other 
failures left to fix). So far I've just used the conformance tests 
online to test these change and Piglit to catch possible regressions.



Signed-off-by: Tapani Pälli 
---
  src/mesa/main/api_validate.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 51a3d1f..9b80600 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -112,9 +112,8 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
  
 switch (ctx->API) {

 case API_OPENGLES2:
-  /* For ES2, we can draw if any vertex array is enabled (and we
-   * should always have a vertex program/shader). */
-  if (ctx->Array.VAO->_Enabled == 0x0 || !ctx->VertexProgram._Current)
+  /* For ES2, we can draw if we have a vertex program/shader). */
+  if (!ctx->VertexProgram._Current)
 return GL_FALSE;
break;
  



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64

2014-09-30 Thread Tom Stellard

On Tue, Sep 30, 2014 at 12:29:52PM -0400, Ilia Mirkin wrote:
> Perhaps do the same thing as util_bitcount, i.e.
> 
> #if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304)
>   return __builtin_popcountll(n);
> #else
> ...
> #endif
> 
> Perhaps the gcc version check is no longer necessary, unlikely
> anyone's using gcc3.3 or earlier at this point. But whatever.
>

I saw a patch from Matt recently that added autoconf checks for a bunch
of different builtin functions, I think we should use those instead.

-Tom

> On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák  wrote:
> > From: Marek Olšák 
> >
> > I'll need this in radeonsi.
> > ---
> >  src/gallium/auxiliary/util/u_math.h | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/src/gallium/auxiliary/util/u_math.h 
> > b/src/gallium/auxiliary/util/u_math.h
> > index 39bd40f..48d5c31 100644
> > --- a/src/gallium/auxiliary/util/u_math.h
> > +++ b/src/gallium/auxiliary/util/u_math.h
> > @@ -727,6 +727,14 @@ util_bitcount(unsigned n)
> >  #endif
> >  }
> >
> > +
> > +static INLINE unsigned
> > +util_bitcount64(uint64_t n)
> > +{
> > +   return util_bitcount(n) + util_bitcount(n >> 32);
> > +}
> > +
> > +
> >  /**
> >   * Reverse bits in n
> >   * Algorithm taken from:
> > --
> > 1.9.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64

2014-09-30 Thread Ilia Mirkin

On Tue, Sep 30, 2014 at 1:14 PM, Tom Stellard  wrote:
> On Tue, Sep 30, 2014 at 12:29:52PM -0400, Ilia Mirkin wrote:
>> Perhaps do the same thing as util_bitcount, i.e.
>>
>> #if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION >= 304)
>>   return __builtin_popcountll(n);
>> #else
>> ...
>> #endif
>>
>> Perhaps the gcc version check is no longer necessary, unlikely
>> anyone's using gcc3.3 or earlier at this point. But whatever.
>>
>
> I saw a patch from Matt recently that added autoconf checks for a bunch
> of different builtin functions, I think we should use those instead.

That sounds way better, but these version checks are all over u_math.h
-- feels like a separate cleanup, not necessary to saddle this simple
change with working out how autoconf works :) But if Marek wants to do
it, I won't object...

>
> -Tom
>
>> On Tue, Sep 30, 2014 at 12:26 PM, Marek Olšák  wrote:
>> > From: Marek Olšák 
>> >
>> > I'll need this in radeonsi.
>> > ---
>> >  src/gallium/auxiliary/util/u_math.h | 8 
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/src/gallium/auxiliary/util/u_math.h 
>> > b/src/gallium/auxiliary/util/u_math.h
>> > index 39bd40f..48d5c31 100644
>> > --- a/src/gallium/auxiliary/util/u_math.h
>> > +++ b/src/gallium/auxiliary/util/u_math.h
>> > @@ -727,6 +727,14 @@ util_bitcount(unsigned n)
>> >  #endif
>> >  }
>> >
>> > +
>> > +static INLINE unsigned
>> > +util_bitcount64(uint64_t n)
>> > +{
>> > +   return util_bitcount(n) + util_bitcount(n >> 32);
>> > +}
>> > +
>> > +
>> >  /**
>> >   * Reverse bits in n
>> >   * Algorithm taken from:
>> > --
>> > 1.9.1
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl()

2014-09-30 Thread Charmaine Lee

Reviewed-by: Charmaine Lee 

From: mesa-dev  on behalf of Brian Paul 

Sent: Tuesday, September 30, 2014 9:31 AM
To: mesa-dev@lists.freedesktop.org
Subject: [Mesa-dev] [PATCH] tgsi: fix Semantic.Name assignment in   
tgsi_transform_input_decl()

Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC.
Fixes polygon stipple regression.
---
 src/gallium/auxiliary/tgsi/tgsi_transform.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.h 
b/src/gallium/auxiliary/tgsi/tgsi_transform.h
index bfcdd56..921aa90 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_transform.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_transform.h
@@ -120,7 +120,7 @@ tgsi_transform_input_decl(struct tgsi_transform_context 
*ctx,
decl.Declaration.File = TGSI_FILE_INPUT;
decl.Declaration.Interpolate = 1;
decl.Declaration.Semantic = 1;
-   decl.Semantic.Name = TGSI_SEMANTIC_GENERIC;
+   decl.Semantic.Name = sem_name;
decl.Semantic.Index = sem_index;
decl.Range.First =
decl.Range.Last = index;
--
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=iVNYIcCaC9TDvyNBQU%2F5q5NVsC01tSgJb3oX27T14ck%3D%0A&m=kdSMDzhhfBB7r7%2BtTT8ZJLsLWFgmZ6ruSleqmdygkOs%3D%0A&s=b935ac45947463251948f10239ee0f3612e74bc5601c500e5141ffdec63d0f32
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] llvmpipe: move lp_jit_screen_init() call after allocation of screen object

2014-09-30 Thread Roland Scheidegger

Am 30.09.2014 15:16, schrieb Brian Paul:
> The screen argument isn't actually used by lp_jit_screen_init() at this
> time, but let's move the call so that we pass a valid pointer.
> 
> v2: don't leak screen if lp_jit_screen_init() fails.
> ---
>  src/gallium/drivers/llvmpipe/lp_screen.c |8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 3025322..a264f99 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -557,9 +557,6 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
> return NULL;
>  #endif
>  
> -   if (!lp_jit_screen_init(screen))
> -  return NULL;
> -
>  #ifdef DEBUG
> LP_DEBUG = debug_get_flags_option("LP_DEBUG", lp_debug_flags, 0 );
>  #endif
> @@ -570,6 +567,11 @@ llvmpipe_create_screen(struct sw_winsys *winsys)
> if (!screen)
>return NULL;
>  
> +   if (!lp_jit_screen_init(screen)) {
> +  FREE(screen);
> +  return NULL;
> +   }
> +
> screen->winsys = winsys;
>  
> screen->base.destroy = llvmpipe_destroy_screen;
> 

Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/brw_reg: Make the accumulator register take an explicit width.

2014-09-30 Thread Jason Ekstrand

The big pile of patches I just pushed regresses about 25 piglit tests on
SNB.  This fixes the regressions.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 12 
 src/mesa/drivers/dri/i965/brw_reg.h|  5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  8 
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9f65b1f..89ac7e2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -636,7 +636,8 @@ fs_visitor::visit(ir_expression *ir)
 if (brw->gen >= 7)
no16("SIMD16 explicit accumulator operands unsupported\n");
 
-struct brw_reg acc = retype(brw_acc_reg(), this->result.type);
+struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
+this->result.type);
 
 emit(MUL(acc, op[0], op[1]));
 emit(MACH(reg_null_d, op[0], op[1]));
@@ -650,7 +651,8 @@ fs_visitor::visit(ir_expression *ir)
   if (brw->gen >= 7)
  no16("SIMD16 explicit accumulator operands unsupported\n");
 
-  struct brw_reg acc = retype(brw_acc_reg(), this->result.type);
+  struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
+  this->result.type);
 
   emit(MUL(acc, op[0], op[1]));
   emit(MACH(this->result, op[0], op[1]));
@@ -665,7 +667,8 @@ fs_visitor::visit(ir_expression *ir)
   if (brw->gen >= 7)
  no16("SIMD16 explicit accumulator operands unsupported\n");
 
-  struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD);
+  struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
+  BRW_REGISTER_TYPE_UD);
 
   emit(ADDC(reg_null_ud, op[0], op[1]));
   emit(MOV(this->result, fs_reg(acc)));
@@ -675,7 +678,8 @@ fs_visitor::visit(ir_expression *ir)
   if (brw->gen >= 7)
  no16("SIMD16 explicit accumulator operands unsupported\n");
 
-  struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD);
+  struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
+  BRW_REGISTER_TYPE_UD);
 
   emit(SUBB(reg_null_ud, op[0], op[1]));
   emit(MOV(this->result, fs_reg(acc)));
diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
b/src/mesa/drivers/dri/i965/brw_reg.h
index 2e110d6..19af0ae 100644
--- a/src/mesa/drivers/dri/i965/brw_reg.h
+++ b/src/mesa/drivers/dri/i965/brw_reg.h
@@ -639,9 +639,10 @@ brw_ip_reg(void)
 }
 
 static inline struct brw_reg
-brw_acc_reg(void)
+brw_acc_reg(unsigned width)
 {
-   return brw_vec8_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ACCUMULATOR, 0);
+   return brw_vecn_reg(width, BRW_ARCHITECTURE_REGISTER_FILE,
+   BRW_ARF_ACCUMULATOR, 0);
 }
 
 static inline struct brw_reg
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 9299029..f03cf4f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1455,7 +1455,7 @@ vec4_visitor::visit(ir_expression *ir)
 else
emit(MUL(result_dst, op[0], op[1]));
  } else {
-struct brw_reg acc = retype(brw_acc_reg(), result_dst.type);
+struct brw_reg acc = retype(brw_acc_reg(8), result_dst.type);
 
 emit(MUL(acc, op[0], op[1]));
 emit(MACH(dst_null_d(), op[0], op[1]));
@@ -1466,7 +1466,7 @@ vec4_visitor::visit(ir_expression *ir)
   }
   break;
case ir_binop_imul_high: {
-  struct brw_reg acc = retype(brw_acc_reg(), result_dst.type);
+  struct brw_reg acc = retype(brw_acc_reg(8), result_dst.type);
 
   emit(MUL(acc, op[0], op[1]));
   emit(MACH(result_dst, op[0], op[1]));
@@ -1478,14 +1478,14 @@ vec4_visitor::visit(ir_expression *ir)
   emit_math(SHADER_OPCODE_INT_QUOTIENT, result_dst, op[0], op[1]);
   break;
case ir_binop_carry: {
-  struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD);
+  struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_UD);
 
   emit(ADDC(dst_null_ud(), op[0], op[1]));
   emit(MOV(result_dst, src_reg(acc)));
   break;
}
case ir_binop_borrow: {
-  struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_UD);
+  struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_UD);
 
   emit(SUBB(dst_null_ud(), op[0], op[1]));
   emit(MOV(result_dst, src_reg(acc)));
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] galahad: fix indirect draw

2014-09-30 Thread sroland

From: Roland Scheidegger 

Need to unwrap the indirect resource otherwise bad things will happen.

Fixes random crashes and timeouts with piglit's arb_indirect_draw tests.
---
 src/gallium/drivers/galahad/glhd_context.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/galahad/glhd_context.c 
b/src/gallium/drivers/galahad/glhd_context.c
index 79d5495..37ea170 100644
--- a/src/gallium/drivers/galahad/glhd_context.c
+++ b/src/gallium/drivers/galahad/glhd_context.c
@@ -49,7 +49,7 @@ galahad_context_destroy(struct pipe_context *_pipe)
 
 static void
 galahad_context_draw_vbo(struct pipe_context *_pipe,
- const struct pipe_draw_info *info)
+ const struct pipe_draw_info *info)
 {
struct galahad_context *glhd_pipe = galahad_context(_pipe);
struct pipe_context *pipe = glhd_pipe->pipe;
@@ -58,7 +58,14 @@ galahad_context_draw_vbo(struct pipe_context *_pipe,
 * before drawing.
 */
 
-   pipe->draw_vbo(pipe, info);
+   if (info->indirect) {
+  struct pipe_draw_info info_unwrapped = *info;
+  info_unwrapped.indirect = galahad_resource_unwrap(info->indirect);
+  pipe->draw_vbo(pipe, &info_unwrapped);
+   }
+   else {
+  pipe->draw_vbo(pipe, info);
+   }
 }
 
 static struct pipe_query *
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] galahad: (trivial) handle cubemap arrays

2014-09-30 Thread sroland

From: Roland Scheidegger 

---
 src/gallium/drivers/galahad/glhd_screen.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/drivers/galahad/glhd_screen.c 
b/src/gallium/drivers/galahad/glhd_screen.c
index 5a91077..11ab1a9 100644
--- a/src/gallium/drivers/galahad/glhd_screen.c
+++ b/src/gallium/drivers/galahad/glhd_screen.c
@@ -176,6 +176,13 @@ galahad_screen_resource_create(struct pipe_screen *_screen,
   glhd_check("%u", templat->height0, == templat->width0);
   glhd_check("%u", templat->depth0,  == 1);
   glhd_check("%u", templat->array_size, == 6);
+   } else if (templat->target == PIPE_TEXTURE_CUBE_ARRAY) {
+  unsigned max_texture_cube_levels = screen->get_param(screen, 
PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS);
+  glhd_check("%u", templat->last_level, < max_texture_cube_levels);
+  glhd_check("%u", templat->width0,  <= (1 << (max_texture_cube_levels - 
1)));
+  glhd_check("%u", templat->height0, == templat->width0);
+  glhd_check("%u", templat->depth0,  == 1);
+  glhd_check("%u", templat->array_size, % 6 == 0);
} else if (templat->target == PIPE_TEXTURE_RECT) {
   unsigned max_texture_2d_levels = screen->get_param(screen, 
PIPE_CAP_MAX_TEXTURE_2D_LEVELS);
   glhd_check("%u", templat->last_level, == 0);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/brw_reg: Make the accumulator register take an explicit width.

2014-09-30 Thread Matt Turner

Assuming no regressions on other platforms:

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V3 1/4] mesa: Add new variables in gl_context to store sample layout

2014-09-30 Thread Anuj Phogat

On Mon, Sep 29, 2014 at 7:16 PM, Jordan Justen  wrote:
>
> On 2014-09-29 16:33:33, Anuj Phogat wrote:
> > SampleMap{2,4,8}x variables are used in later patches to implement
> > EXT_framebuffer_multisample_blit_scaled extension.
> >
> > V2: Use integer array instead of a string.
> > Bump up the comment.
> >
> > Signed-off-by: Anuj Phogat 
> > ---
> >  src/mesa/main/mtypes.h | 32 
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> > index 0d50be8..162dc44 100644
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -3608,6 +3608,38 @@ struct gl_constants
> > GLint MaxDepthTextureSamples;
> > GLint MaxIntegerSamples;
> >
> > +   /**
> > +* GL_EXT_texture_multisample_blit_scaled implementation assumes that
> > +* samples are laid out in a rectangular grid roughly corresponding to
> > +* sample locations within a pixel. Below SampleMap{2,4,8}x variables
> > +* are used to map indices of rectangular grid to sample numbers within
> > +* a pixel. This mapping of indices to sample numbers must be 
> > initialized
> > +* by the driver for the target hardware. For example, if we have the 8X
> > +* MSAA sample number layout (sample positions) for XYZ hardware:
> > +*
> > +*sample indices layout  sample number layout
> > +*-  -
> > +*| 0 | 1 |  | a | b |
> > +*-  -
> > +*| 2 | 3 |  | c | d |
> > +*-  -
> > +*| 4 | 5 |  | e | f |
> > +*-  -
> > +*| 6 | 7 |  | g | h |
> > +*-  -
> > +*
> > +* Where a,b,c,d,e,f,g,h are integers between [0-7].
> > +*
> > +* Then, initialize the SampleMap8x variable for XYZ hardware as shown
> > +* below:
> > +*SampleMap8x = {a, b, c, d, e, f, g, h};
> > +*
> > +* Follow the logic for other sample counts.
> > +*/
> > +   unsigned *SampleMap2x;
> > +   unsigned *SampleMap4x;
> > +   unsigned *SampleMap8x;
>
> Wouldn't uint8_t work given the 0-7 range?
>
Yes, that'll work.
> Also, I thought we could include the bytes directly in the structure,
> and let the driver write the bytes rather than setting a pointer to
> the array.
>
>uint8_t SampleMap2x[2];
>uint8_t SampleMap4x[4];
>uint8_t SampleMap8x[8];
>
I'll make the changes and send out a V4.

> -Jordan
>
> > /** GL_ARB_shader_atomic_counters */
> > GLuint MaxAtomicBufferBindings;
> > GLuint MaxAtomicBufferSize;
> > --
> > 1.9.3
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: add util_bitcount64

2014-09-30 Thread Matt Turner

On Tue, Sep 30, 2014 at 9:26 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> I'll need this in radeonsi.
> ---
>  src/gallium/auxiliary/util/u_math.h | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_math.h 
> b/src/gallium/auxiliary/util/u_math.h
> index 39bd40f..48d5c31 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -727,6 +727,14 @@ util_bitcount(unsigned n)
>  #endif
>  }
>
> +
> +static INLINE unsigned
> +util_bitcount64(uint64_t n)
> +{
> +   return util_bitcount(n) + util_bitcount(n >> 32);

There's a __builtin_popcountll that operates on a 64-bit value
directly. You should probably use that instead.  We already use it and
test for it with autoconf -- just check #ifdef
HAVE___BUILTIN_POPCOUNTLL.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 81680] [r600g] Firefox crashes with hardware acceleration turned on

2014-09-30 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=81680

--- Comment #42 from Benjamin Bellec  ---
(In reply to comment #40)
> Created attachment 107124 [details] [review]
> possible fix
> 
> Could you please test this patch?

Tested-by: Benjamin Bellec 

Your patch fixes the crash.
Tested on Evergreen.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits

2014-09-30 Thread Matt Turner

On Tue, Sep 30, 2014 at 8:50 AM, Ian Romanick  wrote:
> On 09/20/2014 07:41 PM, Matt Turner wrote:
>> On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes  wrote:
>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>>> index 79d2e94..c11ad4f 100644
>>> --- a/src/mesa/main/shaderapi.c
>>> +++ b/src/mesa/main/shaderapi.c
>>> @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void)
>>>  void
>>>  _mesa_init_shader_state(struct gl_context *ctx)
>>>  {
>>> +   int i;
>>
>> In context, this declaration looks odd. Move it below the two just
>> after this hunk?
>
> Not in core Mesa where we have to do dumb ol' C89. :(

Move it after the other two variable declarations...

   /* Device drivers may override these to control what kind of instructions
* are generated by the GLSL compiler.
*/
   struct gl_shader_compiler_options options;
   gl_shader_stage sh;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.

2014-09-30 Thread Kenneth Graunke

On Tuesday, September 30, 2014 10:33:42 AM Daniel Vetter wrote:
> On Tue, Sep 30, 2014 at 01:15:56AM -0700, Kenneth Graunke wrote:
> > Write-back caching cannot be used for buffers being scanned out by the
> > display engine; surfaces used for scan-out must be write-through or
> > uncached.  I originally chose WT for render targets because it works in
> > all cases.  However, we really want to use write-back caching where
> > possible, as it is more efficient.
> > 
> > Most renderbuffers are not used for scanout - off-screen FBOs certainly
> > are fine, and non-pageflipped backbuffers should be fine as well.  So
> > in most cases WB will work.  However, we don't know what will be used
> > for scan-out, so we instead simply use the PTE value specified by the
> > kernel, as it knows these things.
> > 
> > This matches our MOCS choice on Haswell.
> > 
> > Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5
> > in a microbenchmark (spotted by Eero Tamminen).  Improves performance
> > in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
> > Broadwell GT2.
> > 
> > Signed-off-by: Kenneth Graunke 
> > Reported-by: Eero Tamminen 
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Cc'd to stable because it's a pretty trivial change and provides a sizable
> > boost to performance on new hardware.
> 
> Both patches are Reviewed-by: Daniel Vetter 
> 
> Aside: Not using WT on display can lead to corruption (apparently bdw is
> fairly aggressive with writeback so hard to spot in reality), so imo
> definitely stable material.
> 
> With the hw display crc stuff we now support in the kernel/igt we could
> even write an automated testcase for these corruptions, but probably not
> worth the hassle.
> -Daniel

Well, we should have already been using WT when writing anything that hits the 
display.  The advantage of using the PTE entries' cache mode is that we should 
get WB for most (non-displayed) surfaces, but still get WT for anything 
displayed.

Thanks for the review!

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] SandyBridge's 'resinfo' -> returned value for SURFTYPE_BUFFER?

2014-09-30 Thread Matt Turner

On Tue, Sep 30, 2014 at 5:22 AM, Samuel Iglesias Gonsálvez
 wrote:
> Hello,
>
> I am looking at bug 57439 [0] where it shows an error
> in a piglit test [1] related to textureSize() function happening
> in Intel SandyBridge hardware.
>
> According to SNB's PRM documentation (vol4 part1 page 141), the
> returned value for SURFTYPE_BUFFER (the surface type used in the test)
> is not defined in the 'resinfo' message type. For IvyBridge's doc it is
> defined as the buffer size, which is calculated from combined
> Depth/Height/Width values.
>
> As it is not clear that SNB returns the same value than IVB for that
> kind of message and surface type, I send this email here asking for a
> clarification :-)

Yes, I can confirm that the internal BSpec says on Sandybridge resinfo
for SURFTYPE_BUFFER (and SURFTYPE_STRBUF, same thing?) returns
undefined results in all channels.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V4 1/4] mesa: Add new variables in gl_context to store sample layout

2014-09-30 Thread Anuj Phogat

SampleMap{2,4,8}x variables are used in later patches to implement
EXT_framebuffer_multisample_blit_scaled extension.

V2: Use integer array instead of a string.
Bump up the comment.

V3: Use uint8_t type array.

Signed-off-by: Anuj Phogat 
---
 src/mesa/main/mtypes.h | 32 
 1 file changed, 32 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 0d50be8..258531b 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3608,6 +3608,38 @@ struct gl_constants
GLint MaxDepthTextureSamples;
GLint MaxIntegerSamples;
 
+   /**
+* GL_EXT_texture_multisample_blit_scaled implementation assumes that
+* samples are laid out in a rectangular grid roughly corresponding to
+* sample locations within a pixel. Below SampleMap{2,4,8}x variables
+* are used to map indices of rectangular grid to sample numbers within
+* a pixel. This mapping of indices to sample numbers must be initialized
+* by the driver for the target hardware. For example, if we have the 8X
+* MSAA sample number layout (sample positions) for XYZ hardware:
+*
+*sample indices layout  sample number layout
+*-  -
+*| 0 | 1 |  | a | b |
+*-  -
+*| 2 | 3 |  | c | d |
+*-  -
+*| 4 | 5 |  | e | f |
+*-  -
+*| 6 | 7 |  | g | h |
+*-  -
+*
+* Where a,b,c,d,e,f,g,h are integers between [0-7].
+*
+* Then, initialize the SampleMap8x variable for XYZ hardware as shown
+* below:
+*SampleMap8x = {a, b, c, d, e, f, g, h};
+*
+* Follow the logic for other sample counts.
+*/
+   uint8_t SampleMap2x[2];
+   uint8_t SampleMap4x[4];
+   uint8_t SampleMap8x[8];
+
/** GL_ARB_shader_atomic_counters */
GLuint MaxAtomicBufferBindings;
GLuint MaxAtomicBufferSize;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V4 2/4] i965: Initialize the SampleMap{2, 4, 8}x variables

2014-09-30 Thread Anuj Phogat

with values specific to Intel hardware.

V2: Define and use gen6_get_sample_map() function to initialize
the variables.

V3: Change the function name to gen6_set_sample_maps() and use
memcpy() to fill in the data.

Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/brw_context.c|  8 
 src/mesa/drivers/dri/i965/brw_context.h|  2 +
 src/mesa/drivers/dri/i965/gen6_multisample_state.c | 45 ++
 3 files changed, 55 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 619f2d5..ebe6a50 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -406,6 +406,14 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx->Const.MaxDepthTextureSamples = max_samples;
ctx->Const.MaxIntegerSamples = max_samples;
 
+   /* gen6_set_sample_maps() sets SampleMap{2,4,8}x variables which are used
+* to map indices of rectangular grid to sample numbers within a pixel.
+* These variables are used by GL_EXT_framebuffer_multisample_blit_scaled
+* extension implementation. For more details see the comment above
+* gen6_set_sample_maps() definition.
+*/
+   gen6_set_sample_maps(ctx);
+
if (brw->gen >= 7)
   ctx->Const.MaxProgramTextureGatherComponents = 4;
else if (brw->gen == 6)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 5830aa99..e0f2e6b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1660,6 +1660,8 @@ gen6_get_sample_position(struct gl_context *ctx,
  struct gl_framebuffer *fb,
  GLuint index,
  GLfloat *result);
+void
+gen6_set_sample_maps(struct gl_context *ctx);
 
 /* gen8_multisample_state.c */
 void gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samp);
diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
index 429a590..ee20c08 100644
--- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
@@ -57,6 +57,51 @@ gen6_get_sample_position(struct gl_context *ctx,
 }
 
 /**
+ * Sample index layout shows the numbering of slots in a rectangular
+ * grid of samples with in a pixel. Sample number layout shows the
+ * rectangular grid of samples roughly corresponding to the real sample
+ * locations with in a pixel. Sample number layout matches the sample
+ * index layout in case of 2X and 4x MSAA, but they are different in
+ * case of 8X MSAA.
+ *
+ * 2X MSAA sample index / number layout
+ *   -
+ *   | 0 | 1 |
+ *   -
+ *
+ * 4X MSAA sample index / number layout
+ *   -
+ *   | 0 | 1 |
+ *   -
+ *   | 2 | 3 |
+ *   -
+ *
+ * 8X MSAA sample index layout8x MSAA sample number layout
+ *   -  -
+ *   | 0 | 1 |  | 5 | 2 |
+ *   -  -
+ *   | 2 | 3 |  | 4 | 6 |
+ *   -  -
+ *   | 4 | 5 |  | 0 | 3 |
+ *   -  -
+ *   | 6 | 7 |  | 7 | 1 |
+ *   -  -
+ *
+ * A sample map is used to map sample indices to sample numbers.
+ */
+void
+gen6_set_sample_maps(struct gl_context *ctx)
+{
+   uint8_t map_2x[2] = {0, 1};
+   uint8_t map_4x[4] = {0, 1, 2, 3};
+   uint8_t map_8x[8] = {5, 2, 4, 6, 0, 3, 7, 1};
+
+   memcpy(ctx->Const.SampleMap2x, map_2x, sizeof(map_2x));
+   memcpy(ctx->Const.SampleMap4x, map_4x, sizeof(map_4x));
+   memcpy(ctx->Const.SampleMap8x, map_8x, sizeof(map_8x));
+}
+
+/**
  * 3DSTATE_MULTISAMPLE
  */
 void
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V4 3/4] meta: Implement ext_framebuffer_multisample_blit_scaled extension

2014-09-30 Thread Anuj Phogat

Extension enables doing a multisample buffer resolve and buffer
scaling using a single glBlitFrameBuffer() call. Currently, we
have this extension implemented in BLORP which is only used by
SNB and IVB. This patch implements the extension in meta path
which makes it available to Broadwell.

Implementation features:
 - Supports scaled resolves of 2X, 4X and 8X multisample buffers.

 - Avoids unnecessary shader compilations by storing the pre compiled
   shaders for each supported sample count.

 - Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and
   GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed
   behavior in the extension's spec.

 - I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT
   filter. It made the edges in the image look little smoother but
   the image gets blurred causing no overall quality improvement.
   For now I have dropped the idea of doing different filtering for
   nicest filter.

V2:
 - Minor changes to simplify the fragment shader.
 - Refactor the code to move i965 specific sample_map computation out
   of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized
   by the driver.
 - Use a simple msaa resolve shader for scaled resolves with scaling
   factor = 1.0.

V3:
 - Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x
   variables and use it in fragment shader.

V4:
 - Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x
   variables.

Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/common/meta.h  |   6 ++
 src/mesa/drivers/common/meta_blit.c | 206 +---
 2 files changed, 199 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index edc3e8c..2c9517b 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -279,6 +279,12 @@ enum blit_msaa_shader {
BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_COPY_UINT,
BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_DEPTH_RESOLVE,
BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_DEPTH_COPY,
+   BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE,
+   BLIT_4X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE,
+   BLIT_8X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE,
+   BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE,
+   BLIT_4X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE,
+   BLIT_8X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE,
BLIT_MSAA_SHADER_COUNT,
 };
 
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index fc9848a..c7ff2b1 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -55,6 +55,179 @@
 #define OFFSET(FIELD) ((void *) offsetof(struct vertex, FIELD))
 
 static void
+setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
+   struct blit_state *blit,
+   struct gl_renderbuffer *src_rb,
+   GLenum target, GLenum filter)
+{
+   GLint loc_src_width, loc_src_height;
+   int i, samples;
+   int shader_offset = 0;
+   void *mem_ctx = ralloc_context(NULL);
+   char *fs_source;
+   char *name, *sample_number;
+   const uint8_t *sample_map;
+   char *sample_map_str = rzalloc_size(mem_ctx, 1);
+   char *sample_map_expr = rzalloc_size(mem_ctx, 1);
+   char *texel_fetch_macro = rzalloc_size(mem_ctx, 1);;
+   const char *vs_source;
+   const char *sampler_array_suffix = "";
+   const char *texcoord_type = "vec2";
+   float y_scale;
+   enum blit_msaa_shader shader_index;
+
+   assert(src_rb);
+   samples = MAX2(src_rb->NumSamples, 1);
+   y_scale = samples * 0.5;
+
+   /* We expect only power of 2 samples in source multisample buffer. */
+   assert((samples & (samples - 1)) == 0);
+   while (samples >> (shader_offset + 1)) {
+  shader_offset++;
+   }
+   /* Update the assert if we plan to support more than 8X MSAA. */
+   assert(shader_offset > 0 && shader_offset < 4);
+
+   assert(target == GL_TEXTURE_2D_MULTISAMPLE ||
+  target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY);
+
+   shader_index = BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE +
+  shader_offset - 1;
+
+   if (target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY) {
+  shader_index += BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE -
+  BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE;
+  sampler_array_suffix = "Array";
+  texcoord_type = "vec3";
+   }
+
+   if (blit->msaa_shaders[shader_index]) {
+  _mesa_UseProgram(blit->msaa_shaders[shader_index]);
+  /* Update the uniform values. */
+  loc_src_width =
+ glGetUniformLocation(blit->msaa_shaders[shader_index], "src_width");
+  loc_src_height =
+ glGetUniformLocation(blit->msaa_shaders[shader_index], "src_height");
+  glUniform1f(loc_src_width, src_rb->Width);
+  glUniform1f(loc_src_height, src_rb->Height);
+  return;
+   }
+
+   name = ralloc_asprintf(mem_ctx, "vec4 MSAA sc

[Mesa-dev] [PATCH V4 4/4] i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8

2014-09-30 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 046d2a1..10fe10e 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -256,8 +256,7 @@ intelInitExtensions(struct gl_context *ctx)
 
   ctx->Extensions.EXT_framebuffer_multisample = true;
   ctx->Extensions.EXT_transform_feedback = true;
-  if (brw->gen < 8)
- ctx->Extensions.EXT_framebuffer_multisample_blit_scaled = true;
+  ctx->Extensions.EXT_framebuffer_multisample_blit_scaled = true;
   ctx->Extensions.ARB_blend_func_extended = 
!driQueryOptionb(&brw->optionCache, "disable_blend_func_extended");
   ctx->Extensions.ARB_draw_buffers_blend = true;
   ctx->Extensions.ARB_ES3_compatibility = true;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6

2014-09-30 Thread Mathias Fröhlich


Jose,

On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote:
> That said, the way we use these things are still a bit in flux. Mathias 
> has some pending patches.   BTW, Mathis, should I submit your patches 
> for making llvmpipe thread safe?
Mesa day for me. I did double check the mesa compile with different
llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope
to have caught all configs), which did only require marginal rebase changes.
That's what I pushed finally.

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info

2014-09-30 Thread Roland Scheidegger

Am 30.09.2014 18:46, schrieb Marek Olšák:
> From: Marek Olšák 
> 
> Use an array of properties indexed by TGSI_PROPERTY_* definitions.
> ---
>  src/gallium/auxiliary/draw/draw_gs.c | 23 -
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c  | 15 +++---
>  src/gallium/auxiliary/tgsi/tgsi_scan.c   | 59 
> ++--
>  src/gallium/auxiliary/tgsi/tgsi_scan.h   |  6 +--
>  src/gallium/auxiliary/util/u_pstipple.c  |  8 +---
>  src/gallium/drivers/llvmpipe/lp_state_fs.c   | 10 +---
>  src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++---
>  src/gallium/drivers/r300/r300_fs.c   |  8 +---
>  src/gallium/drivers/radeonsi/si_shader.c | 53 +++--
>  9 files changed, 70 insertions(+), 136 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
> b/src/gallium/auxiliary/draw/draw_gs.c
> index 878fcca..0c2f892 100644
> --- a/src/gallium/auxiliary/draw/draw_gs.c
> +++ b/src/gallium/auxiliary/draw/draw_gs.c
> @@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw,
> tgsi_scan_shader(state->tokens, &gs->info);
>  
> /* setup the defaults */
> -   gs->input_primitive = PIPE_PRIM_TRIANGLES;
> -   gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP;
> -   gs->max_output_vertices = 32;
> gs->max_out_prims = 0;
>  
>  #ifdef HAVE_LLVM
> @@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw,
>gs->vector_length = 1;
> }
>  
> -   for (i = 0; i < gs->info.num_properties; ++i) {
> -  if (gs->info.properties[i].name ==
> -  TGSI_PROPERTY_GS_INPUT_PRIM)
> - gs->input_primitive = gs->info.properties[i].data[0];
> -  else if (gs->info.properties[i].name ==
> -   TGSI_PROPERTY_GS_OUTPUT_PRIM)
> - gs->output_primitive = gs->info.properties[i].data[0];
> -  else if (gs->info.properties[i].name ==
> -   TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES)
> - gs->max_output_vertices = gs->info.properties[i].data[0];
> -   }
> +   gs->input_primitive =
> + gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0];
> +   gs->output_primitive =
> + gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0];
> +   gs->max_output_vertices =
> + gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
> +   if (!gs->max_output_vertices)
> +  gs->max_output_vertices = 32;
> +
> /* Primitive boundary is bigger than max_output_vertices by one, because
>  * the specification says that the geometry shader should exit if the 
>  * number of emitted vertices is bigger or equal to max_output_vertices 
> and
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index c0bd7be..2d7f32d 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
> * were forgetting so we're using MAX_VERTEX_VARYING from
> * that spec even though we could debug_assert if it's not
> * set, but that's a lot uglier. */
> -  uint max_output_vertices = 32;
> -  uint i = 0;
> +  uint max_output_vertices;
> +
>/* inputs are always indirect with gs */
>bld.indirect_files |= (1 << TGSI_FILE_INPUT);
>bld.gs_iface = gs_iface;
> @@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
>bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex;
>bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive;
>  
> -  for (i = 0; i < info->num_properties; ++i) {
> - if (info->properties[i].name ==
> - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) {
> -max_output_vertices = info->properties[i].data[0];
> - }
> -  }
> +  max_output_vertices =
> +info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
> +  if (!max_output_vertices)
> + max_output_vertices = 32;
> +
>bld.max_output_vertices_vec =
>   lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type,
>  max_output_vertices);
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
> b/src/gallium/auxiliary/tgsi/tgsi_scan.c
> index c71bb36..f9d1896 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
> @@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
>   {
>  const struct tgsi_full_property *fullprop
> = &parse.FullToken.FullProperty;
> +unsigned name = fullprop->Property.PropertyName;
>  
> -info->properties[info->num_properties].name =
> -   fullprop->Property.PropertyName;
> -memcpy(info->properties[info->num_properties].data,
> -   fullprop->u, 8 * sizeof(unsigned));;
> -
> -++info->num_

Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6

2014-09-30 Thread Brian Paul


On 09/30/2014 01:00 PM, Mathias Fröhlich wrote:


Jose,

On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote:

That said, the way we use these things are still a bit in flux. Mathias
has some pending patches.   BTW, Mathis, should I submit your patches
for making llvmpipe thread safe?

Mesa day for me. I did double check the mesa compile with different
llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope
to have caught all configs), which did only require marginal rebase changes.
That's what I pushed finally.


My linux build is broken:


In file included from draw/draw_context.c:49:0:
./gallivm/lp_bld_init.h:47:4: error: unknown type name 
'LLVMMCJITMemoryManagerRef'



$ llvm-config --version
3.2

Yeah, it's a bit old, but it was working until now.

Some of our other automated builds are failing too, probably with newer 
LLVM versions...  Let me investigate.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6

2014-09-30 Thread Brian Paul


On 09/30/2014 01:16 PM, Brian Paul wrote:

On 09/30/2014 01:00 PM, Mathias Fröhlich wrote:


Jose,

On Wednesday, September 24, 2014 12:42:24 Jose Fonseca wrote:

That said, the way we use these things are still a bit in flux. Mathias
has some pending patches.   BTW, Mathis, should I submit your patches
for making llvmpipe thread safe?

Mesa day for me. I did double check the mesa compile with different
llvm versions and the latest rebases (no, llvm, llvm-3.5, llvm-3.6 - hope
to have caught all configs), which did only require marginal rebase
changes.
That's what I pushed finally.


My linux build is broken:


In file included from draw/draw_context.c:49:0:
./gallivm/lp_bld_init.h:47:4: error: unknown type name
'LLVMMCJITMemoryManagerRef'


$ llvm-config --version
3.2

Yeah, it's a bit old, but it was working until now.

Some of our other automated builds are failing too, probably with newer
LLVM versions...  Let me investigate.


Same failure with LLVM 3.3.1 too.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6

2014-09-30 Thread Mathias Fröhlich


Hi,

On Tuesday, September 30, 2014 13:17:31 Brian Paul wrote:
> Same failure with LLVM 3.3.1 too.
Ok, that's what I did not try.
Sorry. I will try to followup immediately ...

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: Disable gallivm to fix build with LLVM 3.6

2014-09-30 Thread Brian Paul


On 09/30/2014 01:26 PM, Mathias Fröhlich wrote:


Hi,

On Tuesday, September 30, 2014 13:17:31 Brian Paul wrote:

Same failure with LLVM 3.3.1 too.

Ok, that's what I did not try.
Sorry. I will try to followup immediately ...


Thanks, Mathias.  But I'm about to post a patch that fixes things for 
LLVM 3.2 for me...  Let me know what you think.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Brian Paul

Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h.  If we
don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid
excessive #ifdef testing elsewhere.
---
 src/gallium/auxiliary/gallivm/lp_bld.h  |   40 +++
 src/gallium/auxiliary/gallivm/lp_bld_init.c |   33 +-
 2 files changed, 41 insertions(+), 32 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h 
b/src/gallium/auxiliary/gallivm/lp_bld.h
index fcf4f16..3d156e8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld.h
@@ -58,6 +58,46 @@
 #endif
 
 
+/* Only MCJIT is available as of LLVM SVN r216982 */
+#if HAVE_LLVM >= 0x0306
+
+#define USE_MCJIT 1
+#define HAVE_AVX 1
+
+#else
+
+/**
+ * AVX is supported in:
+ * - standard JIT from LLVM 3.2 onwards
+ * - MC-JIT from LLVM 3.1
+ *   - MC-JIT supports limited OSes (MacOSX and Linux)
+ * - standard JIT in LLVM 3.1, with backports
+ */
+#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || 
defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
+#  define USE_MCJIT 1
+#  define HAVE_AVX 0
+#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && 
defined(HAVE_JIT_AVX_SUPPORT))
+#  define USE_MCJIT 0
+#  define HAVE_AVX 1
+#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE))
+#  define USE_MCJIT 1
+#  define HAVE_AVX 1
+#else
+#  define USE_MCJIT 0
+#  define HAVE_AVX 0
+#endif
+
+#endif /* HAVE_LLVM >= 0x0306 */
+
+
+#if !USE_MCJIT
+/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy
+ * typedef to simplify things elsewhere.
+ */
+typedef void *LLVMMCJITMemoryManagerRef;
+#endif
+
+
 /**
  * Redefine these LLVM entrypoints as invalid macros to make sure we
  * don't accidentally use them.  We need to use the functions which
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 4e4aecb..3be14c2 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -43,37 +43,6 @@
 #include 
 
 
-/* Only MCJIT is available as of LLVM SVN r216982 */
-#if HAVE_LLVM >= 0x0306
-
-#define USE_MCJIT 1
-#define HAVE_AVX 1
-
-#else
-
-/**
- * AVX is supported in:
- * - standard JIT from LLVM 3.2 onwards
- * - MC-JIT from LLVM 3.1
- *   - MC-JIT supports limited OSes (MacOSX and Linux)
- * - standard JIT in LLVM 3.1, with backports
- */
-#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || 
defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
-#  define USE_MCJIT 1
-#  define HAVE_AVX 0
-#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && 
defined(HAVE_JIT_AVX_SUPPORT))
-#  define USE_MCJIT 0
-#  define HAVE_AVX 1
-#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE))
-#  define USE_MCJIT 1
-#  define HAVE_AVX 1
-#else
-#  define USE_MCJIT 0
-#  define HAVE_AVX 0
-#endif
-
-#endif /* HAVE_LLVM >= 0x0306 */
-
 #if USE_MCJIT
 void LLVMLinkInMCJIT();
 #endif
@@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm)
assert(!gallivm->engine);
lp_free_generated_code(gallivm->code);
gallivm->code = NULL;
-#if HAVE_LLVM < 0x0306
+#if USE_MCJIT
LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
gallivm->memorymgr = NULL;
 #endif
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Use BDW_MOCS_PTE for renderbuffers.

2014-09-30 Thread Kristian Høgsberg

On Tue, Sep 30, 2014 at 1:15 AM, Kenneth Graunke  wrote:
> Write-back caching cannot be used for buffers being scanned out by the
> display engine; surfaces used for scan-out must be write-through or
> uncached.  I originally chose WT for render targets because it works in
> all cases.  However, we really want to use write-back caching where
> possible, as it is more efficient.
>
> Most renderbuffers are not used for scanout - off-screen FBOs certainly
> are fine, and non-pageflipped backbuffers should be fine as well.  So
> in most cases WB will work.  However, we don't know what will be used
> for scan-out, so we instead simply use the PTE value specified by the
> kernel, as it knows these things.
>
> This matches our MOCS choice on Haswell.
>
> Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5
> in a microbenchmark (spotted by Eero Tamminen).  Improves performance
> in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
> Broadwell GT2.
>
> Signed-off-by: Kenneth Graunke 
> Reported-by: Eero Tamminen 
> Cc: mesa-sta...@lists.freedesktop.org

That makes sense, good find from Eero.  I'll update the SKL MOCS accordingly.

Reviewed-by: Kristian Høgsberg 

>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Cc'd to stable because it's a pretty trivial change and provides a sizable
> boost to performance on new hardware.
>
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index 40eb2ea..6dd343f 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -377,7 +377,7 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
>   horizontal_alignment(mt) |
>   surface_tiling_mode(tiling);
>
> -   surf[1] = SET_FIELD(BDW_MOCS_WT, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
> +   surf[1] = SET_FIELD(BDW_MOCS_PTE, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
>
> surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
>   SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
> --
> 2.1.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/13] tgsi: simplify shader properties in tgsi_shader_info

2014-09-30 Thread Marek Olšák

On Tue, Sep 30, 2014 at 9:04 PM, Roland Scheidegger  wrote:
> Am 30.09.2014 18:46, schrieb Marek Olšák:
>> From: Marek Olšák 
>>
>> Use an array of properties indexed by TGSI_PROPERTY_* definitions.
>> ---
>>  src/gallium/auxiliary/draw/draw_gs.c | 23 -
>>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c  | 15 +++---
>>  src/gallium/auxiliary/tgsi/tgsi_scan.c   | 59 
>> ++--
>>  src/gallium/auxiliary/tgsi/tgsi_scan.h   |  6 +--
>>  src/gallium/auxiliary/util/u_pstipple.c  |  8 +---
>>  src/gallium/drivers/llvmpipe/lp_state_fs.c   | 10 +---
>>  src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 24 +++---
>>  src/gallium/drivers/r300/r300_fs.c   |  8 +---
>>  src/gallium/drivers/radeonsi/si_shader.c | 53 +++--
>>  9 files changed, 70 insertions(+), 136 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
>> b/src/gallium/auxiliary/draw/draw_gs.c
>> index 878fcca..0c2f892 100644
>> --- a/src/gallium/auxiliary/draw/draw_gs.c
>> +++ b/src/gallium/auxiliary/draw/draw_gs.c
>> @@ -750,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw,
>> tgsi_scan_shader(state->tokens, &gs->info);
>>
>> /* setup the defaults */
>> -   gs->input_primitive = PIPE_PRIM_TRIANGLES;
>> -   gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP;
>> -   gs->max_output_vertices = 32;
>> gs->max_out_prims = 0;
>>
>>  #ifdef HAVE_LLVM
>> @@ -768,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw,
>>gs->vector_length = 1;
>> }
>>
>> -   for (i = 0; i < gs->info.num_properties; ++i) {
>> -  if (gs->info.properties[i].name ==
>> -  TGSI_PROPERTY_GS_INPUT_PRIM)
>> - gs->input_primitive = gs->info.properties[i].data[0];
>> -  else if (gs->info.properties[i].name ==
>> -   TGSI_PROPERTY_GS_OUTPUT_PRIM)
>> - gs->output_primitive = gs->info.properties[i].data[0];
>> -  else if (gs->info.properties[i].name ==
>> -   TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES)
>> - gs->max_output_vertices = gs->info.properties[i].data[0];
>> -   }
>> +   gs->input_primitive =
>> + gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM][0];
>> +   gs->output_primitive =
>> + gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM][0];
>> +   gs->max_output_vertices =
>> + gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
>> +   if (!gs->max_output_vertices)
>> +  gs->max_output_vertices = 32;
>> +
>> /* Primitive boundary is bigger than max_output_vertices by one, because
>>  * the specification says that the geometry shader should exit if the
>>  * number of emitted vertices is bigger or equal to max_output_vertices 
>> and
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
>> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> index c0bd7be..2d7f32d 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> @@ -3855,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
>> * were forgetting so we're using MAX_VERTEX_VARYING from
>> * that spec even though we could debug_assert if it's not
>> * set, but that's a lot uglier. */
>> -  uint max_output_vertices = 32;
>> -  uint i = 0;
>> +  uint max_output_vertices;
>> +
>>/* inputs are always indirect with gs */
>>bld.indirect_files |= (1 << TGSI_FILE_INPUT);
>>bld.gs_iface = gs_iface;
>> @@ -3864,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
>>bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex;
>>bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive;
>>
>> -  for (i = 0; i < info->num_properties; ++i) {
>> - if (info->properties[i].name ==
>> - TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) {
>> -max_output_vertices = info->properties[i].data[0];
>> - }
>> -  }
>> +  max_output_vertices =
>> +info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES][0];
>> +  if (!max_output_vertices)
>> + max_output_vertices = 32;
>> +
>>bld.max_output_vertices_vec =
>>   lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type,
>>  max_output_vertices);
>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
>> b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> index c71bb36..f9d1896 100644
>> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> @@ -277,13 +277,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
>>   {
>>  const struct tgsi_full_property *fullprop
>> = &parse.FullToken.FullProperty;
>> +unsigned name = fullprop->Property.PropertyName;
>>
>> -info->properties[info->num_properties].name =
>> -   fullprop->Property.PropertyName;
>>

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Mathias Fröhlich


Hi Brian,

On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote:
> Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h.  If we
> don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid
> excessive #ifdef testing elsewhere.
[...]
> @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm)
> assert(!gallivm->engine);
> lp_free_generated_code(gallivm->code);
> gallivm->code = NULL;
> -#if HAVE_LLVM < 0x0306
> +#if USE_MCJIT
We will probably still need the < 0x0306 check:
#if HAVE_LLVM < 0x0306 && USE_MCJIT
since this memorymanager stuff just vanished in the way 3.5 implemented
this with version 3.6.

> LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
> gallivm->memorymgr = NULL;
>  #endif
> 

Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager
call under some configurations with MCJIT and older llvm.
So, additionally to what you had, how about the attached one?
I am still trying to verify this change against 3.5 and 3.6.
I am not sure about 3.2 since it did not build out of the box with
my configure line.

Greetings

Mathias>From 915222ba9ed262d4c8deeafd2bfd530bb0a769ba Mon Sep 17 00:00:00 2001
Message-Id: <915222ba9ed262d4c8deeafd2bfd530bb0a769ba.1412109598.git.mathias.froehl...@gmx.net>
From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= 
Date: Tue, 30 Sep 2014 22:11:30 +0200
Subject: [PATCH] gallivm: fix build for LLVM 3.2

---
 src/gallium/auxiliary/gallivm/lp_bld.h| 40 +++
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 37 ++---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp |  9 ++
 src/gallium/auxiliary/gallivm/lp_bld_misc.h   |  3 ++
 4 files changed, 54 insertions(+), 35 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h
index fcf4f16..218f537 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld.h
@@ -58,6 +58,46 @@
 #endif
 
 
+/* Only MCJIT is available as of LLVM SVN r216982 */
+#if HAVE_LLVM >= 0x0306
+
+#define USE_MCJIT 1
+#define HAVE_AVX 1
+
+#else
+
+/**
+ * AVX is supported in:
+ * - standard JIT from LLVM 3.2 onwards
+ * - MC-JIT from LLVM 3.1
+ *   - MC-JIT supports limited OSes (MacOSX and Linux)
+ * - standard JIT in LLVM 3.1, with backports
+ */
+#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
+#  define USE_MCJIT 1
+#  define HAVE_AVX 0
+#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT))
+#  define USE_MCJIT 0
+#  define HAVE_AVX 1
+#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE))
+#  define USE_MCJIT 1
+#  define HAVE_AVX 1
+#else
+#  define USE_MCJIT 0
+#  define HAVE_AVX 0
+#endif
+
+#endif /* HAVE_LLVM >= 0x0306 */
+
+
+#if HAVE_LLVM <= 0x0303
+/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy
+ * typedef to simplify things elsewhere.
+ */
+typedef void *LLVMMCJITMemoryManagerRef;
+#endif
+
+
 /**
  * Redefine these LLVM entrypoints as invalid macros to make sure we
  * don't accidentally use them.  We need to use the functions which
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 4e4aecb..2f5b4ba 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -43,37 +43,6 @@
 #include 
 
 
-/* Only MCJIT is available as of LLVM SVN r216982 */
-#if HAVE_LLVM >= 0x0306
-
-#define USE_MCJIT 1
-#define HAVE_AVX 1
-
-#else
-
-/**
- * AVX is supported in:
- * - standard JIT from LLVM 3.2 onwards
- * - MC-JIT from LLVM 3.1
- *   - MC-JIT supports limited OSes (MacOSX and Linux)
- * - standard JIT in LLVM 3.1, with backports
- */
-#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
-#  define USE_MCJIT 1
-#  define HAVE_AVX 0
-#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT))
-#  define USE_MCJIT 0
-#  define HAVE_AVX 1
-#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE))
-#  define USE_MCJIT 1
-#  define HAVE_AVX 1
-#else
-#  define USE_MCJIT 0
-#  define HAVE_AVX 0
-#endif
-
-#endif /* HAVE_LLVM >= 0x0306 */
-
 #if USE_MCJIT
 void LLVMLinkInMCJIT();
 #endif
@@ -219,10 +188,8 @@ gallivm_free_code(struct gallivm_state *gallivm)
assert(!gallivm->engine);
lp_free_generated_code(gallivm->code);
gallivm->code = NULL;
-#if HAVE_LLVM < 0x0306
-   LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
+   lp_free_memory_manager(gallivm->memorymgr);
gallivm->memorymgr = NULL;
-#endif
 }
 
 
@@ -317,7 +284,7 @@ init_gallivm_state(struct gallivm_state *gallivm, const char *name,
if (!gallivm->builder)
   goto fail;
 
-#if HAVE_LLVM < 0x0306
+#if USE_MCJIT && HAVE_LLVM < 0x0306
gallivm->memorymgr = lp_get_default_memory_manager();
if (!gallivm->memorymgr)
   goto fail

Re: [Mesa-dev] [PATCH 2/2] galahad: fix indirect draw

2014-09-30 Thread Jose Fonseca

Series looks good. Thanks for looking into this Roland.

It looks nobody else is using galahad, nor looking at the warnings.  I wonder 
if it makes sense to keep using/updating it.

Jose


From: srol...@vmware.com 
Sent: 30 September 2014 19:07
To: Jose Fonseca; mesa-dev@lists.freedesktop.org
Cc: Roland Scheidegger
Subject: [PATCH 2/2] galahad: fix indirect draw

From: Roland Scheidegger 

Need to unwrap the indirect resource otherwise bad things will happen.

Fixes random crashes and timeouts with piglit's arb_indirect_draw tests.
---
 src/gallium/drivers/galahad/glhd_context.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/galahad/glhd_context.c 
b/src/gallium/drivers/galahad/glhd_context.c
index 79d5495..37ea170 100644
--- a/src/gallium/drivers/galahad/glhd_context.c
+++ b/src/gallium/drivers/galahad/glhd_context.c
@@ -49,7 +49,7 @@ galahad_context_destroy(struct pipe_context *_pipe)

 static void
 galahad_context_draw_vbo(struct pipe_context *_pipe,
- const struct pipe_draw_info *info)
+ const struct pipe_draw_info *info)
 {
struct galahad_context *glhd_pipe = galahad_context(_pipe);
struct pipe_context *pipe = glhd_pipe->pipe;
@@ -58,7 +58,14 @@ galahad_context_draw_vbo(struct pipe_context *_pipe,
 * before drawing.
 */

-   pipe->draw_vbo(pipe, info);
+   if (info->indirect) {
+  struct pipe_draw_info info_unwrapped = *info;
+  info_unwrapped.indirect = galahad_resource_unwrap(info->indirect);
+  pipe->draw_vbo(pipe, &info_unwrapped);
+   }
+   else {
+  pipe->draw_vbo(pipe, info);
+   }
 }

 static struct pipe_query *
--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Brian Paul


On 09/30/2014 02:40 PM, Mathias Fröhlich wrote:


Hi Brian,

On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote:

Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h.  If we
don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid
excessive #ifdef testing elsewhere.

[...]

@@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm)
 assert(!gallivm->engine);
 lp_free_generated_code(gallivm->code);
 gallivm->code = NULL;
-#if HAVE_LLVM < 0x0306
+#if USE_MCJIT

We will probably still need the < 0x0306 check:
#if HAVE_LLVM < 0x0306 && USE_MCJIT
since this memorymanager stuff just vanished in the way 3.5 implemented
this with version 3.6.


 LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
 gallivm->memorymgr = NULL;
  #endif



Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager
call under some configurations with MCJIT and older llvm.
So, additionally to what you had, how about the attached one?
I am still trying to verify this change against 3.5 and 3.6.
I am not sure about 3.2 since it did not build out of the box with
my configure line.


It compiles, but I get a segfault when I try to run anything:


Program received signal SIGSEGV, Segmentation fault.
0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable 
(this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165

165  mgr()->setMemoryWritable();
(gdb) where
#0  0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable 
(this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165
#1  0x751579cc in (anonymous 
namespace)::JITEmitter::startFunction (this=0x727b20, F=...) at 
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JITEmitter.cpp:782
#2  0x7567e595 in (anonymous 
namespace)::Emitter::runOnMachineFunction 
(this=0x75dd50, MF=...) at 
/build/buildd/llvm-3.2-3.2/lib/Target/X86/X86CodeEmitter.cpp:145
#3  0x7501bbbf in runOnFunction (F=..., this=0x727980) at 
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1498
#4  llvm::FPPassManager::runOnFunction (this=0x727980, F=...) at 
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1476
#5  0x7501f2bb in llvm::FunctionPassManagerImpl::run 
(this=0x6fc950, F=...) at 
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1449
#6  0x7501f396 in llvm::FunctionPassManager::run (this=0x6f5d20, 
F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1379
#7  0x7514e637 in llvm::JIT::jitTheFunction 
(this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at 
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:645
#8  0x7514ec2f in llvm::JIT::runJITOnFunctionUnlocked 
(this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at 
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:624
#9  0x7514ed89 in llvm::JIT::getPointerToFunction 
(this=0x6fc800, F=0x769720) at 
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:681
#10 0x77941c04 in gallivm_jit_function (gallivm=0x6f5a90, 
func=0x769720) at gallivm/lp_bld_init.c:586
#11 0x779b54d2 in generate_variant (lp=0x61f750, 
shader=0x6fdd10, key=0x7fffd9a0) at lp_state_fs.c:2634
#12 0x779b6a77 in llvmpipe_update_fs (lp=0x61f750) at 
lp_state_fs.c:3166
#13 0x779ac7bb in llvmpipe_update_derived (llvmpipe=0x61f750) at 
lp_state_derived.c:186
#14 0x77984562 in llvmpipe_draw_vbo (pipe=0x61f750, 
info=0x7fffdcc0) at lp_draw_arrays.c:70
#15 0x7785c1d3 in cso_draw_vbo (cso=0x6b9260, 
info=0x7fffdcc0) at cso_cache/cso_context.c:1418
#16 0x7771a373 in st_draw_vbo (ctx=0x77ec4010, 
prims=0x6ab7c0, nr_prims=2, ib=0x0, index_bounds_valid=1 '\001', 
min_index=0, max_index=161, tfb_vertcount=0x0, indirect=0x0) at 
../../src/mesa/state_tracker/st_draw.c:285
#17 0x776f7e3f in vbo_save_playback_vertex_list 
(ctx=0x77ec4010, data=0x6ab3ec) at 
../../src/mesa/vbo/vbo_save_draw.c:310
#18 0x77524e18 in ext_opcode_execute (ctx=0x77ec4010, 
node=0x6ab3e8) at ../../src/mesa/main/dlist.c:658
#19 0x7753b5db in execute_list (ctx=0x77ec4010, list=1) at 
../../src/mesa/main/dlist.c:7692
#20 0x77541f2b in _mesa_CallList (list=1) at 
../../src/mesa/main/dlist.c:9121

#21 0x00402e2d in draw () at gears.c:196
#22 0x76ee1376 in processWindowWorkList (window=0x61b180) at 
glut_event.c:1307
#23 0x76ee232c in __glutProcessWindowWorkLists () at 
glut_event.c:1358

#24 glutMainLoop () at glut_event.c:1379
#25 0x004036ea in main (argc=1, argv=0x7fffe658) at gears.c:405

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 05/56] mesa/main: Add tessellation shader state and limits

2014-09-30 Thread Ian Romanick

On 09/30/2014 11:24 AM, Matt Turner wrote:
> On Tue, Sep 30, 2014 at 8:50 AM, Ian Romanick  wrote:
>> On 09/20/2014 07:41 PM, Matt Turner wrote:
>>> On Sat, Sep 20, 2014 at 6:40 PM, Chris Forbes  wrote:
 diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
 index 79d2e94..c11ad4f 100644
 --- a/src/mesa/main/shaderapi.c
 +++ b/src/mesa/main/shaderapi.c
 @@ -105,6 +105,7 @@ _mesa_get_shader_flags(void)
  void
  _mesa_init_shader_state(struct gl_context *ctx)
  {
 +   int i;
>>>
>>> In context, this declaration looks odd. Move it below the two just
>>> after this hunk?
>>
>> Not in core Mesa where we have to do dumb ol' C89. :(
> 
> Move it after the other two variable declarations...
> 
>/* Device drivers may override these to control what kind of instructions
> * are generated by the GLSL compiler.
> */
>struct gl_shader_compiler_options options;
>gl_shader_stage sh;

Oh... yeah, that's fine.  I misunderstood you before.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.

2014-09-30 Thread Anuj Phogat

On Sat, Sep 27, 2014 at 12:12 PM, Matt Turner  wrote:
> The next patch adds an algebraic optimization for the pattern
>
>sqrt a, b
>rcp  c, a
>
> and turns it into
>
>sqrt a, b
>rsq  c, b
>
> but many vertex shaders do
>
>a = sqrt(b);
>var1 /= a;
>var2 /= a;
>
> which generates
>
>sqrt a, b
>rcp  c, a
>rcp  d, a
>
> If we apply the algebraic optimization before CSE, we'll end up with
>
>sqrt a, b
>rsq  c, b
>rcp  d, a
>
> Applying CSE combines the RCP instructions, preventing this from
> happening.
>
> No shader-db changes.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 022ed37..e0a3d5f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1790,8 +1790,8 @@ vec4_visitor::run()
>OPT(dead_code_eliminate);
>OPT(dead_control_flow_eliminate, this);
>OPT(opt_copy_propagation);
> -  OPT(opt_algebraic);
>OPT(opt_cse);
> +  OPT(opt_algebraic);
>OPT(opt_register_coalesce);
> } while (progress);
>
> --
> 1.8.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


For the series:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.

2014-09-30 Thread Ian Romanick

On 09/27/2014 12:12 PM, Matt Turner wrote:
> The next patch adds an algebraic optimization for the pattern
> 
>sqrt a, b
>rcp  c, a
> 
> and turns it into
> 
>sqrt a, b
>rsq  c, b
> 
> but many vertex shaders do
> 
>a = sqrt(b);
>var1 /= a;
>var2 /= a;
> 
> which generates
> 
>sqrt a, b
>rcp  c, a
>rcp  d, a
> 
> If we apply the algebraic optimization before CSE, we'll end up with
> 
>sqrt a, b
>rsq  c, b
>rcp  d, a

Why doesn't a second pass through opt_algebraic turn this into

   rsq  c, b
   rsq  d, b

Seems like this could cause us to miss other optimization opportunities...

> Applying CSE combines the RCP instructions, preventing this from
> happening.
> 
> No shader-db changes.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 022ed37..e0a3d5f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1790,8 +1790,8 @@ vec4_visitor::run()
>OPT(dead_code_eliminate);
>OPT(dead_control_flow_eliminate, this);
>OPT(opt_copy_propagation);
> -  OPT(opt_algebraic);
>OPT(opt_cse);
> +  OPT(opt_algebraic);
>OPT(opt_register_coalesce);
> } while (progress);
>  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.

2014-09-30 Thread Matt Turner

On Tue, Sep 30, 2014 at 2:10 PM, Ian Romanick  wrote:
> On 09/27/2014 12:12 PM, Matt Turner wrote:
>> The next patch adds an algebraic optimization for the pattern
>>
>>sqrt a, b
>>rcp  c, a
>>
>> and turns it into
>>
>>sqrt a, b
>>rsq  c, b
>>
>> but many vertex shaders do
>>
>>a = sqrt(b);
>>var1 /= a;
>>var2 /= a;
>>
>> which generates
>>
>>sqrt a, b
>>rcp  c, a
>>rcp  d, a
>>
>> If we apply the algebraic optimization before CSE, we'll end up with
>>
>>sqrt a, b
>>rsq  c, b
>>rcp  d, a
>
> Why doesn't a second pass through opt_algebraic turn this into

Because the addition in patch #2 just recognizes a consecutive sqrt+rcp pattern.

>rsq  c, b
>rsq  d, b
>
> Seems like this could cause us to miss other optimization opportunities...

This seems pretty sufficient for the collection of shaders in
shader-db -- no regressions, cuts vec4 instructions, and handles 410
sqrt+rcp pairs.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] i965/fs: Extend predicated break pass to predicate WHILE.

2014-09-30 Thread Matt Turner

On Mon, Sep 8, 2014 at 12:21 PM, Matt Turner  wrote:
> Helps a handful of programs in Serious Sam 3 that use do-while loops.
>
> instructions in affected programs: 16114 -> 16075 (-0.24%)
> ---

How about a review?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7.

2014-09-30 Thread Ian Romanick

On 09/28/2014 01:26 PM, Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index e1f5735..e6c34fa 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -634,14 +634,24 @@ fs_visitor::visit(ir_expression *ir)
>  else
> emit(MUL(this->result, op[0], op[1]));
>   } else {
> -if (brw->gen >= 7)
> -   no16("SIMD16 explicit accumulator operands unsupported\n");
> -
>  struct brw_reg acc = retype(brw_acc_reg(), this->result.type);
>  
> -emit(MUL(acc, op[0], op[1]));
> -emit(MACH(reg_null_d, op[0], op[1]));
> -emit(MOV(this->result, fs_reg(acc)));
> +fs_inst *mul = emit(MUL(acc, op[0], op[1]));
> +fs_inst *mach = emit(MACH(reg_null_d, op[0], op[1]));
> +fs_inst *mov = emit(MOV(this->result, fs_reg(acc)));
> +
> +if (brw->gen == 7 && dispatch_width == 16) {
> +   mul->force_uncompressed = true;
> +   mach->force_uncompressed = true;
> +   mov->force_uncompressed = true;
> +
> +   mul = emit(MUL(acc, half(op[0], 1), half(op[1], 1)));
> +   mul->force_sechalf = true;
> +   mach = emit(MACH(reg_null_d, half(op[0], 1), half(op[1], 1)));
> +   mach->force_sechalf = true;
> +   mov = emit(MOV(half(this->result, 1), fs_reg(acc)));
> +   mov->force_sechalf = true;
> +}

Are there a bunch of cases where we double emit things for SIMD16?
Would it make more sense to have a generic function that took a list of
instructions, duplicated them, and did the force_uncompressed /
force_sechalf modification?

>   }
>} else {
>emit(MUL(this->result, op[0], op[1]));
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Jose Fonseca

Brian,

Your patch looks good AFAICT.

Not sure why the crash, and I'm afraid I won't have time to look into it.

I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for 
all LLVM versions .  We were avoiding it on old LLVM versions, but AFAICT 
there's no longer any reason to avoid it now, and it might simplify get things 
working again.

If things still don't work, then I think we should revert the recent LLVM 
changes, move them into a branch so we can investigate the issues with old LLVM 
more carefuly without blocking builds/tests on master.

Jose 



From: Brian Paul 
Sent: 30 September 2014 21:47
To: Mathias Fröhlich; mesa-dev@lists.freedesktop.org
Cc: Jose Fonseca
Subject: Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

On 09/30/2014 02:40 PM, Mathias Fröhlich wrote:
>
> Hi Brian,
>
> On Tuesday, September 30, 2014 13:30:21 Brian Paul wrote:
>> Move the USE_MCJIT / HAVE_AVX determination logic to lp_bld.h.  If we
>> don't have MCJIT define a dummy LLVMMCJITMemoryManagerRef type to avoid
>> excessive #ifdef testing elsewhere.
> [...]
>> @@ -219,7 +188,7 @@ gallivm_free_code(struct gallivm_state *gallivm)
>>  assert(!gallivm->engine);
>>  lp_free_generated_code(gallivm->code);
>>  gallivm->code = NULL;
>> -#if HAVE_LLVM < 0x0306
>> +#if USE_MCJIT
> We will probably still need the < 0x0306 check:
> #if HAVE_LLVM < 0x0306 && USE_MCJIT
> since this memorymanager stuff just vanished in the way 3.5 implemented
> this with version 3.6.
>
>>  LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
>>  gallivm->memorymgr = NULL;
>>   #endif
>>
>
> Also, we will probably fail to compile the LLVMDisposeMCJITMemoryManager
> call under some configurations with MCJIT and older llvm.
> So, additionally to what you had, how about the attached one?
> I am still trying to verify this change against 3.5 and 3.6.
> I am not sure about 3.2 since it did not build out of the box with
> my configure line.

It compiles, but I get a segfault when I try to run anything:


Program received signal SIGSEGV, Segmentation fault.
0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable
(this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165
165  mgr()->setMemoryWritable();
(gdb) where
#0  0x7797a579 in DelegatingJITMemoryManager::setMemoryWritable
(this=0x6f5cb0) at gallivm/lp_bld_misc.cpp:165
#1  0x751579cc in (anonymous
namespace)::JITEmitter::startFunction (this=0x727b20, F=...) at
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JITEmitter.cpp:782
#2  0x7567e595 in (anonymous
namespace)::Emitter::runOnMachineFunction
(this=0x75dd50, MF=...) at
/build/buildd/llvm-3.2-3.2/lib/Target/X86/X86CodeEmitter.cpp:145
#3  0x7501bbbf in runOnFunction (F=..., this=0x727980) at
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1498
#4  llvm::FPPassManager::runOnFunction (this=0x727980, F=...) at
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1476
#5  0x7501f2bb in llvm::FunctionPassManagerImpl::run
(this=0x6fc950, F=...) at
/build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1449
#6  0x7501f396 in llvm::FunctionPassManager::run (this=0x6f5d20,
F=...) at /build/buildd/llvm-3.2-3.2/lib/VMCore/PassManager.cpp:1379
#7  0x7514e637 in llvm::JIT::jitTheFunction
(this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:645
#8  0x7514ec2f in llvm::JIT::runJITOnFunctionUnlocked
(this=this@entry=0x6fc800, F=F@entry=0x769720, locked=...) at
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:624
#9  0x7514ed89 in llvm::JIT::getPointerToFunction
(this=0x6fc800, F=0x769720) at
/build/buildd/llvm-3.2-3.2/lib/ExecutionEngine/JIT/JIT.cpp:681
#10 0x77941c04 in gallivm_jit_function (gallivm=0x6f5a90,
func=0x769720) at gallivm/lp_bld_init.c:586
#11 0x779b54d2 in generate_variant (lp=0x61f750,
shader=0x6fdd10, key=0x7fffd9a0) at lp_state_fs.c:2634
#12 0x779b6a77 in llvmpipe_update_fs (lp=0x61f750) at
lp_state_fs.c:3166
#13 0x779ac7bb in llvmpipe_update_derived (llvmpipe=0x61f750) at
lp_state_derived.c:186
#14 0x77984562 in llvmpipe_draw_vbo (pipe=0x61f750,
info=0x7fffdcc0) at lp_draw_arrays.c:70
#15 0x7785c1d3 in cso_draw_vbo (cso=0x6b9260,
info=0x7fffdcc0) at cso_cache/cso_context.c:1418
#16 0x7771a373 in st_draw_vbo (ctx=0x77ec4010,
prims=0x6ab7c0, nr_prims=2, ib=0x0, index_bounds_valid=1 '\001',
min_index=0, max_index=161, tfb_vertcount=0x0, indirect=0x0) at
../../src/mesa/state_tracker/st_draw.c:285
#17 0x776f7e3f in vbo_save_playback_vertex_list
(ctx=0x77ec4010, data=0x6ab3ec) at
../../src/mesa/vbo/vbo_save_draw.c:310
#18 0x77524e18 in ext_opcode_execute (ctx=0x77ec4010,
node=0x6ab3e8) at ../../src/mesa/main/dlist.c:658
#19 0x7753b5db in execute_list (ctx=0x77ec4010, list=1) at
../../src/mesa/main/dli

Re: [Mesa-dev] [PATCH 0/6] i965/fs: ARB_gpu_shader5 operations SIMD16 support

2014-09-30 Thread Ian Romanick

The first 3 are

Reviewed-by: Ian Romanick 

I sent a question on patch 4 that may affect it and the others.

On 09/28/2014 01:26 PM, Matt Turner wrote:
> [PATCH 1/6] i965/fs: Set MUL source type to W/UW in 64-bit mul macro
> 
>Fixes 64-bit multiploes on Gen8.
> 
> [PATCH 2/6] i965/fs: Don't offset uniform registers in half().
> 
>Bug fix necessary for later patches.
> 
> [PATCH 3/6] i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen
> 
>Don't apply Gen7 restrictions to Gen8.
> 
> [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7.
> [PATCH 5/6] i965/fs: Implement SIMD16 64-bit integer multiplies on Gen 7.
> [PATCH 6/6] i965/fs: Implement SIMD16 carry/borrow on Gen 7.
> 
>Implements SIMD16 operations on Gen7.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/vec4: Call opt_algebraic after opt_cse.

2014-09-30 Thread Ian Romanick

On 09/30/2014 02:16 PM, Matt Turner wrote:
> On Tue, Sep 30, 2014 at 2:10 PM, Ian Romanick  wrote:
>> On 09/27/2014 12:12 PM, Matt Turner wrote:
>>> The next patch adds an algebraic optimization for the pattern
>>>
>>>sqrt a, b
>>>rcp  c, a
>>>
>>> and turns it into
>>>
>>>sqrt a, b
>>>rsq  c, b
>>>
>>> but many vertex shaders do
>>>
>>>a = sqrt(b);
>>>var1 /= a;
>>>var2 /= a;
>>>
>>> which generates
>>>
>>>sqrt a, b
>>>rcp  c, a
>>>rcp  d, a
>>>
>>> If we apply the algebraic optimization before CSE, we'll end up with
>>>
>>>sqrt a, b
>>>rsq  c, b
>>>rcp  d, a
>>
>> Why doesn't a second pass through opt_algebraic turn this into
> 
> Because the addition in patch #2 just recognizes a consecutive sqrt+rcp 
> pattern.

That makes sense.  Series is

Reviewed-by: Ian Romanick 

>>rsq  c, b
>>rsq  d, b
>>
>> Seems like this could cause us to miss other optimization opportunities...
> 
> This seems pretty sufficient for the collection of shaders in
> shader-db -- no regressions, cuts vec4 instructions, and handles 410
> sqrt+rcp pairs.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] i965/fs: Extend predicated break pass to predicate WHILE.

2014-09-30 Thread Ian Romanick

On 09/25/2014 09:00 AM, Matt Turner wrote:
> On Thu, Sep 25, 2014 at 8:25 AM, Ian Romanick  wrote:
>> How did you test this?  Do we have piglit execution tests that actually
>> hit this path?  I'm assuming you didn't play Serious Sam 3 looking for
>> rendering errors. ;)
> 
> I wrote the patch and initially missed the necessary predicate_inverse
> bit, and saw a bunch of piglit failures (hangs I think?). I checked
> the docs again and realized I needed to flip it, and then all of
> piglit passed.

Heh... okay.

Reviewed-by: Ian Romanick 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Mathias Fröhlich


Hi,

On Tuesday, September 30, 2014 21:25:54 Jose Fonseca wrote:
> Your patch looks good AFAICT.
> 
> Not sure why the crash, and I'm afraid I won't have time to look into it.
I am currently looking into that.

> I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for 
> all LLVM versions .  We were avoiding it on old LLVM versions, but AFAICT 
> there's no longer any reason to avoid it now, and it might simplify get 
> things working again.
> 
> If things still don't work, then I think we should revert the recent LLVM 
> changes, move them into a branch so we can investigate the issues with old 
> LLVM more carefuly without blocking builds/tests on master.

Reverting

commit d90ff351f3a3598834f77b9c0723532b3abd3cd5
Author: Mathias Fröhlich 
Date:   Sun Jul 13 12:49:41 2014 +0200

llvmpipe: Make a llvmpipe OpenGL context thread safe.

is probably enough.
I will just do so if I can't find the reason for the crash now.

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] util: add u_lowering

2014-09-30 Thread Rob Clark

From: Rob Clark 

TGSI->TGSI pass, extracted from freedreno.  Currently provides the
following lower support, to help drivers emulate unsupported opcodes
or features:

Individual opcodes:
  DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH,
  DP2, DP2A

Also supported, although it is up to the driver to manage it's own
shader variants:
 + two-sided-color
 + texture coord saturate (ie. to emulate GL_CLAMP)

All of the lowering operations are opt-in so a driver can pick and
choose what it wants.

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/Makefile.sources  |1 +
 src/gallium/auxiliary/util/u_lowering.c | 1571 +++
 src/gallium/auxiliary/util/u_lowering.h |   87 ++
 3 files changed, 1659 insertions(+)
 create mode 100644 src/gallium/auxiliary/util/u_lowering.c
 create mode 100644 src/gallium/auxiliary/util/u_lowering.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 58d8af7..575c315 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -125,6 +125,7 @@ C_SOURCES := \
util/u_keymap.c \
util/u_linear.c \
util/u_linkage.c \
+   util/u_lowering.c \
util/u_network.c \
util/u_math.c \
util/u_mm.c \
diff --git a/src/gallium/auxiliary/util/u_lowering.c 
b/src/gallium/auxiliary/util/u_lowering.c
new file mode 100644
index 000..fd193bc
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_lowering.c
@@ -0,0 +1,1571 @@
+/*
+ * Copyright (C) 2014 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ *
+ * Authors:
+ *Rob Clark 
+ */
+
+#include "tgsi/tgsi_transform.h"
+#include "tgsi/tgsi_scan.h"
+#include "tgsi/tgsi_dump.h"
+
+#include "util/u_debug.h"
+#include "util/u_math.h"
+#include "util/u_lowering.h"
+
+struct u_lowering_context {
+   struct tgsi_transform_context base;
+   const struct u_lowering_config *config;
+   struct tgsi_shader_info *info;
+   unsigned two_side_colors;
+   unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS];
+   unsigned color_base; /* base register for chosen COLOR/BCOLOR's */
+   int face_idx;
+   unsigned numtmp;
+   struct {
+  struct tgsi_full_src_register src;
+  struct tgsi_full_dst_register dst;
+   } tmp[2];
+#define A 0
+#define B 1
+   struct tgsi_full_src_register imm;
+   int emitted_decls;
+   unsigned saturate;
+};
+
+static inline struct u_lowering_context *
+u_lowering_context(struct tgsi_transform_context *tctx)
+{
+   return (struct u_lowering_context *) tctx;
+}
+
+/*
+ * Utility helpers:
+ */
+
+static void
+reg_dst(struct tgsi_full_dst_register *dst,
+const struct tgsi_full_dst_register *orig_dst, unsigned wrmask)
+{
+   *dst = *orig_dst;
+   dst->Register.WriteMask &= wrmask;
+   assert(dst->Register.WriteMask);
+}
+
+static inline void
+get_swiz(unsigned *swiz, const struct tgsi_src_register *src)
+{
+   swiz[0] = src->SwizzleX;
+   swiz[1] = src->SwizzleY;
+   swiz[2] = src->SwizzleZ;
+   swiz[3] = src->SwizzleW;
+}
+
+static void
+reg_src(struct tgsi_full_src_register *src,
+const struct tgsi_full_src_register *orig_src,
+unsigned sx, unsigned sy, unsigned sz, unsigned sw)
+{
+   unsigned swiz[4];
+   get_swiz(swiz, &orig_src->Register);
+   *src = *orig_src;
+   src->Register.SwizzleX = swiz[sx];
+   src->Register.SwizzleY = swiz[sy];
+   src->Register.SwizzleZ = swiz[sz];
+   src->Register.SwizzleW = swiz[sw];
+}
+
+#define TGSI_SWIZZLE__ TGSI_SWIZZLE_X   /* don't-care value! */
+#define SWIZ(x,y,z,w) TGSI_SWIZZLE_ ## x, TGSI_SWIZZLE_ ## y, \
+   TGSI_SWIZZLE_ ## z, TGSI_SWIZZLE_ ## w
+
+/*
+ * if (dst.x aliases src.x) {
+ *   MOV tmpA.x, src.x
+ *   src = tmpA
+ * }
+ * COS dst.x, src.x
+ * SIN dst.y, src.x
+ * MOV dst.zw, imm{0.0, 1.0}
+ */
+static bool
+aliases(const struct tgsi_full_dst_register *dst, unsigned dst_mask,
+const struct tgsi_

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Brian Paul


On 09/30/2014 03:34 PM, Mathias Fröhlich wrote:


Hi,

On Tuesday, September 30, 2014 21:25:54 Jose Fonseca wrote:

Your patch looks good AFAICT.

Not sure why the crash, and I'm afraid I won't have time to look into it.

I am currently looking into that.


I think it might help to '#define USE_MCJIT 1' for now, ie, enable MCJIT for 
all LLVM versions .  We were avoiding it on old LLVM versions, but AFAICT 
there's no longer any reason to avoid it now, and it might simplify get things 
working again.

If things still don't work, then I think we should revert the recent LLVM 
changes, move them into a branch so we can investigate the issues with old LLVM 
more carefuly without blocking builds/tests on master.


Reverting

commit d90ff351f3a3598834f77b9c0723532b3abd3cd5
Author: Mathias Fröhlich 
Date:   Sun Jul 13 12:49:41 2014 +0200

 llvmpipe: Make a llvmpipe OpenGL context thread safe.

is probably enough.
I will just do so if I can't find the reason for the crash now.


Yeah, reverting that patch clears up the regression here.  Please go 
ahead and do the revert if you don't think you can solve the problem 
otherwise.  Thanks!


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Mathias Fröhlich


Hi Brian,

On Tuesday, September 30, 2014 15:42:21 Brian Paul wrote:
> Yeah, reverting that patch clears up the regression here.  Please go 
> ahead and do the revert if you don't think you can solve the problem 
> otherwise.  Thanks!

I could even reproduce the segfault with the previous patch and 3.5,
which is the one I am currently testing against.
Does the attached patch - based on master - also fix your problems
on 3.3?
... sorry, I am still iterating on a configuration that builds 3.2 or at least 
3.3.

Thanks for your patience!

Mathias>From 321aecdcebd9844568985e1064d6679e04cf6e2a Mon Sep 17 00:00:00 2001
Message-Id: <321aecdcebd9844568985e1064d6679e04cf6e2a.1412113717.git.mathias.froehl...@gmx.net>
From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= 
Date: Tue, 30 Sep 2014 22:11:30 +0200
Subject: [PATCH] gallivm: fix build for LLVM 3.2

---
 src/gallium/auxiliary/gallivm/lp_bld.h| 8 
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 4 +---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 9 +
 src/gallium/auxiliary/gallivm/lp_bld_misc.h   | 3 +++
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h
index fcf4f16..a01c216 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld.h
@@ -58,6 +58,14 @@
 #endif
 
 
+#if HAVE_LLVM <= 0x0303
+/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy
+ * typedef to simplify things elsewhere.
+ */
+typedef void *LLVMMCJITMemoryManagerRef;
+#endif
+
+
 /**
  * Redefine these LLVM entrypoints as invalid macros to make sure we
  * don't accidentally use them.  We need to use the functions which
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 4e4aecb..8d7a0b6 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -219,10 +219,8 @@ gallivm_free_code(struct gallivm_state *gallivm)
assert(!gallivm->engine);
lp_free_generated_code(gallivm->code);
gallivm->code = NULL;
-#if HAVE_LLVM < 0x0306
-   LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
+   lp_free_memory_manager(gallivm->memorymgr);
gallivm->memorymgr = NULL;
-#endif
 }
 
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index c173ab6..9c2de2c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -554,3 +554,12 @@ lp_get_default_memory_manager()
return 0;
 #endif
 }
+
+extern "C"
+void
+lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr)
+{
+#if HAVE_LLVM < 0x0306
+   delete reinterpret_cast(memorymgr);
+#endif
+}
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
index 40d3e79..36923aa 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
@@ -65,6 +65,9 @@ lp_free_generated_code(struct lp_generated_code *code);
 extern LLVMMCJITMemoryManagerRef
 lp_get_default_memory_manager();
 
+extern void
+lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr);
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: add u_lowering

2014-09-30 Thread Brian Paul


On 09/30/2014 03:38 PM, Rob Clark wrote:

From: Rob Clark 

TGSI->TGSI pass, extracted from freedreno.  Currently provides the
following lower support, to help drivers emulate unsupported opcodes
or features:

Individual opcodes:
   DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH,
   DP2, DP2A

Also supported, although it is up to the driver to manage it's own
shader variants:
  + two-sided-color
  + texture coord saturate (ie. to emulate GL_CLAMP)

All of the lowering operations are opt-in so a driver can pick and
choose what it wants.

Signed-off-by: Rob Clark 


Hi Rob, a few thoughts:

How about moving this into the src/gallium/auxiliary/tgsi/ directory 
since this is a very TGSI-specific thing?


I think some of my recent changes to the tgsi_transform code would be 
helpful, like emit_epilog/prolog() and the various 
tgsi_transform_opX_inst() helpers.


There's a lot of tricky code trying to determine the exact size of the 
new shader token buffer.  Why not just use a 2x buffer and then realloc 
to the exact size at the end?


I think the big if/switch statements could be replaced by some kind of 
table-driven system.


FWIW, I have a two-sided lighting transformation function too that I 
haven't pushed to master yet.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/6] i965/fs: Implement SIMD16 integer multiplies on Gen 7.

2014-09-30 Thread Matt Turner

On Tue, Sep 30, 2014 at 2:26 PM, Ian Romanick  wrote:
> Are there a bunch of cases where we double emit things for SIMD16?
> Would it make more sense to have a generic function that took a list of
> instructions, duplicated them, and did the force_uncompressed /
> force_sechalf modification?

Not many. Other than these, the only other things are the 3-src
instructions on SNB+IVB, and BFI instructions on Haswell. In those
cases, we can just double emit instructions in the generator.

These (addc, subb, integer multiplies) are weird and have to be
handled in the visitor because they use the accumulator and on Gen7
the accumulator doesn't handle integer data in SIMD16.

I'm going to have to rebase the last three on Jason's changes though,
so I'll resend them.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: add u_lowering

2014-09-30 Thread Eric Anholt

Rob Clark  writes:

> From: Rob Clark 
>
> TGSI->TGSI pass, extracted from freedreno.  Currently provides the
> following lower support, to help drivers emulate unsupported opcodes
> or features:
>
> Individual opcodes:
>   DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH,
>   DP2, DP2A
>
> Also supported, although it is up to the driver to manage it's own
> shader variants:
>  + two-sided-color
>  + texture coord saturate (ie. to emulate GL_CLAMP)
>
> All of the lowering operations are opt-in so a driver can pick and
> choose what it wants.

This is very useful to me, as it got me +15 piglit tests, and -70 lines
of code.

I'm not using FRC because I can do it in just as many instructions as my
FLR, and I'm not using LRP because I'm doing the (c + a * (b - c))
version of things, but I'm using all the rest of those opcodes.

I'm not cloning the tokens for shader variants yet, so I'm still doing
my own texcoord clamping.  However, I do want to use the two-side-color
lowering, so I'll probably start cloning for variants, at which point
I'll use the texcoord bits, too.

However, I'd like to see the code first have a commit that's just a raw
copy, so that git log --follow can track where the code came from.  I've
pushed a branch called "robclark-lowering" to my mesa tree if maybe you
want to ack that starting commit?

pgp3v8ritM8qL.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Mathias Fröhlich


Brian,

at least here, I get a build that runs glxgears which
previously did not run with 3.3, 3.5.
Currently the compile test runs with 3.6.
If this succeeds, ok to push the attached fix
(The same than before but with a more descriptive commit message)?

Greetings
Mathias>From 39a8625423f85327eefdadd3d4068c9d3e26d936 Mon Sep 17 00:00:00 2001
Message-Id: <39a8625423f85327eefdadd3d4068c9d3e26d936.1412114843.git.mathias.froehl...@gmx.net>
From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= 
Date: Tue, 30 Sep 2014 22:11:30 +0200
Subject: [PATCH] gallivm: Fix build for LLVM 3.2

Do not rely on LLVMMCJITMemoryManagerRef being available.
The c binding to the memory manager objects only appeared
on llvm-3.4.
The change is based on an initial patch of Brian Paul.

Signed-off-by: Mathias Froehlich 
---
 src/gallium/auxiliary/gallivm/lp_bld.h| 8 
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 4 +---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 9 +
 src/gallium/auxiliary/gallivm/lp_bld_misc.h   | 3 +++
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld.h b/src/gallium/auxiliary/gallivm/lp_bld.h
index fcf4f16..a01c216 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld.h
@@ -58,6 +58,14 @@
 #endif
 
 
+#if HAVE_LLVM <= 0x0303
+/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy
+ * typedef to simplify things elsewhere.
+ */
+typedef void *LLVMMCJITMemoryManagerRef;
+#endif
+
+
 /**
  * Redefine these LLVM entrypoints as invalid macros to make sure we
  * don't accidentally use them.  We need to use the functions which
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 4e4aecb..8d7a0b6 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -219,10 +219,8 @@ gallivm_free_code(struct gallivm_state *gallivm)
assert(!gallivm->engine);
lp_free_generated_code(gallivm->code);
gallivm->code = NULL;
-#if HAVE_LLVM < 0x0306
-   LLVMDisposeMCJITMemoryManager(gallivm->memorymgr);
+   lp_free_memory_manager(gallivm->memorymgr);
gallivm->memorymgr = NULL;
-#endif
 }
 
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index c173ab6..9c2de2c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -554,3 +554,12 @@ lp_get_default_memory_manager()
return 0;
 #endif
 }
+
+extern "C"
+void
+lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr)
+{
+#if HAVE_LLVM < 0x0306
+   delete reinterpret_cast(memorymgr);
+#endif
+}
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
index 40d3e79..36923aa 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
@@ -65,6 +65,9 @@ lp_free_generated_code(struct lp_generated_code *code);
 extern LLVMMCJITMemoryManagerRef
 lp_get_default_memory_manager();
 
+extern void
+lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr);
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Brian Paul


On 09/30/2014 04:17 PM, Mathias Fröhlich wrote:


Brian,

at least here, I get a build that runs glxgears which
previously did not run with 3.3, 3.5.
Currently the compile test runs with 3.6.
If this succeeds, ok to push the attached fix
(The same than before but with a more descriptive commit message)?


OK, this patch seems to fix everything here.  Thanks!

Reviewed-by: Brian Paul 
Tested-by: Brian Paul 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix build for LLVM 3.2

2014-09-30 Thread Roland Scheidegger

Am 01.10.2014 00:17, schrieb Mathias Fröhlich:
> 
> Brian,
> 
> at least here, I get a build that runs glxgears which
> previously did not run with 3.3, 3.5.
> Currently the compile test runs with 3.6.
> If this succeeds, ok to push the attached fix
> (The same than before but with a more descriptive commit message)?
> 
> Greetings
> Mathias
> 

Looks good to me too.
Reviewed-by: Roland Scheidegger 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: add u_lowering

2014-09-30 Thread Eric Anholt

Brian Paul  writes:

> On 09/30/2014 03:38 PM, Rob Clark wrote:
>> From: Rob Clark 
>>
>> TGSI->TGSI pass, extracted from freedreno.  Currently provides the
>> following lower support, to help drivers emulate unsupported opcodes
>> or features:
>>
>> Individual opcodes:
>>DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH,
>>DP2, DP2A
>>
>> Also supported, although it is up to the driver to manage it's own
>> shader variants:
>>   + two-sided-color
>>   + texture coord saturate (ie. to emulate GL_CLAMP)
>>
>> All of the lowering operations are opt-in so a driver can pick and
>> choose what it wants.
>>
>> Signed-off-by: Rob Clark 
>
> Hi Rob, a few thoughts:
>
> How about moving this into the src/gallium/auxiliary/tgsi/ directory 
> since this is a very TGSI-specific thing?

I happened to be writing a series in parallel with Rob to do the same
thing, and I chose tgsi/.  I've rebased my changes on his freedreno fix,
and I'm going to send them out in this thread now.  I don't really care
which location wins, having good history is all I care about (Also,
reviewing his changes, I found style improvements that I propagated to
mine.  So duplicated work ended up having some value).

> I think some of my recent changes to the tgsi_transform code would be 
> helpful, like emit_epilog/prolog() and the various 
> tgsi_transform_opX_inst() helpers.
>
> There's a lot of tricky code trying to determine the exact size of the 
> new shader token buffer.  Why not just use a 2x buffer and then realloc 
> to the exact size at the end?

2x isn't nearly the lowest growth factor, right?  How massively
overallocated do you go?

All this calculation of array sizes up front is pretty awful, though --
it seems like token buffers ought to be just growing as you append
tokens.  But I guess that would be a change to do to tgsi_transform in
general.

> I think the big if/switch statements could be replaced by some kind of 
> table-driven system.

pgpHr_xZioEba.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.

2014-09-30 Thread Eric Anholt

Lots of drivers need to transform the weird instructions in TGSI into
reasonable scalar ops, and this code can make those translations
canonical.
---
 src/gallium/auxiliary/tgsi/tgsi_lowering.c | 1573 
 src/gallium/auxiliary/tgsi/tgsi_lowering.h |   89 ++
 2 files changed, 1662 insertions(+)
 create mode 100644 src/gallium/auxiliary/tgsi/tgsi_lowering.c
 create mode 100644 src/gallium/auxiliary/tgsi/tgsi_lowering.h

diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c 
b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
new file mode 100644
index 000..795b537
--- /dev/null
+++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
@@ -0,0 +1,1573 @@
+/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
+
+/*
+ * Copyright (C) 2014 Rob Clark 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ *
+ * Authors:
+ *Rob Clark 
+ */
+
+#include "tgsi/tgsi_transform.h"
+#include "tgsi/tgsi_scan.h"
+#include "tgsi/tgsi_dump.h"
+
+#include "util/u_debug.h"
+#include "util/u_math.h"
+
+#include "freedreno_lowering.h"
+
+struct fd_lowering_context {
+   struct tgsi_transform_context base;
+   const struct fd_lowering_config *config;
+   struct tgsi_shader_info *info;
+   unsigned two_side_colors;
+   unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS];
+   unsigned color_base;  /* base register for chosen COLOR/BCOLOR's */
+   int face_idx;
+   unsigned numtmp;
+   struct {
+   struct tgsi_full_src_register src;
+   struct tgsi_full_dst_register dst;
+   } tmp[2];
+#define A 0
+#define B 1
+   struct tgsi_full_src_register imm;
+   int emitted_decls;
+   unsigned saturate;
+};
+
+static inline struct fd_lowering_context *
+fd_lowering_context(struct tgsi_transform_context *tctx)
+{
+   return (struct fd_lowering_context *)tctx;
+}
+
+/*
+ * Utility helpers:
+ */
+
+static void
+reg_dst(struct tgsi_full_dst_register *dst,
+   const struct tgsi_full_dst_register *orig_dst, unsigned wrmask)
+{
+   *dst = *orig_dst;
+   dst->Register.WriteMask &= wrmask;
+   assert(dst->Register.WriteMask);
+}
+
+static inline void
+get_swiz(unsigned *swiz, const struct tgsi_src_register *src)
+{
+   swiz[0] = src->SwizzleX;
+   swiz[1] = src->SwizzleY;
+   swiz[2] = src->SwizzleZ;
+   swiz[3] = src->SwizzleW;
+}
+
+static void
+reg_src(struct tgsi_full_src_register *src,
+   const struct tgsi_full_src_register *orig_src,
+   unsigned sx, unsigned sy, unsigned sz, unsigned sw)
+{
+   unsigned swiz[4];
+   get_swiz(swiz, &orig_src->Register);
+   *src = *orig_src;
+   src->Register.SwizzleX = swiz[sx];
+   src->Register.SwizzleY = swiz[sy];
+   src->Register.SwizzleZ = swiz[sz];
+   src->Register.SwizzleW = swiz[sw];
+}
+
+#define TGSI_SWIZZLE__ TGSI_SWIZZLE_X  /* don't-care value! */
+#define SWIZ(x,y,z,w) TGSI_SWIZZLE_ ## x, TGSI_SWIZZLE_ ## y, \
+   TGSI_SWIZZLE_ ## z, TGSI_SWIZZLE_ ## w
+
+/*
+ * if (dst.x aliases src.x) {
+ *   MOV tmpA.x, src.x
+ *   src = tmpA
+ * }
+ * COS dst.x, src.x
+ * SIN dst.y, src.x
+ * MOV dst.zw, imm{0.0, 1.0}
+ */
+static bool
+aliases(const struct tgsi_full_dst_register *dst, unsigned dst_mask,
+   const struct tgsi_full_src_register *src, unsigned src_mask)
+{
+   if ((dst->Register.File == src->Register.File) &&
+   (dst->Register.Index == src->Register.Index)) {
+   unsigned i, actual_mask = 0;
+   unsigned swiz[4];
+   get_swiz(swiz, &src->Register);
+   for (i = 0; i < 4; i++)
+   if (src_mask & (1 << i))
+   actual_mask |= (1 << swiz[i]);
+   if (actual_mask & dst_mask)
+   return true;
+   }
+   return false;
+}
+
+static void
+create_mov(struct tgsi_transform_context *tctx,
+

[Mesa-dev] [PATCH 3/3] gallium: Rename freedreno parts of tgsi_lowering.[ch].

2014-09-30 Thread Eric Anholt

---
 src/gallium/auxiliary/Makefile.sources |  1 +
 src/gallium/auxiliary/tgsi/tgsi_lowering.c | 50 +++---
 src/gallium/auxiliary/tgsi/tgsi_lowering.h | 12 +++
 3 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 58d8af7..f6621ef 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -75,6 +75,7 @@ C_SOURCES := \
tgsi/tgsi_exec.c \
tgsi/tgsi_info.c \
tgsi/tgsi_iterate.c \
+   tgsi/tgsi_lowering.c \
tgsi/tgsi_parse.c \
tgsi/tgsi_sanity.c \
tgsi/tgsi_scan.c \
diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c 
b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
index 5627bb5..b6b18db 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_lowering.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
@@ -31,11 +31,11 @@
 #include "util/u_debug.h"
 #include "util/u_math.h"
 
-#include "freedreno_lowering.h"
+#include "tgsi_lowering.h"
 
-struct fd_lowering_context {
+struct tgsi_lowering_context {
struct tgsi_transform_context base;
-   const struct fd_lowering_config *config;
+   const struct tgsi_lowering_config *config;
struct tgsi_shader_info *info;
unsigned two_side_colors;
unsigned two_side_idx[PIPE_MAX_SHADER_INPUTS];
@@ -53,10 +53,10 @@ struct fd_lowering_context {
unsigned saturate;
 };
 
-static inline struct fd_lowering_context *
-fd_lowering_context(struct tgsi_transform_context *tctx)
+static inline struct tgsi_lowering_context *
+tgsi_lowering_context(struct tgsi_transform_context *tctx)
 {
-   return (struct fd_lowering_context *)tctx;
+   return (struct tgsi_lowering_context *)tctx;
 }
 
 /*
@@ -196,7 +196,7 @@ static void
 transform_dst(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst  = &inst->Dst[0];
struct tgsi_full_src_register *src0 = &inst->Src[0];
struct tgsi_full_src_register *src1 = &inst->Src[1];
@@ -276,7 +276,7 @@ static void
 transform_xpd(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst  = &inst->Dst[0];
struct tgsi_full_src_register *src0 = &inst->Src[0];
struct tgsi_full_src_register *src1 = &inst->Src[1];
@@ -347,7 +347,7 @@ static void
 transform_scs(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst = &inst->Dst[0];
struct tgsi_full_src_register *src = &inst->Src[0];
struct tgsi_full_instruction new_inst;
@@ -409,7 +409,7 @@ static void
 transform_lrp(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst  = &inst->Dst[0];
struct tgsi_full_src_register *src0 = &inst->Src[0];
struct tgsi_full_src_register *src1 = &inst->Src[1];
@@ -475,7 +475,7 @@ static void
 transform_frc(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst = &inst->Dst[0];
struct tgsi_full_src_register *src = &inst->Src[0];
struct tgsi_full_instruction new_inst;
@@ -519,7 +519,7 @@ static void
 transform_pow(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst  = &inst->Dst[0];
struct tgsi_full_src_register *src0 = &inst->Src[0];
struct tgsi_full_src_register *src1 = &inst->Src[1];
@@ -579,7 +579,7 @@ static void
 transform_lit(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowering_context(tctx);
+   struct tgsi_lowering_context *ctx = tgsi_lowering_context(tctx);
struct tgsi_full_dst_register *dst = &inst->Dst[0];
struct tgsi_full_src_register *src = &inst->Src[0];
struct tgsi_full_instruction new_inst;
@@ -690,7 +690,7 @@ static void
 transform_exp(struct tgsi_transform_context *tctx,
   struct tgsi_full_instruction *inst)
 {
-   struct fd_lowering_context *ctx = fd_lowe

1 2 >

1 - 100 of 128 matches

Mail list logo