date:20150713

Re: [Mesa-dev] [PATCH] nvc0: fix geometry program revalidation of clipping params

2015-07-13 Thread Samuel Pitoiset

What piglit test does this fix?

On Sat, Jul 11, 2015 at 7:13 PM, Ilia Mirkin  wrote:

> Signed-off-by: Ilia Mirkin 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>
> Even though in practice a geometry program will never be using UCP's,
> we still were revalidating (aka recompiling) the program when more
> clip planes became enabled (which also are used for regular clip
> distances).
>
> This seems like it should have led to massive fail, but I guess you
> don't change the number of clip planes when using geometry shaders.
> But I'm going to put this through a full piglit run just in case
> there's something I'm missing.
>
>  src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
> index 785e52e..11f2b10 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
> @@ -339,7 +339,7 @@ nvc0_check_program_ucps(struct nvc0_context *nvc0,
>nvc0_vertprog_validate(nvc0);
> else
> if (likely(vp == nvc0->gmtyprog))
> -  nvc0_vertprog_validate(nvc0);
> +  nvc0_gmtyprog_validate(nvc0);
> else
>nvc0_tevlprog_validate(nvc0);
>  }
> --
> 2.3.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 
Best regards,
Samuel Pitoiset.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] i965: Move BEGIN_BATCH() into same control flow as ADVANCE_BATCH().

2015-07-13 Thread Iago Toral

I think Chris did not review this one:

Reviewed-by: Iago Toral Quiroga 

Iago

On Fri, 2015-07-10 at 11:44 -0700, Matt Turner wrote:
> BEGIN_BATCH() and ADVANCE_BATCH() will contain "do {" and "} while (0)"
> respectively to allow declaring local variables used by intervening
> OUT_BATCH macros. As such, BEGIN_BATCH() and ADVANCE_BATCH() need to be
> in the same control flow.
> ---
>  src/mesa/drivers/dri/i965/brw_draw.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> b/src/mesa/drivers/dri/i965/brw_draw.c
> index 69ad4d4..ec13473 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> @@ -261,17 +261,17 @@ static void brw_emit_prim(struct brw_context *brw,
>indirect_flag = 0;
> }
>  
> +   BEGIN_BATCH(brw->gen >= 7 ? 7 : 6);
> +
> if (brw->gen >= 7) {
>if (brw->predicate.state == BRW_PREDICATE_STATE_USE_BIT)
>   predicate_enable = GEN7_3DPRIM_PREDICATE_ENABLE;
>else
>   predicate_enable = 0;
>  
> -  BEGIN_BATCH(7);
>OUT_BATCH(CMD_3D_PRIM << 16 | (7 - 2) | indirect_flag | 
> predicate_enable);
>OUT_BATCH(hw_prim | vertex_access_type);
> } else {
> -  BEGIN_BATCH(6);
>OUT_BATCH(CMD_3D_PRIM << 16 | (6 - 2) |
>  hw_prim << GEN4_3DPRIM_TOPOLOGY_TYPE_SHIFT |
>  vertex_access_type);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] nv50: turn samples counts off during blit

2015-07-13 Thread Samuel Pitoiset

Fixes the following piglit test:
  occlusion_query_meta_no_fragments

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_surface.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
index dc9852d..66eccc2 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
@@ -1432,6 +1432,7 @@ static void
 nv50_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
 {
struct nv50_context *nv50 = nv50_context(pipe);
+   struct nouveau_pushbuf *push = nv50->base.pushbuf;
boolean eng3d = FALSE;
 
if (util_format_is_depth_or_stencil(info->dst.resource->format)) {
@@ -1493,10 +1494,20 @@ nv50_blit(struct pipe_context *pipe, const struct 
pipe_blit_info *info)
 info->src.box.height != -info->dst.box.height))
   eng3d = TRUE;
 
+   if (nv50->screen->num_occlusion_queries_active) {
+  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+  PUSH_DATA (push, 0);
+   }
+
if (!eng3d)
   nv50_blit_eng2d(nv50, info);
else
   nv50_blit_3d(nv50, info);
+
+   if (nv50->screen->num_occlusion_queries_active) {
+  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+  PUSH_DATA (push, 1);
+   }
 }
 
 static void
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nv50: add nesting support for occlusion queries

2015-07-13 Thread Samuel Pitoiset

This is loosely based on nvc0.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 27 --
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  2 ++
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 81f7474..80d3fd2 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -49,6 +49,7 @@ struct nv50_query {
uint32_t offset; /* base + i * 32 */
uint8_t state;
boolean is64bit;
+   int nesting; /* only used for occlusion queries */
struct nouveau_mm_allocation *mm;
struct nouveau_fence *fence;
 };
@@ -175,11 +176,16 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
 
switch (q->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
-  PUSH_SPACE(push, 4);
-  BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
-  PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
-  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
-  PUSH_DATA (push, 1);
+  q->nesting = nv50->screen->num_occlusion_queries_active++;
+  if (q->nesting) {
+ nv50_query_get(push, q, 0x10, 0x0100f002);
+  } else {
+ PUSH_SPACE(push, 4);
+ BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
+ PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
+ BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+ PUSH_DATA (push, 1);
+  }
   break;
case PIPE_QUERY_PRIMITIVES_GENERATED:
   nv50_query_get(push, q, 0x10, 0x06805002);
@@ -223,9 +229,11 @@ nv50_query_end(struct pipe_context *pipe, struct 
pipe_query *pq)
switch (q->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
   nv50_query_get(push, q, 0, 0x0100f002);
-  PUSH_SPACE(push, 2);
-  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
-  PUSH_DATA (push, 0);
+  if (--nv50->screen->num_occlusion_queries_active == 0) {
+ PUSH_SPACE(push, 2);
+ BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+ PUSH_DATA (push, 0);
+  }
   break;
case PIPE_QUERY_PRIMITIVES_GENERATED:
   nv50_query_get(push, q, 0, 0x06805002);
@@ -396,8 +404,7 @@ nv50_render_condition(struct pipe_context *pipe,
   case PIPE_QUERY_OCCLUSION_COUNTER:
   case PIPE_QUERY_OCCLUSION_PREDICATE:
  if (likely(!condition)) {
-/* XXX: Placeholder, handle nesting here if available */
-if (unlikely(false))
+if (unlikely(q->nesting))
cond = wait ? NV50_3D_COND_MODE_NOT_EQUAL :
  NV50_3D_COND_MODE_ALWAYS;
 else
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 881051b..3a12a1f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -54,6 +54,8 @@ struct nv50_screen {
struct nv50_context *cur_ctx;
struct nv50_graph_state save_state;
 
+   int num_occlusion_queries_active;
+
struct nouveau_bo *code;
struct nouveau_bo *uniforms;
struct nouveau_bo *txc; /* TIC (offset 0) and TSC (65536) */
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: free interface_types and replace old hash_table uses

2015-07-13 Thread Iago Toral

On Sat, 2015-07-11 at 10:13 +1000, Timothy Arceri wrote:
> The util/hash_table was intended to be a fast hash table
> replacement for the program/hash_table see 35fd61bd99c1 and 72e55bb6888ff.
> 
> This replaces some more uses of the old hash table and also
> destroys the interface_types hash table when _mesa_glsl_release_types()
> is called which wasn't previously being done.
> ---
>  Was looking at the remaining program/hash_table uses and noticed that
>  interface_types wasnt being freed so thought I'd fix that and replace the
>  hash while I was there.
> 
>  No measurable compile time changes to the public shader-db
> 
>  src/glsl/glsl_types.cpp | 85 
> ++---
>  src/glsl/glsl_types.h   |  2 +-
>  2 files changed, 46 insertions(+), 41 deletions(-)
> 
> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
> index 281ff51..255bd69 100644
> --- a/src/glsl/glsl_types.cpp
> +++ b/src/glsl/glsl_types.cpp
> @@ -25,7 +25,7 @@
>  #include "main/core.h" /* for Elements, MAX2 */
>  #include "glsl_parser_extras.h"
>  #include "glsl_types.h"
> -#include "program/hash_table.h"
> +#include "util/hash_table.h"
>  
> 
>  mtx_t glsl_type::mutex = _MTX_INITIALIZER_NP;
> @@ -329,14 +329,19 @@ _mesa_glsl_release_types(void)
>  * necessary.
>  */
> if (glsl_type::array_types != NULL) {
> -  hash_table_dtor(glsl_type::array_types);
> +  _mesa_hash_table_destroy(glsl_type::array_types, NULL);
>glsl_type::array_types = NULL;
> }
>  
> if (glsl_type::record_types != NULL) {
> -  hash_table_dtor(glsl_type::record_types);
> +  _mesa_hash_table_destroy(glsl_type::record_types, NULL);
>glsl_type::record_types = NULL;
> }
> +
> +   if (glsl_type::interface_types != NULL) {
> +  _mesa_hash_table_destroy(glsl_type::interface_types, NULL);
> +  glsl_type::interface_types = NULL;
> +   }

I think it is probably best to put the destruction of interface_types in
a separate patch, it is a different issue after all. You can add my
Reviewed-by on that patch.

With that and a couple of other minor nitpicks I mention below fixed,
this is:
Reviewed-by: Iago Toral Quiroga 

>  }
>  
> 
> @@ -648,27 +653,28 @@ glsl_type::get_array_instance(const glsl_type *base, 
> unsigned array_size)
> mtx_lock(&glsl_type::mutex);
>  
> if (array_types == NULL) {
> -  array_types = hash_table_ctor(64, hash_table_string_hash,
> - hash_table_string_compare);
> +  array_types = _mesa_hash_table_create(NULL, _mesa_key_hash_string,
> +_mesa_key_string_equal);
> }
>  
> -   const glsl_type *t = (glsl_type *) hash_table_find(array_types, key);
> -
> -   if (t == NULL) {
> +   const struct hash_entry *entry = _mesa_hash_table_search(array_types, 
> key);
> +   if (entry == NULL) {
>mtx_unlock(&glsl_type::mutex);
> -  t = new glsl_type(base, array_size);
> +  const glsl_type *t = new glsl_type(base, array_size);
>mtx_lock(&glsl_type::mutex);
>  
> -  hash_table_insert(array_types, (void *) t, ralloc_strdup(mem_ctx, 
> key));
> +  entry = _mesa_hash_table_insert(array_types,
> +  ralloc_strdup(mem_ctx, key),
> +  (void *) t);
> }
>  
> -   assert(t->base_type == GLSL_TYPE_ARRAY);
> -   assert(t->length == array_size);
> -   assert(t->fields.array == base);
> +   assert(((glsl_type *)entry->data)->base_type == GLSL_TYPE_ARRAY);
> +   assert(((glsl_type *)entry->data)->length == array_size);
> +   assert(((glsl_type *)entry->data)->fields.array == base);

Other parts of this file put a blank between the type cast and the
variable, so I would add that here (and in all other places where you
cast entry to glsl_type* in this patch).

> mtx_unlock(&glsl_type::mutex);
>  
> -   return t;
> +   return (glsl_type *)entry->data;
>  }
>  
> 
> @@ -722,19 +728,13 @@ glsl_type::record_compare(const glsl_type *b) const
>  }
>  
> 
> -int
> +bool
>  glsl_type::record_key_compare(const void *a, const void *b)
>  {
> const glsl_type *const key1 = (glsl_type *) a;
> const glsl_type *const key2 = (glsl_type *) b;
>  
> -   /* Return zero is the types match (there is zero difference) or non-zero
> -* otherwise.
> -*/
> -   if (strcmp(key1->name, key2->name) != 0)
> -  return 1;
> -
> -   return !key1->record_compare(key2);
> +   return strcmp(key1->name, key2->name) == 0 && key1->record_compare(key2);
>  }
>  
> 
> @@ -772,25 +772,27 @@ glsl_type::get_record_instance(const glsl_struct_field 
> *fields,
> mtx_lock(&glsl_type::mutex);
>  
> if (record_types == NULL) {
> -  record_types = hash_table_ctor(64, record_key_hash, 
> record_key_compare);
> +  record_types = _mesa_hash_table_create(NULL, record_key_hash,
> + record_key_compare);
> }
>  
> -   const glsl_type *t = (glsl_type *) hash_

[Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Samuel Pitoiset

NV50_3D_BIND_TSC only allows to bind 16 samplers, and since we don't
want to do anything with NV50_3D_BIND_TSC2, just limit the maximum
number of samplers to 16 like for nvc0.

This fixes dmesg fails with the following piglit test:
 max-samplers

But the test still fails.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 6583a35..46ae0b8 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -286,7 +286,7 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
case PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS:
   /* The chip could handle more sampler views than samplers */
case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
-  return MIN2(32, PIPE_MAX_SAMPLERS);
+  return MIN2(16, PIPE_MAX_SAMPLERS);
case PIPE_SHADER_CAP_DOUBLES:
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Alejandro Piñeiro

On 01/07/15 01:37, Jason Ekstrand wrote:
> If we can avoid duplication in the texturing code, that would be
> really nice.  Could we do this as a refactor of the old code and then
> a much smaller NIR function that calls some shared code?  That's what
> we did for FS and it worked ok.  I looked at the layout of your code
> and, after you finish reading instruction inputs, very little of it is
> actually NIR-specific.

Our wip second batch of commits has now a version that refactors the old
code. But I have a doubt. As part of this refactor, I created a new
method to be used by brw_fs and brw_vec4, that returns the ir texture
opcode based on the nir texture opcode. You can take a look to that
method here [1]. The issue with this method is that ir_texture_opcode is
defined at glsl/ir.h inside a #ifdef __cplusplus. So I needed to rename
brw_nir.c to brw_nir.cpp. On a previous early review [2] you mentioned
that it would be better to keep it as brw_nir.c unless we have a good
reason. Is using this enum good reason to do the renaming? If not, the
other option I see is modify glsl/ir.h and move the enum definitions to
the top, outside the ifdef __cplusplus, as other headers (like
glsl/glsl_types.h) have.

BR

[1]
https://github.com/Igalia/mesa/commit/e81ce150ef931e50b6cb1aae0b42a79f448863f5
[2] https://bugs.freedesktop.org/show_bug.cgi?id=89580#c7
> --Jason
>
> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev  wrote:
>> From: Alejandro Piñeiro 
>>
>> This sets up the basic structure and placeholders for the implementation of
>> texture operations, which will be added incrementally in subsequent patches.
>>
>> Apart from helping split the texture support into smaller patches, this patch
>> is useful to get a simplified picture of the steps involved in the emission 
>> of
>> texture operations.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120 
>> -
>>  1 file changed, 119 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index 08a70b0..c3638a0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -1412,7 +1412,125 @@ vec4_visitor::nir_swizzle_result(nir_tex_instr 
>> *instr, dst_reg dest,
>>  void
>>  vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>>  {
>> -   /* @TODO: Not yet implemented */
>> +   unsigned sampler = instr->sampler_index;
>> +   src_reg sampler_reg = src_reg(sampler);
>> +
>> +   const glsl_type *dest_type;
>> +
>> +   /* Load the texture operation sources */
>> +   for (unsigned i = 0; i < instr->num_srcs; i++) {
>> +  src_reg src = get_nir_src(instr->src[i].src);
>> +
>> +  switch (instr->src[i].src_type) {
>> +  case nir_tex_src_comparitor:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_coord:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_ddx:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_ddy:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_lod:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_ms_index:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_offset:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_sampler_offset:
>> + /* @TODO: not yet implemented */
>> + break;
>> +
>> +  case nir_tex_src_projector:
>> + unreachable("Should be lowered by do_lower_texture_projection");
>> +
>> +  case nir_tex_src_bias:
>> + unreachable("LOD bias is not valid for vertex shaders.\n");
>> +
>> +  default:
>> + unreachable("unknown texture source");
>> +  }
>> +   }
>> +
>> +   /* Get the texture operation opcode */
>> +   enum opcode opcode = shader_opcode_for_nir_opcode(instr->op);
>> +
>> +   /* Build the instruction */
>> +   enum glsl_base_type dest_base_type =
>> +  brw_glsl_base_type_for_nir_type(instr->dest_type);
>> +
>> +   dest_type =
>> +  glsl_type::get_instance(dest_base_type, 
>> nir_tex_instr_dest_size(instr), 1);
>> +
>> +   vec4_instruction *inst = new(mem_ctx)
>> +  vec4_instruction(opcode, dst_reg(this, dest_type));
>> +
>> +   for (unsigned i = 0; i < 3; i++) {
>> +  if (instr->const_offset[i] != 0) {
>> + inst->offset = brw_texture_offset(instr->const_offset, 3);
>> + break;
>> +  }
>> +   }
>> +
>> +   /* The message header is necessary for:
>> +* - Texel offsets
>> +* - Gather channel selection
>> +* - Sampler indices too large to fit in a 4-bit value.
>> +*/
>> +   inst->header_size =
>> +  (inst->offset != 0 ||
>> +   instr->op == nir_texop_t

Re: [Mesa-dev] [PATCH v2] glsl: avoid compiler's segfault when processing operators with void arguments

2015-07-13 Thread Samuel Iglesias Gonsálvez

On 11/07/15 19:38, Renaud Gaubert wrote:
> This is done by returning an rvalue of type void in the
> ast_function_expression::hir function instead of a void expression.
> 
> This produces (in the case of the ternary) an hir with a call
> to the void returning function and an assignment of a void variable
> which will be optimized out (the assignment) during the optimization
> pass.
> 
> This fix results in having a valid subexpression in the many
> different cases where the subexpressions are functions whose
> return values are void.
> 
> Thus preventing to dereference NULL in the following cases:
>   * binary operator
>   * unary operators
>   * ternary operator
>   * comparison operators (except equal and nequal operator)
> 
> Equal and nequal had to be handled as a special case because
> instead of segfaulting on a forbidden syntax it was now accepting
> expressions with a void return value on either (or both) side of
> the expression.
> 
> Signed-off-by: Renaud Gaubert 
> Reviewed-by: Gabriel Laskar 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252
> 
> ---
> Piglit tests were sent to the Piglit mailing list:
>   * glsl-1.10 Adds tests on how void functions are handled
> 
>  src/glsl/ast_function.cpp | 9 -
>  src/glsl/ast_to_hir.cpp   | 9 -
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
> index 92e26bf..ac32723 100644
> --- a/src/glsl/ast_function.cpp
> +++ b/src/glsl/ast_function.cpp
> @@ -1785,7 +1785,14 @@ ast_function_expression::hir(exec_list *instructions,
>/* an error has already been emitted */
>value = ir_rvalue::error_value(ctx);
>} else {
> -  value = generate_call(instructions, sig, &actual_parameters, state);
> +value = generate_call(instructions, sig, &actual_parameters, state);
> +if (!value) {
> +   ir_variable *const tmp = new(ctx) 
> ir_variable(glsl_type::void_type,
> + "void_var",
> + ir_var_temporary);
> +   instructions->push_tail(tmp);
> +   value = new(ctx) ir_dereference_variable(tmp);
> +}

Indention. You used two spaces indention instead of three.

With that fixed,

   Reviewed-by: Samuel Iglesias Gonsálvez 

If you don't have commit rights, I can fix the indention, add the R-b
and push the patch to master (no need of v3 patch). Just tell me to do
so (*).

Thanks for your contribution!

Sam

(*) Also for your piglit patch. If nobody has pushed your piglit patch
yet, I can do it.

>}
>  
>return value;
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 8cb46be..0d0ad2a 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -1270,7 +1270,14 @@ ast_expression::do_hir(exec_list *instructions,
> *applied to one operand that can make them match, in which
> *case this conversion is done."
> */
> -  if ((!apply_implicit_conversion(op[0]->type, op[1], state)
> +
> +  if (op[0]->type == glsl_type::void_type || op[1]->type == 
> glsl_type::void_type) {
> + _mesa_glsl_error(& loc, state, "`%s':  wrong operand types: "
> + "no operation `%1$s' exists that takes a left-hand "
> + "operand of type 'void' or a right operand of type "
> + "'void'", (this->oper == ast_equal) ? "==" : "!=");
> + error_emitted = true;
> +  } else if ((!apply_implicit_conversion(op[0]->type, op[1], state)
> && !apply_implicit_conversion(op[1]->type, op[0], state))
>|| (op[0]->type != op[1]->type)) {
>   _mesa_glsl_error(& loc, state, "operands of `%s' must have the same 
> "
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-13 Thread Neil Roberts

Chris Forbes  writes:

> Nitpicks aside, I don't think this is a great idea now that you've got
> the SKL PI working.

Can you explain why you don't think this is a good idea? Is it because
it is an optimisation for something that is not known to be a big use
case so carrying around the extra code just adds unnecessary maintenance
burden? I could agree with that so I'm happy to abandon the patch for
now if that's the general consensus.

> I also think it's broken -- you need to arrange to have the centroid
> barycentric coords delivered to the FS thread, which won't be
> happening if this is the *only* use of them. Masked in the tests,
> because they compare with a centroid-qualified input. [I'm assuming
> you don't always get these delivered to the FS in SKL, but no docs
> access...]

The changes to brw_compute_barycentric_interp_modes in the patch ensure
that the centroid barycentric coords are delivered whenever
interpolateAtCentroid is used in a shader. I don't think this is a
problem. At least it seems to work in a simple test without using a
separate varying with the centroid qualifier.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 18/78] i965: Take is_scalar_shader_stage() method out to allow reuse

2015-07-13 Thread Jason Ekstrand

On Fri, Jul 10, 2015 at 8:53 AM, Eduardo Lima Mitev  wrote:
> On 06/30/2015 06:58 PM, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>>> This patch makes public the is_scalar_shader_stage() method in brw_shader, 
>>> and
>>> renames it to brw_compiler_is_scalar_shader_stage(). The plan is to later 
>>> reuse it
>>> in brw_nir, to enable/disable optimization passes depending on the type
>>> of shader stage.
>>
>> I'm not so sure that this is a good plan.  It assumes that whether we
>> are doing a scalar or vec4 compile is based entirely on the shader
>> stage and some static information (possibly based on environment
>> variables).  Ken and I were talking around the office and we may want
>> to use both SIMD4x2 and SIMD8 mode for geometry shaders depending on
>> the number of inputs, etc.  This won't work in the given paradigm.
>>
>
> If I understand correctly, what you propose is having a function to
> dynamically choose the type of shader (scalar vs. vec4) when compiling
> the shader, using not only gen and stage, but also actual application
> data. I think this is a good idea and will allow experimenting with
> different combinations of shaders with real input data.
>
> However, I wonder if we this should be added later after more elaborated
> thoughts on what exactly do we need and where to plug it. I have been
> experimenting with a function following the use case you mentioned,
> choosing shader backend based on inputs to a GS. But honestly it feels
> like a blind guess from my side to what actually you and Ken have in mind.
>
> Current patch basically reuses the function we already have to select
> the shader, so what I propose is to move forward with it for the moment
> (adding the missing MESA_SHADER_COMPUTE) and discuss how to extend it to
> factor in dynamic data; or perhaps you can explain us your proposal with
> a bit more detail.
>
> WDYT?

My primary issue was with having it based on a is_scalar_stage
function at all.  It seems like a better long-term plan would be to
simply pass an is_scalar boolean into those lowering functions.  That
said, I haven't taken a real hard look as to what calls those
functions and whether or not that's actually what we want either.
Make sense?  If there's no better place to make that determination
further up the call chain, then this is fine for now.
--Jason

>>> The new method accepts a brw_compiler instead of a brw_context. This is done
>>> for consistency, since the actual info we need (scalar_vs) is in 
>>> brw_compiler,
>>> and fetching in through brw_content->intelScreen->compiler seems like too
>>> much indirection.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_shader.cpp | 22 ++
>>>  src/mesa/drivers/dri/i965/brw_shader.h   | 13 +
>>>  2 files changed, 19 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> index 0b53647..3b99046 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>>> @@ -182,19 +182,6 @@ brw_shader_precompile(struct gl_context *ctx,
>>> return true;
>>>  }
>>>
>>> -static inline bool
>>> -is_scalar_shader_stage(struct brw_context *brw, int stage)
>>> -{
>>> -   switch (stage) {
>>> -   case MESA_SHADER_FRAGMENT:
>>> -  return true;
>>> -   case MESA_SHADER_VERTEX:
>>> -  return brw->intelScreen->compiler->scalar_vs;
>>> -   default:
>>> -  return false;
>>> -   }
>>> -}
>>> -
>>>  static void
>>>  brw_lower_packing_builtins(struct brw_context *brw,
>>> gl_shader_stage shader_type,
>>> @@ -205,7 +192,8 @@ brw_lower_packing_builtins(struct brw_context *brw,
>>> | LOWER_PACK_UNORM_2x16
>>> | LOWER_UNPACK_UNORM_2x16;
>>>
>>> -   if (is_scalar_shader_stage(brw, shader_type)) {
>>> +   if (brw_compiler_is_scalar_shader_stage(brw->intelScreen->compiler,
>>> +   shader_type)) {
>>>ops |= LOWER_UNPACK_UNORM_4x8
>>> | LOWER_UNPACK_SNORM_4x8
>>> | LOWER_PACK_UNORM_4x8
>>> @@ -218,7 +206,8 @@ brw_lower_packing_builtins(struct brw_context *brw,
>>> * lowering is needed. For SOA code, the Half2x16 ops must be
>>> * scalarized.
>>> */
>>> -  if (is_scalar_shader_stage(brw, shader_type)) {
>>> +  if (brw_compiler_is_scalar_shader_stage(brw->intelScreen->compiler,
>>> +  shader_type)) {
>>>   ops |= LOWER_PACK_HALF_2x16_TO_SPLIT
>>>   |  LOWER_UNPACK_HALF_2x16_TO_SPLIT;
>>>}
>>> @@ -294,7 +283,8 @@ process_glsl_ir(struct brw_context *brw,
>>> do {
>>>progress = false;
>>>
>>> -  if (is_scalar_shader_stage(brw, shader->Stage)) {
>>> +  if (brw_compiler_is_scalar_shader_stage(brw->intelScreen->compiler,
>>> +

Re: [Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic

2015-07-13 Thread Jason Ekstrand

On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev  wrote:
> On 06/30/2015 06:51 PM, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>>> The index into the output_reg array where to store the destination register 
>>> is
>>> fetched from the nir_outputs map built during nir_setup_outputs stage.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++--
>>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 8a2d335..55d4490 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -520,10 +520,23 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
>>> *instr)
>>> }
>>>
>>> case nir_intrinsic_store_output_indirect:
>>> +  has_indirect = true;
>>>/* fallthrough */
>>> -   case nir_intrinsic_store_output:
>>> -  /* @TODO: Not yet implemented */
>>> +   case nir_intrinsic_store_output: {
>>> +  int offset = instr->const_index[0];
>>> +  int output = nir_outputs[offset];
>>> +
>>> +  src = get_nir_src(instr->src[0], nir_output_types[offset]);
>>> +  dest = dst_reg(src);
>>> +
>>> +  dest.writemask = brw_writemask_for_size(instr->num_components);
>>> +
>>> +  if (has_indirect)
>>> + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[1]));
>>> +
>>> +  output_reg[output] = dest;
>>
>> I'm very confused about the amount of indirection going on here.  It
>> seems to me that we should be setting these outputs up in
>> setup_outputs() rather than storring off a map from ints to other ints
>> and setting it up here.  I didn't make this comment on the patch for
>> setup_outputs() because I wanted to wait to see it used before I
>> commented on it.
>>
>> I'm guessing you did it this way because the nir_assign_var_locations
>> is giving you bogus values.  If so, then it might be better to just
>> assign variable locations in setup_outputs() rather than having a
>> remap table.  The whole point of nir_lower_io is to make IO easy for
>> the back-end.  If you need a re-map table, then it's no longer making
>> it easy and we need to think more about what's going on.
>> --Jason
>>
>
> That double indirection felt bad since the beginning, but it was needed
> to store the original variable's location (var->data.location). Let me
> explain:
>
> We are (re)using the plumbering in vec4_visitor to setup URB, so the
> only thing we need to do is to store the out register in "output_reg"
> map at the correct location. And that location is given by the original
> location in the shader (var->data.location).
>
> So, in this case, "nir_assign_var_locations" pass, which constructs
> var->data.driver_location, is not useful to us, except to give us
> consecutive indexes to construct the other map we have, the type map,
> which is needed to carry the correct type from the original variable to
> the output register.

If nir_assign_var_locations isn't doing anything for you, don't call
it.  You'll need to do something with var->data.driver_location.  If
what you really want is var->data.location, then just copy that to
var->data.driver_location when you do nir_setup_outputs.  Or
(depending on how the URB setup works, I don't actually know), put the
actual URB location in var->data.driver_location when you walk the
outputs.

From there, you have two options.  One would be to setup output_reg at
the same time with the correct types right away and emit a MOV when
you get a store_output.  (Copy propagation should clean up the MOV.)
For what it's worth, I don't think the type matters; a URB write just
writes data to something so as long as you don't have a type mismatch
in a MOV, the hardware won't care.

The other option, would be to directly emit the URB write in
store_output.  At the moment, it may be better to take the first
option since that better matches what the FS does right now.  But both
should work fine.

> So, before knowing that I could modify nir_lower_io, my best shot at
> transferring the original variable location was to create this
> nir_outputs map. Now, what I have done is to put that value in
> const_index[1] of the intrinsic instruction, which was previously
> unused. What do you think?
>
> That removes the offset to offset map, but we still need the type map.
>
> About your comment on initializing the register during setup stage, I'm
> a bit confused: the register that we need to store is not available
> during setup stage, because we still don't have local registers allocated.

What do you mean?  Because you don't have the destination of the
output_write intrinsic allocated?  Even if the register has a file of
BAD_FILE, you could still store the type there.  Also, as I said
above, the hardware shouldn't care about the types of data.  As long
as

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Jason Ekstrand

On Wed, Jul 1, 2015 at 1:07 AM, Alejandro Piñeiro  wrote:
> On 01/07/15 01:37, Jason Ekstrand wrote:
>> If we can avoid duplication in the texturing code, that would be
>> really nice.  Could we do this as a refactor of the old code and then
>> a much smaller NIR function that calls some shared code?
>
> We worked under the assumption that the IR path would be eventually
> removed, so having a nir self-contained version would make that removal
> easier.
>
>>  That's what
>> we did for FS and it worked ok.
>
> Yes, I realized that the fs nir path was reusing code, instead of a new
> implementation. But as the fs nir code was doing some nir to ir
> structure/enum mappings, I assumed that was a first version of the
> texture support, and that it would be modified in the future when ir get
> ridded. As mentioned, I went directly to a nir self-contained version,
> so when ir get ridded, it would not be needed to modify the nir path,
> but just remove the ir one.

In the FS, it's useful for two reasons.  One was to share code between
the old and new visitors.  The other is because we have 3 different
texture paths in the FS for different gens.  In vec4, we only have one
so it's a little less justified.  However, it will make code review
easier.

>> I looked at the layout of your code
>> and, after you finish reading instruction inputs, very little of it is
>> actually NIR-specific.
>
> Yes, current nir texture support is basically using the nir structures
> to get all the info, and then just adapt the old code at vec4_visitor,
> so it would be possible to reuse the code. In any case, as it is right
> now happening with the fs path, probably it would be needed to do some
> mappings between nir to ir structures/enums.
>
> I will work now on a version of the support based on refactor the
> existing code, in order to be used by the nir and ir paths.
>
> BR
>
>> --Jason
>>
>> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev  wrote:
>>> From: Alejandro Piñeiro 
>>>
>>> This sets up the basic structure and placeholders for the implementation of
>>> texture operations, which will be added incrementally in subsequent patches.
>>>
>>> Apart from helping split the texture support into smaller patches, this 
>>> patch
>>> is useful to get a simplified picture of the steps involved in the emission 
>>> of
>>> texture operations.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120 
>>> -
>>>  1 file changed, 119 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 08a70b0..c3638a0 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -1412,7 +1412,125 @@ vec4_visitor::nir_swizzle_result(nir_tex_instr 
>>> *instr, dst_reg dest,
>>>  void
>>>  vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>>>  {
>>> -   /* @TODO: Not yet implemented */
>>> +   unsigned sampler = instr->sampler_index;
>>> +   src_reg sampler_reg = src_reg(sampler);
>>> +
>>> +   const glsl_type *dest_type;
>>> +
>>> +   /* Load the texture operation sources */
>>> +   for (unsigned i = 0; i < instr->num_srcs; i++) {
>>> +  src_reg src = get_nir_src(instr->src[i].src);
>>> +
>>> +  switch (instr->src[i].src_type) {
>>> +  case nir_tex_src_comparitor:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_coord:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ddx:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ddy:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_lod:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ms_index:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_offset:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_sampler_offset:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_projector:
>>> + unreachable("Should be lowered by do_lower_texture_projection");
>>> +
>>> +  case nir_tex_src_bias:
>>> + unreachable("LOD bias is not valid for vertex shaders.\n");
>>> +
>>> +  default:
>>> + unreachable("unknown texture source");
>>> +  }
>>> +   }
>>> +
>>> +   /* Get the texture operation opcode */
>>> +   enum opcode opcode = shader_opcode_for_nir_opcode(instr->op);
>>> +
>>> +   /* Build the instruction */
>>> +   enum glsl_base_type dest_base_type =
>>> +  brw_glsl_base_type_for_nir_type(instr->dest_type);
>>> +
>>> +   dest_type =
>>> +  glsl_type::get_instance(dest_base_type, 
>>> nir_tex_instr_dest_si

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Jason Ekstrand

On Mon, Jul 13, 2015 at 3:00 AM, Alejandro Piñeiro  wrote:
> On 01/07/15 01:37, Jason Ekstrand wrote:
>> If we can avoid duplication in the texturing code, that would be
>> really nice.  Could we do this as a refactor of the old code and then
>> a much smaller NIR function that calls some shared code?  That's what
>> we did for FS and it worked ok.  I looked at the layout of your code
>> and, after you finish reading instruction inputs, very little of it is
>> actually NIR-specific.
>
> Our wip second batch of commits has now a version that refactors the old
> code. But I have a doubt. As part of this refactor, I created a new
> method to be used by brw_fs and brw_vec4, that returns the ir texture
> opcode based on the nir texture opcode. You can take a look to that
> method here [1]. The issue with this method is that ir_texture_opcode is
> defined at glsl/ir.h inside a #ifdef __cplusplus. So I needed to rename
> brw_nir.c to brw_nir.cpp. On a previous early review [2] you mentioned
> that it would be better to keep it as brw_nir.c unless we have a good
> reason. Is using this enum good reason to do the renaming? If not, the
> other option I see is modify glsl/ir.h and move the enum definitions to
> the top, outside the ifdef __cplusplus, as other headers (like
> glsl/glsl_types.h) have.

Go ahead and make it a cpp file if you have to.  Another option would
be to do the map the other way: ir_opcode -> nir_tex_op and refactor
the code to use the nir_tex_op.  That's probably what we should be
doing in the FS now that the old visitor is gone.
--Jason

> BR
>
> [1]
> https://github.com/Igalia/mesa/commit/e81ce150ef931e50b6cb1aae0b42a79f448863f5
> [2] https://bugs.freedesktop.org/show_bug.cgi?id=89580#c7
>> --Jason
>>
>> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev  wrote:
>>> From: Alejandro Piñeiro 
>>>
>>> This sets up the basic structure and placeholders for the implementation of
>>> texture operations, which will be added incrementally in subsequent patches.
>>>
>>> Apart from helping split the texture support into smaller patches, this 
>>> patch
>>> is useful to get a simplified picture of the steps involved in the emission 
>>> of
>>> texture operations.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120 
>>> -
>>>  1 file changed, 119 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 08a70b0..c3638a0 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -1412,7 +1412,125 @@ vec4_visitor::nir_swizzle_result(nir_tex_instr 
>>> *instr, dst_reg dest,
>>>  void
>>>  vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>>>  {
>>> -   /* @TODO: Not yet implemented */
>>> +   unsigned sampler = instr->sampler_index;
>>> +   src_reg sampler_reg = src_reg(sampler);
>>> +
>>> +   const glsl_type *dest_type;
>>> +
>>> +   /* Load the texture operation sources */
>>> +   for (unsigned i = 0; i < instr->num_srcs; i++) {
>>> +  src_reg src = get_nir_src(instr->src[i].src);
>>> +
>>> +  switch (instr->src[i].src_type) {
>>> +  case nir_tex_src_comparitor:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_coord:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ddx:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ddy:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_lod:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_ms_index:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_offset:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_sampler_offset:
>>> + /* @TODO: not yet implemented */
>>> + break;
>>> +
>>> +  case nir_tex_src_projector:
>>> + unreachable("Should be lowered by do_lower_texture_projection");
>>> +
>>> +  case nir_tex_src_bias:
>>> + unreachable("LOD bias is not valid for vertex shaders.\n");
>>> +
>>> +  default:
>>> + unreachable("unknown texture source");
>>> +  }
>>> +   }
>>> +
>>> +   /* Get the texture operation opcode */
>>> +   enum opcode opcode = shader_opcode_for_nir_opcode(instr->op);
>>> +
>>> +   /* Build the instruction */
>>> +   enum glsl_base_type dest_base_type =
>>> +  brw_glsl_base_type_for_nir_type(instr->dest_type);
>>> +
>>> +   dest_type =
>>> +  glsl_type::get_instance(dest_base_type, 
>>> nir_tex_instr_dest_size(instr), 1);
>>> +
>>> +   vec4_instruction *inst = new(mem_ctx)
>>> +  vec4_instruction(opcode, dst_reg(this, dest_type));
>>> +
>>> +   for (un

[Mesa-dev] [PATCH] i965/fs: Make the texturing helpers take NIR opcodes instead of old IR ones

2015-07-13 Thread Jason Ekstrand

Now that the old GLSL IR visitor code is gone, having the remap is silly.
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  12 +--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  18 +---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 ++-
 3 files changed, 75 insertions(+), 90 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 5243079..1690f4a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -109,7 +109,7 @@ public:
void compute_clip_distance(gl_clip_plane *clip_planes);
 
uint32_t gather_channel(int orig_chan, uint32_t sampler);
-   void swizzle_result(ir_texture_opcode op, int dest_components,
+   void swizzle_result(nir_texop op, int dest_components,
fs_reg orig_val, uint32_t sampler);
 
int type_size(const struct glsl_type *type);
@@ -207,28 +207,28 @@ public:
void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
bool is_rect, uint32_t sampler, int texunit);
-   fs_inst *emit_texture_gen4(ir_texture_opcode op, fs_reg dst,
+   fs_inst *emit_texture_gen4(nir_texop op, fs_reg dst,
   fs_reg coordinate, int coord_components,
   fs_reg shadow_comp,
   fs_reg lod, fs_reg lod2, int grad_components,
   uint32_t sampler);
-   fs_inst *emit_texture_gen4_simd16(ir_texture_opcode op, fs_reg dst,
+   fs_inst *emit_texture_gen4_simd16(nir_texop op, fs_reg dst,
  fs_reg coordinate, int vector_elements,
  fs_reg shadow_c, fs_reg lod,
  uint32_t sampler);
-   fs_inst *emit_texture_gen5(ir_texture_opcode op, fs_reg dst,
+   fs_inst *emit_texture_gen5(nir_texop op, fs_reg dst,
   fs_reg coordinate, int coord_components,
   fs_reg shadow_comp,
   fs_reg lod, fs_reg lod2, int grad_components,
   fs_reg sample_index, uint32_t sampler,
   bool has_offset);
-   fs_inst *emit_texture_gen7(ir_texture_opcode op, fs_reg dst,
+   fs_inst *emit_texture_gen7(nir_texop op, fs_reg dst,
   fs_reg coordinate, int coord_components,
   fs_reg shadow_comp,
   fs_reg lod, fs_reg lod2, int grad_components,
   fs_reg sample_index, fs_reg mcs, fs_reg sampler,
   fs_reg offset_value);
-   void emit_texture(ir_texture_opcode op,
+   void emit_texture(nir_texop op,
  const glsl_type *dest_type,
  fs_reg coordinate, int components,
  fs_reg shadow_c,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index caf1300..d8a6f3c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1739,23 +1739,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
nir_tex_instr *instr)
   glsl_type::get_instance(dest_base_type, nir_tex_instr_dest_size(instr),
   1);
 
-   ir_texture_opcode op;
-   switch (instr->op) {
-   case nir_texop_lod: op = ir_lod; break;
-   case nir_texop_query_levels: op = ir_query_levels; break;
-   case nir_texop_tex: op = ir_tex; break;
-   case nir_texop_tg4: op = ir_tg4; break;
-   case nir_texop_txb: op = ir_txb; break;
-   case nir_texop_txd: op = ir_txd; break;
-   case nir_texop_txf: op = ir_txf; break;
-   case nir_texop_txf_ms: op = ir_txf_ms; break;
-   case nir_texop_txl: op = ir_txl; break;
-   case nir_texop_txs: op = ir_txs; break;
-   default:
-  unreachable("unknown texture opcode");
-   }
-
-   emit_texture(op, dest_type, coordinate, instr->coord_components,
+   emit_texture(instr->op, dest_type, coordinate, instr->coord_components,
 shadow_comparitor, lod, lod2, lod_components, sample_index,
 tex_offset, mcs, gather_component,
 is_cube_array, is_rect, sampler, sampler_reg, texunit);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 94d6a58..c726dcc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -78,7 +78,7 @@ fs_visitor::emit_vs_system_value(int location)
 }
 
 fs_inst *
-fs_visitor::emit_texture_gen4(ir_texture_opcode op, fs_reg dst,
+fs_visitor::emit_texture_gen4(nir_texop op, fs_reg dst,
   fs_reg coordinate, int coord_components,
   fs_reg shadow_c,
   fs_reg lod, fs_reg dPdy, int grad_components,
@@ -106,13

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-13 Thread Jose Fonseca


On 12/07/15 01:49, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 4:54 PM, Jose Fonseca  wrote:

On 02/07/15 19:45, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:


On 02/07/15 17:49, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca 
wrote:



On 02/07/15 17:24, Ilia Mirkin wrote:




On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
wrote:





Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...





I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.




It just won't have compiled the shaders, I think. I guess this could
force
it.




AFAIK, tiles defer the _rendering_, not the compilation. At least
llvmpipe
compiles everything at draw time.




That said, glretrace already does glReadPixels when dumping state, so
one
way or the other, when inspecting state in qapitrace, everything will
be
flushed and and synched.




But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?




Fair enough. It's the first thing after glDraw. Forget about
glReadpixels.

I guess just still don't understand what's special about tilers.  But I
don't think it's pertinent now.



What's special about tilers is that they defer renders. Compiling the
program can similarly get deferred because they can. (And sometimes
entire renders get dropped due to clears, etc.) Should it get
deferred? Dunno. I don't even remember if freedreno defers
compilation, and never knew what vc4 did.





Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.





Doing this immediately after draw call is no problem at all. I don't
think
it's worth complicating things by allowing a lag between draw and shader
extraction. It just makes things more unreliable which defeats the point.



Would it really complicate things though? Internally, it can never
drop the debug info since a program might later be reused wholesale
and there won't be another compilation, so it has to store the info on
the program object.



Program object might not exist (e.g. when debugging fixed-function).

And the concept of program object looses meaning in the downstream layers
(e.g inside gallium pipe drivers, where TGSI can come from all sort of
utility modules and not just GLSL).


I have little doubts: for this to be feasible, it's imperative this applies
to the immediately validated state.  Our stack has too many layer to do
anything else: it would be complex and buggy.


Jose


Jose,

Were you planning on working on something like this? I could _really_
use it for some bugs I'm tracking down (and failing thus far),
unfortunately the shaders are unreadable and get compiled very far
away from time of use, which makes it harder to track.


No, I'm afraid I don't have the time myself.  It's not directly useful 
to anything I'm working on at the moment.  My goal was only to help come 
up with good design for this, so that if/when somebody couldn't resist 
the urge to scratch this itch, there was a tentative design/plan in 
place already.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Make the texturing helpers take NIR opcodes instead of old IR ones

2015-07-13 Thread Francisco Jerez

Jason Ekstrand  writes:

> Now that the old GLSL IR visitor code is gone, having the remap is silly.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |  12 +--
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  18 +---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 
> ++-
>  3 files changed, 75 insertions(+), 90 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 5243079..1690f4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -109,7 +109,7 @@ public:
> void compute_clip_distance(gl_clip_plane *clip_planes);
>  
> uint32_t gather_channel(int orig_chan, uint32_t sampler);
> -   void swizzle_result(ir_texture_opcode op, int dest_components,
> +   void swizzle_result(nir_texop op, int dest_components,
> fs_reg orig_val, uint32_t sampler);
>  
> int type_size(const struct glsl_type *type);
> @@ -207,28 +207,28 @@ public:
> void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
> fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
> bool is_rect, uint32_t sampler, int texunit);
> -   fs_inst *emit_texture_gen4(ir_texture_opcode op, fs_reg dst,
> +   fs_inst *emit_texture_gen4(nir_texop op, fs_reg dst,
>fs_reg coordinate, int coord_components,
>fs_reg shadow_comp,
>fs_reg lod, fs_reg lod2, int grad_components,
>uint32_t sampler);
> -   fs_inst *emit_texture_gen4_simd16(ir_texture_opcode op, fs_reg dst,
> +   fs_inst *emit_texture_gen4_simd16(nir_texop op, fs_reg dst,
>   fs_reg coordinate, int vector_elements,
>   fs_reg shadow_c, fs_reg lod,
>   uint32_t sampler);
> -   fs_inst *emit_texture_gen5(ir_texture_opcode op, fs_reg dst,
> +   fs_inst *emit_texture_gen5(nir_texop op, fs_reg dst,
>fs_reg coordinate, int coord_components,
>fs_reg shadow_comp,
>fs_reg lod, fs_reg lod2, int grad_components,
>fs_reg sample_index, uint32_t sampler,
>bool has_offset);
> -   fs_inst *emit_texture_gen7(ir_texture_opcode op, fs_reg dst,
> +   fs_inst *emit_texture_gen7(nir_texop op, fs_reg dst,
>fs_reg coordinate, int coord_components,
>fs_reg shadow_comp,
>fs_reg lod, fs_reg lod2, int grad_components,
>fs_reg sample_index, fs_reg mcs, fs_reg 
> sampler,
>fs_reg offset_value);
> -   void emit_texture(ir_texture_opcode op,
> +   void emit_texture(nir_texop op,
>   const glsl_type *dest_type,
>   fs_reg coordinate, int components,
>   fs_reg shadow_c,

Hold on, I'm about to send a series that gets rid of the
ir_texture_opcode argument of these gen-specific functions and replaces
it with a backend_instruction opcode.  If you wait a little this change
will involve much less churn.

> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index caf1300..d8a6f3c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1739,23 +1739,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
> nir_tex_instr *instr)
>glsl_type::get_instance(dest_base_type, nir_tex_instr_dest_size(instr),
>1);
>  
> -   ir_texture_opcode op;
> -   switch (instr->op) {
> -   case nir_texop_lod: op = ir_lod; break;
> -   case nir_texop_query_levels: op = ir_query_levels; break;
> -   case nir_texop_tex: op = ir_tex; break;
> -   case nir_texop_tg4: op = ir_tg4; break;
> -   case nir_texop_txb: op = ir_txb; break;
> -   case nir_texop_txd: op = ir_txd; break;
> -   case nir_texop_txf: op = ir_txf; break;
> -   case nir_texop_txf_ms: op = ir_txf_ms; break;
> -   case nir_texop_txl: op = ir_txl; break;
> -   case nir_texop_txs: op = ir_txs; break;
> -   default:
> -  unreachable("unknown texture opcode");
> -   }
> -
> -   emit_texture(op, dest_type, coordinate, instr->coord_components,
> +   emit_texture(instr->op, dest_type, coordinate, instr->coord_components,
>  shadow_comparitor, lod, lod2, lod_components, sample_index,
>  tex_offset, mcs, gather_component,
>  is_cube_array, is_rect, sampler, sampler_reg, texunit);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 94d6a58..c726dcc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/dr

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Alejandro Piñeiro



On 13/07/15 14:05, Jason Ekstrand wrote:
> On Mon, Jul 13, 2015 at 3:00 AM, Alejandro Piñeiro  
> wrote:
>> On 01/07/15 01:37, Jason Ekstrand wrote:
>>> If we can avoid duplication in the texturing code, that would be
>>> really nice.  Could we do this as a refactor of the old code and then
>>> a much smaller NIR function that calls some shared code?  That's what
>>> we did for FS and it worked ok.  I looked at the layout of your code
>>> and, after you finish reading instruction inputs, very little of it is
>>> actually NIR-specific.
>> Our wip second batch of commits has now a version that refactors the old
>> code. But I have a doubt. As part of this refactor, I created a new
>> method to be used by brw_fs and brw_vec4, that returns the ir texture
>> opcode based on the nir texture opcode. You can take a look to that
>> method here [1]. The issue with this method is that ir_texture_opcode is
>> defined at glsl/ir.h inside a #ifdef __cplusplus. So I needed to rename
>> brw_nir.c to brw_nir.cpp. On a previous early review [2] you mentioned
>> that it would be better to keep it as brw_nir.c unless we have a good
>> reason. Is using this enum good reason to do the renaming? If not, the
>> other option I see is modify glsl/ir.h and move the enum definitions to
>> the top, outside the ifdef __cplusplus, as other headers (like
>> glsl/glsl_types.h) have.
> Go ahead and make it a cpp file if you have to.  

Ok, thanks.

> Another option would
> be to do the map the other way: ir_opcode -> nir_tex_op and refactor
> the code to use the nir_tex_op.  

But even if we do the map in the other way, the helper function would be
using the enum type defined at glsl/ir.h, so we would need to make it a
cpp file in any case. So for now, I will keep things simple, and keep
the current mapping,  and do the change later.

> That's probably what we should be
> doing in the FS now that the old visitor is gone.
> --Jason
>
>> BR
>>
>> [1]
>> https://github.com/Igalia/mesa/commit/e81ce150ef931e50b6cb1aae0b42a79f448863f5
>> [2] https://bugs.freedesktop.org/show_bug.cgi?id=89580#c7
>>> --Jason
>>>
>>> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev  
>>> wrote:
 From: Alejandro Piñeiro 

 This sets up the basic structure and placeholders for the implementation of
 texture operations, which will be added incrementally in subsequent 
 patches.

 Apart from helping split the texture support into smaller patches, this 
 patch
 is useful to get a simplified picture of the steps involved in the 
 emission of
 texture operations.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
 ---
  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120 
 -
  1 file changed, 119 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 index 08a70b0..c3638a0 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 @@ -1412,7 +1412,125 @@ vec4_visitor::nir_swizzle_result(nir_tex_instr 
 *instr, dst_reg dest,
  void
  vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
  {
 -   /* @TODO: Not yet implemented */
 +   unsigned sampler = instr->sampler_index;
 +   src_reg sampler_reg = src_reg(sampler);
 +
 +   const glsl_type *dest_type;
 +
 +   /* Load the texture operation sources */
 +   for (unsigned i = 0; i < instr->num_srcs; i++) {
 +  src_reg src = get_nir_src(instr->src[i].src);
 +
 +  switch (instr->src[i].src_type) {
 +  case nir_tex_src_comparitor:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_coord:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_ddx:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_ddy:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_lod:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_ms_index:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_offset:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_sampler_offset:
 + /* @TODO: not yet implemented */
 + break;
 +
 +  case nir_tex_src_projector:
 + unreachable("Should be lowered by do_lower_texture_projection");
 +
 +  case nir_tex_src_bias:
 + unreachable("LOD bias is not valid for vertex shaders.\n");
 +
 +  default:
 + unreachable("unknown texture source");
 +  }
 +   }
 +
 +   /* Get the texture operation opcode */
 +   enum

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Jason Ekstrand

On Jul 13, 2015 7:14 AM, "Alejandro Piñeiro"  wrote:
>
>
>
> On 13/07/15 14:05, Jason Ekstrand wrote:
> > On Mon, Jul 13, 2015 at 3:00 AM, Alejandro Piñeiro 
wrote:
> >> On 01/07/15 01:37, Jason Ekstrand wrote:
> >>> If we can avoid duplication in the texturing code, that would be
> >>> really nice.  Could we do this as a refactor of the old code and then
> >>> a much smaller NIR function that calls some shared code?  That's what
> >>> we did for FS and it worked ok.  I looked at the layout of your code
> >>> and, after you finish reading instruction inputs, very little of it is
> >>> actually NIR-specific.
> >> Our wip second batch of commits has now a version that refactors the
old
> >> code. But I have a doubt. As part of this refactor, I created a new
> >> method to be used by brw_fs and brw_vec4, that returns the ir texture
> >> opcode based on the nir texture opcode. You can take a look to that
> >> method here [1]. The issue with this method is that ir_texture_opcode
is
> >> defined at glsl/ir.h inside a #ifdef __cplusplus. So I needed to rename
> >> brw_nir.c to brw_nir.cpp. On a previous early review [2] you mentioned
> >> that it would be better to keep it as brw_nir.c unless we have a good
> >> reason. Is using this enum good reason to do the renaming? If not, the
> >> other option I see is modify glsl/ir.h and move the enum definitions to
> >> the top, outside the ifdef __cplusplus, as other headers (like
> >> glsl/glsl_types.h) have.
> > Go ahead and make it a cpp file if you have to.
>
> Ok, thanks.
>
> > Another option would
> > be to do the map the other way: ir_opcode -> nir_tex_op and refactor
> > the code to use the nir_tex_op.
>
> But even if we do the map in the other way, the helper function would be
> using the enum type defined at glsl/ir.h, so we would need to make it a
> cpp file in any case. So for now, I will keep things simple, and keep
> the current mapping,  and do the change later.

The difference is that, with the patch I sent a few hours ago, you only
need to do the conversion one place so there's no need for the helper.

> > That's probably what we should be
> > doing in the FS now that the old visitor is gone.
> > --Jason
> >
> >> BR
> >>
> >> [1]
> >>
https://github.com/Igalia/mesa/commit/e81ce150ef931e50b6cb1aae0b42a79f448863f5
> >> [2] https://bugs.freedesktop.org/show_bug.cgi?id=89580#c7
> >>> --Jason
> >>>
> >>> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev 
wrote:
>  From: Alejandro Piñeiro 
> 
>  This sets up the basic structure and placeholders for the
implementation of
>  texture operations, which will be added incrementally in subsequent
patches.
> 
>  Apart from helping split the texture support into smaller patches,
this patch
>  is useful to get a simplified picture of the steps involved in the
emission of
>  texture operations.
> 
>  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>  ---
>   src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120
-
>   1 file changed, 119 insertions(+), 1 deletion(-)
> 
>  diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>  index 08a70b0..c3638a0 100644
>  --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>  +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>  @@ -1412,7 +1412,125 @@
vec4_visitor::nir_swizzle_result(nir_tex_instr *instr, dst_reg dest,
>   void
>   vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>   {
>  -   /* @TODO: Not yet implemented */
>  +   unsigned sampler = instr->sampler_index;
>  +   src_reg sampler_reg = src_reg(sampler);
>  +
>  +   const glsl_type *dest_type;
>  +
>  +   /* Load the texture operation sources */
>  +   for (unsigned i = 0; i < instr->num_srcs; i++) {
>  +  src_reg src = get_nir_src(instr->src[i].src);
>  +
>  +  switch (instr->src[i].src_type) {
>  +  case nir_tex_src_comparitor:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_coord:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_ddx:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_ddy:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_lod:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_ms_index:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_offset:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_sampler_offset:
>  + /* @TODO: not yet implemented */
>  + break;
>  +
>  +  case nir_tex_src_projector:

Re: [Mesa-dev] [PATCH] i965/fs: Make the texturing helpers take NIR opcodes instead of old IR ones

2015-07-13 Thread Jason Ekstrand

On Jul 13, 2015 6:04 AM, "Francisco Jerez"  wrote:
>
> Jason Ekstrand  writes:
>
> > Now that the old GLSL IR visitor code is gone, having the remap is
silly.
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.h   |  12 +--
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  18 +---
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135
++-
> >  3 files changed, 75 insertions(+), 90 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
b/src/mesa/drivers/dri/i965/brw_fs.h
> > index 5243079..1690f4a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> > @@ -109,7 +109,7 @@ public:
> > void compute_clip_distance(gl_clip_plane *clip_planes);
> >
> > uint32_t gather_channel(int orig_chan, uint32_t sampler);
> > -   void swizzle_result(ir_texture_opcode op, int dest_components,
> > +   void swizzle_result(nir_texop op, int dest_components,
> > fs_reg orig_val, uint32_t sampler);
> >
> > int type_size(const struct glsl_type *type);
> > @@ -207,28 +207,28 @@ public:
> > void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
> > fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
> > bool is_rect, uint32_t sampler, int
texunit);
> > -   fs_inst *emit_texture_gen4(ir_texture_opcode op, fs_reg dst,
> > +   fs_inst *emit_texture_gen4(nir_texop op, fs_reg dst,
> >fs_reg coordinate, int coord_components,
> >fs_reg shadow_comp,
> >fs_reg lod, fs_reg lod2, int
grad_components,
> >uint32_t sampler);
> > -   fs_inst *emit_texture_gen4_simd16(ir_texture_opcode op, fs_reg dst,
> > +   fs_inst *emit_texture_gen4_simd16(nir_texop op, fs_reg dst,
> >   fs_reg coordinate, int
vector_elements,
> >   fs_reg shadow_c, fs_reg lod,
> >   uint32_t sampler);
> > -   fs_inst *emit_texture_gen5(ir_texture_opcode op, fs_reg dst,
> > +   fs_inst *emit_texture_gen5(nir_texop op, fs_reg dst,
> >fs_reg coordinate, int coord_components,
> >fs_reg shadow_comp,
> >fs_reg lod, fs_reg lod2, int
grad_components,
> >fs_reg sample_index, uint32_t sampler,
> >bool has_offset);
> > -   fs_inst *emit_texture_gen7(ir_texture_opcode op, fs_reg dst,
> > +   fs_inst *emit_texture_gen7(nir_texop op, fs_reg dst,
> >fs_reg coordinate, int coord_components,
> >fs_reg shadow_comp,
> >fs_reg lod, fs_reg lod2, int
grad_components,
> >fs_reg sample_index, fs_reg mcs, fs_reg
sampler,
> >fs_reg offset_value);
> > -   void emit_texture(ir_texture_opcode op,
> > +   void emit_texture(nir_texop op,
> >   const glsl_type *dest_type,
> >   fs_reg coordinate, int components,
> >   fs_reg shadow_c,
>
> Hold on, I'm about to send a series that gets rid of the
> ir_texture_opcode argument of these gen-specific functions and replaces
> it with a backend_instruction opcode.  If you wait a little this change
> will involve much less churn.

Fine with me.

> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index caf1300..d8a6f3c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -1739,23 +1739,7 @@ fs_visitor::nir_emit_texture(const fs_builder
&bld, nir_tex_instr *instr)
> >glsl_type::get_instance(dest_base_type,
nir_tex_instr_dest_size(instr),
> >1);
> >
> > -   ir_texture_opcode op;
> > -   switch (instr->op) {
> > -   case nir_texop_lod: op = ir_lod; break;
> > -   case nir_texop_query_levels: op = ir_query_levels; break;
> > -   case nir_texop_tex: op = ir_tex; break;
> > -   case nir_texop_tg4: op = ir_tg4; break;
> > -   case nir_texop_txb: op = ir_txb; break;
> > -   case nir_texop_txd: op = ir_txd; break;
> > -   case nir_texop_txf: op = ir_txf; break;
> > -   case nir_texop_txf_ms: op = ir_txf_ms; break;
> > -   case nir_texop_txl: op = ir_txl; break;
> > -   case nir_texop_txs: op = ir_txs; break;
> > -   default:
> > -  unreachable("unknown texture opcode");
> > -   }
> > -
> > -   emit_texture(op, dest_type, coordinate, instr->coord_components,
> > +   emit_texture(instr->op, dest_type, coordinate,
instr->coord_components,
> >  shadow_comparitor, lod, lod2, lod_components,
sample_index,
> >  tex_offset, mcs, gather_component,
> >  is_cube_array, is_rect, sampler, sampler_reg,

Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-13 Thread Jose Fonseca


On 09/07/15 22:05, Marek Olšák wrote:

I'd like to discuss one more thing that will affect whether image
slots will be global (shared by all shaders) or not.

Which image unit an image uniform uses is not a compile-time thing,
but it's specified later using glUniform1i. That means we need a
per-shader table that maps image uniforms to global image units. One
possible solution is to add this pipe_context function:

void (*set_shader_image_mapping)(
struct pipe_context *, unsigned shader,
unsigned start_decl_index, unsigned count,
unsigned *decl_to_slot_mapping);

This is only required if the shader image slots are global and not
per-shader. (if they are per-shader, st/mesa can reorder the slots for
each shader independently just like it already does for textures and
UBOs) Shader storage buffer objects suffer from the same issue. Atomic
counters don't.

Therefore, image slots must be per-shader (like sampler slots) to
avoid this craziness and keep things simple.


Sounds OK to me too.

D3D11's UAV binding points are global, but that can be easily 
accomodated by binding the same UAVs array on all stages.


Jose


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Jason Ekstrand

On Jul 13, 2015 7:17 AM, "Jason Ekstrand"  wrote:
>
>
> On Jul 13, 2015 7:14 AM, "Alejandro Piñeiro"  wrote:
> >
> >
> >
> > On 13/07/15 14:05, Jason Ekstrand wrote:
> > > On Mon, Jul 13, 2015 at 3:00 AM, Alejandro Piñeiro <
apinhe...@igalia.com> wrote:
> > >> On 01/07/15 01:37, Jason Ekstrand wrote:
> > >>> If we can avoid duplication in the texturing code, that would be
> > >>> really nice.  Could we do this as a refactor of the old code and
then
> > >>> a much smaller NIR function that calls some shared code?  That's
what
> > >>> we did for FS and it worked ok.  I looked at the layout of your code
> > >>> and, after you finish reading instruction inputs, very little of it
is
> > >>> actually NIR-specific.
> > >> Our wip second batch of commits has now a version that refactors the
old
> > >> code. But I have a doubt. As part of this refactor, I created a new
> > >> method to be used by brw_fs and brw_vec4, that returns the ir texture
> > >> opcode based on the nir texture opcode. You can take a look to that
> > >> method here [1]. The issue with this method is that
ir_texture_opcode is
> > >> defined at glsl/ir.h inside a #ifdef __cplusplus. So I needed to
rename
> > >> brw_nir.c to brw_nir.cpp. On a previous early review [2] you
mentioned
> > >> that it would be better to keep it as brw_nir.c unless we have a good
> > >> reason. Is using this enum good reason to do the renaming? If not,
the
> > >> other option I see is modify glsl/ir.h and move the enum definitions
to
> > >> the top, outside the ifdef __cplusplus, as other headers (like
> > >> glsl/glsl_types.h) have.
> > > Go ahead and make it a cpp file if you have to.
> >
> > Ok, thanks.
> >
> > > Another option would
> > > be to do the map the other way: ir_opcode -> nir_tex_op and refactor
> > > the code to use the nir_tex_op.
> >
> > But even if we do the map in the other way, the helper function would be
> > using the enum type defined at glsl/ir.h, so we would need to make it a
> > cpp file in any case. So for now, I will keep things simple, and keep
> > the current mapping,  and do the change later.
>
> The difference is that, with the patch I sent a few hours ago, you only
need to do the conversion one place so there's no need for the helper.

For whatever it's worth I'm also OK with just duplicating the 10/lines of
code to do the conversion and not bothering with the helper.

> > > That's probably what we should be
> > > doing in the FS now that the old visitor is gone.
> > > --Jason
> > >
> > >> BR
> > >>
> > >> [1]
> > >>
https://github.com/Igalia/mesa/commit/e81ce150ef931e50b6cb1aae0b42a79f448863f5
> > >> [2] https://bugs.freedesktop.org/show_bug.cgi?id=89580#c7
> > >>> --Jason
> > >>>
> > >>> On Fri, Jun 26, 2015 at 1:07 AM, Eduardo Lima Mitev <
el...@igalia.com> wrote:
> >  From: Alejandro Piñeiro 
> > 
> >  This sets up the basic structure and placeholders for the
implementation of
> >  texture operations, which will be added incrementally in
subsequent patches.
> > 
> >  Apart from helping split the texture support into smaller patches,
this patch
> >  is useful to get a simplified picture of the steps involved in the
emission of
> >  texture operations.
> > 
> >  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> >  ---
> >   src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 120
-
> >   1 file changed, 119 insertions(+), 1 deletion(-)
> > 
> >  diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >  index 08a70b0..c3638a0 100644
> >  --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >  +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >  @@ -1412,7 +1412,125 @@
vec4_visitor::nir_swizzle_result(nir_tex_instr *instr, dst_reg dest,
> >   void
> >   vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
> >   {
> >  -   /* @TODO: Not yet implemented */
> >  +   unsigned sampler = instr->sampler_index;
> >  +   src_reg sampler_reg = src_reg(sampler);
> >  +
> >  +   const glsl_type *dest_type;
> >  +
> >  +   /* Load the texture operation sources */
> >  +   for (unsigned i = 0; i < instr->num_srcs; i++) {
> >  +  src_reg src = get_nir_src(instr->src[i].src);
> >  +
> >  +  switch (instr->src[i].src_type) {
> >  +  case nir_tex_src_comparitor:
> >  + /* @TODO: not yet implemented */
> >  + break;
> >  +
> >  +  case nir_tex_src_coord:
> >  + /* @TODO: not yet implemented */
> >  + break;
> >  +
> >  +  case nir_tex_src_ddx:
> >  + /* @TODO: not yet implemented */
> >  + break;
> >  +
> >  +  case nir_tex_src_ddy:
> >  + /* @TODO: not yet implemented */
> >  + break;
> >  +
> >  +  case nir_tex_src_lod:
> >  + /* @TODO: not yet implemented */
> >  +

Re: [Mesa-dev] [PATCH 3/4] radeon, r200: allow hyperz for radeon DRM module v2

2015-07-13 Thread Alex Deucher

On Sat, Jul 11, 2015 at 8:48 AM, Roland Scheidegger  wrote:
> Am 11.07.2015 um 11:25 schrieb Marek Olšák:
>> On Fri, Jul 10, 2015 at 11:02 PM, Roland Scheidegger  
>> wrote:
>>> Am 10.07.2015 um 19:41 schrieb Emil Velikov:
 On 10 July 2015 at 13:18, Roland Scheidegger  wrote:
> Am 10.07.2015 um 05:44 schrieb Michel Dänzer:
>> On 10.07.2015 05:13, Emil Velikov wrote:
>>> The original code only half considered hyperz as an option. As per
>>> previous commit "major != 2 cannot occur" we can simply things, and
>>> allow users to set the option if they choose to do so.
>>>
>>> Signed-off-by: Emil Velikov 
>>> ---
>>>  src/mesa/drivers/dri/r200/r200_context.c | 10 ++
>>>  src/mesa/drivers/dri/radeon/radeon_context.c |  9 ++---
>>>  2 files changed, 4 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/r200/r200_context.c 
>>> b/src/mesa/drivers/dri/r200/r200_context.c
>>> index 40cc50a..2a42ab3 100644
>>> --- a/src/mesa/drivers/dri/r200/r200_context.c
>>> +++ b/src/mesa/drivers/dri/r200/r200_context.c
>>> @@ -225,14 +225,8 @@ GLboolean r200CreateContext( gl_api api,
>>> rmesa->radeon.initialMaxAnisotropy = 
>>> driQueryOptionf(&rmesa->radeon.optionCache,
>>>  
>>> "def_max_anisotropy");
>>>
>>> -   if ( sPriv->drm_version.major == 1
>>> -   && driQueryOptionb( &rmesa->radeon.optionCache, "hyperz" ) ) {
>>> -  if ( sPriv->drm_version.minor < 13 )
>>> - fprintf( stderr, "DRM version 1.%d too old to support HyperZ, "
>>> -  "disabling.\n", sPriv->drm_version.minor );
>>> -  else
>>> - rmesa->using_hyperz = GL_TRUE;
>>> -   }
>>
>> This code only set rmesa->using_hyperz = GL_TRUE if
>> sPriv->drm_version.major == 1. It was disabled for KMS in commit
>> e541845959761e9f47d14ade6b58a32db04ef7e4 ("r200: Fix piglit paths 
>> test.").
>>
>>
>>> +   if (driQueryOptionb( &rmesa->radeon.optionCache, "hyperz"))
>>> +  rmesa->using_hyperz = GL_TRUE;
>>
>> This enables it again for KMS. Maybe that's okay though, especially if
>> the driconf option is disabled by default.
>
>
> Oh you're right. The reason given though why it was disabled looks bogus
> to me ("Piglit doesn't like HyperZ warning so disable it for kms." ???),
> and I can't see why that would have only applied to r200, not r100. So
> it should be fine. (Of course, you will get more failures with that
> enabled with piglit, some things just plain won't work, but that was
> just the case with UMS too, and the reason why it never was enabled by
> default.)
>
 Yes without Roland's knowledge if hyperz is supposed to work for KMS
 the current code is quite ambiguous. If you guys prefer I can simply
 rip out the whole thing, then again hyperz is disabled by default so
 no harm should follow with this patch.

 I don't mind either way.
 Emil

>>>
>>> I'd say keep the option (for both drivers) for now.
>>> I've got some r200 in a dust bin actually, haven't touched it in years...
>>> I think it should work in the same way as it does on r100.
>>> The depth buffer metadata (i.e. the bits saying if a tile is compressed
>>> or not etc.) is really fixed onchip cache, and there's no attempt to
>>> grab or restore that data when the depth buffer is changed, which
>>> obviously isn't quite right... The other limitation is that you cannot
>>> read or write the depth buffer directly (well you can but you get back
>>> garbage - not being able to do a glReadPixel on the depth buffer alone
>>> will cause lots of piglit failures). UMS could also do fast z clear,
>>> this is pretty simple as the command would just set the on-chip tile
>>> bits to the "cleared" state, but this never made it to KMS - the rest of
>>> hyperz should work without this, however, though I'm not entirely sure.
>>> IIRC the r300 has actually pretty much the same hw limitations, however
>>> the driver there fixes these issues (though I think that chip had some
>>> nicer way of fast z clear).
>>
>> Not really, there is just a packet to clear the on-chip tile bits and
>> we always clear the whole surface.
> Yes but for r100/r200 I don't know if such a packet exists. The kernel
> did all the effort for figuring out the actual on-chip addresses to
> clear (of course, with the shared z buffer of late, that way actually
> multiple apps ran with hyperz enabled in parallel correctly, as long as
> they were placed "far enough apart" on the screen (so the tiles wouldn't
> intersect).
> It is possible though such a packet exists on r100/r200 too, and we just
> never figured it out :-).

If there wasn't a special packet, I think you could just write to some
registers to get the same effect.

Alex

>
> Roland
>
>>
>> For reading back zbuffers, a full-sc

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Mon, Jul 13, 2015 at 5:24 AM, Samuel Pitoiset
 wrote:
> NV50_3D_BIND_TSC only allows to bind 16 samplers, and since we don't
> want to do anything with NV50_3D_BIND_TSC2, just limit the maximum
> number of samplers to 16 like for nvc0.
>
> This fixes dmesg fails with the following piglit test:
>  max-samplers
>
> But the test still fails.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> index 6583a35..46ae0b8 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> @@ -286,7 +286,7 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, 
> unsigned shader,
> case PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS:
>/* The chip could handle more sampler views than samplers */
> case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
> -  return MIN2(32, PIPE_MAX_SAMPLERS);
> +  return MIN2(16, PIPE_MAX_SAMPLERS);
> case PIPE_SHADER_CAP_DOUBLES:
> case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
> case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nv50: add nesting support for occlusion queries

2015-07-13 Thread Ilia Mirkin

This doesn't fix up nv50_query_result. And there's also all the rotate
logic in nvc0 which seems to think it's required. Quite frankly I
never fully grokked all that stuff, but it seems highly relevant to
this.

On Mon, Jul 13, 2015 at 4:49 AM, Samuel Pitoiset
 wrote:
> This is loosely based on nvc0.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_query.c  | 27 
> --
>  src/gallium/drivers/nouveau/nv50/nv50_screen.h |  2 ++
>  2 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> index 81f7474..80d3fd2 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> @@ -49,6 +49,7 @@ struct nv50_query {
> uint32_t offset; /* base + i * 32 */
> uint8_t state;
> boolean is64bit;
> +   int nesting; /* only used for occlusion queries */
> struct nouveau_mm_allocation *mm;
> struct nouveau_fence *fence;
>  };
> @@ -175,11 +176,16 @@ nv50_query_begin(struct pipe_context *pipe, struct 
> pipe_query *pq)
>
> switch (q->type) {
> case PIPE_QUERY_OCCLUSION_COUNTER:
> -  PUSH_SPACE(push, 4);
> -  BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
> -  PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
> -  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> -  PUSH_DATA (push, 1);
> +  q->nesting = nv50->screen->num_occlusion_queries_active++;
> +  if (q->nesting) {
> + nv50_query_get(push, q, 0x10, 0x0100f002);
> +  } else {
> + PUSH_SPACE(push, 4);
> + BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
> + PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
> + BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> + PUSH_DATA (push, 1);
> +  }
>break;
> case PIPE_QUERY_PRIMITIVES_GENERATED:
>nv50_query_get(push, q, 0x10, 0x06805002);
> @@ -223,9 +229,11 @@ nv50_query_end(struct pipe_context *pipe, struct 
> pipe_query *pq)
> switch (q->type) {
> case PIPE_QUERY_OCCLUSION_COUNTER:
>nv50_query_get(push, q, 0, 0x0100f002);
> -  PUSH_SPACE(push, 2);
> -  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> -  PUSH_DATA (push, 0);
> +  if (--nv50->screen->num_occlusion_queries_active == 0) {
> + PUSH_SPACE(push, 2);
> + BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> + PUSH_DATA (push, 0);
> +  }
>break;
> case PIPE_QUERY_PRIMITIVES_GENERATED:
>nv50_query_get(push, q, 0, 0x06805002);
> @@ -396,8 +404,7 @@ nv50_render_condition(struct pipe_context *pipe,
>case PIPE_QUERY_OCCLUSION_COUNTER:
>case PIPE_QUERY_OCCLUSION_PREDICATE:
>   if (likely(!condition)) {
> -/* XXX: Placeholder, handle nesting here if available */
> -if (unlikely(false))
> +if (unlikely(q->nesting))
> cond = wait ? NV50_3D_COND_MODE_NOT_EQUAL :
>   NV50_3D_COND_MODE_ALWAYS;
>  else
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> index 881051b..3a12a1f 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> @@ -54,6 +54,8 @@ struct nv50_screen {
> struct nv50_context *cur_ctx;
> struct nv50_graph_state save_state;
>
> +   int num_occlusion_queries_active;
> +
> struct nouveau_bo *code;
> struct nouveau_bo *uniforms;
> struct nouveau_bo *txc; /* TIC (offset 0) and TSC (65536) */
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers

2015-07-13 Thread Emil Velikov

On 18 June 2015 at 06:53, Julien Isorce  wrote:
> From: Jon TURNEY 
>
> _SET_DrawBuffers requires driDispatchRemapTable,
> so we need to link with libmesa for remap.c.
> libmesa requires the C++ linker.
>
> Also need to arrange to call _mesa_init_remap_table()
> to initialize the remap table.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90311
> Signed-off-by: Jon TURNEY 
> ---
>  src/glx/Makefile.am | 5 -
>  src/glx/apple/apple_glapi.c | 3 +++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
> index 6e50e09..de24496 100644
> --- a/src/glx/Makefile.am
> +++ b/src/glx/Makefile.am
> @@ -139,7 +139,10 @@ libglx_la_SOURCES += \
> applegl_glx.c
>
>  SUBDIRS += apple
> -libglx_la_LIBADD += $(builddir)/apple/libappleglx.la
> +libglx_la_LIBADD += \
> +   $(builddir)/apple/libappleglx.la \
> +   $(top_builddir)/src/mesa/libmesa.la
> +nodist_EXTRA_lib@GL_LIB@_la_SOURCES = dummy.cpp
>  endif
>
>  GL_LIBS = \
> diff --git a/src/glx/apple/apple_glapi.c b/src/glx/apple/apple_glapi.c
> index 4d19f7f..849044b 100644
> --- a/src/glx/apple/apple_glapi.c
> +++ b/src/glx/apple/apple_glapi.c
> @@ -39,6 +39,7 @@
>  #include 
>
>  #include "main/glheader.h"
> +#include "main/remap.h"
>  #include "glapi.h"
>  #include "glapitable.h"
>  #include "main/dispatch.h"
> @@ -54,6 +55,8 @@ static void _apple_glapi_create_table(void) {
>  if (__applegl_api)
>  return;
>
> +_mesa_init_remap_table();
> +
>  __ogl_framework_api = 
> _glapi_create_table_from_handle(apple_cgl_get_dl_handle(), "gl");
>  assert(__ogl_framework_api);
>
So as mentioned previously glx (libGL) should interact with the
dispatch directly, rather than have any knowledge in the
implementation detail (which lies in mesa).

src/glx/indirect_glx.c is a nice example, where __glXNewIndirectAPI
(autogenerated from
mapi/glapi/gen/glX_proto_send.py) is used to create the dispatch table
and then set it with _glapi_set_dispatch().

AppleDRI uses similar approach
_glapi_create_table_from_handle(generated from
mapi/glapi/gen/gl_gentable.py), but there are things that could be
improved/fixed.

 - __ogl_framework_api does not need to be over a thousand entries
long. There are six overrides + __ogl_framework_api->Scissor used.
So tweaking the generator to directly produce __applegl_api and(?) a
slimmed down __ogl_framework_api sounds like a good idea to me.

Further on one could also:
 - Drop the multiple forward declarations of __ogl_framework_api.
 - Wrap/implement appledri around __GLXDRI(display|screen) shoving
storing things like dl_handle, thus allowing things to be torn down on
exit. Namely I'm suggesting to cleanup and test Jon's earlier work.
 - apple_glapi_set_dispatch is called "too late" - at
applegl_bind_context. According to the docs/spec one should be able to
fetch the function pointers without any context let alone a currently
bound one.
 - As a follow up from the above struct
glx_context_vtable::get_proc_address could be nuked, making the struct
fit nicely into 32/64 byte cache :-)

Obviously if you can find an alternative way (but not as hacky as this
one) that'll be great.

Cheers,
Emil

P.S. Why is there no appledriproto package, similar to xf86driproto
and friends ? Having appledri{,str}.h duplicated seems fragile and
ill-advised.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] r200: fix fbo rendering by disabling optimized texture format chooser

2015-07-13 Thread Emil Velikov

Hi Roland,

Can I put forward a couple of trivial requrest

On 11 July 2015 at 19:39,   wrote:
> From: Roland Scheidegger 
>
> It is rather unfortunate that we don't know if a texture is going to be used
> as a rt later, and we lack the means to do something about a format chosen
> which we can't render to directly, so disable this and always chose renderable
> format for rgba8 textures.
> This addresses an issue raised on (old) bug 51658
Please use the full bugzilla link.

> with gnome-shell, don't know
> if that's still applicable but it might fix other things as well.
> ---
>  src/mesa/drivers/dri/radeon/radeon_texture.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/radeon/radeon_texture.c 
> b/src/mesa/drivers/dri/radeon/radeon_texture.c
> index edfd48b..95496b1 100644
> --- a/src/mesa/drivers/dri/radeon/radeon_texture.c
> +++ b/src/mesa/drivers/dri/radeon/radeon_texture.c
> @@ -224,7 +224,19 @@ static mesa_format 
> radeonChooseTexFormat(radeonContextPtr rmesa,
> const GLuint ui = 1;
> const GLubyte littleEndian = *((const GLubyte *)&ui);
>
> -   if (fbo)
> +
> +   /* Unfortunately, regardless the fbo flag, we might still be asked to
> +* attach a texture to a fbo later, which then won't succeed if we 
> chose
> +* one which isn't renderable. And unlike more exotic formats, apps 
> aren't
> +* really prepared for the incomplete framebuffer this results in 
> (they'd
> +* have to retry with same internalFormat even, just different
> +* srcFormat/srcType, which can't really be expected anyway).
> +* Ideally, we'd defer format selection until later (if the texture is
> +* used as a rt it's likely there's never data uploaded to it before 
> attached
> +* to a fbo), but this isn't really possible, so for now just always 
> use
> +* a renderable format.
> +*/
> +   if (1||fbo)
Some spaces around || will greatly improve readability.

Cheers
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] radeon: fix some potential big endian issues

2015-07-13 Thread Emil Velikov

On 11 July 2015 at 19:39,   wrote:

> @@ -186,17 +172,24 @@ static inline void emit_cb_setup(struct r100_context 
> *r100,
>
>  /* XXX others?  BE/LE? */
Drop the BE/LE part of the comment ?

>  switch (mesa_format) {
> +/* le */
>  case MESA_FORMAT_B8G8R8A8_UNORM:
>  case MESA_FORMAT_B8G8R8X8_UNORM:
> +/* be */
> +case MESA_FORMAT_A8R8G8B8_UNORM:
> +case MESA_FORMAT_X8R8G8B8_UNORM:
Something like the following might be easier for you/others X months
down the line.

+/* The former format of each pair is for LE while latter for BE systems. */
 switch (mesa_format) {
 case MESA_FORMAT_B8G8R8A8_UNORM:
+case MESA_FORMAT_A8R8G8B8_UNORM:
 case MESA_FORMAT_B8G8R8X8_UNORM:
+case MESA_FORMAT_X8R8G8B8_UNORM:

Same suggestion applies for 4/4.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Emil Velikov

On 13 July 2015 at 10:24, Samuel Pitoiset  wrote:
> NV50_3D_BIND_TSC only allows to bind 16 samplers, and since we don't
> want to do anything with NV50_3D_BIND_TSC2
Is there some widely know issue with it ? mesa & envytools does not
mention anything (based of a quick scan).

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Alejandro Piñeiro



On 13/07/15 16:26, Jason Ekstrand wrote:
>
>
>
> > > > Go ahead and make it a cpp file if you have to.
> > >
> > > Ok, thanks.
> > >
> > > > Another option would
> > > > be to do the map the other way: ir_opcode -> nir_tex_op and refactor
> > > > the code to use the nir_tex_op.
> > >
> > > But even if we do the map in the other way, the helper function
> would be
> > > using the enum type defined at glsl/ir.h, so we would need to make
> it a
> > > cpp file in any case. So for now, I will keep things simple, and keep
> > > the current mapping,  and do the change later.
> >
> > The difference is that, with the patch I sent a few hours ago, you
> only need to do the conversion one place so there's no need for the
> helper.
>
> For whatever it's worth I'm also OK with just duplicating the 10/lines
> of code to do the conversion and not bothering with the helper.
>

I just realized that today Francisco Jerez mentioned that is working to
remove the ir->nir texture op mapping on FS, replacing it with backend
instruction opcode:
http://lists.freedesktop.org/archives/mesa-dev/2015-July/088696.html

So probably it would be better to wait for his change, and avoid the
need of that mapping on the vec4 path too. That would remove the need of
the helper function and the cpp renaming.

That sounds ok or should we go on independently of Francisco's work?

BR

-- 
Alejandro Piñeiro (apinhe...@igalia.com)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Ilia Mirkin

On Mon, Jul 13, 2015 at 12:08 PM, Emil Velikov  wrote:
> On 13 July 2015 at 10:24, Samuel Pitoiset  wrote:
>> NV50_3D_BIND_TSC only allows to bind 16 samplers, and since we don't
>> want to do anything with NV50_3D_BIND_TSC2
> Is there some widely know issue with it ? mesa & envytools does not
> mention anything (based of a quick scan).

It's for an extra 2 system samplers (not sure what for, really). I've
never seen the blob touch them.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread samuel.pitoiset




On 13/07/2015 18:08, Emil Velikov wrote:

Is there some widely know issue with it ? mesa & envytools does not
mention anything (based of a quick scan).


There is no know issue with NV50_3D_BIND_TSC2, but Ilia doesn't seem to 
want to use it. :-)
That's why I followed his suggestion by reducing the max. number of 
samplers to 16 like for nvc0.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-13 Thread Chris Forbes

Oh, never mind - I see there's another hunk that my mailer had folded away
for some reason. I'm happy that it's correct now :)
On Jul 13, 2015 23:33, "Neil Roberts"  wrote:

> Chris Forbes  writes:
>
> > Nitpicks aside, I don't think this is a great idea now that you've got
> > the SKL PI working.
>
> Can you explain why you don't think this is a good idea? Is it because
> it is an optimisation for something that is not known to be a big use
> case so carrying around the extra code just adds unnecessary maintenance
> burden? I could agree with that so I'm happy to abandon the patch for
> now if that's the general consensus.
>
> > I also think it's broken -- you need to arrange to have the centroid
> > barycentric coords delivered to the FS thread, which won't be
> > happening if this is the *only* use of them. Masked in the tests,
> > because they compare with a centroid-qualified input. [I'm assuming
> > you don't always get these delivered to the FS in SKL, but no docs
> > access...]
>
> The changes to brw_compute_barycentric_interp_modes in the patch ensure
> that the centroid barycentric coords are delivered whenever
> interpolateAtCentroid is used in a shader. I don't think this is a
> problem. At least it seems to work in a simple test without using a
> separate varying with the centroid qualifier.
>
> Regards,
> - Neil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nv50: add nesting support for occlusion queries

2015-07-13 Thread Samuel Pitoiset

This is loosely based on nvc0.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 29 --
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  2 ++
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 81f7474..a5b95c1 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -49,6 +49,7 @@ struct nv50_query {
uint32_t offset; /* base + i * 32 */
uint8_t state;
boolean is64bit;
+   int nesting; /* only used for occlusion queries */
struct nouveau_mm_allocation *mm;
struct nouveau_fence *fence;
 };
@@ -175,11 +176,16 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
 
switch (q->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
-  PUSH_SPACE(push, 4);
-  BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
-  PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
-  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
-  PUSH_DATA (push, 1);
+  q->nesting = nv50->screen->num_occlusion_queries_active++;
+  if (q->nesting) {
+ nv50_query_get(push, q, 0x10, 0x0100f002);
+  } else {
+ PUSH_SPACE(push, 4);
+ BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
+ PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
+ BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+ PUSH_DATA (push, 1);
+  }
   break;
case PIPE_QUERY_PRIMITIVES_GENERATED:
   nv50_query_get(push, q, 0x10, 0x06805002);
@@ -223,9 +229,11 @@ nv50_query_end(struct pipe_context *pipe, struct 
pipe_query *pq)
switch (q->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
   nv50_query_get(push, q, 0, 0x0100f002);
-  PUSH_SPACE(push, 2);
-  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
-  PUSH_DATA (push, 0);
+  if (--nv50->screen->num_occlusion_queries_active == 0) {
+ PUSH_SPACE(push, 2);
+ BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
+ PUSH_DATA (push, 0);
+  }
   break;
case PIPE_QUERY_PRIMITIVES_GENERATED:
   nv50_query_get(push, q, 0, 0x06805002);
@@ -319,7 +327,7 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
   res8[0] = TRUE;
   break;
case PIPE_QUERY_OCCLUSION_COUNTER: /* u32 sequence, u32 count, u64 time */
-  res64[0] = q->data[1];
+  res64[0] = q->data[1] - q->data[5];
   break;
case PIPE_QUERY_PRIMITIVES_GENERATED: /* u64 count, u64 time */
case PIPE_QUERY_PRIMITIVES_EMITTED: /* u64 count, u64 time */
@@ -396,8 +404,7 @@ nv50_render_condition(struct pipe_context *pipe,
   case PIPE_QUERY_OCCLUSION_COUNTER:
   case PIPE_QUERY_OCCLUSION_PREDICATE:
  if (likely(!condition)) {
-/* XXX: Placeholder, handle nesting here if available */
-if (unlikely(false))
+if (unlikely(q->nesting))
cond = wait ? NV50_3D_COND_MODE_NOT_EQUAL :
  NV50_3D_COND_MODE_ALWAYS;
 else
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 881051b..3a12a1f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -54,6 +54,8 @@ struct nv50_screen {
struct nv50_context *cur_ctx;
struct nv50_graph_state save_state;
 
+   int num_occlusion_queries_active;
+
struct nouveau_bo *code;
struct nouveau_bo *uniforms;
struct nouveau_bo *txc; /* TIC (offset 0) and TSC (65536) */
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] nv50: turn samples counts off during blit

2015-07-13 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Mon, Jul 13, 2015 at 4:49 AM, Samuel Pitoiset
 wrote:
> Fixes the following piglit test:
>   occlusion_query_meta_no_fragments
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> index dc9852d..66eccc2 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> @@ -1432,6 +1432,7 @@ static void
>  nv50_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
>  {
> struct nv50_context *nv50 = nv50_context(pipe);
> +   struct nouveau_pushbuf *push = nv50->base.pushbuf;
> boolean eng3d = FALSE;
>
> if (util_format_is_depth_or_stencil(info->dst.resource->format)) {
> @@ -1493,10 +1494,20 @@ nv50_blit(struct pipe_context *pipe, const struct 
> pipe_blit_info *info)
>  info->src.box.height != -info->dst.box.height))
>eng3d = TRUE;
>
> +   if (nv50->screen->num_occlusion_queries_active) {
> +  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> +  PUSH_DATA (push, 0);
> +   }
> +
> if (!eng3d)
>nv50_blit_eng2d(nv50, info);
> else
>nv50_blit_3d(nv50, info);
> +
> +   if (nv50->screen->num_occlusion_queries_active) {
> +  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> +  PUSH_DATA (push, 1);
> +   }
>  }
>
>  static void
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nv50: add nesting support for occlusion queries

2015-07-13 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Mon, Jul 13, 2015 at 12:49 PM, Samuel Pitoiset
 wrote:
> This is loosely based on nvc0.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_query.c  | 29 
> --
>  src/gallium/drivers/nouveau/nv50/nv50_screen.h |  2 ++
>  2 files changed, 20 insertions(+), 11 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> index 81f7474..a5b95c1 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> @@ -49,6 +49,7 @@ struct nv50_query {
> uint32_t offset; /* base + i * 32 */
> uint8_t state;
> boolean is64bit;
> +   int nesting; /* only used for occlusion queries */
> struct nouveau_mm_allocation *mm;
> struct nouveau_fence *fence;
>  };
> @@ -175,11 +176,16 @@ nv50_query_begin(struct pipe_context *pipe, struct 
> pipe_query *pq)
>
> switch (q->type) {
> case PIPE_QUERY_OCCLUSION_COUNTER:
> -  PUSH_SPACE(push, 4);
> -  BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
> -  PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
> -  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> -  PUSH_DATA (push, 1);
> +  q->nesting = nv50->screen->num_occlusion_queries_active++;
> +  if (q->nesting) {
> + nv50_query_get(push, q, 0x10, 0x0100f002);
> +  } else {
> + PUSH_SPACE(push, 4);
> + BEGIN_NV04(push, NV50_3D(COUNTER_RESET), 1);
> + PUSH_DATA (push, NV50_3D_COUNTER_RESET_SAMPLECNT);
> + BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> + PUSH_DATA (push, 1);
> +  }
>break;
> case PIPE_QUERY_PRIMITIVES_GENERATED:
>nv50_query_get(push, q, 0x10, 0x06805002);
> @@ -223,9 +229,11 @@ nv50_query_end(struct pipe_context *pipe, struct 
> pipe_query *pq)
> switch (q->type) {
> case PIPE_QUERY_OCCLUSION_COUNTER:
>nv50_query_get(push, q, 0, 0x0100f002);
> -  PUSH_SPACE(push, 2);
> -  BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> -  PUSH_DATA (push, 0);
> +  if (--nv50->screen->num_occlusion_queries_active == 0) {
> + PUSH_SPACE(push, 2);
> + BEGIN_NV04(push, NV50_3D(SAMPLECNT_ENABLE), 1);
> + PUSH_DATA (push, 0);
> +  }
>break;
> case PIPE_QUERY_PRIMITIVES_GENERATED:
>nv50_query_get(push, q, 0, 0x06805002);
> @@ -319,7 +327,7 @@ nv50_query_result(struct pipe_context *pipe, struct 
> pipe_query *pq,
>res8[0] = TRUE;
>break;
> case PIPE_QUERY_OCCLUSION_COUNTER: /* u32 sequence, u32 count, u64 time */
> -  res64[0] = q->data[1];
> +  res64[0] = q->data[1] - q->data[5];
>break;
> case PIPE_QUERY_PRIMITIVES_GENERATED: /* u64 count, u64 time */
> case PIPE_QUERY_PRIMITIVES_EMITTED: /* u64 count, u64 time */
> @@ -396,8 +404,7 @@ nv50_render_condition(struct pipe_context *pipe,
>case PIPE_QUERY_OCCLUSION_COUNTER:
>case PIPE_QUERY_OCCLUSION_PREDICATE:
>   if (likely(!condition)) {
> -/* XXX: Placeholder, handle nesting here if available */
> -if (unlikely(false))
> +if (unlikely(q->nesting))
> cond = wait ? NV50_3D_COND_MODE_NOT_EQUAL :
>   NV50_3D_COND_MODE_ALWAYS;
>  else
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> index 881051b..3a12a1f 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
> @@ -54,6 +54,8 @@ struct nv50_screen {
> struct nv50_context *cur_ctx;
> struct nv50_graph_state save_state;
>
> +   int num_occlusion_queries_active;
> +
> struct nouveau_bo *code;
> struct nouveau_bo *uniforms;
> struct nouveau_bo *txc; /* TIC (offset 0) and TSC (65536) */
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] i965/bdw: Fix setting the instancing state for the SGVS element

2015-07-13 Thread Neil Roberts

When gl_VertexID or gl_InstanceID is used a 3DSTATE_VF_SGVS
instruction is sent to create a sort of element to store the generated
values. The last instruction in this chunk of code looks like it was
trying to set the instancing state for the element using the
3DSTATE_VF_INSTANCING instruction. However it was sending
brw->vb.nr_buffers instead of the element index. This instruction is
supposed to take an element index and that is how it is used further
down in the function so the previous code looks wrong. Perhaps
previously the number of buffers coincidentally matched the number of
enabled elements so the value was generally correct anyway. In a
subsequent patch I want to change a bit how it chooses the SGVS
element index so this needs to be fixed.

Cc: "10.6 10.5" 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 1af90ec..f7d9952 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -74,7 +74,7 @@ gen8_emit_vertices(struct brw_context *brw)
 
   BEGIN_BATCH(3);
   OUT_BATCH(_3DSTATE_VF_INSTANCING << 16 | (3 - 2));
-  OUT_BATCH(brw->vb.nr_buffers | GEN8_VF_INSTANCING_ENABLE);
+  OUT_BATCH(vue | GEN8_VF_INSTANCING_ENABLE);
   OUT_BATCH(0);
   ADVANCE_BATCH();
} else {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/3] i965: Fix problems with gl_Vertex/InstanceID and glPolygonMode

2015-07-13 Thread Neil Roberts

This series fixes these two related bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=84677
https://bugs.freedesktop.org/show_bug.cgi?id=91292

The first bug has had a fix for a while but until now I wasn't able to
figure out what to do on BDW. I had previously made a fix for the
second bug but as pointed out by the original bug reporter the two
bugs interact and the fixes would conflict. This series has a v2 of
the patch so that the two fixes work together.

I've posted a patch to make the Piglit testing more thorough here:

http://patchwork.freedesktop.org/patch/54320/

In the v1 for patch 3 Chris Forbes had a comment that it would be
better to move the edge flag element within brw->vb.* instead of while
emitting the state. I think this would be a good idea except it won't
help in this case now that it works with the SGVS element as well
because there is no input for that at all in brw->vb.*.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3 v2] i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used

2015-07-13 Thread Neil Roberts

When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices used in these instructions weren't taking
into account the reordering so the state would be incorrect.

v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope
when gl_VertexID is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292
Cc: "10.6 10.5" 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 2bac5ff..1b48643 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -246,13 +246,24 @@ gen8_emit_vertices(struct brw_context *brw)
}
ADVANCE_BATCH();
 
-   for (unsigned i = 0; i < brw->vb.nr_enabled; i++) {
+   for (unsigned i = 0, j = 0; i < brw->vb.nr_enabled; i++) {
   const struct brw_vertex_element *input = brw->vb.enabled[i];
   const struct brw_vertex_buffer *buffer = &brw->vb.buffers[input->buffer];
+  unsigned element_index;
+
+  /* The edge flag element is reordered to be the last one in the code
+   * above so we need to compensate for that in the element indices used
+   * below.
+   */
+  if (input == gen6_edgeflag_input)
+ element_index = nr_elements - 1;
+  else
+ element_index = j++;
 
   BEGIN_BATCH(3);
   OUT_BATCH(_3DSTATE_VF_INSTANCING << 16 | (3 - 2));
-  OUT_BATCH(i | (buffer->step_rate ? GEN8_VF_INSTANCING_ENABLE : 0));
+  OUT_BATCH(element_index |
+(buffer->step_rate ? GEN8_VF_INSTANCING_ENABLE : 0));
   OUT_BATCH(buffer->step_rate);
   ADVANCE_BATCH();
}
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3 v2] i965: Swap the order of the vertex ID and edge flag attributes

2015-07-13 Thread Neil Roberts

The edge flag data on Gen6+ is passed through the fixed function hardware as
an extra attribute. According to the PRM it must be the last valid
VERTEX_ELEMENT structure. However if the vertex ID is also used then another
extra element is added to source the VID. This made it so the vertex ID is in
the wrong register in the vertex shader and the edge attribute is no longer in
the last element.

v2: Also implement for BDW+

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677
Cc: "10.6 10.5" 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 30 +++
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 56 +---
 2 files changed, 57 insertions(+), 29 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 320e40e..67a8dfd 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -787,21 +787,6 @@ static void brw_emit_vertices(struct brw_context *brw)
 ((i * 4) << BRW_VE1_DST_OFFSET_SHIFT));
}
 
-   if (brw->gen >= 6 && gen6_edgeflag_input) {
-  uint32_t format =
- brw_get_vertex_surface_type(brw, gen6_edgeflag_input->glarray);
-
-  OUT_BATCH((gen6_edgeflag_input->buffer << GEN6_VE0_INDEX_SHIFT) |
-GEN6_VE0_VALID |
-GEN6_VE0_EDGE_FLAG_ENABLE |
-(format << BRW_VE0_FORMAT_SHIFT) |
-(gen6_edgeflag_input->offset << BRW_VE0_SRC_OFFSET_SHIFT));
-  OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
-(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
-(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
-(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
-   }
-
if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) 
{
   uint32_t dw0 = 0, dw1 = 0;
   uint32_t comp0 = BRW_VE1_COMPONENT_STORE_0;
@@ -842,6 +827,21 @@ static void brw_emit_vertices(struct brw_context *brw)
   OUT_BATCH(dw1);
}
 
+   if (brw->gen >= 6 && gen6_edgeflag_input) {
+  uint32_t format =
+ brw_get_vertex_surface_type(brw, gen6_edgeflag_input->glarray);
+
+  OUT_BATCH((gen6_edgeflag_input->buffer << GEN6_VE0_INDEX_SHIFT) |
+GEN6_VE0_VALID |
+GEN6_VE0_EDGE_FLAG_ENABLE |
+(format << BRW_VE0_FORMAT_SHIFT) |
+(gen6_edgeflag_input->offset << BRW_VE0_SRC_OFFSET_SHIFT));
+  OUT_BATCH((BRW_VE1_COMPONENT_STORE_SRC << BRW_VE1_COMPONENT_0_SHIFT) |
+(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_1_SHIFT) |
+(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
+(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
+   }
+
ADVANCE_BATCH();
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index f7d9952..2bac5ff 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -40,16 +40,25 @@ gen8_emit_vertices(struct brw_context *brw)
 {
struct gl_context *ctx = &brw->ctx;
uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
+   bool uses_edge_flag;
 
brw_prepare_vertices(brw);
brw_prepare_shader_draw_parameters(brw);
 
+   uses_edge_flag = (ctx->Polygon.FrontMode != GL_FILL ||
+ ctx->Polygon.BackMode != GL_FILL);
+
if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) 
{
   unsigned vue = brw->vb.nr_enabled;
 
-  WARN_ONCE(brw->vs.prog_data->inputs_read & VERT_BIT_EDGEFLAG,
-"Using VID/IID with edgeflags, need to reorder the "
-"vertex attributes");
+  /* The element for the edge flags must always be last, so we have to
+   * insert the SGVS before it in that case.
+   */
+  if (uses_edge_flag) {
+ assert(vue > 0);
+ vue--;
+  }
+
   WARN_ONCE(vue >= 33,
 "Trying to insert VID/IID past 33rd vertex element, "
 "need to reorder the vertex attrbutes.");
@@ -138,7 +147,18 @@ gen8_emit_vertices(struct brw_context *brw)
   ADVANCE_BATCH();
}
 
-   unsigned nr_elements = brw->vb.nr_enabled + 
brw->vs.prog_data->uses_vertexid;
+   /* Normally we don't need an element for the SGVS attribute because the
+* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
+* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if the
+* vertex ID is used then it needs an element for the base vertex buffer.
+* Additionally if there is an edge flag element then the SGVS can't be
+* inserted past that so we need a dummy element to ensure that the edge
+* flag is the last one.
+*/
+   bool needs_sgvs_element = (brw->vs.prog_data->uses_vertexid ||
+  (brw->vs.prog_data->use

Re: [Mesa-dev] [PATCH] nvc0: fix geometry program revalidation of clipping params

2015-07-13 Thread Ilia Mirkin

Any one which, after using a geometry shader, enables an extra clip
distance. i.e. none.

On Mon, Jul 13, 2015 at 4:16 AM, Samuel Pitoiset
 wrote:
> What piglit test does this fix?
>
> On Sat, Jul 11, 2015 at 7:13 PM, Ilia Mirkin  wrote:
>>
>> Signed-off-by: Ilia Mirkin 
>> Cc: mesa-sta...@lists.freedesktop.org
>> ---
>>
>> Even though in practice a geometry program will never be using UCP's,
>> we still were revalidating (aka recompiling) the program when more
>> clip planes became enabled (which also are used for regular clip
>> distances).
>>
>> This seems like it should have led to massive fail, but I guess you
>> don't change the number of clip planes when using geometry shaders.
>> But I'm going to put this through a full piglit run just in case
>> there's something I'm missing.
>>
>>  src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>> b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>> index 785e52e..11f2b10 100644
>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>> @@ -339,7 +339,7 @@ nvc0_check_program_ucps(struct nvc0_context *nvc0,
>>nvc0_vertprog_validate(nvc0);
>> else
>> if (likely(vp == nvc0->gmtyprog))
>> -  nvc0_vertprog_validate(nvc0);
>> +  nvc0_gmtyprog_validate(nvc0);
>> else
>>nvc0_tevlprog_validate(nvc0);
>>  }
>> --
>> 2.3.6
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
>
> --
> Best regards,
> Samuel Pitoiset.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0: fix geometry program revalidation of clipping params

2015-07-13 Thread Ilia Mirkin

This was, btw, introduced in commit 3a8ae6ac243b (nvc0: adapt to new
clip state). Back then there was no real geometry support yet.

On Mon, Jul 13, 2015 at 2:05 PM, Ilia Mirkin  wrote:
> Any one which, after using a geometry shader, enables an extra clip
> distance. i.e. none.
>
> On Mon, Jul 13, 2015 at 4:16 AM, Samuel Pitoiset
>  wrote:
>> What piglit test does this fix?
>>
>> On Sat, Jul 11, 2015 at 7:13 PM, Ilia Mirkin  wrote:
>>>
>>> Signed-off-by: Ilia Mirkin 
>>> Cc: mesa-sta...@lists.freedesktop.org
>>> ---
>>>
>>> Even though in practice a geometry program will never be using UCP's,
>>> we still were revalidating (aka recompiling) the program when more
>>> clip planes became enabled (which also are used for regular clip
>>> distances).
>>>
>>> This seems like it should have led to massive fail, but I guess you
>>> don't change the number of clip planes when using geometry shaders.
>>> But I'm going to put this through a full piglit run just in case
>>> there's something I'm missing.
>>>
>>>  src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>>> b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>>> index 785e52e..11f2b10 100644
>>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
>>> @@ -339,7 +339,7 @@ nvc0_check_program_ucps(struct nvc0_context *nvc0,
>>>nvc0_vertprog_validate(nvc0);
>>> else
>>> if (likely(vp == nvc0->gmtyprog))
>>> -  nvc0_vertprog_validate(nvc0);
>>> +  nvc0_gmtyprog_validate(nvc0);
>>> else
>>>nvc0_tevlprog_validate(nvc0);
>>>  }
>>> --
>>> 2.3.6
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>>
>>
>> --
>> Best regards,
>> Samuel Pitoiset.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Emil Velikov

On 13 July 2015 at 17:31, Ilia Mirkin  wrote:
> On Mon, Jul 13, 2015 at 12:08 PM, Emil Velikov  
> wrote:
>> On 13 July 2015 at 10:24, Samuel Pitoiset  wrote:
>>> NV50_3D_BIND_TSC only allows to bind 16 samplers, and since we don't
>>> want to do anything with NV50_3D_BIND_TSC2
>> Is there some widely know issue with it ? mesa & envytools does not
>> mention anything (based of a quick scan).
>
> It's for an extra 2 system samplers (not sure what for, really). I've
> never seen the blob touch them.
I guess that they were expecting for the minimum to be bumped with
OpenGL 4.x (from 16 with OpenGL 3.0).

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Emil Velikov

On 13 July 2015 at 17:32, samuel.pitoiset  wrote:
> On 13/07/2015 18:08, Emil Velikov wrote:
>>
>> Is there some widely know issue with it ? mesa & envytools does not
>> mention anything (based of a quick scan).
>
>
> There is no know issue with NV50_3D_BIND_TSC2, but Ilia doesn't seem to want
> to use it. :-)
> That's why I followed his suggestion by reducing the max. number of samplers
> to 16 like for nvc0.
>
Ack, thanks for the confirmation. Currently it reads like a warning
sign "stay away from XXX" - so if you want to you can soften if a bit.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Ilia Mirkin

On Mon, Jul 13, 2015 at 2:28 PM, Emil Velikov  wrote:
> On 13 July 2015 at 17:32, samuel.pitoiset  wrote:
>> On 13/07/2015 18:08, Emil Velikov wrote:
>>>
>>> Is there some widely know issue with it ? mesa & envytools does not
>>> mention anything (based of a quick scan).
>>
>>
>> There is no know issue with NV50_3D_BIND_TSC2, but Ilia doesn't seem to want
>> to use it. :-)
>> That's why I followed his suggestion by reducing the max. number of samplers
>> to 16 like for nvc0.
>>
> Ack, thanks for the confirmation. Currently it reads like a warning
> sign "stay away from XXX" - so if you want to you can soften if a bit.

"stay away from BIND_TSC2" is right. It's very much not to be used for
regular applications from what I can tell.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Emil Velikov

On 13 July 2015 at 19:30, Ilia Mirkin  wrote:
> On Mon, Jul 13, 2015 at 2:28 PM, Emil Velikov  
> wrote:
>> On 13 July 2015 at 17:32, samuel.pitoiset  wrote:
>>> On 13/07/2015 18:08, Emil Velikov wrote:

 Is there some widely know issue with it ? mesa & envytools does not
 mention anything (based of a quick scan).
>>>
>>>
>>> There is no know issue with NV50_3D_BIND_TSC2, but Ilia doesn't seem to want
>>> to use it. :-)
>>> That's why I followed his suggestion by reducing the max. number of samplers
>>> to 16 like for nvc0.
>>>
>> Ack, thanks for the confirmation. Currently it reads like a warning
>> sign "stay away from XXX" - so if you want to you can soften if a bit.
>
> "stay away from BIND_TSC2" is right. It's very much not to be used for
> regular applications from what I can tell.
I guess I'm having a dull moment and cannot justify the "hostility"
towards it. Is it solely based upon "the blob does not use it so we
shouldn't either" or is there something else ? Perhaps they cannot be
bothered to hook it up. Should that be a discouragement towards using
it in nouveau ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50: limit the maximum number of samplers to 16

2015-07-13 Thread Ilia Mirkin

On Mon, Jul 13, 2015 at 2:44 PM, Emil Velikov  wrote:
> On 13 July 2015 at 19:30, Ilia Mirkin  wrote:
>> On Mon, Jul 13, 2015 at 2:28 PM, Emil Velikov  
>> wrote:
>>> On 13 July 2015 at 17:32, samuel.pitoiset  wrote:
 On 13/07/2015 18:08, Emil Velikov wrote:
>
> Is there some widely know issue with it ? mesa & envytools does not
> mention anything (based of a quick scan).


 There is no know issue with NV50_3D_BIND_TSC2, but Ilia doesn't seem to 
 want
 to use it. :-)
 That's why I followed his suggestion by reducing the max. number of 
 samplers
 to 16 like for nvc0.

>>> Ack, thanks for the confirmation. Currently it reads like a warning
>>> sign "stay away from XXX" - so if you want to you can soften if a bit.
>>
>> "stay away from BIND_TSC2" is right. It's very much not to be used for
>> regular applications from what I can tell.
> I guess I'm having a dull moment and cannot justify the "hostility"
> towards it. Is it solely based upon "the blob does not use it so we
> shouldn't either" or is there something else ? Perhaps they cannot be
> bothered to hook it up. Should that be a discouragement towards using
> it in nouveau ?

It's an extra 2 (not 16 as the original patch thought!) samplers.
Whole separate system for setting them. Seems like they may have been
separate for a reason. There's no API reason to expose them either.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] Bye bye __NOT_HAVE_DRM_H

2015-07-13 Thread Emil Velikov

On 8 July 2015 at 18:07, Emil Velikov  wrote:
> Hi all,
>
> As double-negatives are not too appealing in the English language I've
> decided to nuke the common ifndef __NOT_HAVE_DRM_H.
>
> The first three patches are trivial bugfix + cleanups which I've noticed
> while browsing through, 04-09 remove all occurrences of the lastly macro
> (HAVE_LIBDRM is used instead). With the last two dropping kms_swrast
> from the scons/android build.
>
> Note: I'm not 100% sure if the scons build sets the HAVE_LIBDRM macro,
> despite that it builds just fine. If anyone can confirm that will be
> appreciated.
>
For anyone willing to take a look - please focus on 1 & 4, everything
else is dead obvious :-)

[PATCH 01/11] configure.ac: do not set HAVE_DRI(23) when libdrm is missing
[PATCH 04/11] dri_interface: drop __NOT_HAVE_DRM_H magic

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] i965: Split batch emission from relocation functions.

2015-07-13 Thread Matt Turner

So that everything writing to the batch between BEGIN_BATCH() and
ADVANCE_BATCH() goes through OUT_BATCH.

Reviewed-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 30 ++-
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 34 ++-
 2 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index d93ee6e..f82958f 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -393,11 +393,11 @@ _intel_batchbuffer_flush(struct brw_context *brw,
 
 /*  This is the only way buffers get added to the validate list.
  */
-bool
-intel_batchbuffer_emit_reloc(struct brw_context *brw,
- drm_intel_bo *buffer,
- uint32_t read_domains, uint32_t write_domain,
-uint32_t delta)
+uint32_t
+intel_batchbuffer_reloc(struct brw_context *brw,
+drm_intel_bo *buffer,
+uint32_t read_domains, uint32_t write_domain,
+uint32_t delta)
 {
int ret;
 
@@ -411,16 +411,14 @@ intel_batchbuffer_emit_reloc(struct brw_context *brw,
 * case the buffer doesn't move and we can short-circuit the relocation
 * processing in the kernel
 */
-   intel_batchbuffer_emit_dword(brw, buffer->offset64 + delta);
-
-   return true;
+   return buffer->offset64 + delta;
 }
 
-bool
-intel_batchbuffer_emit_reloc64(struct brw_context *brw,
-   drm_intel_bo *buffer,
-   uint32_t read_domains, uint32_t write_domain,
-  uint32_t delta)
+uint64_t
+intel_batchbuffer_reloc64(struct brw_context *brw,
+  drm_intel_bo *buffer,
+  uint32_t read_domains, uint32_t write_domain,
+  uint32_t delta)
 {
int ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
  buffer, delta,
@@ -432,11 +430,7 @@ intel_batchbuffer_emit_reloc64(struct brw_context *brw,
 * case the buffer doesn't move and we can short-circuit the relocation
 * processing in the kernel
 */
-   uint64_t offset = buffer->offset64 + delta;
-   intel_batchbuffer_emit_dword(brw, offset);
-   intel_batchbuffer_emit_dword(brw, offset >> 32);
-
-   return true;
+   return buffer->offset64 + delta;
 }
 
 
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index e58eae4..c0456f3 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -57,16 +57,16 @@ void intel_batchbuffer_data(struct brw_context *brw,
 const void *data, GLuint bytes,
 enum brw_gpu_ring ring);
 
-bool intel_batchbuffer_emit_reloc(struct brw_context *brw,
-   drm_intel_bo *buffer,
-  uint32_t read_domains,
-  uint32_t write_domain,
-  uint32_t offset);
-bool intel_batchbuffer_emit_reloc64(struct brw_context *brw,
-drm_intel_bo *buffer,
-uint32_t read_domains,
-uint32_t write_domain,
-uint32_t offset);
+uint32_t intel_batchbuffer_reloc(struct brw_context *brw,
+ drm_intel_bo *buffer,
+ uint32_t read_domains,
+ uint32_t write_domain,
+ uint32_t offset);
+uint64_t intel_batchbuffer_reloc64(struct brw_context *brw,
+   drm_intel_bo *buffer,
+   uint32_t read_domains,
+   uint32_t write_domain,
+   uint32_t offset);
 static inline uint32_t float_as_int(float f)
 {
union {
@@ -164,14 +164,16 @@ intel_batchbuffer_advance(struct brw_context *brw)
 #define BEGIN_BATCH_BLT(n) intel_batchbuffer_begin(brw, n, BLT_RING)
 #define OUT_BATCH(d) intel_batchbuffer_emit_dword(brw, d)
 #define OUT_BATCH_F(f) intel_batchbuffer_emit_float(brw, f)
-#define OUT_RELOC(buf, read_domains, write_domain, delta) do { \
-   intel_batchbuffer_emit_reloc(brw, buf,  \
-   read_domains, write_domain, delta); \
-} while (0)
+#define OUT_RELOC(buf, read_domains, write_domain, delta)  \
+   OUT_BATCH(intel_batchbuffer_reloc(brw, buf, read_domains, write_domain, \
+ delta))
 
 /* Handle 48-bit address relocations for Gen8+ */
-#define OUT_RELOC64(buf, read_domains, write_domain, delta) do {

[Mesa-dev] [PATCH 1/4] i965: Move BEGIN_BATCH() into same control flow as ADVANCE_BATCH().

2015-07-13 Thread Matt Turner

BEGIN_BATCH() and ADVANCE_BATCH() will contain "do {" and "} while (0)"
respectively to allow declaring local variables used by intervening
OUT_BATCH macros. As such, BEGIN_BATCH() and ADVANCE_BATCH() need to be
in the same control flow.

Reviewed-by: Iago Toral Quiroga 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 69ad4d4..ec13473 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -261,17 +261,17 @@ static void brw_emit_prim(struct brw_context *brw,
   indirect_flag = 0;
}
 
+   BEGIN_BATCH(brw->gen >= 7 ? 7 : 6);
+
if (brw->gen >= 7) {
   if (brw->predicate.state == BRW_PREDICATE_STATE_USE_BIT)
  predicate_enable = GEN7_3DPRIM_PREDICATE_ENABLE;
   else
  predicate_enable = 0;
 
-  BEGIN_BATCH(7);
   OUT_BATCH(CMD_3D_PRIM << 16 | (7 - 2) | indirect_flag | 
predicate_enable);
   OUT_BATCH(hw_prim | vertex_access_type);
} else {
-  BEGIN_BATCH(6);
   OUT_BATCH(CMD_3D_PRIM << 16 | (6 - 2) |
 hw_prim << GEN4_3DPRIM_TOPOLOGY_TYPE_SHIFT |
 vertex_access_type);
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] i965: Add and use USED_BATCH macro.

2015-07-13 Thread Matt Turner

The next patch will replace the .used field with an on-demand
calculation of batchbuffer usage.
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_performance_monitor.c |  6 +++---
 src/mesa/drivers/dri/i965/brw_state_batch.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_urb.c |  4 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.c   | 20 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.h   |  9 ++---
 6 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 2ccfae1..eac1f00 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -226,7 +226,7 @@ retry:
intel_batchbuffer_require_space(brw, estimated_max_batch_usage, 
RENDER_RING);
intel_batchbuffer_save_state(brw);
drm_intel_bo *saved_bo = brw->batch.bo;
-   uint32_t saved_used = brw->batch.used;
+   uint32_t saved_used = USED_BATCH(brw->batch);
uint32_t saved_state_batch_offset = brw->batch.state_batch_offset;
 
switch (brw->gen) {
@@ -245,7 +245,7 @@ retry:
 * reserved enough space that a wrap will never happen.
 */
assert(brw->batch.bo == saved_bo);
-   assert((brw->batch.used - saved_used) * 4 +
+   assert((USED_BATCH(brw->batch) - saved_used) * 4 +
   (saved_state_batch_offset - brw->batch.state_batch_offset) <
   estimated_max_batch_usage);
/* Shut up compiler warnings on release build */
diff --git a/src/mesa/drivers/dri/i965/brw_performance_monitor.c 
b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
index 0a12375..7e90e8a 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_monitor.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
@@ -710,7 +710,7 @@ emit_mi_report_perf_count(struct brw_context *brw,
/* Make sure the commands to take a snapshot fits in a single batch. */
intel_batchbuffer_require_space(brw, MI_REPORT_PERF_COUNT_BATCH_DWORDS * 4,
RENDER_RING);
-   int batch_used = brw->batch.used;
+   int batch_used = USED_BATCH(brw->batch);
 
/* Reports apparently don't always get written unless we flush first. */
brw_emit_mi_flush(brw);
@@ -754,7 +754,7 @@ emit_mi_report_perf_count(struct brw_context *brw,
brw_emit_mi_flush(brw);
 
(void) batch_used;
-   assert(brw->batch.used - batch_used <= MI_REPORT_PERF_COUNT_BATCH_DWORDS * 
4);
+   assert(USED_BATCH(brw->batch) - batch_used <= 
MI_REPORT_PERF_COUNT_BATCH_DWORDS * 4);
 }
 
 /**
@@ -1386,7 +1386,7 @@ void
 brw_perf_monitor_new_batch(struct brw_context *brw)
 {
assert(brw->batch.ring == RENDER_RING);
-   assert(brw->gen < 6 || brw->batch.used == 0);
+   assert(brw->gen < 6 || USED_BATCH(brw->batch) == 0);
 
if (brw->perfmon.oa_users == 0)
   return;
diff --git a/src/mesa/drivers/dri/i965/brw_state_batch.c 
b/src/mesa/drivers/dri/i965/brw_state_batch.c
index a405a80..d79e0ea 100644
--- a/src/mesa/drivers/dri/i965/brw_state_batch.c
+++ b/src/mesa/drivers/dri/i965/brw_state_batch.c
@@ -87,7 +87,7 @@ brw_annotate_aub(struct brw_context *brw)
drm_intel_aub_annotation annotations[annotation_count];
int a = 0;
make_annotation(&annotations[a++], AUB_TRACE_TYPE_BATCH, 0,
-   4*brw->batch.used);
+   4 * USED_BATCH(brw->batch));
for (int i = brw->state_batch_count; i-- > 0; ) {
   uint32_t type = brw->state_batch_list[i].type;
   uint32_t start_offset = brw->state_batch_list[i].offset;
@@ -136,7 +136,7 @@ __brw_state_batch(struct brw_context *brw,
 * space, then flush and try again.
 */
if (batch->state_batch_offset < size ||
-   offset < 4*batch->used + batch->reserved_space) {
+   offset < 4 * USED_BATCH(*batch) + batch->reserved_space) {
   intel_batchbuffer_flush(brw);
   offset = ROUND_DOWN_TO(batch->state_batch_offset - size, alignment);
}
diff --git a/src/mesa/drivers/dri/i965/brw_urb.c 
b/src/mesa/drivers/dri/i965/brw_urb.c
index 6fcf1b0..8fc06ba 100644
--- a/src/mesa/drivers/dri/i965/brw_urb.c
+++ b/src/mesa/drivers/dri/i965/brw_urb.c
@@ -249,8 +249,8 @@ void brw_upload_urb_fence(struct brw_context *brw)
uf.bits1.cs_fence  = brw->urb.size;
 
/* erratum: URB_FENCE must not cross a 64byte cacheline */
-   if ((brw->batch.used & 15) > 12) {
-  int pad = 16 - (brw->batch.used & 15);
+   if ((USED_BATCH(brw->batch) & 15) > 12) {
+  int pad = 16 - (USED_BATCH(brw->batch) & 15);
   do
 brw->batch.map[brw->batch.used++] = MI_NOOP;
   while (--pad);
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index f82958f..628a7b7 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -94,7 +94,7 @@ intel_batchbuffer_reset_to_saved(struct brw_context *brw)
drm_intel_gem_bo_clear_relocs(brw->batch.bo, brw->b

[Mesa-dev] [PATCH 4/4] i965: Optimize batchbuffer macros.

2015-07-13 Thread Matt Turner

Previously OUT_BATCH was just a macro around an inline function which
does

   brw->batch.map[brw->batch.used++] = dword;

When making consecutive calls to intel_batchbuffer_emit_dword() the
compiler isn't able to recognize that we're writing consecutive memory
locations or that it doesn't need to write batch.used back to memory
each time.

We can avoid both of these problems by making a local pointer to the
next location in the batch in BEGIN_BATCH().

Cuts 18k from the .text size.

   text data  bss  dec  hex  filename
4946956   19515226192  5168300   4edcac  i965_dri.so before
4928956   19515226192  5150300   4e965c  i965_dri.so after

This series (including commit c0433948) improves performance of Synmark
OglBatch7 by 8.01389% +/- 0.63922% (n=83) on Ivybridge.
---
Again, Chris makes an excellent suggestion that in turn improves performance!

 src/mesa/drivers/dri/i965/brw_context.h   |  5 ++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 12 --
 src/mesa/drivers/dri/i965/brw_urb.c   |  2 +-
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 19 -
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 55 ++-
 src/mesa/drivers/dri/i965/intel_blit.c| 19 -
 6 files changed, 70 insertions(+), 42 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 44d1aea..34a49b2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -873,7 +873,8 @@ struct intel_batchbuffer {
 #ifdef DEBUG
uint16_t emit, total;
 #endif
-   uint16_t used, reserved_space;
+   uint16_t reserved_space;
+   uint32_t *map_next;
uint32_t *map;
uint32_t *cpu_map;
 #define BATCH_SZ (8192*sizeof(uint32_t))
@@ -883,7 +884,7 @@ struct intel_batchbuffer {
bool needs_sol_reset;
 
struct {
-  uint16_t used;
+  uint32_t *map_next;
   int reloc_count;
} saved;
 };
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 320e40e..7cb091d 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -604,14 +604,15 @@ brw_prepare_shader_draw_parameters(struct brw_context 
*brw)
 /**
  * Emit a VERTEX_BUFFER_STATE entry (part of 3DSTATE_VERTEX_BUFFERS).
  */
-static void
+static uint32_t *
 emit_vertex_buffer_state(struct brw_context *brw,
  unsigned buffer_nr,
  drm_intel_bo *bo,
  unsigned bo_ending_address,
  unsigned bo_offset,
  unsigned stride,
- unsigned step_rate)
+ unsigned step_rate,
+ uint32_t *__map)
 {
struct gl_context *ctx = &brw->ctx;
uint32_t dw0;
@@ -643,7 +644,10 @@ emit_vertex_buffer_state(struct brw_context *brw,
   OUT_BATCH(0);
}
OUT_BATCH(step_rate);
+
+   return __map;
 }
+#define EMIT_VERTEX_BUFFER_STATE(...) __map = 
emit_vertex_buffer_state(__VA_ARGS__, __map)
 
 static void brw_emit_vertices(struct brw_context *brw)
 {
@@ -704,14 +708,14 @@ static void brw_emit_vertices(struct brw_context *brw)
   OUT_BATCH((_3DSTATE_VERTEX_BUFFERS << 16) | (4 * nr_buffers - 1));
   for (i = 0; i < brw->vb.nr_buffers; i++) {
 struct brw_vertex_buffer *buffer = &brw->vb.buffers[i];
- emit_vertex_buffer_state(brw, i, buffer->bo, buffer->bo->size - 1,
+ EMIT_VERTEX_BUFFER_STATE(brw, i, buffer->bo, buffer->bo->size - 1,
   buffer->offset, buffer->stride,
   buffer->step_rate);
 
   }
 
   if (brw->vs.prog_data->uses_vertexid) {
- emit_vertex_buffer_state(brw, brw->vb.nr_buffers,
+ EMIT_VERTEX_BUFFER_STATE(brw, brw->vb.nr_buffers,
   brw->draw.draw_params_bo,
   brw->draw.draw_params_bo->size - 1,
   brw->draw.draw_params_offset,
diff --git a/src/mesa/drivers/dri/i965/brw_urb.c 
b/src/mesa/drivers/dri/i965/brw_urb.c
index 8fc06ba..6078c38 100644
--- a/src/mesa/drivers/dri/i965/brw_urb.c
+++ b/src/mesa/drivers/dri/i965/brw_urb.c
@@ -252,7 +252,7 @@ void brw_upload_urb_fence(struct brw_context *brw)
if ((USED_BATCH(brw->batch) & 15) > 12) {
   int pad = 16 - (USED_BATCH(brw->batch) & 15);
   do
-brw->batch.map[brw->batch.used++] = MI_NOOP;
+ *brw->batch.map_next++ = MI_NOOP;
   while (--pad);
}
 
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 628a7b7..088ffd2 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -48,6 +48,7 @@ intel_batchbuffer_init(struct brw_context *brw)
if (!brw->has_llc) {
   brw->batch.cpu_map = malloc(BATCH_SZ);
   b

Re: [Mesa-dev] [PATCH 4/5] i965: Split batch emission from relocation functions.

2015-07-13 Thread Matt Turner

On Sat, Jul 11, 2015 at 11:08 AM, Chris Wilson  wrote:
> On Fri, Jul 10, 2015 at 11:44:58AM -0700, Matt Turner wrote:
>> So that everything writing to the batch between BEGIN_BATCH() and
>> ADVANCE_BATCH() goes through OUT_BATCH.
>
> Reviewed-by: Chris Wilson 
>
>> +#define OUT_RELOC64(buf, read_domains, write_domain, delta) do {\
>> +   uint64_t reloc64 = intel_batchbuffer_reloc64(brw, buf, read_domains, \
>> +write_domain, delta);   \
>> +   OUT_BATCH(reloc64);  \
>> +   OUT_BATCH(reloc64 >> 32);\
>>  } while (0)
>
> Is the compiler smart enough to convert this to movq?

It is not, but I'm not totally sure that's better -- an unaligned movq
might cross a cache boundary which is apparently pretty expensive.

> The pointer emission variant at least makes it trivial to use
>*((uint64_t *)map)++ = reloc64
> if so desired.

I'll do some performance tests on top of my v3 series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] i965: Optimize batchbuffer macros.

2015-07-13 Thread Matt Turner

On Sat, Jul 11, 2015 at 11:02 AM, Chris Wilson  wrote:
> On Fri, Jul 10, 2015 at 11:44:59AM -0700, Matt Turner wrote:
>> Previously OUT_BATCH was just a macro around an inline function which
>> does
>>
>>brw->batch.map[brw->batch.used++] = dword;
>>
>> When making consecutive calls to intel_batchbuffer_emit_dword() the
>> compiler isn't able to recognize that we're writing consecutive memory
>> locations or that it doesn't need to write batch.used back to memory
>> each time.
>>
>> We can avoid both of these problems by making a local pointer to the
>> next location in the batch in BEGIN_BATCH(), indexing it with a local
>> variable, and incrementing batch.used once in ADVANCE_BATCH().
>>
>> Cuts 18k from the .text size.
>>
>>text data  bss  dec  hex  filename
>> 4946956   19515226192  5168300   4edcac  i965_dri.so before
>> 4928588   19515226192  5149932   4e94ec  i965_dri.so after
>>
>> This series (including commit c0433948) improves performance of Synmark
>> OglBatch7 by 3.64514% +/- 0.298131% (n=282) on Ivybridge.
>> ---
>> That -4.19005% +/- 1.15188% (n=30) regression on Ivybridge is now a
>> performance improvement! Thanks Chris for the help!
>>
>>  src/mesa/drivers/dri/i965/intel_batchbuffer.c |  8 ++---
>>  src/mesa/drivers/dri/i965/intel_batchbuffer.h | 52 
>> +++
>>  2 files changed, 41 insertions(+), 19 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
>> b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> index f82958f..93f2872 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> @@ -395,13 +395,13 @@ _intel_batchbuffer_flush(struct brw_context *brw,
>>   */
>>  uint32_t
>>  intel_batchbuffer_reloc(struct brw_context *brw,
>> -drm_intel_bo *buffer,
>> +drm_intel_bo *buffer, uint32_t offset,
>>  uint32_t read_domains, uint32_t write_domain,
>>  uint32_t delta)
>>  {
>> int ret;
>>
>> -   ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
>> +   ret = drm_intel_bo_emit_reloc(brw->batch.bo, offset,
>>buffer, delta,
>>read_domains, write_domain);
>> assert(ret == 0);
>> @@ -416,11 +416,11 @@ intel_batchbuffer_reloc(struct brw_context *brw,
>>
>>  uint64_t
>>  intel_batchbuffer_reloc64(struct brw_context *brw,
>> -  drm_intel_bo *buffer,
>> +  drm_intel_bo *buffer, uint32_t offset,
>>uint32_t read_domains, uint32_t write_domain,
>>uint32_t delta)
>>  {
>> -   int ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
>> +   int ret = drm_intel_bo_emit_reloc(brw->batch.bo, offset,
>>   buffer, delta,
>>   read_domains, write_domain);
>> assert(ret == 0);
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
>> b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> index c0456f3..6342c97 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> @@ -59,14 +59,16 @@ void intel_batchbuffer_data(struct brw_context *brw,
>>
>>  uint32_t intel_batchbuffer_reloc(struct brw_context *brw,
>>   drm_intel_bo *buffer,
>> + uint32_t offset,
>>   uint32_t read_domains,
>>   uint32_t write_domain,
>> - uint32_t offset);
>> + uint32_t delta);
>>  uint64_t intel_batchbuffer_reloc64(struct brw_context *brw,
>> drm_intel_bo *buffer,
>> +   uint32_t offset,
>> uint32_t read_domains,
>> uint32_t write_domain,
>> -   uint32_t offset);
>> +   uint32_t delta);
>>  static inline uint32_t float_as_int(float f)
>>  {
>> union {
>> @@ -160,23 +162,43 @@ intel_batchbuffer_advance(struct brw_context *brw)
>>  #endif
>>  }
>>
>> -#define BEGIN_BATCH(n) intel_batchbuffer_begin(brw, n, RENDER_RING)
>> -#define BEGIN_BATCH_BLT(n) intel_batchbuffer_begin(brw, n, BLT_RING)
>> -#define OUT_BATCH(d) intel_batchbuffer_emit_dword(brw, d)
>> -#define OUT_BATCH_F(f) intel_batchbuffer_emit_float(brw, f)
>> -#define OUT_RELOC(buf, read_domains, write_domain, delta)  \
>> -   OUT_BATCH(intel_batchbuffer_reloc(brw, buf, read_domains, write_domain, \
>> - delta))
>> +#define BEGIN_BATCH(n) do {\
>> +   intel_batchbuffer_begin(brw, (n), RENDER_RING); \
>> +   uint32_t *__map = &brw->batch.map[brw

[Mesa-dev] [PATCH] i965: Fix 32 bit build warnings in intel_get_yf_ys_bo_size()

2015-07-13 Thread Anuj Phogat

Along with fixing the type of pitch parameter, patch also changes
the types of few local variables and function return type.

Warnings fixed are:
intel_mipmap_tree.c:671:7: warning: passing argument 3 of
'intel_get_yf_ys_bo_size' from incompatible pointer type

intel_mipmap_tree.c:563:1: note: expected 'uint64_t *' but
argument is of type 'long unsigned int *'

Reported-by: Kenneth Graunke 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index fb896a9..1529651 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -559,14 +559,14 @@ intel_lower_compressed_format(struct brw_context *brw, 
mesa_format format)
 }
 
 /* This function computes Yf/Ys tiled bo size, alignment and pitch. */
-static uint64_t
+static unsigned long
 intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
-uint64_t *pitch)
+unsigned long *pitch)
 {
const uint32_t bpp = mt->cpp * 8;
const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
uint32_t tile_width, tile_height;
-   uint64_t stride, size, aligned_y;
+   unsigned long stride, size, aligned_y;
 
assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v3] i965: Split out gen8 push constant state upload

2015-07-13 Thread Dylan Baker

On Fri, Jul 10, 2015 at 02:24:42PM -0700, Ben Widawsky wrote:
> On Fri, Jul 10, 2015 at 12:03:54PM -0700, Matt Turner wrote:
> > I don't think putting Intel-internal links in the commit message is a good 
> > idea.
> > 
> > Ken's made similar comments to me.
> > 
> > Also, so much off the wall commentary...
> 
> Maybe my definition of "off the wall" is different than yours. The only thing
> off the wall to me, was the bit about missing Mark. It was *some* off the wall
> commentary.
> 
> That aside though, I think the internal links is a good point and thing to
> discuss... I've had a couple of cases already where I, or Neil benefited from
> the Jenkins links being there to try to figure out some later regression. I 
> can
> sympathize with not having internal links in the history since it isn't
> accessible to anyone. Earlier, I would have fought somewhat strongly for the
> links, except that when Mark moved servers he didn't preserve the old links, 
> so
> that made me feel like it's a lot more transient than I initially felt.
> 
> However, I think it's really valuable for us to have them in the patches,
> especially for review by some of the internal folks - like isn't it great to 
> see
> for yourself that I ran it? I suppose I can discard the URLs before pushing. 
> The
> cases I mentioned above would have benefited just as well having the links on
> the list and not in the commit history (albeit a bit harder to find). Any
> opposition to that?

It's worth pointing out in this discussion that old results are pruned
after so many days. It's simply not practical for us to store all of the
results forever.

Dylan


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Mark constant static data as const.

2015-07-13 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_curbe.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 44 ++---
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_curbe.c 
b/src/mesa/drivers/dri/i965/brw_curbe.c
index befd7a9..a149ce3 100644
--- a/src/mesa/drivers/dri/i965/brw_curbe.c
+++ b/src/mesa/drivers/dri/i965/brw_curbe.c
@@ -176,7 +176,7 @@ void brw_upload_cs_urb_state(struct brw_context *brw)
ADVANCE_BATCH();
 }
 
-static GLfloat fixed_plane[6][4] = {
+static const GLfloat fixed_plane[6][4] = {
{ 0,0,   -1, 1 },
{ 0,0,1, 1 },
{ 0,   -1,0, 1 },
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 7cb091d..c95f0c3 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -40,7 +40,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_buffer_objects.h"
 
-static GLuint double_types[5] = {
+static const GLuint double_types[5] = {
0,
BRW_SURFACEFORMAT_R64_FLOAT,
BRW_SURFACEFORMAT_R64G64_FLOAT,
@@ -48,7 +48,7 @@ static GLuint double_types[5] = {
BRW_SURFACEFORMAT_R64G64B64A64_FLOAT
 };
 
-static GLuint float_types[5] = {
+static const GLuint float_types[5] = {
0,
BRW_SURFACEFORMAT_R32_FLOAT,
BRW_SURFACEFORMAT_R32G32_FLOAT,
@@ -56,7 +56,7 @@ static GLuint float_types[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_FLOAT
 };
 
-static GLuint half_float_types[5] = {
+static const GLuint half_float_types[5] = {
0,
BRW_SURFACEFORMAT_R16_FLOAT,
BRW_SURFACEFORMAT_R16G16_FLOAT,
@@ -64,7 +64,7 @@ static GLuint half_float_types[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_FLOAT
 };
 
-static GLuint fixed_point_types[5] = {
+static const GLuint fixed_point_types[5] = {
0,
BRW_SURFACEFORMAT_R32_SFIXED,
BRW_SURFACEFORMAT_R32G32_SFIXED,
@@ -72,7 +72,7 @@ static GLuint fixed_point_types[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_SFIXED,
 };
 
-static GLuint uint_types_direct[5] = {
+static const GLuint uint_types_direct[5] = {
0,
BRW_SURFACEFORMAT_R32_UINT,
BRW_SURFACEFORMAT_R32G32_UINT,
@@ -80,7 +80,7 @@ static GLuint uint_types_direct[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_UINT
 };
 
-static GLuint uint_types_norm[5] = {
+static const GLuint uint_types_norm[5] = {
0,
BRW_SURFACEFORMAT_R32_UNORM,
BRW_SURFACEFORMAT_R32G32_UNORM,
@@ -88,7 +88,7 @@ static GLuint uint_types_norm[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_UNORM
 };
 
-static GLuint uint_types_scale[5] = {
+static const GLuint uint_types_scale[5] = {
0,
BRW_SURFACEFORMAT_R32_USCALED,
BRW_SURFACEFORMAT_R32G32_USCALED,
@@ -96,7 +96,7 @@ static GLuint uint_types_scale[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_USCALED
 };
 
-static GLuint int_types_direct[5] = {
+static const GLuint int_types_direct[5] = {
0,
BRW_SURFACEFORMAT_R32_SINT,
BRW_SURFACEFORMAT_R32G32_SINT,
@@ -104,7 +104,7 @@ static GLuint int_types_direct[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_SINT
 };
 
-static GLuint int_types_norm[5] = {
+static const GLuint int_types_norm[5] = {
0,
BRW_SURFACEFORMAT_R32_SNORM,
BRW_SURFACEFORMAT_R32G32_SNORM,
@@ -112,7 +112,7 @@ static GLuint int_types_norm[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_SNORM
 };
 
-static GLuint int_types_scale[5] = {
+static const GLuint int_types_scale[5] = {
0,
BRW_SURFACEFORMAT_R32_SSCALED,
BRW_SURFACEFORMAT_R32G32_SSCALED,
@@ -120,7 +120,7 @@ static GLuint int_types_scale[5] = {
BRW_SURFACEFORMAT_R32G32B32A32_SSCALED
 };
 
-static GLuint ushort_types_direct[5] = {
+static const GLuint ushort_types_direct[5] = {
0,
BRW_SURFACEFORMAT_R16_UINT,
BRW_SURFACEFORMAT_R16G16_UINT,
@@ -128,7 +128,7 @@ static GLuint ushort_types_direct[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_UINT
 };
 
-static GLuint ushort_types_norm[5] = {
+static const GLuint ushort_types_norm[5] = {
0,
BRW_SURFACEFORMAT_R16_UNORM,
BRW_SURFACEFORMAT_R16G16_UNORM,
@@ -136,7 +136,7 @@ static GLuint ushort_types_norm[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_UNORM
 };
 
-static GLuint ushort_types_scale[5] = {
+static const GLuint ushort_types_scale[5] = {
0,
BRW_SURFACEFORMAT_R16_USCALED,
BRW_SURFACEFORMAT_R16G16_USCALED,
@@ -144,7 +144,7 @@ static GLuint ushort_types_scale[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_USCALED
 };
 
-static GLuint short_types_direct[5] = {
+static const GLuint short_types_direct[5] = {
0,
BRW_SURFACEFORMAT_R16_SINT,
BRW_SURFACEFORMAT_R16G16_SINT,
@@ -152,7 +152,7 @@ static GLuint short_types_direct[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_SINT
 };
 
-static GLuint short_types_norm[5] = {
+static const GLuint short_types_norm[5] = {
0,
BRW_SURFACEFORMAT_R16_SNORM,
BRW_SURFACEFORMAT_R16G16_SNORM,
@@ -160,7 +160,7 @@ static GLuint short_types_norm[5] = {
BRW_SURFACEFORMAT_R16G16B16A16_SNORM
 };
 
-static GLuint short_types_scale[5] = {
+static const GL

[Mesa-dev] [PATCH 11/13] program: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/mesa/program/prog_execute.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/mesa/program/prog_execute.c b/src/mesa/program/prog_execute.c
index 77274e2..2c52d0d 100644
--- a/src/mesa/program/prog_execute.c
+++ b/src/mesa/program/prog_execute.c
@@ -623,7 +623,7 @@ _mesa_execute_program(struct gl_context * ctx,
 GLfloat a[4], result[4];
 fetch_vector1(&inst->SrcReg[0], machine, a);
 result[0] = result[1] = result[2] = result[3]
-   = (GLfloat) cos(a[0]);
+   = cosf(a[0]);
 store_vector4(inst, machine, result);
  }
  break;
@@ -776,7 +776,7 @@ _mesa_execute_program(struct gl_context * ctx,
 if (inst->SrcReg[0].File != PROGRAM_UNDEFINED) {
GLfloat a[4];
fetch_vector1(&inst->SrcReg[0], machine, a);
-   cond = (a[0] != 0.0);
+   cond = (a[0] != 0.0F);
 }
 else {
cond = eval_condition(machine, inst);
@@ -834,7 +834,7 @@ _mesa_execute_program(struct gl_context * ctx,
val = -FLT_MAX;
 }
 else {
-   val = (float)(log(a[0]) * 1.442695F);
+   val = logf(a[0]) * 1.442695F;
 }
 result[0] = result[1] = result[2] = result[3] = val;
 store_vector4(inst, machine, result);
@@ -853,10 +853,10 @@ _mesa_execute_program(struct gl_context * ctx,
 result[1] = a[0];
 /* XXX we could probably just use pow() here */
 if (a[0] > 0.0F) {
-   if (a[1] == 0.0 && a[3] == 0.0)
+   if (a[1] == 0.0F && a[3] == 0.0F)
   result[2] = 1.0F;
else
-  result[2] = (GLfloat) pow(a[1], a[3]);
+  result[2] = powf(a[1], a[3]);
 }
 else {
result[2] = 0.0F;
@@ -886,12 +886,12 @@ _mesa_execute_program(struct gl_context * ctx,
   int exponent;
   GLfloat mantissa = frexpf(t[0], &exponent);
   q[0] = (GLfloat) (exponent - 1);
-  q[1] = (GLfloat) (2.0 * mantissa); /* map [.5, 1) -> [1, 2) 
*/
+  q[1] = 2.0F * mantissa; /* map [.5, 1) -> [1, 2) */
 
  /* The fast LOG2 macro doesn't meet the precision
   * requirements.
   */
-  q[2] = (float)(log(t[0]) * 1.442695F);
+  q[2] = logf(t[0]) * 1.442695F;
}
 }
 else {
@@ -1051,7 +1051,7 @@ _mesa_execute_program(struct gl_context * ctx,
 fetch_vector1(&inst->SrcReg[0], machine, a);
 fetch_vector1(&inst->SrcReg[1], machine, b);
 result[0] = result[1] = result[2] = result[3]
-   = (GLfloat) pow(a[0], b[0]);
+   = powf(a[0], b[0]);
 store_vector4(inst, machine, result);
  }
  break;
@@ -1095,10 +1095,10 @@ _mesa_execute_program(struct gl_context * ctx,
  {
 GLfloat a[4], result[4];
 fetch_vector1(&inst->SrcReg[0], machine, a);
-result[0] = (GLfloat) cos(a[0]);
-result[1] = (GLfloat) sin(a[0]);
-result[2] = 0.0;/* undefined! */
-result[3] = 0.0;/* undefined! */
+result[0] = cosf(a[0]);
+result[1] = sinf(a[0]);
+result[2] = 0.0F;/* undefined! */
+result[3] = 0.0F;/* undefined! */
 store_vector4(inst, machine, result);
  }
  break;
@@ -1161,7 +1161,7 @@ _mesa_execute_program(struct gl_context * ctx,
 GLfloat a[4], result[4];
 fetch_vector1(&inst->SrcReg[0], machine, a);
 result[0] = result[1] = result[2] = result[3]
-   = (GLfloat) sin(a[0]);
+   = sinf(a[0]);
 store_vector4(inst, machine, result);
  }
  break;
@@ -1360,7 +1360,7 @@ _mesa_execute_program(struct gl_context * ctx,
  * zero, we'd probably be fine except for an assert in
  * IROUND_POS() which gets triggered by the inf values created.
  */
-if (texcoord[3] != 0.0) {
+if (texcoord[3] != 0.0F) {
texcoord[0] /= texcoord[3];
texcoord[1] /= texcoord[3];
texcoord[2] /= texcoord[3];
@@ -1380,7 +1380,7 @@ _mesa_execute_program(struct gl_context * ctx,
 
 fetch_vector4(&inst->SrcReg[0], machine, texcoord);
 if (inst->TexSrcTarget != TEXTURE_CUBE_INDEX &&
-texcoord[3] != 0.0) {
+texcoord[3] != 0.0F) {
texcoord[0] /= texcoord[3];
texcoord[1] /= texcoord[3];
texcoord[2] /= texcoord[3];
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.fr

[Mesa-dev] [PATCH 04/13] i965: Use float calculations when double is unnecessary.

2015-07-13 Thread Matt Turner

Literals without an f/F suffix are of type double, and implicit
conversion rules specify that the float in (float op double) be
converted to a double before the operation is performed. I believe float
execution was intended (in nearly all cases) or is sufficient (in the
case of gen7_urb.c).

Removes a lot of float <-> double conversion instructions and replaces
many double instructions with float instructions which are cheaper.

   text data  bss  dec  hex  filename
4928659   19516026192  5150011   4e953b  i965_dri.so before
4928315   19515226192  5149659   4e93db  i965_dri.so after
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   | 22 +++---
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c|  4 ++--
 src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c  |  4 ++--
 src/mesa/drivers/dri/i965/brw_misc_state.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_sampler_state.c  |  4 ++--
 src/mesa/drivers/dri/i965/brw_sf_state.c   |  9 +
 src/mesa/drivers/dri/i965/brw_state_cache.c|  2 +-
 src/mesa/drivers/dri/i965/brw_util.h   |  4 ++--
 src/mesa/drivers/dri/i965/gen6_multisample_state.c |  4 ++--
 src/mesa/drivers/dri/i965/gen6_sf_state.c  |  2 +-
 src/mesa/drivers/dri/i965/gen7_sf_state.c  |  2 +-
 src/mesa/drivers/dri/i965/gen7_urb.c   |  2 +-
 src/mesa/drivers/dri/i965/gen8_sf_state.c  |  2 +-
 14 files changed, 35 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 1561b59..205c905 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1285,8 +1285,8 @@ brw_blorp_blit_program::translate_dst_to_src()
   /* Round the float coordinates down to nearest integer */
   emit_rndd(Xp_f, X_f);
   emit_rndd(Yp_f, Y_f);
-  emit_mul(X_f, Xp_f, brw_imm_f(1 / key->x_scale));
-  emit_mul(Y_f, Yp_f, brw_imm_f(1 / key->y_scale));
+  emit_mul(X_f, Xp_f, brw_imm_f(1.0f / key->x_scale));
+  emit_mul(Y_f, Yp_f, brw_imm_f(1.0f / key->y_scale));
   SWAP_XY_AND_XPYP();
} else if (!key->bilinear_filter) {
   /* Round the float coordinates down to nearest integer by moving to
@@ -1442,7 +1442,7 @@ brw_blorp_blit_program::manual_blend_average(unsigned 
num_samples)
   for (int j = 0; j < 4; ++j) {
  emit_mul(offset(texture_data[0], 2*j),
  offset(vec8(texture_data[0]), 2*j),
- brw_imm_f(1.0/num_samples));
+ brw_imm_f(1.0f / num_samples));
   }
}
 
@@ -1475,9 +1475,9 @@ brw_blorp_blit_program::manual_blend_bilinear(unsigned 
num_samples)
 
   /* Compute pixel coordinates */
   emit_add(vec16(x_sample_coords), Xp_f,
-  brw_imm_f((float)(i & 0x1) * (1.0 / key->x_scale)));
+  brw_imm_f((float)(i & 0x1) * (1.0f / key->x_scale)));
   emit_add(vec16(y_sample_coords), Yp_f,
-  brw_imm_f((float)((i >> 1) & 0x1) * (1.0 / key->y_scale)));
+  brw_imm_f((float)((i >> 1) & 0x1) * (1.0f / key->y_scale)));
   emit_mov(vec16(X), x_sample_coords);
   emit_mov(vec16(Y), y_sample_coords);
 
@@ -1789,7 +1789,7 @@ brw_blorp_coord_transform_params::setup(GLfloat src0, 
GLfloat src1,
* so 0.5 provides the necessary correction.
*/
   multiplier = scale;
-  offset = src0 + (-dst0 + 0.5) * scale;
+  offset = src0 + (-dst0 + 0.5f) * scale;
} else {
   /* When mirroring X we need:
*   src_x - src_x0 = dst_x1 - dst_x - 0.5
@@ -1797,7 +1797,7 @@ brw_blorp_coord_transform_params::setup(GLfloat src0, 
GLfloat src1,
*   src_x = src_x0 + (dst_x1 -dst_x - 0.5) * scale
*/
   multiplier = -scale;
-  offset = src0 + (dst1 - 0.5) * scale;
+  offset = src0 + (dst1 - 0.5f) * scale;
}
 }
 
@@ -1952,8 +1952,8 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct 
brw_context *brw,
/* Scaling factors used for bilinear filtering in multisample scaled
 * blits.
 */
-   wm_prog_key.x_scale = 2.0;
-   wm_prog_key.y_scale = src_mt->num_samples / 2.0;
+   wm_prog_key.x_scale = 2.0f;
+   wm_prog_key.y_scale = src_mt->num_samples / 2.0f;
 
if (filter == GL_LINEAR && src.num_samples <= 1 && dst.num_samples <= 1)
   wm_prog_key.bilinear_filter = true;
@@ -2000,9 +2000,9 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct 
brw_context *brw,
x1 = wm_push_consts.dst_x1 = roundf(dst_x1);
y1 = wm_push_consts.dst_y1 = roundf(dst_y1);
wm_push_consts.rect_grid_x1 = (minify(src_mt->logical_width0, src_level) *
-  wm_prog_key.x_scale - 1.0);
+  wm_prog_key.x_scale - 1.0f);
wm_push_consts.rect_grid_y1 = (minify(src_mt->logical_height0, src_level) *
-  wm_prog_key.y_scale - 1.0);
+

[Mesa-dev] [PATCH 06/13] gallium/auxiliary: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/gallium/auxiliary/util/u_format_rgb9e5.h | 2 +-
 src/gallium/auxiliary/util/u_math.h  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_format_rgb9e5.h 
b/src/gallium/auxiliary/util/u_format_rgb9e5.h
index 7a01f7f..d1ace3f 100644
--- a/src/gallium/auxiliary/util/u_format_rgb9e5.h
+++ b/src/gallium/auxiliary/util/u_format_rgb9e5.h
@@ -75,7 +75,7 @@ typedef union {
 
 static INLINE float rgb9e5_ClampRange(float x)
 {
-   if (x > 0.0) {
+   if (x > 0.0f) {
   if (x >= MAX_RGB9E5) {
  return MAX_RGB9E5;
   } else {
diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index 3b4040f..9c3cb6a 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -240,7 +240,7 @@ util_iround(float f)
 static INLINE boolean
 util_is_approx(float a, float b, float tol)
 {
-   return fabs(b - a) <= tol;
+   return fabsf(b - a) <= tol;
 }
 
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-13 Thread Matt Turner

---
 src/gallium/auxiliary/util/u_format_rgb9e5.h | 6 +++---
 src/gallium/auxiliary/util/u_math.c  | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_format_rgb9e5.h 
b/src/gallium/auxiliary/util/u_format_rgb9e5.h
index c2a3f6f..7a01f7f 100644
--- a/src/gallium/auxiliary/util/u_format_rgb9e5.h
+++ b/src/gallium/auxiliary/util/u_format_rgb9e5.h
@@ -115,8 +115,8 @@ static INLINE unsigned float3_to_rgb9e5(const float rgb[3])
exp_shared = MAX2(-RGB9E5_EXP_BIAS-1, rgb9e5_FloorLog2(maxrgb)) + 1 + 
RGB9E5_EXP_BIAS;
assert(exp_shared <= RGB9E5_MAX_VALID_BIASED_EXP);
assert(exp_shared >= 0);
-   /* This pow function could be replaced by a table. */
-   denom = pow(2, exp_shared - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS);
+   /* This exp2 function could be replaced by a table. */
+   denom = exp2(exp_shared - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS);
 
maxm = (int) floor(maxrgb / denom + 0.5);
if (maxm == MAX_RGB9E5_MANTISSA+1) {
@@ -154,7 +154,7 @@ static INLINE void rgb9e5_to_float3(unsigned rgb, float 
retval[3])
 
v.raw = rgb;
exponent = v.field.biasedexponent - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS;
-   scale = (float) pow(2, exponent);
+   scale = exp2f(exponent);
 
retval[0] = v.field.r * scale;
retval[1] = v.field.g * scale;
diff --git a/src/gallium/auxiliary/util/u_math.c 
b/src/gallium/auxiliary/util/u_math.c
index ae9e951..c58af91 100644
--- a/src/gallium/auxiliary/util/u_math.c
+++ b/src/gallium/auxiliary/util/u_math.c
@@ -48,7 +48,7 @@ init_pow2_table(void)
 {
int i;
for (i = 0; i < POW2_TABLE_SIZE; i++)
-  pow2_table[i] = (float) pow(2.0, (i - POW2_TABLE_OFFSET) / 
POW2_TABLE_SCALE);
+  pow2_table[i] = exp2f((i - POW2_TABLE_OFFSET) / POW2_TABLE_SCALE);
 }
 
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/13] program: Use exp2(x) instead of pow(2.0, x).

2015-07-13 Thread Matt Turner

---
 src/mesa/program/prog_execute.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/program/prog_execute.c b/src/mesa/program/prog_execute.c
index 46260b5..77274e2 100644
--- a/src/mesa/program/prog_execute.c
+++ b/src/mesa/program/prog_execute.c
@@ -723,7 +723,7 @@ _mesa_execute_program(struct gl_context * ctx,
 * result.z = result.x * APPX(result.y)
 * We do what the ARB extension says.
 */
-   q[2] = (GLfloat) pow(2.0, t[0]);
+   q[2] = exp2f(t[0]);
 }
 q[1] = t[0] - floor_t0;
 q[3] = 1.0F;
@@ -734,7 +734,7 @@ _mesa_execute_program(struct gl_context * ctx,
  {
 GLfloat a[4], result[4], val;
 fetch_vector1(&inst->SrcReg[0], machine, a);
-val = (GLfloat) pow(2.0, a[0]);
+val = exp2f(a[0]);
 /*
 if (IS_INF_OR_NAN(val))
val = 1.0e10;
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/13] util: Avoid double promition.

2015-07-13 Thread Matt Turner

---
 src/util/register_allocate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c
index 2ad8c3c..95be20f 100644
--- a/src/util/register_allocate.c
+++ b/src/util/register_allocate.c
@@ -648,7 +648,7 @@ ra_get_best_spill_node(struct ra_graph *g)
   float cost = g->nodes[n].spill_cost;
   float benefit;
 
-  if (cost <= 0.0)
+  if (cost <= 0.0f)
 continue;
 
   if (g->nodes[n].in_stack)
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/13] vbo: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/mesa/vbo/vbo_context.c| 6 +++---
 src/mesa/vbo/vbo_exec_array.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/vbo/vbo_context.c b/src/mesa/vbo/vbo_context.c
index fd1ffe2..e3eb286 100644
--- a/src/mesa/vbo/vbo_context.c
+++ b/src/mesa/vbo/vbo_context.c
@@ -37,9 +37,9 @@
 
 static GLuint check_size( const GLfloat *attr )
 {
-   if (attr[3] != 1.0) return 4;
-   if (attr[2] != 0.0) return 3;
-   if (attr[1] != 0.0) return 2;
+   if (attr[3] != 1.0F) return 4;
+   if (attr[2] != 0.0F) return 3;
+   if (attr[1] != 0.0F) return 2;
return 1;   
 }
 
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index 72b8206..b73aa97 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -255,7 +255,7 @@ check_array_data(struct gl_context *ctx, struct 
gl_client_array *array,
 GLint k;
 for (k = 0; k < array->Size; k++) {
if (IS_INF_OR_NAN(f[k]) ||
-   f[k] >= 1.0e20 || f[k] <= -1.0e10) {
+   f[k] >= 1.0e20F || f[k] <= -1.0e10F) {
   printf("Bad array data:\n");
   printf("  Element[%u].%u = %f\n", j, k, f[k]);
   printf("  Array %u at %p\n", attrib, (void* ) array);
@@ -263,7 +263,7 @@ check_array_data(struct gl_context *ctx, struct 
gl_client_array *array,
 array->Type, array->Size, array->Stride);
   printf("  Address/offset %p in Buffer Object %u\n",
 array->Ptr, array->BufferObj->Name);
-  f[k] = 1.0; /* XXX replace the bad value! */
+  f[k] = 1.0F; /* XXX replace the bad value! */
}
/*assert(!IS_INF_OR_NAN(f[k]));*/
 }
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/13] nir: Avoid double promition.

2015-07-13 Thread Matt Turner

---
 src/glsl/nir/nir_opcodes.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_opcodes.py b/src/glsl/nir/nir_opcodes.py
index 56e96d9..df5b7e2 100644
--- a/src/glsl/nir/nir_opcodes.py
+++ b/src/glsl/nir/nir_opcodes.py
@@ -474,10 +474,10 @@ else
 """)
 
 opcode("ldexp", 0, tfloat, [0, 0], [tfloat, tint], "", """
-dst = ldexp(src0, src1);
+dst = ldexpf(src0, src1);
 /* flush denormals to zero. */
 if (!isnormal(dst))
-   dst = copysign(0.0f, src0);
+   dst = copysignf(0.0f, src0);
 """)
 
 # Combines the first component of each input to make a 2-component vector.
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/13] swrast: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/mesa/swrast/s_aaline.c| 28 ++--
 src/mesa/swrast/s_aalinetemp.h|  4 ++--
 src/mesa/swrast/s_atifragshader.c |  4 ++--
 src/mesa/swrast/s_copypix.c   |  6 +++---
 src/mesa/swrast/s_drawpix.c   | 12 ++--
 src/mesa/swrast/s_fragprog.c  |  4 ++--
 src/mesa/swrast/s_lines.c |  4 ++--
 src/mesa/swrast/s_points.c| 10 +-
 src/mesa/swrast/s_span.c  | 10 +-
 src/mesa/swrast/s_texcombine.c|  6 +++---
 src/mesa/swrast/s_texfilter.c |  8 
 src/mesa/swrast/s_tritemp.h   |  2 +-
 src/mesa/swrast/s_zoom.c  |  2 +-
 13 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/src/mesa/swrast/s_aaline.c b/src/mesa/swrast/s_aaline.c
index f3258e8..de5b42b 100644
--- a/src/mesa/swrast/s_aaline.c
+++ b/src/mesa/swrast/s_aaline.c
@@ -116,11 +116,11 @@ compute_plane(GLfloat x0, GLfloat y0, GLfloat x1, GLfloat 
y1,
const GLfloat b = pz * py;
const GLfloat c = px * px + py * py;
const GLfloat d = -(a * x0 + b * y0 + c * z0);
-   if (a == 0.0 && b == 0.0 && c == 0.0 && d == 0.0) {
-  plane[0] = 0.0;
-  plane[1] = 0.0;
-  plane[2] = 1.0;
-  plane[3] = 0.0;
+   if (a == 0.0F && b == 0.0F && c == 0.0F && d == 0.0F) {
+  plane[0] = 0.0F;
+  plane[1] = 0.0F;
+  plane[2] = 1.0F;
+  plane[3] = 0.0F;
}
else {
   plane[0] = a;
@@ -135,9 +135,9 @@ compute_plane(GLfloat x0, GLfloat y0, GLfloat x1, GLfloat 
y1,
 static inline void
 constant_plane(GLfloat value, GLfloat plane[4])
 {
-   plane[0] = 0.0;
-   plane[1] = 0.0;
-   plane[2] = -1.0;
+   plane[0] = 0.0F;
+   plane[1] = 0.0F;
+   plane[2] = -1.0F;
plane[3] = value;
 }
 
@@ -160,8 +160,8 @@ static inline GLfloat
 solve_plane_recip(GLfloat x, GLfloat y, const GLfloat plane[4])
 {
const GLfloat denom = plane[3] + plane[0] * x + plane[1] * y;
-   if (denom == 0.0)
-  return 0.0;
+   if (denom == 0.0F)
+  return 0.0F;
else
   return -plane[2] / denom;
 }
@@ -374,7 +374,7 @@ segment(struct gl_context *ctx,
   if (x0 < x1) {
  xLeft = x0 - line->halfWidth;
  xRight = x1 + line->halfWidth;
- if (line->dy >= 0.0) {
+ if (line->dy >= 0.0F) {
 yBot = y0 - 3.0F * line->halfWidth;
 yTop = y0 + line->halfWidth;
  }
@@ -386,7 +386,7 @@ segment(struct gl_context *ctx,
   else {
  xLeft = x1 - line->halfWidth;
  xRight = x0 + line->halfWidth;
- if (line->dy <= 0.0) {
+ if (line->dy <= 0.0F) {
 yBot = y1 - 3.0F * line->halfWidth;
 yTop = y1 + line->halfWidth;
  }
@@ -420,7 +420,7 @@ segment(struct gl_context *ctx,
   if (y0 < y1) {
  yBot = y0 - line->halfWidth;
  yTop = y1 + line->halfWidth;
- if (line->dx >= 0.0) {
+ if (line->dx >= 0.0F) {
 xLeft = x0 - 3.0F * line->halfWidth;
 xRight = x0 + line->halfWidth;
  }
@@ -432,7 +432,7 @@ segment(struct gl_context *ctx,
   else {
  yBot = y1 - line->halfWidth;
  yTop = y0 + line->halfWidth;
- if (line->dx <= 0.0) {
+ if (line->dx <= 0.0F) {
 xLeft = x1 - 3.0F * line->halfWidth;
 xRight = x1 + line->halfWidth;
  }
diff --git a/src/mesa/swrast/s_aalinetemp.h b/src/mesa/swrast/s_aalinetemp.h
index f1d078f..bebb131 100644
--- a/src/mesa/swrast/s_aalinetemp.h
+++ b/src/mesa/swrast/s_aalinetemp.h
@@ -44,7 +44,7 @@ NAME(plot)(struct gl_context *ctx, struct LineInfo *line, int 
ix, int iy)
 
(void) swrast;
 
-   if (coverage == 0.0)
+   if (coverage == 0.0F)
   return;
 
line->span.end++;
@@ -123,7 +123,7 @@ NAME(line)(struct gl_context *ctx, const SWvertex *v0, 
const SWvertex *v1)
  ctx->Const.MinLineWidthAA,
  ctx->Const.MaxLineWidthAA);
 
-   if (line.len == 0.0 || IS_INF_OR_NAN(line.len))
+   if (line.len == 0.0F || IS_INF_OR_NAN(line.len))
   return;
 
INIT_SPAN(line.span, GL_LINE);
diff --git a/src/mesa/swrast/s_atifragshader.c 
b/src/mesa/swrast/s_atifragshader.c
index 9e029db..2974dee 100644
--- a/src/mesa/swrast/s_atifragshader.c
+++ b/src/mesa/swrast/s_atifragshader.c
@@ -436,13 +436,13 @@ execute_shader(struct gl_context *ctx, const struct 
ati_fragment_shader *shader,
 for (i = 0; i < 3; i++) {
dst[optype][i] =
   (src[optype][2][i] >
-   0.5) ? src[optype][0][i] : src[optype][1][i];
+   0.5F) ? src[optype][0][i] : src[optype][1][i];
 }
  }
  else {
 dst[optype][3] =
(src[optype][2][3] >
-0.5) ? src[optype][0][3] : src[optype][1][3];
+0.5F) ? src[optype][0][3] : src[optype][1][3];
  }
  break;
 
diff --g

[Mesa-dev] [PATCH 12/13] mesa/math: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/mesa/math/m_clip_tmp.h | 20 ++---
 src/mesa/math/m_matrix.c   | 70 +++---
 src/mesa/math/m_norm_tmp.h |  2 +-
 3 files changed, 46 insertions(+), 46 deletions(-)

diff --git a/src/mesa/math/m_clip_tmp.h b/src/mesa/math/m_clip_tmp.h
index e289be7..60c0004 100644
--- a/src/mesa/math/m_clip_tmp.h
+++ b/src/mesa/math/m_clip_tmp.h
@@ -194,13 +194,13 @@ static GLvector4f * TAG(cliptest_points3)( GLvector4f 
*clip_vec,
STRIDE_LOOP {
   const GLfloat cx = from[0], cy = from[1], cz = from[2];
   GLubyte mask = 0;
-  if (cx >  1.0)   mask |= CLIP_RIGHT_BIT;
-  else if (cx < -1.0)  mask |= CLIP_LEFT_BIT;
-  if (cy >  1.0)   mask |= CLIP_TOP_BIT;
-  else if (cy < -1.0)  mask |= CLIP_BOTTOM_BIT;
+  if (cx >  1.0F)   mask |= CLIP_RIGHT_BIT;
+  else if (cx < -1.0F)  mask |= CLIP_LEFT_BIT;
+  if (cy >  1.0F)   mask |= CLIP_TOP_BIT;
+  else if (cy < -1.0F)  mask |= CLIP_BOTTOM_BIT;
   if (viewport_z_clip) {
-if (cz >  1.0)   mask |= CLIP_FAR_BIT;
-else if (cz < -1.0)  mask |= CLIP_NEAR_BIT;
+if (cz >  1.0F)   mask |= CLIP_FAR_BIT;
+else if (cz < -1.0F)  mask |= CLIP_NEAR_BIT;
   }
   clipMask[i] = mask;
   tmpOrMask |= mask;
@@ -230,10 +230,10 @@ static GLvector4f * TAG(cliptest_points2)( GLvector4f 
*clip_vec,
STRIDE_LOOP {
   const GLfloat cx = from[0], cy = from[1];
   GLubyte mask = 0;
-  if (cx >  1.0)   mask |= CLIP_RIGHT_BIT;
-  else if (cx < -1.0)  mask |= CLIP_LEFT_BIT;
-  if (cy >  1.0)   mask |= CLIP_TOP_BIT;
-  else if (cy < -1.0)  mask |= CLIP_BOTTOM_BIT;
+  if (cx >  1.0F)   mask |= CLIP_RIGHT_BIT;
+  else if (cx < -1.0F)  mask |= CLIP_LEFT_BIT;
+  if (cy >  1.0F)   mask |= CLIP_TOP_BIT;
+  else if (cy < -1.0F)  mask |= CLIP_BOTTOM_BIT;
   clipMask[i] = mask;
   tmpOrMask |= mask;
   tmpAndMask &= mask;
diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c
index 6a42c6c..6522200 100644
--- a/src/mesa/math/m_matrix.c
+++ b/src/mesa/math/m_matrix.c
@@ -380,7 +380,7 @@ static GLboolean invert_matrix_general( GLmatrix *mat )
if (fabsf(r3[0])>fabsf(r2[0])) SWAP_ROWS(r3, r2);
if (fabsf(r2[0])>fabsf(r1[0])) SWAP_ROWS(r2, r1);
if (fabsf(r1[0])>fabsf(r0[0])) SWAP_ROWS(r1, r0);
-   if (0.0 == r0[0])  return GL_FALSE;
+   if (0.0F == r0[0])  return GL_FALSE;
 
/* eliminate first variable */
m1 = r1[0]/r0[0]; m2 = r2[0]/r0[0]; m3 = r3[0]/r0[0];
@@ -388,31 +388,31 @@ static GLboolean invert_matrix_general( GLmatrix *mat )
s = r0[2]; r1[2] -= m1 * s; r2[2] -= m2 * s; r3[2] -= m3 * s;
s = r0[3]; r1[3] -= m1 * s; r2[3] -= m2 * s; r3[3] -= m3 * s;
s = r0[4];
-   if (s != 0.0) { r1[4] -= m1 * s; r2[4] -= m2 * s; r3[4] -= m3 * s; }
+   if (s != 0.0F) { r1[4] -= m1 * s; r2[4] -= m2 * s; r3[4] -= m3 * s; }
s = r0[5];
-   if (s != 0.0) { r1[5] -= m1 * s; r2[5] -= m2 * s; r3[5] -= m3 * s; }
+   if (s != 0.0F) { r1[5] -= m1 * s; r2[5] -= m2 * s; r3[5] -= m3 * s; }
s = r0[6];
-   if (s != 0.0) { r1[6] -= m1 * s; r2[6] -= m2 * s; r3[6] -= m3 * s; }
+   if (s != 0.0F) { r1[6] -= m1 * s; r2[6] -= m2 * s; r3[6] -= m3 * s; }
s = r0[7];
-   if (s != 0.0) { r1[7] -= m1 * s; r2[7] -= m2 * s; r3[7] -= m3 * s; }
+   if (s != 0.0F) { r1[7] -= m1 * s; r2[7] -= m2 * s; r3[7] -= m3 * s; }
 
/* choose pivot - or die */
if (fabsf(r3[1])>fabsf(r2[1])) SWAP_ROWS(r3, r2);
if (fabsf(r2[1])>fabsf(r1[1])) SWAP_ROWS(r2, r1);
-   if (0.0 == r1[1])  return GL_FALSE;
+   if (0.0F == r1[1])  return GL_FALSE;
 
/* eliminate second variable */
m2 = r2[1]/r1[1]; m3 = r3[1]/r1[1];
r2[2] -= m2 * r1[2]; r3[2] -= m3 * r1[2];
r2[3] -= m2 * r1[3]; r3[3] -= m3 * r1[3];
-   s = r1[4]; if (0.0 != s) { r2[4] -= m2 * s; r3[4] -= m3 * s; }
-   s = r1[5]; if (0.0 != s) { r2[5] -= m2 * s; r3[5] -= m3 * s; }
-   s = r1[6]; if (0.0 != s) { r2[6] -= m2 * s; r3[6] -= m3 * s; }
-   s = r1[7]; if (0.0 != s) { r2[7] -= m2 * s; r3[7] -= m3 * s; }
+   s = r1[4]; if (0.0F != s) { r2[4] -= m2 * s; r3[4] -= m3 * s; }
+   s = r1[5]; if (0.0F != s) { r2[5] -= m2 * s; r3[5] -= m3 * s; }
+   s = r1[6]; if (0.0F != s) { r2[6] -= m2 * s; r3[6] -= m3 * s; }
+   s = r1[7]; if (0.0F != s) { r2[7] -= m2 * s; r3[7] -= m3 * s; }
 
/* choose pivot - or die */
if (fabsf(r3[2])>fabsf(r2[2])) SWAP_ROWS(r3, r2);
-   if (0.0 == r2[2])  return GL_FALSE;
+   if (0.0F == r2[2])  return GL_FALSE;
 
/* eliminate third variable */
m3 = r3[2]/r2[2];
@@ -421,7 +421,7 @@ static GLboolean invert_matrix_general( GLmatrix *mat )
r3[7] -= m3 * r2[7];
 
/* last check */
-   if (0.0 == r3[3]) return GL_FALSE;
+   if (0.0F == r3[3]) return GL_FALSE;
 
s = 1.0F/r3[3]; /* now back substitute row 3 */
r3[4] *= s; r3[5] *= s; r3[6] *= s; r3[7] *= s;
@@ -490,26 +490,26 @@ static GLboolean invert_matrix_3d_general( GLmatrix *mat )
 */
pos = neg = 0.0;

[Mesa-dev] [PATCH 09/13] tnl: Avoid double promotion.

2015-07-13 Thread Matt Turner

There are a couple of unrelated changes in t_vb_lighttmp.h that I hope
you'll excuse -- there's a block of code that's duplicated modulo a few
trivial differences that I took the liberty of fixing.
---
 src/mesa/tnl/t_draw.c   |  2 +-
 src/mesa/tnl/t_rasterpos.c  |  6 +++---
 src/mesa/tnl/t_vb_fog.c |  6 +++---
 src/mesa/tnl/t_vb_light.c   | 16 
 src/mesa/tnl/t_vb_lighttmp.h| 16 +++-
 src/mesa/tnl/t_vb_normals.c |  4 ++--
 src/mesa/tnl/t_vertex_generic.c |  2 +-
 7 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/src/mesa/tnl/t_draw.c b/src/mesa/tnl/t_draw.c
index 6adf1dc..713c6a9 100644
--- a/src/mesa/tnl/t_draw.c
+++ b/src/mesa/tnl/t_draw.c
@@ -257,7 +257,7 @@ static GLboolean *_tnl_import_edgeflag( struct gl_context 
*ctx,
GLuint i;
 
for (i = 0; i < count; i++) {
-  *bptr++ = ((GLfloat *)ptr)[0] == 1.0;
+  *bptr++ = ((GLfloat *)ptr)[0] == 1.0F;
   ptr += stride;
}
 
diff --git a/src/mesa/tnl/t_rasterpos.c b/src/mesa/tnl/t_rasterpos.c
index 7ef50ea..4bd9ac8 100644
--- a/src/mesa/tnl/t_rasterpos.c
+++ b/src/mesa/tnl/t_rasterpos.c
@@ -148,7 +148,7 @@ shade_rastpos(struct gl_context *ctx,
 SUB_3V(VP, light->_Position, vertex);
  /* d = length(VP) */
 d = (GLfloat) LEN_3FV( VP );
-if (d > 1.0e-6) {
+if (d > 1.0e-6F) {
 /* normalize VP */
GLfloat invd = 1.0F / d;
SELF_SCALE_SCALAR_3V(VP, invd);
@@ -172,7 +172,7 @@ shade_rastpos(struct gl_context *ctx,
 }
   }
 
-  if (attenuation < 1e-3)
+  if (attenuation < 1e-3F)
 continue;
 
   n_dot_VP = DOT3( normal, VP );
@@ -219,7 +219,7 @@ shade_rastpos(struct gl_context *ctx,
shine = ctx->Light.Material.Attrib[MAT_ATTRIB_FRONT_SHININESS][0];
spec_coef = powf(n_dot_h, shine);
 
-   if (spec_coef > 1.0e-10) {
+   if (spec_coef > 1.0e-10F) {
if (ctx->Light.Model.ColorControl==GL_SEPARATE_SPECULAR_COLOR) {
   ACC_SCALE_SCALAR_3V( specularContrib, spec_coef,
light->_MatSpecular[0]);
diff --git a/src/mesa/tnl/t_vb_fog.c b/src/mesa/tnl/t_vb_fog.c
index 1ca72f8..5489ed6 100644
--- a/src/mesa/tnl/t_vb_fog.c
+++ b/src/mesa/tnl/t_vb_fog.c
@@ -45,8 +45,8 @@ struct fog_stage_data {
 #define FOG_STAGE_DATA(stage) ((struct fog_stage_data *)stage->privatePtr)
 
 #define FOG_EXP_TABLE_SIZE 256
-#define FOG_MAX (10.0)
-#define EXP_FOG_MAX .0006595
+#define FOG_MAX (10.0F)
+#define EXP_FOG_MAX .0006595F
 #define FOG_INCR (FOG_MAX/FOG_EXP_TABLE_SIZE)
 static GLfloat exp_table[FOG_EXP_TABLE_SIZE];
 static GLfloat inited = 0;
@@ -54,7 +54,7 @@ static GLfloat inited = 0;
 #if 1
 #define NEG_EXP( result, narg )
\
 do {   \
-   GLfloat f = (GLfloat) (narg * (1.0/FOG_INCR));  \
+   GLfloat f = (GLfloat) (narg * (1.0F / FOG_INCR));   \
GLint k = (GLint) f;
\
if (k > FOG_EXP_TABLE_SIZE-2)   \
   result = (GLfloat) EXP_FOG_MAX;  \
diff --git a/src/mesa/tnl/t_vb_light.c b/src/mesa/tnl/t_vb_light.c
index dbd57fa..df9073e 100644
--- a/src/mesa/tnl/t_vb_light.c
+++ b/src/mesa/tnl/t_vb_light.c
@@ -137,23 +137,23 @@ validate_shine_table( struct gl_context *ctx, GLuint 
side, GLfloat shininess )
break;
 
   m = s->tab;
-  m[0] = 0.0;
-  if (shininess == 0.0) {
+  m[0] = 0.0F;
+  if (shininess == 0.0F) {
 for (j = 1 ; j <= SHINE_TABLE_SIZE ; j++)
-   m[j] = 1.0;
+   m[j] = 1.0F;
   }
   else {
 for (j = 1 ; j < SHINE_TABLE_SIZE ; j++) {
 GLdouble t, x = j / (GLfloat) (SHINE_TABLE_SIZE - 1);
-if (x < 0.005) /* underflow check */
-   x = 0.005;
+if (x < 0.005F) /* underflow check */
+   x = 0.005F;
 t = pow(x, shininess);
-   if (t > 1e-20)
+   if (t > 1e-20F)
   m[j] = (GLfloat) t;
else
-  m[j] = 0.0;
+  m[j] = 0.0F;
 }
-m[SHINE_TABLE_SIZE] = 1.0;
+m[SHINE_TABLE_SIZE] = 1.0F;
   }
 
   s->shininess = shininess;
diff --git a/src/mesa/tnl/t_vb_lighttmp.h b/src/mesa/tnl/t_vb_lighttmp.h
index f8786ac..3aebcd4 100644
--- a/src/mesa/tnl/t_vb_lighttmp.h
+++ b/src/mesa/tnl/t_vb_lighttmp.h
@@ -112,7 +112,7 @@ static void TAG(light_rgba_spec)( struct gl_context *ctx,
 GLint side;
 GLfloat contrib[3];
 GLfloat attenuation;
-GLfloat VP[3];  /* unit vector from vertex to light */
+GLfloat VP[3];  /* unit vector from vertex to light */
 GLfloat n_dot_VP;   /* n dot VP */
 GLfloat *h;
 
@@ -129,7 +129,7 @@ static void TAG(lig

[Mesa-dev] [PATCH 00/13] Avoid double promotion

2015-07-13 Thread Matt Turner

In the process of looking at the assembly generated from the OUT_BATCH
series, I noticed a bunch of float <-> double round trips. Mostly this
arises because unsuffixed floating-point literals are of type double,
and (float op double) is a double operation in which the float is
implicitly converted to double. This leads to some awful code for
simple things like

   float x;
   ...
   x += 1.0;

x is converted to double, 1.0 is added, and then x is converted back to
a float. :(

Most of the series is just adding f/F to floating-point literals (I
tried to use the suffix the surrounding code used) but there are also
some changes like s/sin/sinf/.

gcc has a warning flag (-Wdouble-promotion) that identifies places where
floats are implicitly cast to doubles (that I only found after I found
and fixed a bunch of things by looking at assembly) but it's not really
useful in general because float arguments are always cast to double
when passed as arguments to varargs functions like printf (why?), and
it warns about that, generating a lot of noise.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/13] mesa: Use floats for viewport bounds.

2015-07-13 Thread Matt Turner

ARB_viewport_array specifies that DEPTH_RANGE consists of double-
precision parameters (corresponding commit d4dc35987), and a preparatory
commit (6340e609a) added _mesa_get_viewport_xform() which returned
double-precision scale[3] and translate[3] vectors, even though X, Y,
Width, and Height were still floats.

All users of _mesa_get_viewport_xform() immediately convert the double
scale and translation vectors into floats (which were floats originally,
but were converted to doubles in _mesa_get_viewport_xform(), sigh).

i965 at least cannot consume doubles (see SF_CLIP_VIEWPORT). If we want
to pass doubles to hardware, we should have a different function that
does that.
---
 src/mesa/drivers/dri/i915/i915_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_sf_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen6_viewport_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_viewport_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_viewport_state.c |  2 +-
 src/mesa/drivers/dri/r200/r200_state.c  |  2 +-
 src/mesa/drivers/dri/radeon/radeon_state.c  |  2 +-
 src/mesa/main/viewport.c| 14 +++---
 src/mesa/main/viewport.h|  2 +-
 src/mesa/math/m_matrix.c|  4 ++--
 src/mesa/math/m_matrix.h|  4 ++--
 src/mesa/state_tracker/st_atom_viewport.c   |  2 +-
 src/mesa/tnl/t_context.c|  2 +-
 src/mesa/tnl/t_rasterpos.c  |  2 +-
 14 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/i915_state.c 
b/src/mesa/drivers/dri/i915/i915_state.c
index 5f10b84..4c83073 100644
--- a/src/mesa/drivers/dri/i915/i915_state.c
+++ b/src/mesa/drivers/dri/i915/i915_state.c
@@ -402,7 +402,7 @@ void
 intelCalcViewport(struct gl_context * ctx)
 {
struct intel_context *intel = intel_context(ctx);
-   double scale[3], translate[3];
+   float scale[3], translate[3];
 
_mesa_get_viewport_xform(ctx, 0, scale, translate);
 
diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c 
b/src/mesa/drivers/dri/i965/brw_sf_state.c
index 5d98922..3be6e4a 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
@@ -45,7 +45,7 @@ static void upload_sf_vp(struct brw_context *brw)
struct gl_context *ctx = &brw->ctx;
struct brw_sf_viewport *sfv;
GLfloat y_scale, y_bias;
-   double scale[3], translate[3];
+   float scale[3], translate[3];
const bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
 
sfv = brw_state_batch(brw, AUB_TRACE_SF_VP_STATE,
diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
index 7c8d884..11b9a36 100644
--- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
@@ -101,7 +101,7 @@ gen6_upload_sf_vp(struct brw_context *brw)
}
 
for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
-  double scale[3], translate[3];
+  float scale[3], translate[3];
 
   /* _NEW_VIEWPORT */
   _mesa_get_viewport_xform(ctx, i, scale, translate);
diff --git a/src/mesa/drivers/dri/i965/gen7_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
index b655205..c75dc99 100644
--- a/src/mesa/drivers/dri/i965/gen7_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
@@ -53,7 +53,7 @@ gen7_upload_sf_clip_viewport(struct brw_context *brw)
}
 
for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
-  double scale[3], translate[3];
+  float scale[3], translate[3];
   _mesa_get_viewport_xform(ctx, i, scale, translate);
 
   /* According to the "Vertex X,Y Clamping and Quantization" section of
diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
index 2d8eeb1..2692ad5 100644
--- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
@@ -53,7 +53,7 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
}
 
for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
-  double scale[3], translate[3];
+  float scale[3], translate[3];
   _mesa_get_viewport_xform(ctx, i, scale, translate);
 
   /* _NEW_VIEWPORT: Viewport Matrix Elements */
diff --git a/src/mesa/drivers/dri/r200/r200_state.c 
b/src/mesa/drivers/dri/r200/r200_state.c
index 6fe70b5..68485dd 100644
--- a/src/mesa/drivers/dri/r200/r200_state.c
+++ b/src/mesa/drivers/dri/r200/r200_state.c
@@ -1546,7 +1546,7 @@ void r200UpdateWindow( struct gl_context *ctx )
GLfloat xoffset = 0;
GLfloat yoffset = dPriv ? (GLfloat) dPriv->h : 0;
const GLboolean render_to_fbo = (ctx->DrawBuffer ? 
_mesa_is_user_fbo(ctx->DrawBuffer) : 0);
-   double scale[3], translate[3];
+   float scale[3], translate[3];
GLfloat y_scale, y_bias;
 
if (render_to_fbo) {
diff --git a/src/mesa/drivers/dri/radeon/radeon_state.c 
b/src/mesa/drivers/dri/

[Mesa-dev] [PATCH 13/13] mesa: Avoid double promotion.

2015-07-13 Thread Matt Turner

---
 src/mesa/main/ffvertex_prog.c | 10 +-
 src/mesa/main/fog.c   |  2 +-
 src/mesa/main/get.c   |  2 +-
 src/mesa/main/light.c | 30 +++---
 src/mesa/main/lines.c |  4 ++--
 src/mesa/main/multisample.c   |  4 ++--
 src/mesa/main/pack.c  | 14 +++---
 src/mesa/main/pixel.c |  4 ++--
 src/mesa/main/pixeltransfer.c |  8 
 src/mesa/main/points.c|  8 
 src/mesa/main/readpix.c   |  4 ++--
 src/mesa/main/samplerobj.c|  2 +-
 src/mesa/main/texparam.c  |  2 +-
 src/mesa/swrast_setup/ss_tritmp.h |  4 ++--
 14 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c
index 70adaf8..95b428d 100644
--- a/src/mesa/main/ffvertex_prog.c
+++ b/src/mesa/main/ffvertex_prog.c
@@ -189,15 +189,15 @@ static void make_state_key( struct gl_context *ctx, 
struct state_key *key )
 if (light->Enabled) {
key->unit[i].light_enabled = 1;
 
-   if (light->EyePosition[3] == 0.0)
+   if (light->EyePosition[3] == 0.0F)
   key->unit[i].light_eyepos3_is_zero = 1;
 
-   if (light->SpotCutoff == 180.0)
+   if (light->SpotCutoff == 180.0F)
   key->unit[i].light_spotcutoff_is_180 = 1;
 
-   if (light->ConstantAttenuation != 1.0 ||
-   light->LinearAttenuation != 0.0 ||
-   light->QuadraticAttenuation != 0.0)
+   if (light->ConstantAttenuation != 1.0F ||
+   light->LinearAttenuation != 0.0F ||
+   light->QuadraticAttenuation != 0.0F)
   key->unit[i].light_attenuated = 1;
 }
   }
diff --git a/src/mesa/main/fog.c b/src/mesa/main/fog.c
index 3bce289..45f343d 100644
--- a/src/mesa/main/fog.c
+++ b/src/mesa/main/fog.c
@@ -115,7 +115,7 @@ _mesa_Fogfv( GLenum pname, const GLfloat *params )
 ctx->Fog.Mode = m;
 break;
   case GL_FOG_DENSITY:
-if (*params<0.0) {
+if (*params<0.0F) {
_mesa_error( ctx, GL_INVALID_VALUE, "glFog" );
 return;
 }
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 3d6d639..785a9b5 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -626,7 +626,7 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
   break;
 
case GL_EDGE_FLAG:
-  v->value_bool = ctx->Current.Attrib[VERT_ATTRIB_EDGEFLAG][0] == 1.0;
+  v->value_bool = ctx->Current.Attrib[VERT_ATTRIB_EDGEFLAG][0] == 1.0F;
   break;
 
case GL_READ_BUFFER:
diff --git a/src/mesa/main/light.c b/src/mesa/main/light.c
index 4021dbe..fe2ce8c 100644
--- a/src/mesa/main/light.c
+++ b/src/mesa/main/light.c
@@ -143,7 +143,7 @@ _mesa_light(struct gl_context *ctx, GLuint lnum, GLenum 
pname, const GLfloat *pa
   COPY_3V(light->SpotDirection, params);
   break;
case GL_SPOT_EXPONENT:
-  assert(params[0] >= 0.0);
+  assert(params[0] >= 0.0F);
   assert(params[0] <= ctx->Const.MaxSpotExponent);
   if (light->SpotExponent == params[0])
 return;
@@ -151,12 +151,12 @@ _mesa_light(struct gl_context *ctx, GLuint lnum, GLenum 
pname, const GLfloat *pa
   light->SpotExponent = params[0];
   break;
case GL_SPOT_CUTOFF:
-  assert(params[0] == 180.0 || (params[0] >= 0.0 && params[0] <= 90.0));
+  assert(params[0] == 180.0F || (params[0] >= 0.0F && params[0] <= 90.0F));
   if (light->SpotCutoff == params[0])
  return;
   FLUSH_VERTICES(ctx, _NEW_LIGHT);
   light->SpotCutoff = params[0];
-  light->_CosCutoff = (GLfloat) (cos(light->SpotCutoff * M_PI / 180.0));
+  light->_CosCutoff = (cosf(light->SpotCutoff * M_PI / 180.0));
   if (light->_CosCutoff < 0)
  light->_CosCutoff = 0;
   if (light->SpotCutoff != 180.0F)
@@ -165,21 +165,21 @@ _mesa_light(struct gl_context *ctx, GLuint lnum, GLenum 
pname, const GLfloat *pa
  light->_Flags &= ~LIGHT_SPOT;
   break;
case GL_CONSTANT_ATTENUATION:
-  assert(params[0] >= 0.0);
+  assert(params[0] >= 0.0F);
   if (light->ConstantAttenuation == params[0])
 return;
   FLUSH_VERTICES(ctx, _NEW_LIGHT);
   light->ConstantAttenuation = params[0];
   break;
case GL_LINEAR_ATTENUATION:
-  assert(params[0] >= 0.0);
+  assert(params[0] >= 0.0F);
   if (light->LinearAttenuation == params[0])
 return;
   FLUSH_VERTICES(ctx, _NEW_LIGHT);
   light->LinearAttenuation = params[0];
   break;
case GL_QUADRATIC_ATTENUATION:
-  assert(params[0] >= 0.0);
+  assert(params[0] >= 0.0F);
   if (light->QuadraticAttenuation == params[0])
 return;
   FLUSH_VERTICES(ctx, _NEW_LIGHT);
@@ -238,31 +238,31 @@ _mesa_Lightfv( GLenum light, GLenum pname, const GLfloat 
*params )
   params = temp;
   break;
case GL_SPOT_EXPONENT:
-  if (params[0]

Re: [Mesa-dev] [PATCH 61/78] i965/nir/vec4: Add skeleton implementation of nir_emit_texture()

2015-07-13 Thread Jason Ekstrand

On Mon, Jul 13, 2015 at 9:09 AM, Alejandro Piñeiro  wrote:
>
>
> On 13/07/15 16:26, Jason Ekstrand wrote:
>>
>>
>>
>> > > > Go ahead and make it a cpp file if you have to.
>> > >
>> > > Ok, thanks.
>> > >
>> > > > Another option would
>> > > > be to do the map the other way: ir_opcode -> nir_tex_op and refactor
>> > > > the code to use the nir_tex_op.
>> > >
>> > > But even if we do the map in the other way, the helper function
>> would be
>> > > using the enum type defined at glsl/ir.h, so we would need to make
>> it a
>> > > cpp file in any case. So for now, I will keep things simple, and keep
>> > > the current mapping,  and do the change later.
>> >
>> > The difference is that, with the patch I sent a few hours ago, you
>> only need to do the conversion one place so there's no need for the
>> helper.
>>
>> For whatever it's worth I'm also OK with just duplicating the 10/lines
>> of code to do the conversion and not bothering with the helper.
>>
>
> I just realized that today Francisco Jerez mentioned that is working to
> remove the ir->nir texture op mapping on FS, replacing it with backend
> instruction opcode:
> http://lists.freedesktop.org/archives/mesa-dev/2015-July/088696.html
>
> So probably it would be better to wait for his change, and avoid the
> need of that mapping on the vec4 path too. That would remove the need of
> the helper function and the cpp renaming.
>
> That sounds ok or should we go on independently of Francisco's work?

I think it's probably best to just move foward with no assumptions of
code-sharing between vec4 and FS.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-13 Thread Roland Scheidegger

Did you replace 2 of them by exp2f but one by exp2f on purpose?

I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
(all of exp2, exp2f, powf are c99, powf is supported by older msvc but
the others are not). I guess though could throw some wrappers into
c99_math.h.

Roland

Am 14.07.2015 um 01:22 schrieb Matt Turner:
> ---
>  src/gallium/auxiliary/util/u_format_rgb9e5.h | 6 +++---
>  src/gallium/auxiliary/util/u_math.c  | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_format_rgb9e5.h 
> b/src/gallium/auxiliary/util/u_format_rgb9e5.h
> index c2a3f6f..7a01f7f 100644
> --- a/src/gallium/auxiliary/util/u_format_rgb9e5.h
> +++ b/src/gallium/auxiliary/util/u_format_rgb9e5.h
> @@ -115,8 +115,8 @@ static INLINE unsigned float3_to_rgb9e5(const float 
> rgb[3])
> exp_shared = MAX2(-RGB9E5_EXP_BIAS-1, rgb9e5_FloorLog2(maxrgb)) + 1 + 
> RGB9E5_EXP_BIAS;
> assert(exp_shared <= RGB9E5_MAX_VALID_BIASED_EXP);
> assert(exp_shared >= 0);
> -   /* This pow function could be replaced by a table. */
> -   denom = pow(2, exp_shared - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS);
> +   /* This exp2 function could be replaced by a table. */
> +   denom = exp2(exp_shared - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS);
>  
> maxm = (int) floor(maxrgb / denom + 0.5);
> if (maxm == MAX_RGB9E5_MANTISSA+1) {
> @@ -154,7 +154,7 @@ static INLINE void rgb9e5_to_float3(unsigned rgb, float 
> retval[3])
>  
> v.raw = rgb;
> exponent = v.field.biasedexponent - RGB9E5_EXP_BIAS - 
> RGB9E5_MANTISSA_BITS;
> -   scale = (float) pow(2, exponent);
> +   scale = exp2f(exponent);
>  
> retval[0] = v.field.r * scale;
> retval[1] = v.field.g * scale;
> diff --git a/src/gallium/auxiliary/util/u_math.c 
> b/src/gallium/auxiliary/util/u_math.c
> index ae9e951..c58af91 100644
> --- a/src/gallium/auxiliary/util/u_math.c
> +++ b/src/gallium/auxiliary/util/u_math.c
> @@ -48,7 +48,7 @@ init_pow2_table(void)
>  {
> int i;
> for (i = 0; i < POW2_TABLE_SIZE; i++)
> -  pow2_table[i] = (float) pow(2.0, (i - POW2_TABLE_OFFSET) / 
> POW2_TABLE_SCALE);
> +  pow2_table[i] = exp2f((i - POW2_TABLE_OFFSET) / POW2_TABLE_SCALE);
>  }
>  
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/13] mesa: overhaul the glGetCompressedTexImage code

2015-07-13 Thread Brian Paul

Same idea as the previous patch.
---
 src/mesa/main/texgetimage.c | 345 +---
 1 file changed, 196 insertions(+), 149 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index b74517a..bcdcfe8 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1369,109 +1369,161 @@ _mesa_GetTextureImage(GLuint texture, GLint level, 
GLenum format, GLenum type,
 
 
 /**
- * Do error checking for a glGetCompressedTexImage() call.
- * \return GL_TRUE if any error, GL_FALSE if no errors.
+ * Do error checking for getting compressed texture images.
+ * \return true if any error, false if no errors.
  */
-static GLboolean
+static bool
 getcompressedteximage_error_check(struct gl_context *ctx,
-  struct gl_texture_image *texImage,
-  GLenum target,
-  GLint level, GLsizei clientMemSize,
-  GLvoid *img, bool dsa)
+  struct gl_texture_object *texObj,
+  GLenum target, GLint level,
+  GLsizei clientMemSize, GLvoid *img,
+  const char *caller)
 {
-   const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
-   GLuint compressedSize, dimensions;
-   const char *suffix = dsa ? "ture" : "";
+   struct gl_texture_image *texImage;
+   GLint maxLevels;
+   const bool dsa = (target == 0);
 
-   assert(texImage);
+   if (!texObj || texObj->Target == 0) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
+  return true;
+   }
+
+   if (dsa) {
+  target = texObj->Target;
+   }
 
if (!legal_getteximage_target(ctx, target, dsa)) {
-  _mesa_error(ctx, GL_INVALID_ENUM,
-  "glGetCompressedTex%sImage(target=%s)", suffix,
-  _mesa_lookup_enum_by_nr(target));
-  return GL_TRUE;
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(target=%s)", caller,
+  _mesa_lookup_enum_by_nr(texObj->Target));
+  return true;
}
 
+   maxLevels = _mesa_max_texture_levels(ctx, target);
assert(maxLevels != 0);
if (level < 0 || level >= maxLevels) {
   _mesa_error(ctx, GL_INVALID_VALUE,
-  "glGetCompressedTex%sImage(bad level = %d)", suffix, level);
-  return GL_TRUE;
+  "%s(bad level = %d)", caller, level);
+  return true;
+   }
+
+   texImage = _mesa_select_tex_image(texObj, target, level);
+   if (!texImage) {
+  /* non-existant texture image */
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(level)", caller);
+  return true;
}
 
if (!_mesa_is_format_compressed(texImage->TexFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetCompressedTex%sImage(texture is not compressed)",
-  suffix);
-  return GL_TRUE;
+  "%s(texture is not compressed)", caller);
+  return true;
+   }
+
+   return false;
+}
+
+
+static GLsizei
+packed_compressed_size(GLuint dimensions, mesa_format format,
+   GLsizei width, GLsizei height, GLsizei depth,
+   const struct gl_pixelstore_attrib *packing)
+{
+   struct compressed_pixelstore st;
+   GLsizei totalBytes;
+
+   _mesa_compute_compressed_pixelstore(dimensions, format,
+   width, height, depth,
+   packing, &st);
+   totalBytes =
+  (st.CopySlices - 1) * st.TotalRowsPerSlice * st.TotalBytesPerRow +
+  st.SkipBytes +
+  (st.CopyRowsPerSlice - 1) * st.TotalBytesPerRow +
+  st.CopyBytesPerRow;
+
+   return totalBytes;
+}
+
+
+/**
+ * Common helper for all glGetCompressed-teximage functions.
+ */
+static void
+get_compressed_texture_image(struct gl_context *ctx,
+ struct gl_texture_object *texObj,
+ GLenum target, GLint level,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLint depth,
+ GLsizei bufSize, GLvoid *pixels,
+ const char *caller)
+{
+   struct gl_texture_image *texImage;
+   unsigned firstFace, numFaces, i, imageStride;
+   GLsizei totalBytes;
+   GLuint dimensions;
+
+   FLUSH_VERTICES(ctx, 0);
+
+   if (getcompressedteximage_error_check(ctx, texObj, target, level,
+ bufSize, pixels, caller)) {
+  return;
+   }
+
+   texImage = _mesa_select_tex_image(texObj, target, level);
+   assert(texImage);  /* should have been error checked already */
+
+   /* check sub-image dimensions */
+   if (width == WHOLE_WIDTH && height == WHOLE_HEIGHT && depth == WHOLE_DEPTH) 
{
+  /* getting whole image */
+  width = texImage->Width;
+  height = texImage->Height;
+  depth = texImage->Dept

[Mesa-dev] [PATCH 02/13] meta: add offset, width, height parameters to decompress_texture_image()

2015-07-13 Thread Brian Paul

In preparation for decompressing texture sub images.
---
 src/mesa/drivers/common/meta.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 54c3d5a..34a8e4b 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -2966,15 +2966,14 @@ static bool
 decompress_texture_image(struct gl_context *ctx,
  struct gl_texture_image *texImage,
  GLuint slice,
+ GLint xoffset, GLint yoffset,
+ GLsizei width, GLsizei height,
  GLenum destFormat, GLenum destType,
  GLvoid *dest)
 {
struct decompress_state *decompress = &ctx->Meta->Decompress;
struct decompress_fbo_state *decompress_fbo;
struct gl_texture_object *texObj = texImage->TexObject;
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Height;
const GLenum target = texObj->Target;
GLenum rbFormat;
GLenum faceTarget;
@@ -3093,7 +3092,7 @@ decompress_texture_image(struct gl_context *ctx,
memset(verts, 0, sizeof(verts));
 
_mesa_meta_setup_texture_coords(faceTarget, slice,
-   0, 0, width, height,
+   xoffset, yoffset, width, height,
texImage->Width, texImage->Height,
texImage->Depth,
verts[0].tex,
@@ -3224,7 +3223,8 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
  else {
 dst = pixels;
  }
- result = decompress_texture_image(ctx, texImage, slice,
+ result = decompress_texture_image(ctx, texImage, slice, 0, 0,
+   texImage->Width, texImage->Height,
format, type, dst);
  if (!result)
 break;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/13] mesa: add API dispatch for GL_ARB_get_texture_sub_image

2015-07-13 Thread Brian Paul

This adds the new glGetTextureSubImage() and
glGetCompressedTextureSubImage() functions.  Also update the
dispatch sanity test program.
---
 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml | 42 
 src/mapi/glapi/gen/Makefile.am   |  1 +
 src/mapi/glapi/gen/gl_API.xml|  3 ++
 src/mesa/main/tests/dispatch_sanity.cpp  |  5 +++
 4 files changed, 51 insertions(+)
 create mode 100644 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml

diff --git a/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml 
b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
new file mode 100644
index 000..6acc92d
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
@@ -0,0 +1,42 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+{
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 5b163b0..170898c 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -135,6 +135,7 @@ API_XML = \
ARB_framebuffer_object.xml \
ARB_geometry_shader4.xml \
ARB_get_program_binary.xml \
+   ARB_get_texture_sub_image.xml \
ARB_gpu_shader_fp64.xml \
ARB_gpu_shader5.xml \
ARB_instanced_arrays.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 2f33075..9cc2c3a 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -12890,4 +12890,7 @@
 
 http://www.w3.org/2001/XInclude"/>
 
+http://www.w3.org/2001/XInclude"/>
+
+
 
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 800720b..cc89acb 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -930,6 +930,11 @@ const struct function common_desktop_functions_possible[] 
= {
 
/* GL_EXT_polygon_offset_clamp */
{ "glPolygonOffsetClampEXT", 11, -1 },
+
+   /* GL_ARB_get_texture_sub_image */
+   { "glGetTextureSubImage", 12, -1 },
+   { "glGetCompressedTextureSubImage", 12, -1 },
+
{ NULL, 0, -1 }
 };
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/13] meta: handle subimages in _mesa_meta_setup_texture_coords()

2015-07-13 Thread Brian Paul

v2: fix depth, total_depth mix-up in meta.h, per Laura Ekstrand.
---
 src/mesa/drivers/common/meta.c | 88 +-
 src/mesa/drivers/common/meta.h |  6 +-
 src/mesa/drivers/common/meta_generate_mipmap.c |  4 +-
 3 files changed, 65 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 9a75019..54c3d5a 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -2449,30 +2449,53 @@ _mesa_meta_Bitmap(struct gl_context *ctx,
 
 /**
  * Compute the texture coordinates for the four vertices of a quad for
- * drawing a 2D texture image or slice of a cube/3D texture.
+ * drawing a 2D texture image or slice of a cube/3D texture.  The offset
+ * and width, height specify a sub-region of the 2D image.
+ *
  * \param faceTarget  GL_TEXTURE_1D/2D/3D or cube face name
  * \param slice  slice of a 1D/2D array texture or 3D texture
- * \param width  width of the texture image
- * \param height  height of the texture image
+ * \param xoffset  X position of sub texture
+ * \param yoffset  Y position of sub texture
+ * \param width  width of the sub texture image
+ * \param height  height of the sub texture image
+ * \param total_width  total width of the texture image
+ * \param total_height  total height of the texture image
+ * \param total_depth  total depth of the texture image
  * \param coords0/1/2/3  returns the computed texcoords
  */
 void
 _mesa_meta_setup_texture_coords(GLenum faceTarget,
 GLint slice,
+GLint xoffset,
+GLint yoffset,
 GLint width,
 GLint height,
-GLint depth,
+GLint total_width,
+GLint total_height,
+GLint total_depth,
 GLfloat coords0[4],
 GLfloat coords1[4],
 GLfloat coords2[4],
 GLfloat coords3[4])
 {
-   static const GLfloat st[4][2] = {
-  {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f}
-   };
+   float st[4][2];
GLuint i;
+   const float s0 = (float) xoffset / (float) total_width;
+   const float s1 = (float) (xoffset + width) / (float) total_width;
+   const float t0 = (float) yoffset / (float) total_height;
+   const float t1 = (float) (yoffset + height) / (float) total_height;
GLfloat r;
 
+   /* setup the reference texcoords */
+   st[0][0] = s0;
+   st[0][1] = t0;
+   st[1][0] = s1;
+   st[1][1] = t0;
+   st[2][0] = s1;
+   st[2][1] = t1;
+   st[3][0] = s0;
+   st[3][1] = t1;
+
if (faceTarget == GL_TEXTURE_CUBE_MAP_ARRAY)
   faceTarget = GL_TEXTURE_CUBE_MAP_POSITIVE_X + slice % 6;
 
@@ -2489,52 +2512,52 @@ _mesa_meta_setup_texture_coords(GLenum faceTarget,
case GL_TEXTURE_3D:
case GL_TEXTURE_2D_ARRAY:
   if (faceTarget == GL_TEXTURE_3D) {
- assert(slice < depth);
- assert(depth >= 1);
- r = (slice + 0.5f) / depth;
+ assert(slice < total_depth);
+ assert(total_depth >= 1);
+ r = (slice + 0.5f) / total_depth;
   }
   else if (faceTarget == GL_TEXTURE_2D_ARRAY)
  r = (float) slice;
   else
  r = 0.0F;
-  coords0[0] = 0.0F; /* s */
-  coords0[1] = 0.0F; /* t */
+  coords0[0] = st[0][0]; /* s */
+  coords0[1] = st[0][1]; /* t */
   coords0[2] = r; /* r */
-  coords1[0] = 1.0F;
-  coords1[1] = 0.0F;
+  coords1[0] = st[1][0];
+  coords1[1] = st[1][1];
   coords1[2] = r;
-  coords2[0] = 1.0F;
-  coords2[1] = 1.0F;
+  coords2[0] = st[2][0];
+  coords2[1] = st[2][1];
   coords2[2] = r;
-  coords3[0] = 0.0F;
-  coords3[1] = 1.0F;
+  coords3[0] = st[3][0];
+  coords3[1] = st[3][1];
   coords3[2] = r;
   break;
case GL_TEXTURE_RECTANGLE_ARB:
-  coords0[0] = 0.0F; /* s */
-  coords0[1] = 0.0F; /* t */
+  coords0[0] = (float) xoffset; /* s */
+  coords0[1] = (float) yoffset; /* t */
   coords0[2] = 0.0F; /* r */
-  coords1[0] = (float) width;
-  coords1[1] = 0.0F;
+  coords1[0] = (float) (xoffset + width);
+  coords1[1] = (float) yoffset;
   coords1[2] = 0.0F;
-  coords2[0] = (float) width;
-  coords2[1] = (float) height;
+  coords2[0] = (float) (xoffset + width);
+  coords2[1] = (float) (yoffset + height);
   coords2[2] = 0.0F;
-  coords3[0] = 0.0F;
-  coords3[1] = (float) height;
+  coords3[0] = (float) xoffset;
+  coords3[1] = (float) (yoffset + height);
   coords3[2] = 0.0F;
   break;
case GL_TEXTURE_1D_ARRAY:
-  coords0[0] = 0.0F; /* s */
+  coords0[0] = st[0][0]; /* s */
   coords0[1] = (float) slice; /* t */
   coords0[2] = 0.0F; /* r */
-  coords

[Mesa-dev] [PATCH 10/13] mesa: add new _mesa_Get[Compressed]TextureSubImage() functions

2015-07-13 Thread Brian Paul

Simple implementations in terms of get_[compressed_]texture_image().
---
 src/mesa/main/texgetimage.c | 38 ++
 src/mesa/main/texgetimage.h | 15 +++
 2 files changed, 53 insertions(+)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index bcdcfe8..07da5f4 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1367,6 +1367,22 @@ _mesa_GetTextureImage(GLuint texture, GLint level, 
GLenum format, GLenum type,
 }
 
 
+void GLAPIENTRY
+_mesa_GetTextureSubImage(GLuint texture, GLint level,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLsizei depth,
+ GLenum format, GLenum type, GLsizei bufSize,
+ void *pixels)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_texture_object *texObj = _mesa_lookup_texture(ctx, texture);
+
+   get_texture_image(ctx, texObj, 0, level,
+ xoffset, yoffset, zoffset, width, height, depth,
+ format, type, bufSize, pixels, "glGetTextureSubImage");
+}
+
+
 
 /**
  * Do error checking for getting compressed texture images.
@@ -1423,6 +1439,10 @@ getcompressedteximage_error_check(struct gl_context *ctx,
 }
 
 
+/**
+ * Compute the number of bytes which will be written when retrieving
+ * a sub-region of a compressed texture.
+ */
 static GLsizei
 packed_compressed_size(GLuint dimensions, mesa_format format,
GLsizei width, GLsizei height, GLsizei depth,
@@ -1637,3 +1657,21 @@ _mesa_GetCompressedTextureImage(GLuint texture, GLint 
level,
 bufSize, pixels,
 "glGetCompressedTextureImage");
 }
+
+
+void APIENTRY
+_mesa_GetCompressedTextureSubImage(GLuint texture, GLint level,
+   GLint xoffset, GLint yoffset,
+   GLint zoffset, GLsizei width,
+   GLsizei height, GLsizei depth,
+   GLsizei bufSize, void *pixels)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_texture_object *texObj = _mesa_lookup_texture(ctx, texture);
+
+   get_compressed_texture_image(ctx, texObj, 0, level,
+xoffset, yoffset, zoffset,
+width, height, depth,
+bufSize, pixels,
+"glGetCompressedTextureImage");
+}
diff --git a/src/mesa/main/texgetimage.h b/src/mesa/main/texgetimage.h
index 040495a..63c75eb 100644
--- a/src/mesa/main/texgetimage.h
+++ b/src/mesa/main/texgetimage.h
@@ -71,6 +71,14 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
   GLenum type, GLsizei bufSize, GLvoid *pixels);
 
 extern void GLAPIENTRY
+_mesa_GetTextureSubImage(GLuint texture, GLint level,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLsizei depth,
+ GLenum format, GLenum type, GLsizei bufSize,
+ void *pixels);
+
+
+extern void GLAPIENTRY
 _mesa_GetCompressedTexImage(GLenum target, GLint lod, GLvoid *img);
 
 extern void GLAPIENTRY
@@ -81,4 +89,11 @@ extern void GLAPIENTRY
 _mesa_GetCompressedTextureImage(GLuint texture, GLint level, GLsizei bufSize,
 GLvoid *pixels);
 
+extern void APIENTRY
+_mesa_GetCompressedTextureSubImage(GLuint texture, GLint level,
+   GLint xoffset, GLint yoffset,
+   GLint zoffset, GLsizei width,
+   GLsizei height, GLsizei depth,
+   GLsizei bufSize, void *pixels);
+
 #endif /* TEXGETIMAGE_H */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/13] mesa: replace Driver.GetCompressedTexImage() w/ GetCompressedTexSubImage()

2015-07-13 Thread Brian Paul

For now, pass offsets of zero and width/height/depth equal to the
whole image.
---
 src/mesa/drivers/common/driverfuncs.c  |  2 +-
 src/mesa/main/dd.h |  9 ++---
 src/mesa/main/texgetimage.c| 28 
 src/mesa/main/texgetimage.h|  9 ++---
 src/mesa/state_tracker/st_cb_texture.c |  2 +-
 5 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/common/driverfuncs.c 
b/src/mesa/drivers/common/driverfuncs.c
index ce99620..6fe42b1 100644
--- a/src/mesa/drivers/common/driverfuncs.c
+++ b/src/mesa/drivers/common/driverfuncs.c
@@ -101,7 +101,7 @@ _mesa_init_driver_functions(struct dd_function_table 
*driver)
driver->TestProxyTexImage = _mesa_test_proxy_teximage;
driver->CompressedTexImage = _mesa_store_compressed_teximage;
driver->CompressedTexSubImage = _mesa_store_compressed_texsubimage;
-   driver->GetCompressedTexImage = _mesa_GetCompressedTexImage_sw;
+   driver->GetCompressedTexSubImage = _mesa_GetCompressedTexSubImage_sw;
driver->BindTexture = NULL;
driver->NewTextureObject = _mesa_new_texture_object;
driver->DeleteTexture = _mesa_delete_texture_object;
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 754abd5..0a54e7a 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -335,9 +335,12 @@ struct dd_function_table {
/**
 * Called by glGetCompressedTexImage.
 */
-   void (*GetCompressedTexImage)(struct gl_context *ctx,
- struct gl_texture_image *texImage,
- GLvoid *data);
+   void (*GetCompressedTexSubImage)(struct gl_context *ctx,
+struct gl_texture_image *texImage,
+GLint xoffset, GLint yoffset,
+GLint zoffset, GLsizei width,
+GLint height, GLint depth,
+GLvoid *data);
/*@}*/
 
/**
diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 30a7d06..fb3c2c8 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -772,13 +772,16 @@ _mesa_GetTexSubImage_sw(struct gl_context *ctx,
 
 
 /**
- * This is the software fallback for Driver.GetCompressedTexImage().
+ * This is the software fallback for Driver.GetCompressedTexSubImage().
  * All error checking will have been done before this routine is called.
  */
 void
-_mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
-   struct gl_texture_image *texImage,
-   GLvoid *img)
+_mesa_GetCompressedTexSubImage_sw(struct gl_context *ctx,
+  struct gl_texture_image *texImage,
+  GLint xoffset, GLint yoffset,
+  GLint zoffset, GLsizei width,
+  GLint height, GLint depth,
+  GLvoid *img)
 {
const GLuint dimensions =
   _mesa_get_texture_dimensions(texImage->TexObject->Target);
@@ -787,10 +790,8 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
GLubyte *dest;
 
_mesa_compute_compressed_pixelstore(dimensions, texImage->TexFormat,
-   texImage->Width, texImage->Height,
-   texImage->Depth,
-   &ctx->Pack,
-   &store);
+   width, height, depth,
+   &ctx->Pack, &store);
 
if (_mesa_is_bufferobj(ctx->Pack.BufferObj)) {
   /* pack texture image into a PBO */
@@ -816,8 +817,8 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
   GLubyte *src;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, slice,
-  0, 0, texImage->Width, texImage->Height,
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + slice,
+  xoffset, yoffset, width, height,
   GL_MAP_READ_BIT, &src, &srcRowStride);
 
   if (src) {
@@ -828,7 +829,7 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
 src += srcRowStride;
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, slice);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + slice);
 
  /* Advance to next slice */
  dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice - 
store.CopyRowsPerSlice);
@@ -1323,7 +1324,10 @@ _mesa_get_compressed_texture_image(struct gl_context 
*ctx,
 
_mesa_lock_texture(ctx, texObj);
{
-  ctx->Driver.GetCompressedTexImage(ctx, texImage, pixels);
+  ctx->Driver.GetCompressedTexSubImage(ctx, texImage,
+   0, 0, 0,
+   texImage->Wi

[Mesa-dev] [PATCH 03/13] mesa: replace Driver.GetTexImage with GetTexSubImage()

2015-07-13 Thread Brian Paul

The new driver hook has x/y/zoffset and width/height/depth parameters
for the new glGetTextureSubImage() function.

The meta code and gallium state tracker are updated to handle the
new parameters.

Callers to Driver.GetTexSubImage() pass in offsets=0 and sizes equal
to the whole texture size.
---
 src/mesa/drivers/common/driverfuncs.c  |  2 +-
 src/mesa/drivers/common/meta.c | 22 --
 src/mesa/drivers/common/meta.h |  8 +---
 src/mesa/main/dd.h | 10 ++
 src/mesa/main/debug.c  |  4 +++-
 src/mesa/main/mipmap.c |  9 ++---
 src/mesa/main/texgetimage.c| 15 ++-
 src/mesa/main/texgetimage.h|  9 +
 src/mesa/state_tracker/st_cb_texture.c | 33 -
 9 files changed, 68 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/common/driverfuncs.c 
b/src/mesa/drivers/common/driverfuncs.c
index 71c1a76..ce99620 100644
--- a/src/mesa/drivers/common/driverfuncs.c
+++ b/src/mesa/drivers/common/driverfuncs.c
@@ -94,7 +94,7 @@ _mesa_init_driver_functions(struct dd_function_table *driver)
driver->QuerySamplesForFormat = _mesa_query_samples_for_format;
driver->TexImage = _mesa_store_teximage;
driver->TexSubImage = _mesa_store_texsubimage;
-   driver->GetTexImage = _mesa_meta_GetTexImage;
+   driver->GetTexSubImage = _mesa_meta_GetTexSubImage;
driver->ClearTexSubImage = _mesa_meta_ClearTexSubImage;
driver->CopyTexSubImage = _mesa_meta_CopyTexSubImage;
driver->GenerateMipmap = _mesa_meta_GenerateMipmap;
diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 34a8e4b..12045eb 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -3196,15 +3196,17 @@ decompress_texture_image(struct gl_context *ctx,
  * from core Mesa.
  */
 void
-_mesa_meta_GetTexImage(struct gl_context *ctx,
-   GLenum format, GLenum type, GLvoid *pixels,
-   struct gl_texture_image *texImage)
+_mesa_meta_GetTexSubImage(struct gl_context *ctx,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLsizei depth,
+  GLenum format, GLenum type, GLvoid *pixels,
+  struct gl_texture_image *texImage)
 {
if (_mesa_is_format_compressed(texImage->TexFormat)) {
   GLuint slice;
   bool result = true;
 
-  for (slice = 0; slice < texImage->Depth; slice++) {
+  for (slice = 0; slice < depth; slice++) {
  void *dst;
  if (texImage->TexObject->Target == GL_TEXTURE_2D_ARRAY
  || texImage->TexObject->Target == GL_TEXTURE_CUBE_MAP_ARRAY) {
@@ -3216,15 +3218,14 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
 struct gl_pixelstore_attrib packing = ctx->Pack;
 packing.SkipPixels = 0;
 packing.SkipRows = 0;
-dst = _mesa_image_address3d(&packing, pixels, texImage->Width,
-texImage->Height, format, type,
-slice, 0, 0);
+dst = _mesa_image_address3d(&packing, pixels, width, height,
+format, type, slice, 0, 0);
  }
  else {
 dst = pixels;
  }
- result = decompress_texture_image(ctx, texImage, slice, 0, 0,
-   texImage->Width, texImage->Height,
+ result = decompress_texture_image(ctx, texImage, slice,
+   xoffset, yoffset, width, height,
format, type, dst);
  if (!result)
 break;
@@ -3234,7 +3235,8 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
  return;
}
 
-   _mesa_GetTexImage_sw(ctx, format, type, pixels, texImage);
+   _mesa_GetTexSubImage_sw(ctx, xoffset, yoffset, zoffset,
+   width, height, depth, format, type, pixels, 
texImage);
 }
 
 
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index f5b74c4..fe43915 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -560,9 +560,11 @@ _mesa_meta_ClearTexSubImage(struct gl_context *ctx,
 const GLvoid *clearValue);
 
 extern void
-_mesa_meta_GetTexImage(struct gl_context *ctx,
-   GLenum format, GLenum type, GLvoid *pixels,
-   struct gl_texture_image *texImage);
+_mesa_meta_GetTexSubImage(struct gl_context *ctx,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLsizei depth,
+  GLenum format, GLenum type, GLvoid *pixels,
+  struct gl_texture_image *texImage);
 
 extern void
 _mesa_meta_DrawTex(struct gl_context *

[Mesa-dev] [PATCH 04/13] mesa: plumb offset/size parameters through GetTexSubImage code

2015-07-13 Thread Brian Paul

Needed for GL_ARB_get_texture_sub_image.  But at this point, the
offsets are always zero and the sizes match the whole texture image.

v2: Fixes, suggestions from Laura Ekstrand:
* Fix calls to ctx->Driver.UnmapTextureImage() to pass the correct
  slice value.
* Added comments and assertions to check zoffset+depth<=tex->Depth before
  the 'img' loops.
* Added a new zoffset==0 assert in get_tex_memcpy().
---
 src/mesa/main/texgetimage.c | 137 ++--
 1 file changed, 80 insertions(+), 57 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 01f1a15..d17dd52 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -75,12 +75,11 @@ type_needs_clamping(GLenum type)
  */
 static void
 get_tex_depth(struct gl_context *ctx, GLuint dimensions,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLint depth,
   GLenum format, GLenum type, GLvoid *pixels,
   struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   GLint height = texImage->Height;
-   GLint depth = texImage->Depth;
GLint img, row;
GLfloat *depthRow = malloc(width * sizeof(GLfloat));
 
@@ -94,14 +93,15 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
   height = 1;
}
 
+   assert(zoffset + depth <= texImage->Depth);
for (img = 0; img < depth; img++) {
   GLubyte *srcMap;
   GLint srcRowStride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
-  &srcMap, &srcRowStride);
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + img,
+  xoffset, yoffset, width, height,
+  GL_MAP_READ_BIT, &srcMap, &srcRowStride);
 
   if (srcMap) {
  for (row = 0; row < height; row++) {
@@ -113,7 +113,7 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
 _mesa_pack_depth_span(ctx, width, dest, type, depthRow, 
&ctx->Pack);
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, img);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + img);
   }
   else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage");
@@ -130,26 +130,26 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
  */
 static void
 get_tex_depth_stencil(struct gl_context *ctx, GLuint dimensions,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLint depth,
   GLenum format, GLenum type, GLvoid *pixels,
   struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Depth;
GLint img, row;
 
assert(format == GL_DEPTH_STENCIL);
assert(type == GL_UNSIGNED_INT_24_8 ||
   type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
 
+   assert(zoffset + depth <= texImage->Depth);
for (img = 0; img < depth; img++) {
   GLubyte *srcMap;
   GLint rowstride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
-  &srcMap, &rowstride);
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + img,
+  xoffset, yoffset, width, height,
+  GL_MAP_READ_BIT, &srcMap, &rowstride);
 
   if (srcMap) {
  for (row = 0; row < height; row++) {
@@ -166,7 +166,7 @@ get_tex_depth_stencil(struct gl_context *ctx, GLuint 
dimensions,
 }
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, img);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + img);
   }
   else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage");
@@ -180,12 +180,11 @@ get_tex_depth_stencil(struct gl_context *ctx, GLuint 
dimensions,
  */
 static void
 get_tex_stencil(struct gl_context *ctx, GLuint dimensions,
+GLint xoffset, GLint yoffset, GLint zoffset,
+GLsizei width, GLsizei height, GLint depth,
 GLenum format, GLenum type, GLvoid *pixels,
 struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Depth;
GLint img, row;
 
assert(format == GL_STENCIL_INDEX);
@@ -195,8 +194,9 @@ get_tex_stencil(struct gl_context *ctx, GLuint dimensions,
   GLint rowstride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
+  ctx->Driver.MapTextureImage(ctx, texImage, z

[Mesa-dev] [PATCH 07/13] mesa: 80-column wrapping in texgetimage.c

2015-07-13 Thread Brian Paul

---
 src/mesa/main/texgetimage.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index fb3c2c8..e180a4c 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -449,7 +449,8 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint 
dimensions,
   rebaseSwizzle[1] = MESA_FORMAT_SWIZZLE_ZERO;
   rebaseSwizzle[2] = MESA_FORMAT_SWIZZLE_ZERO;
   rebaseSwizzle[3] = MESA_FORMAT_SWIZZLE_W;
-} else if (texImage->_BaseFormat != 
_mesa_get_format_base_format(texFormat)) {
+} else if (texImage->_BaseFormat !=
+   _mesa_get_format_base_format(texFormat)) {
   needsRebase =
  _mesa_compute_rgba2base2rgba_component_mapping(texImage->_BaseFormat,
 rebaseSwizzle);
@@ -531,8 +532,8 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint 
dimensions,
  /* If we had to rebase, we have already handled that */
  needsRebase = false;
 
- /* If we were lucky and our RGBA conversion matches the dst format, 
then
-  * we are done.
+ /* If we were lucky and our RGBA conversion matches the dst format,
+  * then we are done.
   */
  if (!need_convert)
 goto do_swap;
@@ -832,7 +833,8 @@ _mesa_GetCompressedTexSubImage_sw(struct gl_context *ctx,
  ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + slice);
 
  /* Advance to next slice */
- dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice - 
store.CopyRowsPerSlice);
+ dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice -
+   store.CopyRowsPerSlice);
 
   } else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetCompresssedTexImage");
@@ -953,7 +955,8 @@ getteximage_error_check(struct gl_context *ctx,
   "glGetTex%sImage(format mismatch)", suffix);
   return GL_TRUE;
}
-   else if (!_mesa_is_stencil_format(format) && 
_mesa_is_enum_format_integer(format) !=
+   else if (!_mesa_is_stencil_format(format) &&
+_mesa_is_enum_format_integer(format) !=
 _mesa_is_format_integer(texImage->TexFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "glGetTex%sImage(format mismatch)", suffix);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/13] mesa: make _mesa_get_[compressed_]texture_image() static

2015-07-13 Thread Brian Paul

These functions are only called from teximage.c
---
 src/mesa/main/texgetimage.c | 24 
 src/mesa/main/texgetimage.h |  7 ---
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index d17dd52..30a7d06 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1004,12 +1004,12 @@ getteximage_error_check(struct gl_context *ctx,
  * \param dsa True when the caller is an ARB_direct_state_access function,
  *false otherwise
  */
-void
-_mesa_get_texture_image(struct gl_context *ctx,
-struct gl_texture_object *texObj,
-struct gl_texture_image *texImage, GLenum target,
-GLint level, GLenum format, GLenum type,
-GLsizei bufSize, GLvoid *pixels, bool dsa)
+static void
+get_texture_image(struct gl_context *ctx,
+  struct gl_texture_object *texObj,
+  struct gl_texture_image *texImage, GLenum target,
+  GLint level, GLenum format, GLenum type,
+  GLsizei bufSize, GLvoid *pixels, bool dsa)
 {
assert(texObj);
assert(texImage);
@@ -1103,8 +1103,8 @@ _mesa_GetnTexImageARB(GLenum target, GLint level, GLenum 
format,
if (!texImage)
   return;
 
-   _mesa_get_texture_image(ctx, texObj, texImage, target, level, format, type,
-   bufSize, pixels, false);
+   get_texture_image(ctx, texObj, texImage, target, level, format, type,
+ bufSize, pixels, false);
 }
 
 
@@ -1179,8 +1179,8 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
  texImage = texObj->Image[i][level];
  assert(texImage);
 
- _mesa_get_texture_image(ctx, texObj, texImage, texObj->Target, level,
- format, type, bufSize, pixels, true);
+ get_texture_image(ctx, texObj, texImage, texObj->Target, level,
+   format, type, bufSize, pixels, true);
 
  image_stride = _mesa_image_image_stride(&ctx->Pack, texImage->Width,
  texImage->Height, format,
@@ -1194,8 +1194,8 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
   if (!texImage)
  return;
 
-  _mesa_get_texture_image(ctx, texObj, texImage, texObj->Target, level,
-  format, type, bufSize, pixels, true);
+  get_texture_image(ctx, texObj, texImage, texObj->Target, level,
+format, type, bufSize, pixels, true);
}
 }
 
diff --git a/src/mesa/main/texgetimage.h b/src/mesa/main/texgetimage.h
index 611b1bd..5124851 100644
--- a/src/mesa/main/texgetimage.h
+++ b/src/mesa/main/texgetimage.h
@@ -49,13 +49,6 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
GLvoid *data);
 
 extern void
-_mesa_get_texture_image(struct gl_context *ctx,
-struct gl_texture_object *texObj,
-struct gl_texture_image *texImage, GLenum target,
-GLint level, GLenum format, GLenum type,
-GLsizei bufSize, GLvoid *pixels, bool dsa);
-
-extern void
 _mesa_get_compressed_texture_image( struct gl_context *ctx,
 struct gl_texture_object *texObj,
 struct gl_texture_image *texImage,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/13] mesa: enable GL_ARB_get_texture_sub_image for all drivers

2015-07-13 Thread Brian Paul

---
 src/mesa/main/extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 4176a69..d2df4fb 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -121,6 +121,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_framebuffer_object",  o(ARB_framebuffer_object),  
GL, 2005 },
{ "GL_ARB_framebuffer_sRGB",o(EXT_framebuffer_sRGB),
GL, 1998 },
{ "GL_ARB_get_program_binary",  o(dummy_true),  
GL, 2010 },
+   { "GL_ARB_get_texture_sub_image",   o(dummy_true),  
GL, 2014 },
{ "GL_ARB_gpu_shader5", o(ARB_gpu_shader5), 
GLC,2010 },
{ "GL_ARB_gpu_shader_fp64", o(ARB_gpu_shader_fp64), 
GLC,2010 },
{ "GL_ARB_half_float_pixel",o(dummy_true),  
GL, 2003 },
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/13] mesa: overhaul the glGetTexImage code

2015-07-13 Thread Brian Paul

1. Reorganize the error checking code.
2. Lay groundwork for getting sub images.
3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and
   _mesa_GetTextureImage() all in terms of get_texture_image().
---
 src/mesa/main/texgetimage.c | 539 +---
 1 file changed, 352 insertions(+), 187 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index e180a4c..b74517a 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -49,6 +49,13 @@
 #include "format_utils.h"
 #include "pixeltransfer.h"
 
+
+/** Some magic numbers used when getting whole texture image */
+#define WHOLE_WIDTH -123
+#define WHOLE_HEIGHT -123
+#define WHOLE_DEPTH -123
+
+
 /**
  * Can the given type represent negative values?
  */
@@ -891,27 +898,59 @@ legal_getteximage_target(struct gl_context *ctx, GLenum 
target, bool dsa)
 
 
 /**
- * Do error checking for a glGetTex(ture)Image() call.
- * \return GL_TRUE if any error, GL_FALSE if no errors.
+ * Do error checking for all (non-compressed) get-texture-image functions.
+ * \param target  user-provided texture target, or 0 if called from DSA 
function
+ * \return true if any error, false if no errors.
  */
-static GLboolean
+static bool
 getteximage_error_check(struct gl_context *ctx,
-struct gl_texture_image *texImage,
+struct gl_texture_object *texObj,
 GLenum target, GLint level,
 GLenum format, GLenum type, GLsizei clientMemSize,
-GLvoid *pixels, bool dsa)
+GLvoid *pixels, const char *caller)
 {
-   const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
-   const GLuint dimensions = (target == GL_TEXTURE_3D) ? 3 : 2;
-   GLenum baseFormat;
-   const char *suffix = dsa ? "ture" : "";
+   struct gl_texture_image *texImage;
+   GLenum baseFormat, err;
+   GLint maxLevels;
+   const bool dsa = (target == 0);
 
-   assert(texImage);
-   assert(maxLevels != 0);
+   if (!texObj || texObj->Target == 0) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
+  return true;
+   }
+
+   if (dsa) {
+  target = texObj->Target;
+   }
+
+   if (!legal_getteximage_target(ctx, target, dsa)) {
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(target=%s)", caller,
+  _mesa_lookup_enum_by_nr(texObj->Target));
+  return true;
+   }
+
+   err = _mesa_error_check_format_and_type(ctx, format, type);
+   if (err != GL_NO_ERROR) {
+  _mesa_error(ctx, err, "%s(format/type)", caller);
+  return true;
+   }
+
+   maxLevels = _mesa_max_texture_levels(ctx, target);
if (level < 0 || level >= maxLevels) {
   _mesa_error(ctx, GL_INVALID_VALUE,
-  "glGetTex%sImage(level out of range)", suffix);
-  return GL_TRUE;
+  "%s(level = %d)", caller, level);
+  return true;
+   }
+
+   if (target == GL_TEXTURE_CUBE_MAP) {
+  target = GL_TEXTURE_CUBE_MAP_POSITIVE_X;
+   }
+
+   texImage = _mesa_select_tex_image(texObj, target, level);
+   if (!texImage) {
+  /* non-existant texture image */
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
+  return true;
}
 
/*
@@ -927,247 +966,325 @@ getteximage_error_check(struct gl_context *ctx,
if (_mesa_is_color_format(format)
&& !_mesa_is_color_format(baseFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  "%s(format mismatch)", caller);
+  return true;
}
else if (_mesa_is_depth_format(format)
 && !_mesa_is_depth_format(baseFormat)
 && !_mesa_is_depthstencil_format(baseFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  "%s(format mismatch)", caller);
+  return true;
}
else if (_mesa_is_stencil_format(format)
 && !ctx->Extensions.ARB_texture_stencil8) {
   _mesa_error(ctx, GL_INVALID_ENUM,
-  "glGetTex%sImage(format=GL_STENCIL_INDEX)", suffix);
-  return GL_TRUE;
+  "%s(format=GL_STENCIL_INDEX)", caller);
+  return true;
}
else if (_mesa_is_ycbcr_format(format)
 && !_mesa_is_ycbcr_format(baseFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  "%s(format mismatch)", caller);
+  return true;
}
else if (_mesa_is_depthstencil_format(format)
 && !_mesa_is_depthstencil_format(baseFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  "%s(format mismatch)", caller);
+  return true;
}
else if (!_mesa_is_stencil_format(forma

[Mesa-dev] [PATCH 13/13] docs: document that GL_ARB_get_texture_sub_image is completed

2015-07-13 Thread Brian Paul

---
 docs/GL3.txt  | 2 +-
 docs/relnotes/10.7.0.html | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 33a282e..2144257 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -202,7 +202,7 @@ GL 4.5, GLSL 4.50:
   - Sampler object DONE
   - Program Pipeline objectDONE
   - Query object   DONE (will require 
changes when GL_ARB_query_buffer_object lands)
-  GL_ARB_get_texture_sub_image started (Brian Paul)
+  GL_ARB_get_texture_sub_image DONE (all drivers)
   GL_ARB_shader_texture_image_samples  not started
   GL_ARB_texture_barrier   DONE (nv50, nvc0, r600, 
radeonsi)
   GL_KHR_context_flush_control DONE (all - but needs 
GLX/EXT extension to be useful)
diff --git a/docs/relnotes/10.7.0.html b/docs/relnotes/10.7.0.html
index 42ea807..4533e78 100644
--- a/docs/relnotes/10.7.0.html
+++ b/docs/relnotes/10.7.0.html
@@ -44,6 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ARB_get_texture_sub_image for all drivers
 GL_AMD_vertex_shader_viewport_index on radeonsi
 GL_ARB_fragment_layer_viewport on radeonsi
 GL_ARB_framebuffer_no_attachments on i965
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2015-07-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=90264

Shawn Starr  changed:

   What|Removed |Added

 CC||shawn.st...@rogers.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-13 Thread Matt Turner

On Mon, Jul 13, 2015 at 5:13 PM, Roland Scheidegger  wrote:
> Did you replace 2 of them by exp2f but one by exp2f on purpose?

Yes, two of them are converting their results to floats so they were
converted to exp2f().

Upon inspection, the argument of the one exp2 is just an integer...
can't we convert that to (double)(1 << arg)?

> I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
> (all of exp2, exp2f, powf are c99, powf is supported by older msvc but
> the others are not). I guess though could throw some wrappers into
> c99_math.h.

Yeah, if we need to put wrappers in c99_math.h I can do that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/13] mesa: add API dispatch for GL_ARB_get_texture_sub_image

2015-07-13 Thread Ilia Mirkin

On Jul 13, 2015 9:21 PM, "Brian Paul"  wrote:
>
> This adds the new glGetTextureSubImage() and
> glGetCompressedTextureSubImage() functions.  Also update the
> dispatch sanity test program.
> ---
>  src/mapi/glapi/gen/ARB_get_texture_sub_image.xml | 42

>  src/mapi/glapi/gen/Makefile.am   |  1 +
>  src/mapi/glapi/gen/gl_API.xml|  3 ++
>  src/mesa/main/tests/dispatch_sanity.cpp  |  5 +++
>  4 files changed, 51 insertions(+)
>  create mode 100644 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
>
> diff --git a/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
> new file mode 100644
> index 000..6acc92d
> --- /dev/null
> +++ b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
> @@ -0,0 +1,42 @@
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +{

I'm surprised this parsed ok...

> +
> +
> +
> +
> diff --git a/src/mapi/glapi/gen/Makefile.am
b/src/mapi/glapi/gen/Makefile.am
> index 5b163b0..170898c 100644
> --- a/src/mapi/glapi/gen/Makefile.am
> +++ b/src/mapi/glapi/gen/Makefile.am
> @@ -135,6 +135,7 @@ API_XML = \
> ARB_framebuffer_object.xml \
> ARB_geometry_shader4.xml \
> ARB_get_program_binary.xml \
> +   ARB_get_texture_sub_image.xml \
> ARB_gpu_shader_fp64.xml \
> ARB_gpu_shader5.xml \
> ARB_instanced_arrays.xml \
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index 2f33075..9cc2c3a 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -12890,4 +12890,7 @@
>
>  http://www.w3.org/2001/XInclude"/>
>
> +http://www.w3.org/2001/XInclude"/>

I believe these are usually included in ext number order.

> +
> +
>  
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
b/src/mesa/main/tests/dispatch_sanity.cpp
> index 800720b..cc89acb 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -930,6 +930,11 @@ const struct function
common_desktop_functions_possible[] = {
>
> /* GL_EXT_polygon_offset_clamp */
> { "glPolygonOffsetClampEXT", 11, -1 },
> +
> +   /* GL_ARB_get_texture_sub_image */
> +   { "glGetTextureSubImage", 12, -1 },
> +   { "glGetCompressedTextureSubImage", 12, -1 },
> +
> { NULL, 0, -1 }
>  };
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Fix 32 bit build warnings in intel_get_yf_ys_bo_size()

2015-07-13 Thread Kenneth Graunke

On Monday, July 13, 2015 03:35:16 PM Anuj Phogat wrote:
> Along with fixing the type of pitch parameter, patch also changes
> the types of few local variables and function return type.
> 
> Warnings fixed are:
> intel_mipmap_tree.c:671:7: warning: passing argument 3 of
> 'intel_get_yf_ys_bo_size' from incompatible pointer type
> 
> intel_mipmap_tree.c:563:1: note: expected 'uint64_t *' but
> argument is of type 'long unsigned int *'
> 
> Reported-by: Kenneth Graunke 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index fb896a9..1529651 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -559,14 +559,14 @@ intel_lower_compressed_format(struct brw_context *brw, 
> mesa_format format)
>  }
>  
>  /* This function computes Yf/Ys tiled bo size, alignment and pitch. */
> -static uint64_t
> +static unsigned long
>  intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
> -uint64_t *pitch)
> +unsigned long *pitch)
>  {
> const uint32_t bpp = mt->cpp * 8;
> const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
> uint32_t tile_width, tile_height;
> -   uint64_t stride, size, aligned_y;
> +   unsigned long stride, size, aligned_y;
>  
> assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);

This looks good to me.  I don't think anything should overflow, though I
didn't check thoroughly.  Thanks Anuj :)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: free interface_types and replace old hash_table uses

2015-07-13 Thread Kenneth Graunke

On Monday, July 13, 2015 11:21:05 AM Iago Toral wrote:
> On Sat, 2015-07-11 at 10:13 +1000, Timothy Arceri wrote:
> > @@ -648,27 +653,28 @@ glsl_type::get_array_instance(const glsl_type *base, 
> > unsigned array_size)
> > mtx_lock(&glsl_type::mutex);
> >  
> > if (array_types == NULL) {
> > -  array_types = hash_table_ctor(64, hash_table_string_hash,
> > -   hash_table_string_compare);
> > +  array_types = _mesa_hash_table_create(NULL, _mesa_key_hash_string,
> > +_mesa_key_string_equal);
> > }
> >  
> > -   const glsl_type *t = (glsl_type *) hash_table_find(array_types, key);
> > -
> > -   if (t == NULL) {
> > +   const struct hash_entry *entry = _mesa_hash_table_search(array_types, 
> > key);
> > +   if (entry == NULL) {
> >mtx_unlock(&glsl_type::mutex);
> > -  t = new glsl_type(base, array_size);
> > +  const glsl_type *t = new glsl_type(base, array_size);
> >mtx_lock(&glsl_type::mutex);
> >  
> > -  hash_table_insert(array_types, (void *) t, ralloc_strdup(mem_ctx, 
> > key));
> > +  entry = _mesa_hash_table_insert(array_types,
> > +  ralloc_strdup(mem_ctx, key),
> > +  (void *) t);
> > }
> >  
> > -   assert(t->base_type == GLSL_TYPE_ARRAY);
> > -   assert(t->length == array_size);
> > -   assert(t->fields.array == base);
> > +   assert(((glsl_type *)entry->data)->base_type == GLSL_TYPE_ARRAY);
> > +   assert(((glsl_type *)entry->data)->length == array_size);
> > +   assert(((glsl_type *)entry->data)->fields.array == base);
> 
> Other parts of this file put a blank between the type cast and the
> variable, so I would add that here (and in all other places where you
> cast entry to glsl_type* in this patch).

Or...why not continue to have a local variable t, and just set t =
entry->data?  Then these could all stay the same, and there would be
less casting.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: free interface_types and replace old hash_table uses

2015-07-13 Thread Timothy Arceri

On Mon, 2015-07-13 at 22:19 -0700, Kenneth Graunke wrote:
> On Monday, July 13, 2015 11:21:05 AM Iago Toral wrote:
> > On Sat, 2015-07-11 at 10:13 +1000, Timothy Arceri wrote:
> > > @@ -648,27 +653,28 @@ glsl_type::get_array_instance(const glsl_type 
> > > *base, unsigned array_size)
> > > mtx_lock(&glsl_type::mutex);
> > >  
> > > if (array_types == NULL) {
> > > -  array_types = hash_table_ctor(64, hash_table_string_hash,
> > > - hash_table_string_compare);
> > > +  array_types = _mesa_hash_table_create(NULL, 
> > > _mesa_key_hash_string,
> > > +_mesa_key_string_equal);
> > > }
> > >  
> > > -   const glsl_type *t = (glsl_type *) hash_table_find(array_types, 
> > > key);
> > > -
> > > -   if (t == NULL) {
> > > +   const struct hash_entry *entry = 
> > > _mesa_hash_table_search(array_types, key);
> > > +   if (entry == NULL) {
> > >mtx_unlock(&glsl_type::mutex);
> > > -  t = new glsl_type(base, array_size);
> > > +  const glsl_type *t = new glsl_type(base, array_size);
> > >mtx_lock(&glsl_type::mutex);
> > >  
> > > -  hash_table_insert(array_types, (void *) t, ralloc_strdup(mem_ctx, 
> > > key));
> > > +  entry = _mesa_hash_table_insert(array_types,
> > > +  ralloc_strdup(mem_ctx, key),
> > > +  (void *) t);
> > > }
> > >  
> > > -   assert(t->base_type == GLSL_TYPE_ARRAY);
> > > -   assert(t->length == array_size);
> > > -   assert(t->fields.array == base);
> > > +   assert(((glsl_type *)entry->data)->base_type == GLSL_TYPE_ARRAY);
> > > +   assert(((glsl_type *)entry->data)->length == array_size);
> > > +   assert(((glsl_type *)entry->data)->fields.array == base);
> > 
> > Other parts of this file put a blank between the type cast and the
> > variable, so I would add that here (and in all other places where you
> > cast entry to glsl_type* in this patch).
> 
> Or...why not continue to have a local variable t, and just set t =
> entry->data?  Then these could all stay the same, and there would be
> less casting.

I did have it like that but in my opinion it just looked messy. 2 extra lines
vs 3 extra casts all of which are in asserts(), it felt like I was making the
surrounding code worse for the sake of the asserts.

I've pushed this already so I guess it doesn't matter now anyway.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] loader: libudev vs sysfs vs libdrm

2015-07-13 Thread Julien Isorce

>> - libdrm: used as a last resource fall-back after the above two. the
>> sole option used by *BSD, MacOS and Android.

Hi,

Unless I am missing something, libdrm is not used on MacOS.

On osx, EGL_PLATFORM=x11 eglGetDisplay(NULL) calls dri2_initialize_x11
which fallback to dri2_initialize_x11_swrast.

Though libdrm can be built and then EGL_PLATFORM=drm eglGetDisplay(NULL)
calls dri2_initialize_drm which fallback to swrast .
But not sure how gbm surfaces are useful.

And there is no kernel side for drm on darwin right ?

Speaking about egl, on linux+x11, the driver name is retrieved with
xcb_dri2_connect_driver_name (except swrast which is harcoded) and then
loaded with dri2_open_driver (= dlopen of %s_dri.so)
So this mechanism is not either libudev, sysfs or libdrm. Am I missing
something (or is it part of sysfs) ? Or the X server internally will use
one of these 3 ?

Also any plan to support eglQueryDevicesEXT (
https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_device_enumeration.txt)
?

On 9 July 2015 at 22:00, Emil Velikov  wrote:

> On 09/07/15 15:40, Axel Davy wrote:
> > On 09/07/2015 15:33, Emil Velikov wrote :
> >>
> >> Yakes forgot about that one. From a quick look we can (I know it feels
> >> durty) use readlink(
> >> /sys/dev/char/$(major):$(minor)).
> >>
> >> Everything seems to be there - bus type and exact location of the
> >> device on the bus. It's limited to sysfs users, but they are more than
> >> the libudev ones (perhaps not by much). How does that sound ?
> >>
> >> Cheers,
> >> Emil
> >>
> > The idea behind using udev for the tag was that arbitrary tag could be
> > used,
> > especially to identify usb devices.
> >
> So avoid breaking peoples' setup the above example can be used to
> produce the very same tag(s). Regardles if it's a device on the pci or
> usb bus. Speaking of usb, are there any devices supported with upstream
> mesa ?
>
> -Emil
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

92 matches

Mail list logo