Re: [Mesa-dev] [PATCH 09/22] i965/fs: add lowering x2d step for IVB/VLV

2017-01-16 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-13 at 14:36 -0800, Matt Turner wrote:
> On Thu, Jan 5, 2017 at 5:07 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > From: "Juan A. Suarez Romero" 
> > 
> > On Ivybridge/Valleyview, when converting a float (F) to a double
> > precision float (DF), the hardware automatically duplicates the
> > source
> > horizontal stride, hence converting only the values in odd
> > positions.
> > 
> > This commit adds a new lowering step, exclusively for IVB/VLV,
> > where the
> > sources are first copied in a temporal register with stride 2, and
> > then converted from this temporal register. Thus, we do not lose
> > any
> > value.
> 
> Curro explained how he thinks the hardware works to me. I'll try to
> reproduce that description here.
> 
> The FPU channels are 32-bits wide on IVB/BYT. Normally, for example
> when operating on 8 float channels, the FPU is given a channel of the
> source register to operate on, and each FPU channel produces a value
> which is written to the channels of the destination.
> 
> But when operating on doubles, each *pair* of FPU channels operates
> on
> one (double-precision) value. Unfortunately the hardware designers
> didn't seem to update the input and output logic, so for instance
> every pair of float channels from the source region are given as
> input
> to the FPU, even though only the low (or even numbered) channel will
> be used. This is why it appears that the hardware doubles the stride,
> but it's really just ignoring all of the odd channels.
> 
> A similar thing happens on output. The output elements are 64-bits
> (even if the output type is float), and so a destination stride of 1
> means the writes are strided by 64-bits. This explains the strange
> looking behavior you discovered of an instruction like mov(8) gX<1>F
> gY<8,8,1>DF.
> 
> With that understanding, we actually can read consecutive float
> channels and convert them to doubles in one instruction -- by using a
> <1,2,0> region. Each float channel is read twice, and the second read
> will be ignored by the FPU.
> 
> So we can replace this patch with the one I have attached. A nice
> side
> effect of this is that we can simplify VEC4_OPCODE_TO_DOUBLE.

Oh, thanks a lot for this explanation! It helps us a lot to understand
how IvyBridge works :-)

Thanks for the patch, I will apply it to our -rc2 branch.

Sam

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/22] i965/vec4: fix register_coalesce() for partial writes

2017-01-16 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-13 at 15:46 -0800, Matt Turner wrote:
> On Thu, Jan 5, 2017 at 5:07 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > From: "Juan A. Suarez Romero" 
> > 
> > When lowering double_to_single() we added a final mov() that puts
> > 32-bit
> 
> I can't confirm that this patch is necessary in the current
> i965-fp64-gen7-ivb-scalar-vec4-rc2 branch. It passes Jenkins with it
> reverted.
> 

Right. We are going to run some tests locally and see if it is actually
needed.

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/22] i965/vec4: fix double_to_single() for IVB/VLV

2017-01-16 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-13 at 14:40 -0800, Matt Turner wrote:
> On Thu, Jan 5, 2017 at 5:07 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > From: "Juan A. Suarez Romero" 
> > 
> > In the generator we must generate slightly different code for
> > Ivybridge/Valleview, because of the way the stride works in
> > this hardware.
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 26
> > +---
> >  1 file changed, 23 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > index 0eaa91b..a68e14c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > @@ -1936,13 +1936,28 @@ generate_code(struct brw_codegen *p,
> > 
> >   brw_set_default_access_mode(p, BRW_ALIGN_1);
> > 
> > - dst.hstride = BRW_HORIZONTAL_STRIDE_2;
> > + /* When converting from DF->F, we set destination's
> > stride as 2 as an
> > +  * aligment requirement. But in IVB/VLV, each DF
> > implicitly writes
> 
> Typo: alignment
> 
> > +  * two floats, being the first one the converted value.
> > So we don't
> > +  * need to explicitly set stride 2, but 1.
> > +  */
> > + if (devinfo->gen == 7 && !devinfo->is_haswell)
> > +dst.hstride = BRW_HORIZONTAL_STRIDE_1;
> > + else
> > +dst.hstride = BRW_HORIZONTAL_STRIDE_2;
> > +
> >   dst.width = BRW_WIDTH_4;
> >   src[0].vstride = BRW_VERTICAL_STRIDE_4;
> >   src[0].width = BRW_WIDTH_4;
> >   brw_MOV(p, dst, src[0]);
> > 
> >   struct brw_reg dst_as_src = dst;
> > + /* As we have set horizontal stride 1 instead of 2 in
> > IVB/VLV, we
> > +  * need to fix it here to have the expected value.
> > +  */
> > + if (devinfo->gen == 7 && !devinfo->is_haswell)
> > +dst_as_src.hstride = BRW_HORIZONTAL_STRIDE_2;
> > +
> >   dst.hstride = BRW_HORIZONTAL_STRIDE_1;
> >   dst.width = BRW_WIDTH_8;
> >   brw_MOV(p, dst, dst_as_src);
> > @@ -1965,8 +1980,13 @@ generate_code(struct brw_codegen *p,
> >   src[0].width = BRW_WIDTH_4;
> >   brw_MOV(p, tmp, src[0]);
> > 
> > - tmp.vstride = BRW_VERTICAL_STRIDE_8;
> > - tmp.hstride = BRW_HORIZONTAL_STRIDE_2;
> > + if (devinfo->gen == 7 && !devinfo->is_haswell) {
> > +tmp.vstride = BRW_VERTICAL_STRIDE_4;
> > +tmp.hstride = BRW_HORIZONTAL_STRIDE_1;
> > + } else {
> > +tmp.vstride = BRW_VERTICAL_STRIDE_8;
> > +tmp.hstride = BRW_HORIZONTAL_STRIDE_2;
> > + }
> 
> With the patch I sent to replace 09/22, there should be no changes
> needed to VEC4_OPCODE_TO_DOUBLE. :)
> 
> Please change double_to_single() to VEC4_OPCODE_FROM_DOUBLE in the
> title.
> 

OK, thanks!

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] anv: increase ANV_MAX_STATE_SIZE_LOG2 limit to 1 MB

2017-01-16 Thread Samuel Iglesias Gonsálvez
Fixes crash in dEQP-VK.ubo.random.all_shared_buffer.48 due to a
fragment shader code bigger than 128 kB.

This patch increases the allocation size limit to 1 MB.

v2:
- Increase it to 1 MB (Jason)
- Increase device->instruction_block_pool allocation size in
  anv_device.c (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_device.c  | 2 +-
 src/intel/vulkan/anv_private.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 6349537d172..f80a36a9400 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -948,7 +948,7 @@ VkResult anv_CreateDevice(
anv_state_pool_init(&device->dynamic_state_pool,
&device->dynamic_state_block_pool);
 
-   anv_block_pool_init(&device->instruction_block_pool, device, 128 * 1024);
+   anv_block_pool_init(&device->instruction_block_pool, device, 1024 * 1024);
anv_state_pool_init(&device->instruction_state_pool,
&device->instruction_block_pool);
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 17b72368819..75f2bde66a8 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -388,7 +388,7 @@ struct anv_fixed_size_state_pool {
 };
 
 #define ANV_MIN_STATE_SIZE_LOG2 6
-#define ANV_MAX_STATE_SIZE_LOG2 17
+#define ANV_MAX_STATE_SIZE_LOG2 20
 
 #define ANV_STATE_BUCKETS (ANV_MAX_STATE_SIZE_LOG2 - ANV_MIN_STATE_SIZE_LOG2 + 
1)
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable OpenGL 4.5 on Haswell.

2017-01-16 Thread Juan A. Suarez Romero
On Fri, 2017-01-13 at 22:53 -0800, Kenneth Graunke wrote:
> Everything is in place and the test results look solid.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
>  src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 


Awesome! 


J.A.

> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 7a40ebaa424..b674b2f494c 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -137,7 +137,7 @@ intelInitExtensions(struct gl_context *ctx)
> if (brw->gen >= 8)
>ctx->Const.GLSLVersion = 450;
> else if (brw->is_haswell && can_do_pipelined_register_writes(brw->screen))
> -  ctx->Const.GLSLVersion = 420;
> +  ctx->Const.GLSLVersion = 450;
> else if (brw->gen >= 6)
>ctx->Const.GLSLVersion = 330;
> else
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index a8d401cdffa..6ae211da3a1 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1539,7 +1539,7 @@ set_max_gl_versions(struct intel_screen *screen)
>break;
> case 7:
>dri_screen->max_gl_core_version = screen->devinfo.is_haswell &&
> - can_do_pipelined_register_writes(screen) ? 42 : 33;
> + can_do_pipelined_register_writes(screen) ? 45 : 33;
>dri_screen->max_gl_compat_version = 30;
>dri_screen->max_gl_es1_version = 11;
>dri_screen->max_gl_es2_version = screen->devinfo.is_haswell ? 31 : 30;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/hud: disable queries during HUD draw calls

2017-01-16 Thread Nicolai Hähnle

For the series:

Reviewed-by: Nicolai Hähnle 

On 16.01.2017 02:59, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c  | 10 ++
 src/gallium/auxiliary/hud/hud_driver_query.c | 18 +-
 src/gallium/auxiliary/hud/hud_private.h  |  2 ++
 3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 9f067f3..fd9a7bc 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -655,20 +655,30 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
cso_set_rasterizer(cso, &hud->rasterizer_aa_lines);
LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
   if (pane)
  hud_pane_draw_colored_objects(hud, pane);
}

cso_restore_state(cso);
cso_restore_constant_buffer_slot0(cso, PIPE_SHADER_VERTEX);

pipe_surface_reference(&surf, NULL);
+
+   /* Start queries. */
+   hud_batch_query_begin(hud->batch_query);
+
+   LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
+  LIST_FOR_EACH_ENTRY(gr, &pane->graph_list, head) {
+ if (gr->begin_query)
+gr->begin_query(gr);
+  }
+   }
 }

 static void
 fixup_bytes(enum pipe_driver_query_type type, int position, uint64_t *exp10)
 {
if (type == PIPE_DRIVER_QUERY_TYPE_BYTES && position % 3 == 0)
   *exp10 = (*exp10 / 1000) * 1024;
 }

 /**
diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
b/src/gallium/auxiliary/hud/hud_driver_query.c
index d80b8ed..6a97dbd 100644
--- a/src/gallium/auxiliary/hud/hud_driver_query.c
+++ b/src/gallium/auxiliary/hud/hud_driver_query.c
@@ -109,22 +109,29 @@ hud_batch_query_update(struct hud_batch_query_context *bq)
  bq->query_types);

   if (!bq->query[bq->head]) {
  fprintf(stderr,
  "gallium_hud: create_batch_query failed. You may have "
  "selected too many or incompatible queries.\n");
  bq->failed = TRUE;
  return;
   }
}
+}
+
+void
+hud_batch_query_begin(struct hud_batch_query_context *bq)
+{
+   if (!bq || bq->failed || !bq->query[bq->head])
+  return;

-   if (!pipe->begin_query(pipe, bq->query[bq->head])) {
+   if (!bq->pipe->begin_query(bq->pipe, bq->query[bq->head])) {
   fprintf(stderr,
   "gallium_hud: could not begin batch query. You may have "
   "selected too many or incompatible queries.\n");
   bq->failed = TRUE;
}
 }

 static boolean
 batch_query_add(struct hud_batch_query_context **pbq,
 struct pipe_context *pipe, unsigned query_type,
@@ -270,21 +277,29 @@ query_new_value_normal(struct query_info *info)
}
 }
 break;
  }
   }
}
else {
   /* initialize */
   info->query[info->head] = pipe->create_query(pipe, info->query_type, 0);
}
+}
+
+static void
+begin_query(struct hud_graph *gr)
+{
+   struct query_info *info = gr->query_data;
+   struct pipe_context *pipe = info->pipe;

+   assert(!info->batch);
if (info->query[info->head])
   pipe->begin_query(pipe, info->query[info->head]);
 }

 static void
 query_new_value(struct hud_graph *gr)
 {
struct query_info *info = gr->query_data;
uint64_t now = os_time_get();

@@ -367,20 +382,21 @@ hud_pipe_query_install(struct hud_batch_query_context 
**pbq,

info = gr->query_data;
info->pipe = pipe;
info->result_type = result_type;

if (flags & PIPE_DRIVER_QUERY_FLAG_BATCH) {
   if (!batch_query_add(pbq, pipe, query_type, &info->result_index))
  goto fail_info;
   info->batch = *pbq;
} else {
+  gr->begin_query = begin_query;
   info->query_type = query_type;
   info->result_index = result_index;
}

hud_graph_set_dump_file(gr);

hud_pane_add_graph(pane, gr);
pane->type = type; /* must be set before updating the max_value */

if (pane->max_value < max_value)
diff --git a/src/gallium/auxiliary/hud/hud_private.h 
b/src/gallium/auxiliary/hud/hud_private.h
index d719e5f..b23439e 100644
--- a/src/gallium/auxiliary/hud/hud_private.h
+++ b/src/gallium/auxiliary/hud/hud_private.h
@@ -34,20 +34,21 @@
 struct hud_graph {
/* initialized by common code */
struct list_head head;
struct hud_pane *pane;
float color[3];
float *vertices; /* ring buffer of vertices */

/* name and query */
char name[128];
void *query_data;
+   void (*begin_query)(struct hud_graph *gr);
void (*query_new_value)(struct hud_graph *gr);
void (*free_query_data)(void *ptr); /**< do not use ordinary free() */

/* mutable variables */
unsigned num_vertices;
unsigned index; /* vertex index being updated */
uint64_t current_value;
FILE *fd;
 };

@@ -96,20 +97,21 @@ void hud_pipe_query_install(struct hud_batch_query_context 
**pbq,
 struct hud_pane *pane, str

Re: [Mesa-dev] [PATCH 6/6] gallium/radeon: add GPU-shaders-busy HUD query

2017-01-16 Thread Nicolai Hähnle

This series is:

Reviewed-by: Nicolai Hähnle 

On 16.01.2017 03:00, Marek Olšák wrote:

From: Marek Olšák 

It should be close to the GPU load, but it can be much lower if something
is stalling shader execution (e.g. CP DMA).
---
 src/gallium/drivers/radeon/r600_gpu_load.c| 16 
 src/gallium/drivers/radeon/r600_pipe_common.h |  4 
 src/gallium/drivers/radeon/r600_query.c   | 11 ++-
 src/gallium/drivers/radeon/r600_query.h   |  1 +
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
b/src/gallium/drivers/radeon/r600_gpu_load.c
index 764d9b5..e3488b3 100644
--- a/src/gallium/drivers/radeon/r600_gpu_load.c
+++ b/src/gallium/drivers/radeon/r600_gpu_load.c
@@ -35,29 +35,35 @@
  */

 #include "r600_pipe_common.h"
 #include "os/os_time.h"

 /* For good accuracy at 1000 fps or lower. This will be inaccurate for higher
  * fps (there are too few samples per frame). */
 #define SAMPLES_PER_SEC 1

 #define GRBM_STATUS0x8010
+#define SPI_BUSY(x)(((x) >> 22) & 0x1)
 #define GUI_ACTIVE(x)  (((x) >> 31) & 0x1)

 static void r600_update_grbm_counters(struct r600_common_screen *rscreen,
  union r600_grbm_counters *counters)
 {
uint32_t value = 0;

rscreen->ws->read_registers(rscreen->ws, GRBM_STATUS, 1, &value);

+   if (SPI_BUSY(value))
+   p_atomic_inc(&counters->named.spi_busy);
+   else
+   p_atomic_inc(&counters->named.spi_idle);
+
if (GUI_ACTIVE(value))
p_atomic_inc(&counters->named.gui_busy);
else
p_atomic_inc(&counters->named.gui_idle);
 }

 static PIPE_THREAD_ROUTINE(r600_gpu_load_thread, param)
 {
struct r600_common_screen *rscreen = (struct r600_common_screen*)param;
const int period_us = 100 / SAMPLES_PER_SEC;
@@ -137,19 +143,29 @@ static unsigned r600_end_counter(struct 
r600_common_screen *rscreen,

memset(&counters, 0, sizeof(counters));
r600_update_grbm_counters(rscreen, &counters);
return counters.array[busy_index] ? 100 : 0;
}
 }

 #define BUSY_INDEX(rscreen, field) (&rscreen->grbm_counters.named.field##_busy 
- \
rscreen->grbm_counters.array)

+uint64_t r600_begin_counter_spi(struct r600_common_screen *rscreen)
+{
+   return r600_read_counter(rscreen, BUSY_INDEX(rscreen, spi));
+}
+
+unsigned r600_end_counter_spi(struct r600_common_screen *rscreen, uint64_t 
begin)
+{
+   return r600_end_counter(rscreen, begin, BUSY_INDEX(rscreen, spi));
+}
+
 uint64_t r600_begin_counter_gui(struct r600_common_screen *rscreen)
 {
return r600_read_counter(rscreen, BUSY_INDEX(rscreen, gui));
 }

 unsigned r600_end_counter_gui(struct r600_common_screen *rscreen, uint64_t 
begin)
 {
return r600_end_counter(rscreen, begin, BUSY_INDEX(rscreen, gui));
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 9f69298..97e9441 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -347,20 +347,22 @@ struct r600_surface {
unsigned db_stencil_base;   /* EG and later */
unsigned db_stencil_info;   /* EG and later */
unsigned db_prefetch_limit; /* R600 only */
unsigned db_htile_surface;
unsigned db_htile_data_base;
unsigned db_preload_control;/* EG and later */
 };

 union r600_grbm_counters {
struct {
+   unsigned spi_busy;
+   unsigned spi_idle;
unsigned gui_busy;
unsigned gui_idle;
} named;
unsigned array[0];
 };

 struct r600_common_screen {
struct pipe_screen  b;
struct radeon_winsys*ws;
enum radeon_family  family;
@@ -739,20 +741,22 @@ struct pipe_resource *r600_resource_create_common(struct 
pipe_screen *screen,
 const char *r600_get_llvm_processor_name(enum radeon_family family);
 void r600_need_dma_space(struct r600_common_context *ctx, unsigned num_dw,
 struct r600_resource *dst, struct r600_resource *src);
 void radeon_save_cs(struct radeon_winsys *ws, struct radeon_winsys_cs *cs,
struct radeon_saved_cs *saved);
 void radeon_clear_saved_cs(struct radeon_saved_cs *saved);
 bool r600_check_device_reset(struct r600_common_context *rctx);

 /* r600_gpu_load.c */
 void r600_gpu_load_kill_thread(struct r600_common_screen *rscreen);
+uint64_t r600_begin_counter_spi(struct r600_common_screen *rscreen);
+unsigned r600_end_counter_spi(struct r600_common_screen *rscreen, uint64_t 
begin);
 uint64_t r600_begin_counter_gui(struct r600_common_screen *rscreen);
 unsigned r600_end_counter_gui(struct r600_common_screen *rscreen, uint64_t 
begin);

 /* r600_perfcounters.c */
 void r600_per

[Mesa-dev] [PATCH 05/27] i965: Replace open coded with intel_miptree_get_image_offset()

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_pixel_read.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index 2563897..ace94a0 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -47,6 +47,19 @@
 
 #define FILE_DEBUG_FLAG DEBUG_PIXEL
 
+static void
+adjust_image_offset(const struct intel_renderbuffer *irb,
+int *xoffset, int *yoffset)
+{
+   unsigned x;
+   unsigned y;
+   intel_miptree_get_image_offset(irb->mt, irb->mt_level, irb->mt_layer,
+  &x, &y);
+
+   *xoffset += x;
+   *yoffset += y;
+}
+
 /**
  * \brief A fast path for glReadPixels
  *
@@ -153,8 +166,7 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
   return false;
}
 
-   xoffset += irb->mt->level[irb->mt_level].slice[irb->mt_layer].x_offset;
-   yoffset += irb->mt->level[irb->mt_level].slice[irb->mt_layer].y_offset;
+   adjust_image_offset(irb, &xoffset, &yoffset);
 
dst_pitch = _mesa_image_row_stride(pack, width, format, type);
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/27] i965/gen6: Calculate hiz offset on demand

2017-01-16 Thread Topi Pohjolainen
This is kept on purpose in i965. It can be moved to ISL if it
is needed in vulkan.

Pointers to miptrees are given solely for verification purposes.
These will be dropped in following patches.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 44 +++
 src/mesa/drivers/dri/i965/gen6_depth_state.c  | 18 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 11 +++
 3 files changed, 69 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 80b341a..6f1c228 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -353,6 +353,50 @@ brw_stencil_all_slices_at_each_lod_offset(const struct 
isl_surf *surf,
return level_y * two_rows_interleaved_pitch + level_x * 64;
 }
 
+uint32_t
+brw_get_mipmap_total_width(unsigned w0, unsigned num_levels, unsigned halign)
+{
+   /* If there is not level two, no adjustment is needed. */
+   if (num_levels < 2)
+  return ALIGN(w0, halign);
+
+   const uint32_t w1 = ALIGN(minify(w0, 1), halign);
+   const uint32_t w2 = minify(w0, 2);
+
+   /* Levels one and two sit side-by-side below level zero. Due to alignment
+* of level one levels one and two may require more space than level zero.
+*/
+   return ALIGN(MAX2(w0, w1 + w2), halign);
+}
+
+uint32_t
+brw_hiz_all_slices_at_each_lod_offset(
+   const struct isl_extent4d *phys_level0_sa,
+   enum isl_surf_dim dim, unsigned num_levels,
+   enum isl_format format,
+   const struct intel_mipmap_tree *mt,
+   unsigned level)
+{
+   assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
+
+   const uint32_t cpp = isl_format_get_layout(format)->bpb / 8;
+   const uint32_t halign = 128 / cpp;
+   const uint32_t valign = 32;
+   const uint32_t level_x = all_slices_at_each_lod_x_offset(
+   phys_level0_sa->width, halign, level);
+   const uint32_t level_y = all_slices_at_each_lod_y_offset(
+   phys_level0_sa, dim, valign, level);
+   const uint32_t pitch = brw_get_mipmap_total_width(
+ phys_level0_sa->width, num_levels, halign) * cpp;
+
+   assert(level_x == mt->level[level].level_x);
+   assert(level_y == mt->level[level].level_y);
+   assert(pitch == mt->pitch);
+   assert(cpp == mt->cpp);
+
+   return level_y * pitch + level_x / halign * 4096;
+}
+
 static void
 brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 {
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 80cb890..05565de 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -165,10 +165,20 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
 
  assert(hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
 
- const uint32_t offset = intel_miptree_get_aligned_offset(
-hiz_mt,
-hiz_mt->level[lod].level_x,
-hiz_mt->level[lod].level_y);
+ struct isl_surf temp_surf;
+ intel_miptree_get_isl_surf(brw, mt, &temp_surf);
+
+ /* Main and hiz surfaces agree on the base level dimensions and
+  * format. Therefore one can calculate against the main surface.
+  */
+ const uint32_t offset = brw_hiz_all_slices_at_each_lod_offset(
+&temp_surf.phys_level0_sa, temp_surf.dim, temp_surf.levels,
+temp_surf.format, hiz_mt, lod);
+
+ assert(offset == intel_miptree_get_aligned_offset(
+ hiz_mt,
+ hiz_mt->level[lod].level_x,
+ hiz_mt->level[lod].level_y));
 
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index e51872f..11c61c2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -984,10 +984,21 @@ brw_miptree_get_vertical_slice_pitch(const struct 
brw_context *brw,
  unsigned level);
 
 uint32_t
+brw_get_mipmap_total_width(unsigned w0, unsigned num_levels, unsigned halign);
+
+uint32_t
 brw_stencil_all_slices_at_each_lod_offset(const struct isl_surf *surf,
   const struct intel_mipmap_tree *mt,
   uint32_t level);
 
+uint32_t
+brw_hiz_all_slices_at_each_lod_offset(
+   const struct isl_extent4d *phys_level0_sa, 
+   enum isl_surf_dim dim, unsigned num_levels,
+   enum isl_format format,
+   const struct intel_mipmap_tree *mt,
+   unsigned level);
+
 void
 brw_miptree_layout(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-- 
2.5.5

_

[Mesa-dev] [PATCH 11/27] i965/hiz/gen6: Stop setting false qpitch

2017-01-16 Thread Topi Pohjolainen
which is not applicable for "all slices at each lod". Current
logic makes one to believe it has some purpose. When miptree
layout is calculated brw_miptree_layout_texture_array() sets
the qpitch unconditionally but later on ignores it altogether
for ALL_SLICES_AT_EACH_LOD.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 606d4c2..825c6a0 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1901,7 +1901,13 @@ intel_hiz_miptree_buf_create(struct brw_context *brw,
buf->aux_base.bo = buf->mt->bo;
buf->aux_base.size = buf->mt->total_height * buf->mt->pitch;
buf->aux_base.pitch = buf->mt->pitch;
-   buf->aux_base.qpitch = buf->mt->qpitch;
+
+   /* On gen6 hiz is unconditionally laid out packing all slices
+* at each level-of-detail (LOD). This means there is no valid qpitch
+* setting. In fact, this is ignored when hardware is setup - there is no
+* hardware qpitch setting of hiz on gen6.
+*/
+   buf->aux_base.qpitch = 0;
 
return buf;
 }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/27] i965/miptree: Remove redundant check for null texture

2017-01-16 Thread Topi Pohjolainen
There exact same check earlier in brw_miptree_layout() which
intel_miptree_create_layout() in turn calls unconditionally.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 25f8f39..9488bec 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -628,13 +628,8 @@ miptree_create(struct brw_context *brw,
 first_level, last_level, width0,
 height0, depth0, num_samples,
 layout_flags);
-   /*
-* pitch == 0 || height == 0  indicates the null texture
-*/
-   if (!mt || !mt->total_width || !mt->total_height) {
-  intel_miptree_release(&mt);
+   if (!mt)
   return NULL;
-   }
 
if (mt->tiling == (I915_TILING_Y | I915_TILING_X))
   mt->tiling = I915_TILING_Y;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/27] i965: Remove check for hiz on earlier gens than SNB

2017-01-16 Thread Topi Pohjolainen
Only caller, brw_workaround_depthstencil_alignment(), returns
early for gen6+.

While at it, reduce scope for brw_get_depthstencil_tile_masks() as
well.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.h|  6 --
 src/mesa/drivers/dri/i965/brw_misc_state.c | 18 ++
 2 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index ff3f861..4176853 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1279,12 +1279,6 @@ brw_meta_resolve_color(struct brw_context *brw,
 /*==
  * brw_misc_state.c
  */
-void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt,
- uint32_t depth_level,
- uint32_t depth_layer,
- struct intel_mipmap_tree *stencil_mt,
- uint32_t *out_tile_mask_x,
- uint32_t *out_tile_mask_y);
 void brw_workaround_depthstencil_alignment(struct brw_context *brw,
GLbitfield clear_mask);
 
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 40a8d07..616c0df 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -165,7 +165,7 @@ brw_depthbuffer_format(struct brw_context *brw)
  * packet.  If the 3 buffers don't agree on the drawing offset ANDed with this
  * mask, then we're in trouble.
  */
-void
+static void
 brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt,
 uint32_t depth_level,
 uint32_t depth_layer,
@@ -179,21 +179,7 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
   intel_get_tile_masks(depth_mt->tiling, depth_mt->tr_mode,
depth_mt->cpp,
&tile_mask_x, &tile_mask_y);
-
-  if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
- uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
- intel_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
-  depth_mt->hiz_buf->mt->tr_mode,
-  depth_mt->hiz_buf->mt->cpp,
-  &hiz_tile_mask_x,
-  &hiz_tile_mask_y);
-
- /* Each HiZ row represents 2 rows of pixels */
- hiz_tile_mask_y = hiz_tile_mask_y << 1 | 1;
-
- tile_mask_x |= hiz_tile_mask_x;
- tile_mask_y |= hiz_tile_mask_y;
-  }
+  assert(!intel_miptree_level_has_hiz(depth_mt, depth_level));
}
 
if (stencil_mt) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/27] i965/gen6: Remove check for stencil format

2017-01-16 Thread Topi Pohjolainen
There are is no alternative.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen6_depth_state.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 3f14006..cb0ed25 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -191,20 +191,14 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
  uint32_t offset = 0;
 
  if (stencil_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
-if (stencil_mt->format == MESA_FORMAT_S_UINT8) {
-   /* Note: we can't compute the stencil offset using
-* intel_region_get_aligned_offset(), because stencil_region
-* claims that the region is untiled even though it's W tiled.
-*/
-   offset =
-  stencil_mt->level[lod].level_y * stencil_mt->pitch +
-  stencil_mt->level[lod].level_x * 64;
-} else {
-   offset = intel_miptree_get_aligned_offset(
-   stencil_mt,
-   stencil_mt->level[lod].level_x,
-   stencil_mt->level[lod].level_y);
-}
+assert(stencil_mt->format == MESA_FORMAT_S_UINT8);
+
+/* Note: we can't compute the stencil offset using
+ * intel_region_get_aligned_offset(), because stencil_region
+ * claims that the region is untiled even though it's W tiled.
+ */
+offset = stencil_mt->level[lod].level_y * stencil_mt->pitch +
+ stencil_mt->level[lod].level_x * 64;
  }
 
 BEGIN_BATCH(3);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i965: Use ISL for auxiliary buffer layout

2017-01-16 Thread Topi Pohjolainen
This series is a step towards using ISL instead of current
intel_mipnap_tree/brw_tex_layout logic in i965. First 11 patches 
simplify the workspace for the rest and more functional changes.

Next seven patches introduce simple on-demand calculators for
stencil and hiz offsets on gen6. On gen6 hardware doesn't support
mipmaps for stencil or hiz. Driver works around this by offsetting
hiz/stencil surfaces to desired level-of-detail. Current logic
relies on pre-computed offset table which is replaced by helpers
functions calculating the offsets when needed. This drops dependency
to intel_mipnap_tree.

Patch 22 prepares ISL to provide large enough auxiliary size for
MCS on gen7 and on gen8. I started seeing intermittent failures (once
in 10 subsequent runs) with
"texelFetch fs sampler2DMSArray 4 98x1x9-98x129x9 -auto -fbo" with
IVB, HSW and BDW, but not with SKL. After some debugging I tracked
this down to SKL using vertical alignment of four requiring 4608
bytes (and driver allocated two pages). Earlier gens used vertical
alignment of one and only allocated one page. More details in the
patch.
I included patch 21 for consistency sake as to me it seems we are
missing something of that sort.

Patch 23 starts using ISL for MCS and the rest for HIZ.


I tried to make individual steps as small as possible adding
temporary asserts checking that newly added calculations matched
the current.

Topi Pohjolainen (27):
  i965/meta: Remove unused brw_get_rb_for_slice()
  i965/miptree: Remove redundant check for null texture
  i965: Remove check for hiz on earlier gens than SNB
  i965/gen6: Remove check for stencil format
  i965: Replace open coded with intel_miptree_get_image_offset()
  i965/blorp/gen6: Simplify hiz surface setup
  i965/gen6: Simplify hiz surface setup
  i965/blorp/gen6: Remove dead code in hiz setup
  i965/gen6: Remove dead code in hiz surface setup
  i965/blorp/gen6: Drop unnecessary stencil/hiz surf dimension adjust
  i965/hiz/gen6: Stop setting false qpitch
  i965/gen6: Calculate stencil offset on demand
  i965/gen6: Calculate hiz offset on demand
  i965/blorp/gen6: Use on-demand stencil/hiz offset resolvers
  i965/gen6: Drop miptrees in depth/stencil offset resolvers
  i965/blorp/gen6: Set aux pitch directly
  i965/gen6/hiz: Add direct buffer size resolver
  i965/gen6: Allocate hiz directly without miptree
  i965/miptree: Refactor aux surface allocation
  i965/miptree: Refactor ISL aux usage resolver
  intel/isl/gen7: Add CCS alignment restrictions
  intel/isl: Apply render target alignment constraints for MCS
  i965/miptree: Use ISL for MCS layouts
  i965/miptree: Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD in mcs init
  i965/miptree/gen7+: Use ISL for HIZ layouts
  i965/blorp: Use hiz surface instead of creating copy
  i965: Use stored hiz surface instead of creating copy

 src/intel/isl/isl_gen7.c |  55 +++
 src/intel/isl/isl_gen8.c |  16 +
 src/intel/isl/isl_gen9.c |  16 +
 src/mesa/drivers/dri/i965/brw_blorp.c| 108 ++---
 src/mesa/drivers/dri/i965/brw_context.h  |   6 -
 src/mesa/drivers/dri/i965/brw_meta_util.c|  44 --
 src/mesa/drivers/dri/i965/brw_meta_util.h|   5 -
 src/mesa/drivers/dri/i965/brw_misc_state.c   |  23 +-
 src/mesa/drivers/dri/i965/brw_tex_layout.c   | 131 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  13 +-
 src/mesa/drivers/dri/i965/gen6_depth_state.c |  45 +-
 src/mesa/drivers/dri/i965/gen7_misc_state.c  |   5 +-
 src/mesa/drivers/dri/i965/gen8_depth_state.c |   6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 548 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  45 +-
 src/mesa/drivers/dri/i965/intel_pixel_read.c |  16 +-
 16 files changed, 472 insertions(+), 610 deletions(-)

-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/27] i965/meta: Remove unused brw_get_rb_for_slice()

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_meta_util.c | 44 ---
 src/mesa/drivers/dri/i965/brw_meta_util.h |  5 
 2 files changed, 49 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index 6d6b692..07a160f 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -267,50 +267,6 @@ brw_meta_mirror_clip_and_scissor(const struct gl_context 
*ctx,
 }
 
 /**
- * Creates a new named renderbuffer that wraps the first slice
- * of an existing miptree.
- *
- * Clobbers the current renderbuffer binding (ctx->CurrentRenderbuffer).
- */
-struct gl_renderbuffer *
-brw_get_rb_for_slice(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- unsigned level, unsigned layer, bool flat)
-{
-   struct gl_context *ctx = &brw->ctx;
-   struct gl_renderbuffer *rb = ctx->Driver.NewRenderbuffer(ctx, 0xDEADBEEF);
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-
-   rb->RefCount = 1;
-   rb->Format = mt->format;
-   rb->_BaseFormat = _mesa_get_format_base_format(mt->format);
-
-   /* Program takes care of msaa and mip-level access manually for stencil.
-* The surface is also treated as Y-tiled instead of as W-tiled calling for
-* twice the width and half the height in dimensions.
-*/
-   if (flat) {
-  const unsigned halign_stencil = 8;
-
-  rb->NumSamples = 0;
-  rb->Width = ALIGN(mt->total_width, halign_stencil) * 2;
-  rb->Height = (mt->total_height / mt->physical_depth0) / 2;
-  irb->mt_level = 0;
-   } else {
-  rb->NumSamples = mt->num_samples;
-  rb->Width = mt->logical_width0;
-  rb->Height = mt->logical_height0;
-  irb->mt_level = level;
-   }
-
-   irb->mt_layer = layer;
-
-   intel_miptree_reference(&irb->mt, mt);
-
-   return rb;
-}
-
-/**
  * Determine if fast color clear supports the given clear color.
  *
  * Fast color clear can only clear to color values of 1.0 or 0.0.  At the
diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.h 
b/src/mesa/drivers/dri/i965/brw_meta_util.h
index 93bc72c..207a54b 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.h
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.h
@@ -57,11 +57,6 @@ brw_is_color_fast_clear_compatible(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
const union gl_color_union *color);
 
-struct gl_renderbuffer *brw_get_rb_for_slice(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- unsigned level, unsigned layer,
- bool flat);
-
 #ifdef __cplusplus
 }
 #endif
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/27] i965/gen6: Calculate stencil offset on demand

2017-01-16 Thread Topi Pohjolainen
This is kept on purpose in i965. It can be moved to ISL if it
is needed in vulkan.

Pointers to miptrees are given solely for verification purposes.
These will be dropped in following patches.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 65 +++
 src/mesa/drivers/dri/i965/gen6_depth_state.c  | 14 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 +++
 3 files changed, 78 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 768f8a8..80b341a 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -288,6 +288,71 @@ gen9_miptree_layout_1d(struct intel_mipmap_tree *mt)
}
 }
 
+static unsigned
+all_slices_at_each_lod_x_offset(unsigned w0, unsigned align, unsigned level)
+{
+   const unsigned w = level >= 2 ? minify(w0, 1) : 0;
+   return ALIGN(w, align);
+}
+
+static unsigned
+all_slices_at_each_lod_y_offset(const struct isl_extent4d *phys_level0_sa,
+enum isl_surf_dim dim, unsigned align,
+unsigned level)
+{
+   unsigned y = 0;
+
+   /* Add vertical space taken by lower levels one by one. Levels one and two
+* are side-by-side just below level zero. Levels three and greater are
+* stacked one after another below level two.
+*/
+   for (unsigned i = 1; i <= level; ++i) {
+  const unsigned d = dim == ISL_SURF_DIM_3D ?
+ minify(phys_level0_sa->depth, i - 1) :
+ phys_level0_sa->array_len;
+
+  /* Levels two and greater are stacked just below level zero. */
+  if (i != 2) {
+ const unsigned h = minify(phys_level0_sa->height, i - 1);
+ y += d * ALIGN(h, align);
+  }
+   }
+
+   return y;
+}
+
+uint32_t
+brw_stencil_all_slices_at_each_lod_offset(const struct isl_surf *surf,
+  const struct intel_mipmap_tree *mt,
+  unsigned level)
+{
+   assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
+
+   const unsigned halign = 64;
+   const unsigned valign = 64;
+   const unsigned level_x = all_slices_at_each_lod_x_offset(
+  surf->phys_level0_sa.width, halign, level);
+   const unsigned level_y = all_slices_at_each_lod_y_offset(
+  &surf->phys_level0_sa, surf->dim, valign, level);
+
+   assert(level_x == mt->level[level].level_x);
+   assert(level_y == mt->level[level].level_y);
+
+   /* From Vol 2a, 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
+*The pitch must be set to 2x the value computed based on width, as
+*the stencil buffer is stored with two rows interleaved.
+*
+* While ISL surface stores the pitch expected by hardware, the offset
+* into individual slices needs to be calculated as if rows are
+* interleaved.
+*/
+   const unsigned two_rows_interleaved_pitch = surf->row_pitch / 2;
+
+   assert(two_rows_interleaved_pitch == mt->pitch);
+
+   return level_y * two_rows_interleaved_pitch + level_x * 64;
+}
+
 static void
 brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 {
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index cda66e8..80cb890 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -189,15 +189,17 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
   if (separate_stencil) {
  uint32_t offset = 0;
 
+ struct isl_surf temp_surf;
+ intel_miptree_get_isl_surf(brw, stencil_mt, &temp_surf);
+
  if (stencil_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
 assert(stencil_mt->format == MESA_FORMAT_S_UINT8);
+offset = brw_stencil_all_slices_at_each_lod_offset(
+&temp_surf, stencil_mt, lod);
 
-/* Note: we can't compute the stencil offset using
- * intel_region_get_aligned_offset(), because stencil_region
- * claims that the region is untiled even though it's W tiled.
- */
-offset = stencil_mt->level[lod].level_y * stencil_mt->pitch +
- stencil_mt->level[lod].level_x * 64;
+assert(offset ==
+   stencil_mt->level[lod].level_y * stencil_mt->pitch +
+   stencil_mt->level[lod].level_x * 64);
  }
 
 BEGIN_BATCH(3);
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 476c46b..e51872f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -983,6 +983,11 @@ brw_miptree_get_vertical_slice_pitch(const struct 
brw_context *brw,
  const struct intel_mipmap_tree *mt,
  unsigned level);
 
+uint32_t
+brw_

[Mesa-dev] [PATCH 08/27] i965/blorp/gen6: Remove dead code in hiz setup

2017-01-16 Thread Topi Pohjolainen
Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt
only exists for gen6. In addition, intel_hiz_miptree_buf_create()
uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 9c4b8fa..ecf27a1 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -250,15 +250,15 @@ blorp_surf_for_miptree(struct brw_context *brw,
 
  struct intel_mipmap_tree *hiz_mt = mt->hiz_buf->mt;
  if (hiz_mt) {
-if (brw->gen == 6 &&
-hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
-   /* gen6 requires the HiZ buffer to be manually offset to the
-* right location.  We could fixup the surf but it doesn't
-* matter since most of those fields don't matter.
-*/
-   apply_gen6_stencil_hiz_offset(aux_surf, hiz_mt, *level,
- &surf->aux_addr.offset);
-}
+assert(brw->gen == 6 &&
+   hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
+
+/* gen6 requires the HiZ buffer to be manually offset to the
+ * right location.  We could fixup the surf but it doesn't
+ * matter since most of those fields don't matter.
+ */
+apply_gen6_stencil_hiz_offset(aux_surf, hiz_mt, *level,
+  &surf->aux_addr.offset);
 assert(hiz_mt->pitch == aux_surf->row_pitch);
  }
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/27] i965/blorp/gen6: Simplify hiz surface setup

2017-01-16 Thread Topi Pohjolainen
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. Also intel_miptree_aux_buffer::offset
is initialised to zero (calloc()).

This will make following patches significantly simpler to read.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8d58616..9c4b8fa 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -244,9 +244,12 @@ blorp_surf_for_miptree(struct brw_context *brw,
  surf->aux_addr.offset = mt->mcs_buf->offset;
   } else {
  assert(surf->aux_usage == ISL_AUX_USAGE_HIZ);
+
+ surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
+ surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
+
  struct intel_mipmap_tree *hiz_mt = mt->hiz_buf->mt;
  if (hiz_mt) {
-surf->aux_addr.buffer = hiz_mt->bo;
 if (brw->gen == 6 &&
 hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
/* gen6 requires the HiZ buffer to be manually offset to the
@@ -255,13 +258,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
 */
apply_gen6_stencil_hiz_offset(aux_surf, hiz_mt, *level,
  &surf->aux_addr.offset);
-} else {
-   surf->aux_addr.offset = 0;
 }
 assert(hiz_mt->pitch == aux_surf->row_pitch);
- } else {
-surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
-surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
  }
   }
} else {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/27] i965/blorp/gen6: Set aux pitch directly

2017-01-16 Thread Topi Pohjolainen
dropping dependency to intel_miptree_get_aux_isl_surf().

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 6cc8b2e..655a961 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -237,12 +237,6 @@ blorp_surf_for_miptree(struct brw_context *brw,
 
 /* gen6 requires the HiZ buffer to be manually offset to the
  * right location. 
- * In depth state setup only surf->aux_surf.row_pitch gets
- * consulted. Otherwise surf->aux_surf is ignored and there is
- * no need to adjust it.  See blorp_emit_depth_stencil_config().
- *
- * surf->aux_surf.row_pitch in turn is set by
- * intel_miptree_get_aux_isl_surf().
  */
 surf->aux_addr.offset = brw_hiz_all_slices_at_each_lod_offset(
&surf->surf->phys_level0_sa, surf->surf->dim,
@@ -253,7 +247,13 @@ blorp_surf_for_miptree(struct brw_context *brw,
   hiz_mt,
   hiz_mt->level[*level].level_x,
   hiz_mt->level[*level].level_y));
-assert(hiz_mt->pitch == aux_surf->row_pitch);
+assert(mt->hiz_buf->aux_base.pitch == hiz_mt->pitch);
+
+/* In depth state setup only surf->aux_surf.row_pitch gets
+ * consulted. Otherwise surf->aux_surf is ignored and there is
+ * no need to adjust it.  See blorp_emit_depth_stencil_config().
+ */
+aux_surf->row_pitch = mt->hiz_buf->aux_base.pitch;
  }
   }
} else {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/27] i965/miptree: Refactor aux surface allocation

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 67 ---
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 3914506..cdfd49d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1509,6 +1509,43 @@ intel_mcs_miptree_buf_create(struct brw_context *brw,
return buf;
 }
 
+static struct intel_miptree_aux_buffer *
+intel_alloc_aux_buffer(struct brw_context *brw,
+   const char *name,
+   const struct isl_surf *main_surf,
+   const struct isl_surf *aux_surf,
+   uint32_t alloc_flags,
+   struct intel_mipmap_tree *mt)
+{
+   struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
+   if (!buf)
+  return false;
+
+   buf->size = aux_surf->size;
+   buf->pitch = aux_surf->row_pitch;
+   buf->qpitch = isl_surf_get_array_pitch_sa_rows(aux_surf);
+
+   uint32_t tiling = I915_TILING_Y;
+   unsigned long pitch;
+
+   /* ISL has stricter set of alignment rules then the drm allocator.
+* Therefore one can pass the ISL dimensions in terms of bytes instead of
+* trying to recalculate based on different format block sizes.
+*/
+   buf->bo = drm_intel_bo_alloc_tiled(brw->bufmgr, name,
+  buf->pitch, buf->size / buf->pitch,
+  1, &tiling, &pitch, alloc_flags);
+   if (buf->bo) {
+  assert(pitch == buf->pitch);
+  assert(tiling == I915_TILING_Y);
+   } else {
+  free(buf);
+  return false;
+   }
+
+   return buf;
+}
+
 static bool
 intel_miptree_alloc_mcs(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
@@ -1590,14 +1627,6 @@ intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
assert(temp_ccs_surf.size &&
   (temp_ccs_surf.size % temp_ccs_surf.row_pitch == 0));
 
-   struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
-   if (!buf)
-  return false;
-
-   buf->size = temp_ccs_surf.size;
-   buf->pitch = temp_ccs_surf.row_pitch;
-   buf->qpitch = isl_surf_get_array_pitch_sa_rows(&temp_ccs_surf);
-
/* In case of compression mcs buffer needs to be initialised requiring the
 * buffer to be immediately mapped to cpu space for writing. Therefore do
 * not use the gpu access flag which can cause an unnecessary delay if the
@@ -1605,25 +1634,11 @@ intel_miptree_alloc_non_msrt_mcs(struct brw_context 
*brw,
 */
const uint32_t alloc_flags =
   is_lossless_compressed ? 0 : BO_ALLOC_FOR_RENDER;
-   uint32_t tiling = I915_TILING_Y;
-   unsigned long pitch;
-
-   /* ISL has stricter set of alignment rules then the drm allocator.
-* Therefore one can pass the ISL dimensions in terms of bytes instead of
-* trying to recalculate based on different format block sizes.
-*/
-   buf->bo = drm_intel_bo_alloc_tiled(brw->bufmgr, "ccs-miptree",
-  buf->pitch, buf->size / buf->pitch,
-  1, &tiling, &pitch, alloc_flags);
-   if (buf->bo) {
-  assert(pitch == buf->pitch);
-  assert(tiling == I915_TILING_Y);
-   } else {
-  free(buf);
+   mt->mcs_buf = intel_alloc_aux_buffer(brw, "ccs-miptree",
+&temp_main_surf, &temp_ccs_surf,
+alloc_flags, mt);
+   if (!mt->mcs_buf)
   return false;
-   }
-
-   mt->mcs_buf = buf;
 
/* From Gen9 onwards single-sampled (non-msrt) auxiliary buffers are
 * used for lossless compression which requires similar initialisation
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/27] i965/gen6: Allocate hiz directly without miptree

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 13 +---
 src/mesa/drivers/dri/i965/gen6_depth_state.c  |  8 -
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 43 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 
 4 files changed, 15 insertions(+), 54 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 655a961..72ead04 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -230,11 +230,7 @@ blorp_surf_for_miptree(struct brw_context *brw,
  surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
  surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
 
- struct intel_mipmap_tree *hiz_mt = mt->hiz_buf->mt;
- if (hiz_mt) {
-assert(brw->gen == 6 &&
-   hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
-
+ if (brw->gen == 6) {
 /* gen6 requires the HiZ buffer to be manually offset to the
  * right location. 
  */
@@ -242,13 +238,6 @@ blorp_surf_for_miptree(struct brw_context *brw,
&surf->surf->phys_level0_sa, surf->surf->dim,
surf->surf->levels, surf->surf->format, *level);
 
-assert(surf->aux_addr.offset ==
-   intel_miptree_get_aligned_offset(
-  hiz_mt,
-  hiz_mt->level[*level].level_x,
-  hiz_mt->level[*level].level_y));
-assert(mt->hiz_buf->aux_base.pitch == hiz_mt->pitch);
-
 /* In depth state setup only surf->aux_surf.row_pitch gets
  * consulted. Otherwise surf->aux_surf is ignored and there is
  * no need to adjust it.  See blorp_emit_depth_stencil_config().
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 78683b4..355e37b 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -161,9 +161,6 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
   /* Emit hiz buffer. */
   if (hiz) {
  assert(depth_mt);
- struct intel_mipmap_tree *hiz_mt = depth_mt->hiz_buf->mt;
-
- assert(hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
 
  struct isl_surf temp_surf;
  intel_miptree_get_isl_surf(brw, mt, &temp_surf);
@@ -175,11 +172,6 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
 &temp_surf.phys_level0_sa, temp_surf.dim, temp_surf.levels,
 temp_surf.format, lod);
 
- assert(offset == intel_miptree_get_aligned_offset(
- hiz_mt,
- hiz_mt->level[lod].level_x,
- hiz_mt->level[lod].level_y));
-
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
 OUT_BATCH(depth_mt->hiz_buf->aux_base.pitch - 1);
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8b40cfa..3914506 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -946,10 +946,7 @@ intel_miptree_hiz_buffer_free(struct 
intel_miptree_hiz_buffer *hiz_buf)
if (hiz_buf == NULL)
   return;
 
-   if (hiz_buf->mt)
-  intel_miptree_release(&hiz_buf->mt);
-   else
-  drm_intel_bo_unreference(hiz_buf->aux_base.bo);
+   drm_intel_bo_unreference(hiz_buf->aux_base.bo);
 
free(hiz_buf);
 }
@@ -1874,30 +1871,9 @@ intel_hiz_miptree_buf_create(struct brw_context *brw,
  struct intel_mipmap_tree *mt)
 {
struct intel_miptree_hiz_buffer *buf = calloc(sizeof(*buf), 1);
-   uint32_t layout_flags = MIPTREE_LAYOUT_ACCELERATED_UPLOAD;
-
-   if (brw->gen == 6)
-  layout_flags |= MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD;
-
if (!buf)
   return NULL;
 
-   layout_flags |= MIPTREE_LAYOUT_TILING_ANY;
-   buf->mt = intel_miptree_create(brw,
-  mt->target,
-  mt->format,
-  mt->first_level,
-  mt->last_level,
-  mt->logical_width0,
-  mt->logical_height0,
-  mt->logical_depth0,
-  mt->num_samples,
-  layout_flags);
-   if (!buf->mt) {
-  free(buf);
-  return NULL;
-   }
-
const uint32_t format = translate_tex_format(brw, mt->format, false);
const unsigned cpp = isl_format_get_layout(format)->bpb / 8;
const unsigned halign = 128 / cpp;
@@ -1914,13 +1890,9 @@ intel_hiz_miptree_buf_create(struct brw_context *brw,
const unsigned total_w = brw_get_mipmap_total_width(
   phys_level0_sa.width, mt->last_level + 1, halign);
 
-   buf->aux_base.bo 

[Mesa-dev] [PATCH 17/27] i965/gen6/hiz: Add direct buffer size resolver

2017-01-16 Thread Topi Pohjolainen
The apparent hack adding unconditionally two lines into cube
maps is taken directly from align_cube().

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 39 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 23 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 
 3 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 0c1d952..12efc6a 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -380,6 +380,45 @@ brw_hiz_all_slices_at_each_lod_offset(
return level_y * pitch + level_x / halign * 4096;
 }
 
+uint32_t
+gen6_get_hiz_total_height(const struct isl_extent4d *phys_level0_sa,
+  enum isl_surf_dim dim, isl_surf_usage_flags_t usage,
+  unsigned last_level)
+{
+   const unsigned valign = 32;
+   const unsigned second_level_y = all_slices_at_each_lod_y_offset(
+  phys_level0_sa, dim, valign, 1);
+
+   /* Second level would be just below first, and its start position is equal
+* to the aligned size needed for the the first.
+*/
+   if (last_level == 0)
+  return second_level_y;
+
+   const unsigned last_level_y = all_slices_at_each_lod_y_offset(
+phys_level0_sa, dim, valign, last_level);
+   const unsigned second_level_h =
+  phys_level0_sa->array_len *
+  ALIGN(minify(phys_level0_sa->height, 1), valign);
+   const unsigned last_level_h =
+  phys_level0_sa->array_len *
+  ALIGN(minify(phys_level0_sa->height, last_level), valign);
+
+   /* Choose the taller of the two: end of the second or end of the last. */
+   const unsigned total_h = MAX2(second_level_y + second_level_h,
+ last_level_y + last_level_h);
+
+   /* The 965's sampler lays cachelines out according to how accesses
+* in the texture surfaces run, so they may be "vertical" through
+* memory.  As a result, the docs say in Surface Padding Requirements:
+* Sampling Engine Surfaces that two extra rows of padding are required.
+*/
+   if (usage & ISL_SURF_USAGE_CUBE_BIT)
+  return (ALIGN(total_h, valign) + 2);
+
+   return ALIGN(total_h, valign);
+}
+
 static void
 brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 825c6a0..8b40cfa 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1898,9 +1898,28 @@ intel_hiz_miptree_buf_create(struct brw_context *brw,
   return NULL;
}
 
+   const uint32_t format = translate_tex_format(brw, mt->format, false);
+   const unsigned cpp = isl_format_get_layout(format)->bpb / 8;
+   const unsigned halign = 128 / cpp;
+   const enum isl_surf_dim dim = get_isl_surf_dim(mt->target);
+   const struct isl_extent4d phys_level0_sa = {
+ { mt->physical_width0 }, 
+ { mt->physical_height0 },  
+ { dim == ISL_SURF_DIM_3D ? mt->physical_depth0 : 1 },
+ { dim == ISL_SURF_DIM_3D ? 1 : mt->physical_depth0 } };
+   const isl_surf_usage_flags_t usage =
+  _mesa_is_cube_map_texture(mt->target) ? ISL_SURF_USAGE_CUBE_BIT : 0;
+   const unsigned total_h = gen6_get_hiz_total_height(
+  &phys_level0_sa, dim, usage, mt->last_level);
+   const unsigned total_w = brw_get_mipmap_total_width(
+  phys_level0_sa.width, mt->last_level + 1, halign);
+
buf->aux_base.bo = buf->mt->bo;
-   buf->aux_base.size = buf->mt->total_height * buf->mt->pitch;
-   buf->aux_base.pitch = buf->mt->pitch;
+   buf->aux_base.pitch = total_w * cpp;
+   buf->aux_base.size = total_h * buf->aux_base.pitch;
+
+   assert(buf->aux_base.pitch == buf->mt->pitch);
+   assert(buf->aux_base.size == buf->mt->total_height * buf->mt->pitch);
 
/* On gen6 hiz is unconditionally laid out packing all slices
 * at each level-of-detail (LOD). This means there is no valid qpitch
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index d77431a..152ae10 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -996,6 +996,11 @@ brw_hiz_all_slices_at_each_lod_offset(
enum isl_surf_dim dim, unsigned num_levels,
enum isl_format format, unsigned level);
 
+uint32_t
+gen6_get_hiz_total_height(const struct isl_extent4d *phys_level0_sa, 
+  enum isl_surf_dim dim, isl_surf_usage_flags_t usage,
+  unsigned last_level);
+
 void
 brw_miptree_layout(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman

[Mesa-dev] [PATCH 24/27] i965/miptree: Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD in mcs init

2017-01-16 Thread Topi Pohjolainen
because buffers get unconditionally initialised by cpu writing.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 9462e98..20b57d1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1519,7 +1519,12 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
assert(temp_mcs_surf.size &&
   (temp_mcs_surf.size % temp_mcs_surf.row_pitch == 0));
 
-   const uint32_t alloc_flags = BO_ALLOC_FOR_RENDER;
+   /* Buffer needs to be initialised requiring the buffer to be immediately
+* mapped to cpu space for writing. Therefore do not use the gpu access
+* flag which can cause an unnecessary delay if the backing pages happened
+* to be just used by the GPU.
+*/
+   const uint32_t alloc_flags = 0;
mt->mcs_buf = intel_alloc_aux_buffer(brw, "mcs-miptree",
 &temp_main_surf, &temp_mcs_surf,
 alloc_flags, mt);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/27] i965/miptree: Use ISL for MCS layouts

2017-01-16 Thread Topi Pohjolainen
This changes the size of the auxiliary buffer on gen7 and gen8
for arrayed surfaces. Current i965 logic uses qpitch height per
slice whereas ISL knows that msaa is never mipmapped and more
compact layout is sufficient.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c|   6 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |   7 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 102 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|   2 +
 4 files changed, 29 insertions(+), 88 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index e6bb120..e76a541 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -173,7 +173,11 @@ blorp_surf_for_miptree(struct brw_context *brw,
surf->aux_usage = intel_miptree_get_aux_isl_usage(brw, mt);
 
struct isl_surf *aux_surf = &tmp_surfs[1];
-   intel_miptree_get_aux_isl_surf(brw, mt, surf->aux_usage, aux_surf);
+
+   if (mt->mcs_buf)
+  *aux_surf = mt->mcs_buf->surf;
+   else
+  intel_miptree_get_aux_isl_surf(brw, mt, surf->aux_usage, aux_surf);
 
if (surf->aux_usage != ISL_AUX_USAGE_NONE) {
   if (surf->aux_usage == ISL_AUX_USAGE_HIZ) {
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index ef7347c..97ca600 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -140,14 +140,17 @@ brw_emit_surface_state(struct brw_context *brw,
if ((mt->mcs_buf || intel_miptree_sample_with_hiz(brw, mt)) &&
!(flags & INTEL_AUX_BUFFER_DISABLED)) {
   aux_usage = intel_miptree_get_aux_isl_usage(brw, mt);
-  intel_miptree_get_aux_isl_surf(brw, mt, aux_usage, &aux_surf_s);
-  aux_surf = &aux_surf_s;
 
   if (mt->mcs_buf) {
+ aux_surf = &mt->mcs_buf->surf;
+
  assert(mt->mcs_buf->offset == 0);
  aux_bo = mt->mcs_buf->bo;
  aux_offset = mt->mcs_buf->bo->offset64 + mt->mcs_buf->offset;
   } else {
+ intel_miptree_get_aux_isl_surf(brw, mt, aux_usage, &aux_surf_s);
+ aux_surf = &aux_surf_s;
+
  aux_bo = mt->hiz_buf->aux_base.bo;
  aux_offset = mt->hiz_buf->aux_base.bo->offset64;
   }
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8643ef8..9462e98 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1460,56 +1460,6 @@ intel_miptree_init_mcs(struct brw_context *brw,
 }
 
 static struct intel_miptree_aux_buffer *
-intel_mcs_miptree_buf_create(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- mesa_format format,
- unsigned mcs_width,
- unsigned mcs_height,
- uint32_t layout_flags)
-{
-   struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
-   struct intel_mipmap_tree *temp_mt;
-
-   if (!buf)
-  return NULL;
-
-   /* From the Ivy Bridge PRM, Vol4 Part1 p76, "MCS Base Address":
-*
-* "The MCS surface must be stored as Tile Y."
-*/
-   layout_flags |= MIPTREE_LAYOUT_TILING_Y;
-   temp_mt = miptree_create(brw,
-mt->target,
-format,
-mt->first_level,
-mt->last_level,
-mcs_width,
-mcs_height,
-mt->logical_depth0,
-0 /* num_samples */,
-layout_flags);
-   if (!temp_mt) {
-  free(buf);
-  return NULL;
-   }
-
-   buf->bo = temp_mt->bo;
-   buf->offset = temp_mt->offset;
-   buf->size = temp_mt->total_height * temp_mt->pitch;
-   buf->pitch = temp_mt->pitch;
-   buf->qpitch = temp_mt->qpitch;
-
-   /* Just hang on to the BO which backs the AUX buffer; the rest of the 
miptree
-* structure should go away. We use miptree create simply as a means to make
-* sure all the constraints for the buffer are satisfied.
-*/
-   drm_intel_bo_reference(temp_mt->bo);
-   intel_miptree_release(&temp_mt);
-
-   return buf;
-}
-
-static struct intel_miptree_aux_buffer *
 intel_alloc_aux_buffer(struct brw_context *brw,
const char *name,
const struct isl_surf *main_surf,
@@ -1543,6 +1493,8 @@ intel_alloc_aux_buffer(struct brw_context *brw,
   return false;
}
 
+   buf->surf = *aux_surf;
+
return buf;
 }
 
@@ -1555,42 +1507,22 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
assert(mt->mcs_buf == NULL);
assert((mt->aux_disable & INTEL_AUX_DISABLE_MCS) == 0);
 
-   /* Choose the correct format for the MCS buffer.  All that really matters
-* is that we allocate the right buffer 

[Mesa-dev] [PATCH 07/27] i965/gen6: Simplify hiz surface setup

2017-01-16 Thread Topi Pohjolainen
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. The same goes for
intel_miptree_aux_buffer::pitch/qpitch.

This will make following patches simpler to read.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_misc_state.c| 5 ++---
 src/mesa/drivers/dri/i965/gen6_depth_state.c  | 4 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 ++---
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 616c0df..af050a0 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -631,11 +631,10 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw,
   /* Emit hiz buffer. */
   if (hiz) {
  assert(depth_mt);
- struct intel_mipmap_tree *hiz_mt = depth_mt->hiz_buf->mt;
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-OUT_BATCH(hiz_mt->pitch - 1);
-OUT_RELOC(hiz_mt->bo,
+OUT_BATCH(depth_mt->hiz_buf->aux_base.pitch - 1);
+OUT_RELOC(depth_mt->hiz_buf->aux_base.bo,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   brw->depthstencil.hiz_offset);
 ADVANCE_BATCH();
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index cb0ed25..0ff2407 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -173,8 +173,8 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
 
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-OUT_BATCH(hiz_mt->pitch - 1);
-OUT_RELOC(hiz_mt->bo,
+OUT_BATCH(depth_mt->hiz_buf->aux_base.pitch - 1);
+OUT_RELOC(depth_mt->hiz_buf->aux_base.bo,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   offset);
 ADVANCE_BATCH();
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 9488bec..606d4c2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3423,13 +3423,8 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
  unreachable("Invalid MCS miptree");
   }
} else if (mt->hiz_buf) {
-  if (mt->hiz_buf->mt) {
- aux_pitch = mt->hiz_buf->mt->pitch;
- aux_qpitch = mt->hiz_buf->mt->qpitch;
-  } else {
- aux_pitch = mt->hiz_buf->aux_base.pitch;
- aux_qpitch = mt->hiz_buf->aux_base.qpitch;
-  }
+  aux_pitch = mt->hiz_buf->aux_base.pitch;
+  aux_qpitch = mt->hiz_buf->aux_base.qpitch;
 
   *usage = ISL_AUX_USAGE_HIZ;
} else {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/27] i965/blorp: Use hiz surface instead of creating copy

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 25 -
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  6 ++
 2 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 8ecbd0e..f511683 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -114,7 +114,7 @@ blorp_surf_for_miptree(struct brw_context *brw,
uint32_t safe_aux_usage,
unsigned *level,
unsigned start_layer, unsigned num_layers,
-   struct isl_surf tmp_surfs[2])
+   struct isl_surf tmp_surfs[1])
 {
if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
@@ -172,12 +172,12 @@ blorp_surf_for_miptree(struct brw_context *brw,
 
surf->aux_usage = intel_miptree_get_aux_isl_usage(brw, mt);
 
-   struct isl_surf *aux_surf = &tmp_surfs[1];
+   struct isl_surf *aux_surf = NULL;
 
if (mt->mcs_buf)
-  *aux_surf = mt->mcs_buf->surf;
-   else
-  intel_miptree_get_aux_isl_surf(brw, mt, surf->aux_usage, aux_surf);
+  aux_surf = &mt->mcs_buf->surf;
+   else if (mt->hiz_buf)
+  aux_surf = &mt->hiz_buf->surf;
 
if (surf->aux_usage != ISL_AUX_USAGE_NONE) {
   if (surf->aux_usage == ISL_AUX_USAGE_HIZ) {
@@ -248,7 +248,6 @@ blorp_surf_for_miptree(struct brw_context *brw,
  * consulted. Otherwise surf->aux_surf is ignored and there is
  * no need to adjust it.  See blorp_emit_depth_stencil_config().
  */
-aux_surf->row_pitch = mt->hiz_buf->pitch;
  }
   }
} else {
@@ -389,12 +388,12 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
  (1 << ISL_AUX_USAGE_CCS_D);
}
 
-   struct isl_surf tmp_surfs[4];
+   struct isl_surf tmp_surfs[2];
struct blorp_surf src_surf, dst_surf;
blorp_surf_for_miptree(brw, &src_surf, src_mt, false, src_usage_flags,
   &src_level, src_layer, 1, &tmp_surfs[0]);
blorp_surf_for_miptree(brw, &dst_surf, dst_mt, true, dst_usage_flags,
-  &dst_level, dst_layer, 1, &tmp_surfs[2]);
+  &dst_level, dst_layer, 1, &tmp_surfs[1]);
 
struct isl_swizzle src_isl_swizzle = {
   .r = swizzle_to_scs(GET_SWZ(src_swizzle, 0)),
@@ -434,7 +433,7 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
dst_mt->num_samples, _mesa_get_format_name(dst_mt->format), dst_mt,
dst_level, dst_layer, dst_x, dst_y);
 
-   struct isl_surf tmp_surfs[4];
+   struct isl_surf tmp_surfs[2];
struct blorp_surf src_surf, dst_surf;
blorp_surf_for_miptree(brw, &src_surf, src_mt, false,
   (1 << ISL_AUX_USAGE_MCS) |
@@ -443,7 +442,7 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
blorp_surf_for_miptree(brw, &dst_surf, dst_mt, true,
   (1 << ISL_AUX_USAGE_MCS) |
   (1 << ISL_AUX_USAGE_CCS_E),
-  &dst_level, dst_layer, 1, &tmp_surfs[2]);
+  &dst_level, dst_layer, 1, &tmp_surfs[1]);
 
struct blorp_batch batch;
blorp_batch_init(&brw->blorp, &batch, brw, 0);
@@ -849,7 +848,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
const unsigned num_layers = fb->MaxNumLayers ? irb->layer_count : 1;
 
/* We can't setup the blorp_surf until we've allocated the MCS above */
-   struct isl_surf isl_tmp[2];
+   struct isl_surf isl_tmp[1];
struct blorp_surf surf;
unsigned level = irb->mt_level;
blorp_surf_for_miptree(brw, &surf, irb->mt, true,
@@ -936,7 +935,7 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 
const mesa_format format = _mesa_get_srgb_format_linear(mt->format);
 
-   struct isl_surf isl_tmp[2];
+   struct isl_surf isl_tmp[1];
struct blorp_surf surf;
blorp_surf_for_miptree(brw, &surf, mt, true,
   (1 << ISL_AUX_USAGE_CCS_E) |
@@ -970,7 +969,7 @@ gen6_blorp_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 {
assert(intel_miptree_level_has_hiz(mt, level));
 
-   struct isl_surf isl_tmp[2];
+   struct isl_surf isl_tmp[1];
struct blorp_surf surf;
blorp_surf_for_miptree(brw, &surf, mt, true, (1 << ISL_AUX_USAGE_HIZ),
   &level, layer, 1, isl_tmp);
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 09afe92..95a674b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1662,6 +1662,12 @@ intel_hiz_miptree_buf_create(struct brw_context *brw,
 */
buf->qpitch = 0;
 
+   /* Blorp depth state setup relies on ISL surface. Fortunately only
+* ::row_pitch gets consulted while the rest gets ign

[Mesa-dev] [PATCH 20/27] i965/miptree: Refactor ISL aux usage resolver

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c|  4 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  3 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 47 +++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  9 -
 4 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 72ead04..e6bb120 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -170,8 +170,10 @@ blorp_surf_for_miptree(struct brw_context *brw,
   *level = 0;
}
 
+   surf->aux_usage = intel_miptree_get_aux_isl_usage(brw, mt);
+
struct isl_surf *aux_surf = &tmp_surfs[1];
-   intel_miptree_get_aux_isl_surf(brw, mt, aux_surf, &surf->aux_usage);
+   intel_miptree_get_aux_isl_surf(brw, mt, surf->aux_usage, aux_surf);
 
if (surf->aux_usage != ISL_AUX_USAGE_NONE) {
   if (surf->aux_usage == ISL_AUX_USAGE_HIZ) {
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 4566696..ef7347c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -139,7 +139,8 @@ brw_emit_surface_state(struct brw_context *brw,
enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
if ((mt->mcs_buf || intel_miptree_sample_with_hiz(brw, mt)) &&
!(flags & INTEL_AUX_BUFFER_DISABLED)) {
-  intel_miptree_get_aux_isl_surf(brw, mt, &aux_surf_s, &aux_usage);
+  aux_usage = intel_miptree_get_aux_isl_usage(brw, mt);
+  intel_miptree_get_aux_isl_surf(brw, mt, aux_usage, &aux_surf_s);
   aux_surf = &aux_surf_s;
 
   if (mt->mcs_buf) {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index cdfd49d..8643ef8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3421,6 +3421,32 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
   surf->usage |= ISL_SURF_USAGE_CUBE_BIT;
 }
 
+enum isl_aux_usage
+intel_miptree_get_aux_isl_usage(const struct brw_context *brw,
+const struct intel_mipmap_tree *mt)
+{
+   if (mt->hiz_buf)
+  return ISL_AUX_USAGE_HIZ;
+
+   if (!mt->mcs_buf)
+  return ISL_AUX_USAGE_NONE;
+
+   if (mt->num_samples > 1) {
+  assert(mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS);
+  return ISL_AUX_USAGE_MCS;
+   }
+
+   if (intel_miptree_is_lossless_compressed(brw, mt)) {
+  assert(brw->gen >= 9);
+  return ISL_AUX_USAGE_CCS_E;
+   }
+
+   if ((mt->aux_disable & INTEL_AUX_DISABLE_CCS) == 0)
+  return ISL_AUX_USAGE_CCS_D;
+
+   unreachable("Invalid MCS miptree");
+}
+
 /* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND CANNOT BE
  * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A SURFACE IS TO
  * PASS IT INTO isl_surf_fill_state.
@@ -3428,32 +3454,17 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
 void
 intel_miptree_get_aux_isl_surf(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
-   struct isl_surf *surf,
-   enum isl_aux_usage *usage)
+   enum isl_aux_usage usage,
+   struct isl_surf *surf)
 {
uint32_t aux_pitch, aux_qpitch;
if (mt->mcs_buf) {
   aux_pitch = mt->mcs_buf->pitch;
   aux_qpitch = mt->mcs_buf->qpitch;
-
-  if (mt->num_samples > 1) {
- assert(mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS);
- *usage = ISL_AUX_USAGE_MCS;
-  } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
- assert(brw->gen >= 9);
- *usage = ISL_AUX_USAGE_CCS_E;
-  } else if ((mt->aux_disable & INTEL_AUX_DISABLE_CCS) == 0) {
- *usage = ISL_AUX_USAGE_CCS_D;
-  } else {
- unreachable("Invalid MCS miptree");
-  }
} else if (mt->hiz_buf) {
   aux_pitch = mt->hiz_buf->aux_base.pitch;
   aux_qpitch = mt->hiz_buf->aux_base.qpitch;
-
-  *usage = ISL_AUX_USAGE_HIZ;
} else {
-  *usage = ISL_AUX_USAGE_NONE;
   return;
}
 
@@ -3461,7 +3472,7 @@ intel_miptree_get_aux_isl_surf(struct brw_context *brw,
intel_miptree_get_isl_surf(brw, mt, surf);
 
/* Figure out the format and tiling of the auxiliary surface */
-   switch (*usage) {
+   switch (usage) {
case ISL_AUX_USAGE_NONE:
   unreachable("Invalid auxiliary usage");
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 4c428cb..f3c8268 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -786,11 +786,16 @@ void
 intel_miptree_get_isl_surf(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
   

[Mesa-dev] [PATCH 14/27] i965/blorp/gen6: Use on-demand stencil/hiz offset resolvers

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 2001cf3..632f5f3 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -160,8 +160,14 @@ blorp_surf_for_miptree(struct brw_context *brw,
* consulted. Otherwise surf is ignored and there is no need to adjust
* it any further.  See blorp_emit_depth_stencil_config().
*/
-  surf->addr.offset += (mt->level[*level].level_y * mt->pitch +
-mt->level[*level].level_x * 64);
+  surf->addr.offset += brw_stencil_all_slices_at_each_lod_offset(
+  surf->surf, mt, *level);
+
+  assert(brw_stencil_all_slices_at_each_lod_offset(
+surf->surf, mt, *level) ==
+ mt->level[*level].level_y * mt->pitch +
+ mt->level[*level].level_x * 64);
+
   *level = 0;
}
 
@@ -239,9 +245,14 @@ blorp_surf_for_miptree(struct brw_context *brw,
  * surf->aux_surf.row_pitch in turn is set by
  * intel_miptree_get_aux_isl_surf().
  */
-surf->aux_addr.offset = intel_miptree_get_aligned_offset(hiz_mt,
-   hiz_mt->level[*level].level_x,
-   hiz_mt->level[*level].level_y);
+surf->aux_addr.offset = brw_hiz_all_slices_at_each_lod_offset(
+   &surf->surf->phys_level0_sa, surf->surf->dim,
+   surf->surf->levels, surf->surf->format, hiz_mt, *level);
+assert(surf->aux_addr.offset ==
+   intel_miptree_get_aligned_offset(
+  hiz_mt,
+  hiz_mt->level[*level].level_x,
+  hiz_mt->level[*level].level_y));
 assert(hiz_mt->pitch == aux_surf->row_pitch);
  }
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/27] i965/gen6: Remove dead code in hiz surface setup

2017-01-16 Thread Topi Pohjolainen
In intel_hiz_miptree_buf_create() the miptree is unconditionally
created with MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen6_depth_state.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 0ff2407..cda66e8 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -162,14 +162,13 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
   if (hiz) {
  assert(depth_mt);
  struct intel_mipmap_tree *hiz_mt = depth_mt->hiz_buf->mt;
- uint32_t offset = 0;
 
- if (hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
-offset = intel_miptree_get_aligned_offset(
-hiz_mt,
-hiz_mt->level[lod].level_x,
-hiz_mt->level[lod].level_y);
- }
+ assert(hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
+
+ const uint32_t offset = intel_miptree_get_aligned_offset(
+hiz_mt,
+hiz_mt->level[lod].level_x,
+hiz_mt->level[lod].level_y);
 
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/27] i965: Use stored hiz surface instead of creating copy

2017-01-16 Thread Topi Pohjolainen
Now the last user of intel_miptree_get_aux_isl_surf() is gone.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 77 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  6 --
 3 files changed, 2 insertions(+), 86 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 160f16d..0711328 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -134,7 +134,7 @@ brw_emit_surface_state(struct brw_context *brw,
union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
 
drm_intel_bo *aux_bo;
-   struct isl_surf *aux_surf = NULL, aux_surf_s;
+   struct isl_surf *aux_surf = NULL;
uint64_t aux_offset = 0;
enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
if ((mt->mcs_buf || intel_miptree_sample_with_hiz(brw, mt)) &&
@@ -148,8 +148,7 @@ brw_emit_surface_state(struct brw_context *brw,
  aux_bo = mt->mcs_buf->bo;
  aux_offset = mt->mcs_buf->bo->offset64 + mt->mcs_buf->offset;
   } else {
- intel_miptree_get_aux_isl_surf(brw, mt, aux_usage, &aux_surf_s);
- aux_surf = &aux_surf_s;
+ aux_surf = &mt->hiz_buf->surf;
 
  aux_bo = mt->hiz_buf->bo;
  aux_offset = mt->hiz_buf->bo->offset64;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 95a674b..4c51661 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3210,83 +3210,6 @@ intel_miptree_get_aux_isl_usage(const struct brw_context 
*brw,
unreachable("Invalid MCS miptree");
 }
 
-/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND CANNOT BE
- * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A SURFACE IS TO
- * PASS IT INTO isl_surf_fill_state.
- */
-void
-intel_miptree_get_aux_isl_surf(struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,
-   enum isl_aux_usage usage,
-   struct isl_surf *surf)
-{
-   uint32_t aux_pitch, aux_qpitch;
-   if (mt->mcs_buf) {
-  aux_pitch = mt->mcs_buf->pitch;
-  aux_qpitch = mt->mcs_buf->qpitch;
-   } else if (mt->hiz_buf) {
-  aux_pitch = mt->hiz_buf->pitch;
-  aux_qpitch = mt->hiz_buf->qpitch;
-   } else {
-  return;
-   }
-
-   /* Start with a copy of the original surface. */
-   intel_miptree_get_isl_surf(brw, mt, surf);
-
-   /* Figure out the format and tiling of the auxiliary surface */
-   switch (usage) {
-   case ISL_AUX_USAGE_NONE:
-  unreachable("Invalid auxiliary usage");
-
-   case ISL_AUX_USAGE_HIZ:
-  isl_surf_get_hiz_surf(&brw->isl_dev, surf, surf);
-  break;
-
-   case ISL_AUX_USAGE_MCS:
-  /*
-   * From the SKL PRM:
-   *"When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E,
-   *HALIGN 16 must be used."
-   */
-  if (brw->gen >= 9)
- assert(mt->halign == 16);
-
-  isl_surf_get_mcs_surf(&brw->isl_dev, surf, surf);
-  break;
-
-   case ISL_AUX_USAGE_CCS_D:
-   case ISL_AUX_USAGE_CCS_E:
-  /*
-   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
-   *
-   *"When MCS is enabled for non-MSRT, HALIGN_16 must be used"
-   *
-   * From the hardware spec for GEN9:
-   *
-   *"When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E,
-   *HALIGN 16 must be used."
-   */
-  assert(mt->num_samples <= 1);
-  if (brw->gen >= 8)
- assert(mt->halign == 16);
-
-  isl_surf_get_ccs_surf(&brw->isl_dev, surf, surf);
-  break;
-   }
-
-   /* We want the pitch of the actual aux buffer. */
-   surf->row_pitch = aux_pitch;
-
-   /* Auxiliary surfaces in ISL have compressed formats and array_pitch_el_rows
-* is in elements.  This doesn't match intel_mipmap_tree::qpitch which is
-* in elements of the primary color surface so we have to divide by the
-* compression block height.
-*/
-   surf->array_pitch_el_rows =
-  aux_qpitch / isl_format_get_layout(surf->format)->bh;
-}
-
 union isl_color_value
 intel_miptree_get_isl_clear_color(struct brw_context *brw,
   const struct intel_mipmap_tree *mt)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index b0fb3bb..3f82e3f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -784,12 +784,6 @@ enum isl_aux_usage
 intel_miptree_get_aux_isl_usage(const struct brw_context *brw,
 const struct intel_mipmap_tree *mt);
 
-void
-intel_miptree_get_aux_isl_surf(struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,

[Mesa-dev] [PATCH 15/27] i965/gen6: Drop miptrees in depth/stencil offset resolvers

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c |  8 
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 19 +--
 src/mesa/drivers/dri/i965/gen6_depth_state.c  |  4 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 +
 4 files changed, 8 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 632f5f3..6cc8b2e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -161,10 +161,9 @@ blorp_surf_for_miptree(struct brw_context *brw,
* it any further.  See blorp_emit_depth_stencil_config().
*/
   surf->addr.offset += brw_stencil_all_slices_at_each_lod_offset(
-  surf->surf, mt, *level);
+  surf->surf, *level);
 
-  assert(brw_stencil_all_slices_at_each_lod_offset(
-surf->surf, mt, *level) ==
+  assert(brw_stencil_all_slices_at_each_lod_offset(surf->surf, *level) ==
  mt->level[*level].level_y * mt->pitch +
  mt->level[*level].level_x * 64);
 
@@ -247,7 +246,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
  */
 surf->aux_addr.offset = brw_hiz_all_slices_at_each_lod_offset(
&surf->surf->phys_level0_sa, surf->surf->dim,
-   surf->surf->levels, surf->surf->format, hiz_mt, *level);
+   surf->surf->levels, surf->surf->format, *level);
+
 assert(surf->aux_addr.offset ==
intel_miptree_get_aligned_offset(
   hiz_mt,
diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 6f1c228..0c1d952 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -323,11 +323,8 @@ all_slices_at_each_lod_y_offset(const struct isl_extent4d 
*phys_level0_sa,
 
 uint32_t
 brw_stencil_all_slices_at_each_lod_offset(const struct isl_surf *surf,
-  const struct intel_mipmap_tree *mt,
   unsigned level)
 {
-   assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
-
const unsigned halign = 64;
const unsigned valign = 64;
const unsigned level_x = all_slices_at_each_lod_x_offset(
@@ -335,9 +332,6 @@ brw_stencil_all_slices_at_each_lod_offset(const struct 
isl_surf *surf,
const unsigned level_y = all_slices_at_each_lod_y_offset(
   &surf->phys_level0_sa, surf->dim, valign, level);
 
-   assert(level_x == mt->level[level].level_x);
-   assert(level_y == mt->level[level].level_y);
-
/* From Vol 2a, 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
 *The pitch must be set to 2x the value computed based on width, as
 *the stencil buffer is stored with two rows interleaved.
@@ -348,8 +342,6 @@ brw_stencil_all_slices_at_each_lod_offset(const struct 
isl_surf *surf,
 */
const unsigned two_rows_interleaved_pitch = surf->row_pitch / 2;
 
-   assert(two_rows_interleaved_pitch == mt->pitch);
-
return level_y * two_rows_interleaved_pitch + level_x * 64;
 }
 
@@ -373,12 +365,8 @@ uint32_t
 brw_hiz_all_slices_at_each_lod_offset(
const struct isl_extent4d *phys_level0_sa,
enum isl_surf_dim dim, unsigned num_levels,
-   enum isl_format format,
-   const struct intel_mipmap_tree *mt,
-   unsigned level)
+   enum isl_format format, unsigned level)
 {
-   assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
-
const uint32_t cpp = isl_format_get_layout(format)->bpb / 8;
const uint32_t halign = 128 / cpp;
const uint32_t valign = 32;
@@ -389,11 +377,6 @@ brw_hiz_all_slices_at_each_lod_offset(
const uint32_t pitch = brw_get_mipmap_total_width(
  phys_level0_sa->width, num_levels, halign) * cpp;
 
-   assert(level_x == mt->level[level].level_x);
-   assert(level_y == mt->level[level].level_y);
-   assert(pitch == mt->pitch);
-   assert(cpp == mt->cpp);
-
return level_y * pitch + level_x / halign * 4096;
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 05565de..78683b4 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -173,7 +173,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
   */
  const uint32_t offset = brw_hiz_all_slices_at_each_lod_offset(
 &temp_surf.phys_level0_sa, temp_surf.dim, temp_surf.levels,
-temp_surf.format, hiz_mt, lod);
+temp_surf.format, lod);
 
  assert(offset == intel_miptree_get_aligned_offset(
  hiz_mt,
@@ -205,7 +205,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
  if (stencil_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
 assert(stencil_mt->format == MESA_FORMAT_S_UINT8);
 off

[Mesa-dev] [PATCH 10/27] i965/blorp/gen6: Drop unnecessary stencil/hiz surf dimension adjust

2017-01-16 Thread Topi Pohjolainen
Hardware state setup only needs offset and pitch and ignores the
rest.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 57 ---
 1 file changed, 20 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index ecf27a1..2001cf3 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -107,36 +107,6 @@ brw_blorp_init(struct brw_context *brw)
 }
 
 static void
-apply_gen6_stencil_hiz_offset(struct isl_surf *surf,
-  struct intel_mipmap_tree *mt,
-  uint32_t lod,
-  uint32_t *offset)
-{
-   assert(mt->array_layout == ALL_SLICES_AT_EACH_LOD);
-
-   if (mt->format == MESA_FORMAT_S_UINT8) {
-  /* Note: we can't compute the stencil offset using
-   * intel_miptree_get_aligned_offset(), because the miptree
-   * claims that the region is untiled even though it's W tiled.
-   */
-  *offset = mt->level[lod].level_y * mt->pitch +
-mt->level[lod].level_x * 64;
-   } else {
-  *offset = intel_miptree_get_aligned_offset(mt,
- mt->level[lod].level_x,
- mt->level[lod].level_y);
-   }
-
-   surf->logical_level0_px.width = minify(surf->logical_level0_px.width, lod);
-   surf->logical_level0_px.height = minify(surf->logical_level0_px.height, 
lod);
-   surf->phys_level0_sa.width = minify(surf->phys_level0_sa.width, lod);
-   surf->phys_level0_sa.height = minify(surf->phys_level0_sa.height, lod);
-   surf->levels = 1;
-   surf->array_pitch_el_rows =
-  ALIGN(surf->phys_level0_sa.height, surf->image_alignment_el.height);
-}
-
-static void
 blorp_surf_for_miptree(struct brw_context *brw,
struct blorp_surf *surf,
struct intel_mipmap_tree *mt,
@@ -181,10 +151,17 @@ blorp_surf_for_miptree(struct brw_context *brw,
* hacks inside the i965 driver.
*
* See also gen6_depth_stencil_state.c
+   *
+   * Note: we can't compute the stencil offset using
+   * intel_miptree_get_aligned_offset(), because the miptree
+   * claims that the region is untiled even though it's W tiled.
+   *
+   * In stencil state setup only surf->row_pitch and surf->addr get
+   * consulted. Otherwise surf is ignored and there is no need to adjust
+   * it any further.  See blorp_emit_depth_stencil_config().
*/
-  uint32_t offset;
-  apply_gen6_stencil_hiz_offset(&tmp_surfs[0], mt, *level, &offset);
-  surf->addr.offset += offset;
+  surf->addr.offset += (mt->level[*level].level_y * mt->pitch +
+mt->level[*level].level_x * 64);
   *level = 0;
}
 
@@ -254,11 +231,17 @@ blorp_surf_for_miptree(struct brw_context *brw,
hiz_mt->array_layout == ALL_SLICES_AT_EACH_LOD);
 
 /* gen6 requires the HiZ buffer to be manually offset to the
- * right location.  We could fixup the surf but it doesn't
- * matter since most of those fields don't matter.
+ * right location. 
+ * In depth state setup only surf->aux_surf.row_pitch gets
+ * consulted. Otherwise surf->aux_surf is ignored and there is
+ * no need to adjust it.  See blorp_emit_depth_stencil_config().
+ *
+ * surf->aux_surf.row_pitch in turn is set by
+ * intel_miptree_get_aux_isl_surf().
  */
-apply_gen6_stencil_hiz_offset(aux_surf, hiz_mt, *level,
-  &surf->aux_addr.offset);
+surf->aux_addr.offset = intel_miptree_get_aligned_offset(hiz_mt,
+   hiz_mt->level[*level].level_x,
+   hiz_mt->level[*level].level_y);
 assert(hiz_mt->pitch == aux_surf->row_pitch);
  }
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/27] intel/isl: Apply render target alignment constraints for MCS

2017-01-16 Thread Topi Pohjolainen
Otherwise i965 driver will start experiencing failures with
"texelFetch fs sampler2DMSArray 4 98x1x9-98x129x9 -auto -fbo"
on IVB, HSW and BDW when starts to use ISL to calculate the
layout.

The test is fine on SKL because one already hits:

  if (isl_format_is_compressed(info->format)) {
  /* On Gen9, the meaning of RENDER_SURFACE_STATE's
   * SurfaceHorizontalAlignment and SurfaceVerticalAlignment changed for
   * compressed formats. They now indicate a multiple of the compression
   * block.  For example, if the compression mode is ETC2 then HALIGN_4
   * indicates a horizontal alignment of 16 pixels.
   *
   * To avoid wasting memory, choose the smallest alignment possible:
   * HALIGN_4 and VALIGN_4.
   */
  *image_align_el = isl_extent3d(4, 4, 1);

in isl_gen9_choose_image_alignment_el().

However, the same fix is applied on Gen9 for consistency sake.

Signed-off-by: Topi Pohjolainen 
---
 src/intel/isl/isl_gen7.c | 16 
 src/intel/isl/isl_gen8.c | 16 
 src/intel/isl/isl_gen9.c | 16 
 3 files changed, 48 insertions(+)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index b1874c1..776d3bf 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -442,6 +442,22 @@ isl_gen7_choose_image_alignment_el(const struct isl_device 
*dev,
   }
}
 
+   if (fmtl->txc == ISL_TXC_MCS) {
+  assert(tiling == ISL_TILING_Y0);
+
+  /*
+   * IvyBrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
+   *
+   * Height, width, and layout of MCS buffer in this case must match with
+   * Render Target height, width, and layout. MCS buffer is tiledY.
+   *
+   * To avoid wasting memory, choose the smallest alignment possible:
+   * HALIGN_4 and VALIGN_4.
+   */
+  *image_align_el = isl_extent3d(4, 4, 1);
+  return;
+   }
+
*image_align_el = (struct isl_extent3d) {
   .w = gen7_choose_halign_el(dev, info),
   .h = gen7_choose_valign_el(dev, info, tiling),
diff --git a/src/intel/isl/isl_gen8.c b/src/intel/isl/isl_gen8.c
index 81c69dc..c4f8f53 100644
--- a/src/intel/isl/isl_gen8.c
+++ b/src/intel/isl/isl_gen8.c
@@ -205,6 +205,22 @@ isl_gen8_choose_image_alignment_el(const struct isl_device 
*dev,
   return;
}
 
+   if (fmtl->txc == ISL_TXC_MCS) {
+  assert(tiling == ISL_TILING_Y0);
+
+  /*
+   * Broadwell PRM Vol 7, "MCS Buffer for Render Target(s)":
+   *
+   * Height, width, and layout of MCS buffer in this case must match with
+   * Render Target height, width, and layout. MCS buffer is tiledY. 
+   *
+   * To avoid wasting memory, choose the smallest alignment possible:
+   * HALIGN_4 and VALIGN_4.
+   */
+  *image_align_el = isl_extent3d(4, 4, 1);
+  return;
+   }
+
/* The below text from the Broadwell PRM provides some insight into the
 * hardware's requirements for LOD alignment.  From the Broadwell PRM >>
 * Volume 5: Memory Views >> Surface Layout >> 2D Surfaces:
diff --git a/src/intel/isl/isl_gen9.c b/src/intel/isl/isl_gen9.c
index e5d0f95..8709235 100644
--- a/src/intel/isl/isl_gen9.c
+++ b/src/intel/isl/isl_gen9.c
@@ -119,6 +119,22 @@ isl_gen9_choose_image_alignment_el(const struct isl_device 
*dev,
   return;
}
 
+   if (fmtl->txc == ISL_TXC_MCS) {
+  assert(tiling == ISL_TILING_Y0);
+
+  /*
+   * Skylake PRM Vol 7, "MCS Buffer for Render Target(s)":
+   *
+   * Height, width, and layout of MCS buffer in this case must match with
+   * Render Target height, width, and layout. MCS buffer is tiledY. 
+   *
+   * To avoid wasting memory, choose the smallest alignment possible:
+   * HALIGN_4 and VALIGN_4.
+   */
+  *image_align_el = isl_extent3d(4, 4, 1);
+  return;
+   }
+
/* This BSpec text provides some insight into the hardware's alignment
 * requirements [Skylake BSpec > Memory Views > Common Surface Formats >
 * Surface Layout and Tiling > 2D Surfaces]:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/27] intel/isl/gen7: Add CCS alignment restrictions

2017-01-16 Thread Topi Pohjolainen
Gen8 and Gen9 are already more heavily constrained as one
applies arrayed/mipmapped alignment even for non-arrayed and
non-mipmapped.

Signed-off-by: Topi Pohjolainen 
---
 src/intel/isl/isl_gen7.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 18687b5..b1874c1 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -403,6 +403,45 @@ isl_gen7_choose_image_alignment_el(const struct isl_device 
*dev,
/* IVB+ does not support combined depthstencil. */
assert(!isl_surf_usage_is_depth_and_stencil(info->usage));
 
+   const struct isl_format_layout *fmtl = isl_format_get_layout(info->format);
+   if (fmtl->txc == ISL_TXC_CCS) {
+  assert(tiling == ISL_TILING_CCS);
+
+  /*
+   * IvyBrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
+   *
+   *  The following table describes the RT alignment 
+   *Pixels   Lines
+   *  TiledY RT CL
+   *  bpp
+   *  32  84
+   *  64  44
+   *  128 24
+   */
+  switch (fmtl->format) {
+  case ISL_FORMAT_GEN7_CCS_32BPP_X:
+ *image_align_el = isl_extent3d(8, 4, 1);
+ return;
+  case ISL_FORMAT_GEN7_CCS_64BPP_X:
+ *image_align_el = isl_extent3d(4, 4, 1);
+ return;
+  case ISL_FORMAT_GEN7_CCS_128BPP_X:
+ *image_align_el = isl_extent3d(2, 4, 1);
+ return;
+  case ISL_FORMAT_GEN7_CCS_32BPP_Y:
+ *image_align_el = isl_extent3d(8, 4, 1);
+ return;
+  case ISL_FORMAT_GEN7_CCS_64BPP_Y:
+ *image_align_el = isl_extent3d(4, 4, 1);
+ return;
+  case ISL_FORMAT_GEN7_CCS_128BPP_Y:
+ *image_align_el = isl_extent3d(2, 4, 1);
+ return;
+  default:
+ unreachable("Invalid CCS format");
+  }
+   }
+
*image_align_el = (struct isl_extent3d) {
   .w = gen7_choose_halign_el(dev, info),
   .h = gen7_choose_valign_el(dev, info, tiling),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/27] i965/miptree/gen7+: Use ISL for HIZ layouts

2017-01-16 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c|   6 +-
 src/mesa/drivers/dri/i965/brw_misc_state.c   |   4 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |   4 +-
 src/mesa/drivers/dri/i965/gen6_depth_state.c |   4 +-
 src/mesa/drivers/dri/i965/gen7_misc_state.c  |   5 +-
 src/mesa/drivers/dri/i965/gen8_depth_state.c |   6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 248 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  11 +-
 8 files changed, 49 insertions(+), 239 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index e76a541..8ecbd0e 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -233,8 +233,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
   } else {
  assert(surf->aux_usage == ISL_AUX_USAGE_HIZ);
 
- surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
- surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
+ surf->aux_addr.buffer = mt->hiz_buf->bo;
+ surf->aux_addr.offset = mt->hiz_buf->offset;
 
  if (brw->gen == 6) {
 /* gen6 requires the HiZ buffer to be manually offset to the
@@ -248,7 +248,7 @@ blorp_surf_for_miptree(struct brw_context *brw,
  * consulted. Otherwise surf->aux_surf is ignored and there is
  * no need to adjust it.  See blorp_emit_depth_stencil_config().
  */
-aux_surf->row_pitch = mt->hiz_buf->aux_base.pitch;
+aux_surf->row_pitch = mt->hiz_buf->pitch;
  }
   }
} else {
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index af050a0..08842d0 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -633,8 +633,8 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw,
  assert(depth_mt);
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-OUT_BATCH(depth_mt->hiz_buf->aux_base.pitch - 1);
-OUT_RELOC(depth_mt->hiz_buf->aux_base.bo,
+OUT_BATCH(depth_mt->hiz_buf->pitch - 1);
+OUT_RELOC(depth_mt->hiz_buf->bo,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   brw->depthstencil.hiz_offset);
 ADVANCE_BATCH();
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 97ca600..160f16d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -151,8 +151,8 @@ brw_emit_surface_state(struct brw_context *brw,
  intel_miptree_get_aux_isl_surf(brw, mt, aux_usage, &aux_surf_s);
  aux_surf = &aux_surf_s;
 
- aux_bo = mt->hiz_buf->aux_base.bo;
- aux_offset = mt->hiz_buf->aux_base.bo->offset64;
+ aux_bo = mt->hiz_buf->bo;
+ aux_offset = mt->hiz_buf->bo->offset64;
   }
 
   /* We only really need a clear color if we also have an auxiliary
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 355e37b..692f07a 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -174,8 +174,8 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
 
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-OUT_BATCH(depth_mt->hiz_buf->aux_base.pitch - 1);
-OUT_RELOC(depth_mt->hiz_buf->aux_base.bo,
+OUT_BATCH(depth_mt->hiz_buf->pitch - 1);
+OUT_RELOC(depth_mt->hiz_buf->bo,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   offset);
 ADVANCE_BATCH();
diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
b/src/mesa/drivers/dri/i965/gen7_misc_state.c
index af9be66..8e87222 100644
--- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
@@ -146,13 +146,12 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
   ADVANCE_BATCH();
} else {
   assert(depth_mt);
-  struct intel_miptree_hiz_buffer *hiz_buf = depth_mt->hiz_buf;
 
   BEGIN_BATCH(3);
   OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (3 - 2));
   OUT_BATCH((mocs << 25) |
-(hiz_buf->aux_base.pitch - 1));
-  OUT_RELOC(hiz_buf->aux_base.bo,
+(depth_mt->hiz_buf->pitch - 1));
+  OUT_RELOC(depth_mt->hiz_buf->bo,
 I915_GEM_DOMAIN_RENDER,
 I915_GEM_DOMAIN_RENDER,
 0);
diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
b/src/mesa/drivers/dri/i965/gen8_depth_state.c
index 14689f4..7c9a698 100644
--- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
@@ -93,10 +93,10 @@ emit_depth_packets(struct brw_co

Re: [Mesa-dev] [PATCH 17/22] i965/vec4: fix register_coalesce() for partial writes

2017-01-16 Thread Samuel Iglesias Gonsálvez
On Mon, 2017-01-16 at 09:01 +0100, Samuel Iglesias Gonsálvez wrote:
> On Fri, 2017-01-13 at 15:46 -0800, Matt Turner wrote:
> > On Thu, Jan 5, 2017 at 5:07 AM, Samuel Iglesias Gonsálvez
> >  wrote:
> > > From: "Juan A. Suarez Romero" 
> > > 
> > > When lowering double_to_single() we added a final mov() that puts
> > > 32-bit
> > 
> > I can't confirm that this patch is necessary in the current
> > i965-fp64-gen7-ivb-scalar-vec4-rc2 branch. It passes Jenkins with
> > it
> > reverted.
> > 
> 
> Right. We are going to run some tests locally and see if it is
> actually
> needed.
> 

OK, it is not needed. I removed it locally.

BTW, there were no comments on patches 18 and 19. So, if you can take a
look at them to see if we need to fix them, it would be great. Our idea
is to send the v2 of the series tomorrow :-)

Thanks for all the reviews!

Sam

> Sam
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

--- Comment #12 from Jan Ziak <0xe2.0x9a.0...@gmail.com> ---
(In reply to Bruce Cherniak from comment #11)
> As Tim suggests, pruning empty nodes is probably the best solution for the
> crash.
> 
> For performance, however, I'm not sure how many cores to expose in your
> case.  cpuinfo shows that there are 4 threads across 2 cores, which we
> detect as 2 cores, with 2 hyperthreads.  Due to the way OpenSWR loads the
> processor, we have found that not using the hyperthreads as OpenSWR workers
> yields the best performance.  This may or may not be the case with your
> processor.
> 
> Something you can try is to set the environment variable
> KNOB_MAX_THREADS_PER_CORE=0.  This will allow OpenSWR to use all 4 threads.
> 
> Please report back on how this affects performance.

An AMD dual core x86 module is in terms of performance close to two separate
x86 cores:

- Kaveri/Steamroller module: 1 instruction fetch unit, 2 instruction decoders,
2 integer cores, 1 AVX core, 1 L1i cache, 2 L1d caches

- Two separate cores: 2 instruction {fetch,decode} units, {integer,AVX} cores,
2 L1{i,d} caches

In my experience, the statement that x86 module is close to 2 separate cores is
generally true. Many programs (gcc (make -j4), ...) scale close to what they
scale on two separate x86 cores.



# export LIBGL_ALWAYS_SOFTWARE=1
# export GALLIUM_DRIVER=swr
# glxgears
350.080 FPS

# KNOB_MAX_THREADS_PER_CORE=0 glxgears
615.980 FPS



Unigine Sanctuary 1.6.3 1024x768_windowed:

Default: 0.166578 FPS
KNOB_MAX_THREADS_PER_CORE=0: 0.440662 FPS

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/main: Fix FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for NONE attachment type

2017-01-16 Thread Alejandro Piñeiro
On 16/01/17 05:13, Iago Toral wrote:
> On Fri, 2017-01-13 at 12:15 -0200, Alejandro Piñeiro wrote:
>> When the attachment type is NONE (att->Type),
>> FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE should be NONE too.
>>
>> Note that technically, the current behaviour follows the spec. From
>> OpenGL 4.5 spec, Section 9.2.3 "Framebuffer Object Queries":
>>
>>"If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
>> either no framebuffer is bound to target; or the default
>> framebuffer is bound, attachment is DEPTH or STENCIL, and the
>> number of depth or stencil bits, respectively, is zero."
>>
>> Reading literally this paragraph, for the default framebuffer, NONE
>> should be only returned if attachment is DEPTH and STENCIL without
>> being allocated.
>>
>> But it doesn't makes too much sense to return DEFAULT_FRAMEBUFFER if
>> the attachment type is NONE. For example, this can happens if the
>> attachment is FRONT_RIGHT run on monoscopic mode, as that attachment
>> is only available on stereo mode.
> Makes sense to me, assuming this is not causing regressions anywhere:

I will make a full CI check before pushing, just in case.

> Reviewed-by: Iago Toral Quiroga 

Thanks!

>
> That said, we should file a bug to Khronos so they fix the text
> accordingly.

Yes, that was my idea. I wanted first a second opinion.

>> With the current behaviour, defensive querying of the object type
>> would not work properly. So you could query the object type checking
>> for NONE, get DEFAULT_FRAMEBUFFER, and then get and INVALID_OPERATION
>> when requesting other pnames (like RED_SIZE), as the real attachment
>> type is NONE.
>>  
>> This fixes:
>> GL45-CTS.direct_state_access.framebuffers_get_attachment_parameters
>> ---
>>
>> CCing Iago Toral as he implemented the current OBJECT_TYPE compute
>> on commit cf4399.
>>
>> FWIW, that commit mentions the following test:
>> dEQP-
>> GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initia
>> l
>>
>> And I confirmed that the test doesn't regress with this change.
>>
>>  src/mesa/main/fbobject.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
>> index 044bd63..7c1b035 100644
>> --- a/src/mesa/main/fbobject.c
>> +++ b/src/mesa/main/fbobject.c
>> @@ -3756,7 +3756,7 @@
>> _mesa_get_framebuffer_attachment_parameter(struct gl_context *ctx,
>> *  stencil bits, respectively, is zero."
>> */
>>*params = (_mesa_is_winsys_fbo(buffer) &&
>> - ((attachment != GL_DEPTH && attachment !=
>> GL_STENCIL) ||
>> + ((attachment != GL_DEPTH && attachment !=
>> GL_STENCIL) &&
>>(att->Type != GL_NONE)))
>>   ? GL_FRAMEBUFFER_DEFAULT : att->Type;
>>return;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: allow TK1 (NVEA) queries to work

2017-01-16 Thread Pierre Moreau
On 07:13 pm - Jan 15 2017, Samuel Pitoiset wrote:
> 
> 
> On 01/14/2017 02:35 AM, Ilia Mirkin wrote:
> > The NVEA 3D class is numerically larger than the NVF0 3D class. The TK1
> > chip uses the SM35 ISA and likely has the same hw counters. Allow these
> > to be used like on all the other supported chips.
> 
> This actually needs more testing. Perf counters are pretty different for
> each generation. The kernel used for reading the counters will work though,
> but the configuration has to be double checked.
> 
> More comments inline.
> 
> > 
> > Signed-off-by: Ilia Mirkin 
> > ---
> >  src/gallium/drivers/nouveau/nvc0/nvc0_query.c |  4 ++--
> >  .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c   |  3 +++
> >  src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c   | 19 
> > ++-
> >  3 files changed, 15 insertions(+), 11 deletions(-)
> > 
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> > index 8b9e6b6..6bf2285 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> > @@ -205,7 +205,7 @@ nvc0_screen_get_driver_query_group_info(struct 
> > pipe_screen *pscreen,
> > 
> > if (screen->base.drm->version >= 0x01000101) {
> >if (screen->compute) {
> > - if (screen->base.class_3d <= NVF0_3D_CLASS) {
> > + if (screen->base.class_3d < GM107_3D_CLASS) {
> >  count += 2;
> >   }
> >}
> > @@ -229,7 +229,7 @@ nvc0_screen_get_driver_query_group_info(struct 
> > pipe_screen *pscreen,
> > } else
> > if (id == NVC0_HW_METRIC_QUERY_GROUP) {
> >if (screen->compute) {
> > -  if (screen->base.class_3d <= NVF0_3D_CLASS) {
> > +  if (screen->base.class_3d < GM107_3D_CLASS) {
> 
> Oops, I forgot to expose these groups when I added Maxwell support. These
> groups are only used for AMD_performance_monitor. Presumably this ext
> currently doesn't expose the counters on Maxwell.
> 
> We should enable them in a separate patch just before this one (for
> maxwell).

Do you need me to test anything on Maxwell?

Pierre

> 
> Otherwise, looks good.
> 
> >  info->name = "Performance metrics";
> >  info->max_active_queries = 4; /* A metric uses at least 2 
> > queries */
> >  info->num_queries = nvc0_hw_metric_get_num_queries(screen);
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
> > index 089af61..494f2dd 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
> > @@ -403,6 +403,7 @@ nvc0_hw_metric_get_queries(struct nvc0_screen *screen)
> > case GM200_3D_CLASS:
> > case GM107_3D_CLASS:
> >return sm50_hw_metric_queries;
> > +   case NVEA_3D_CLASS:
> > case NVF0_3D_CLASS:
> >return sm35_hw_metric_queries;
> > case NVE4_3D_CLASS:
> > @@ -425,6 +426,7 @@ nvc0_hw_metric_get_num_queries(struct nvc0_screen 
> > *screen)
> > case GM200_3D_CLASS:
> > case GM107_3D_CLASS:
> >return ARRAY_SIZE(sm50_hw_metric_queries);
> > +   case NVEA_3D_CLASS:
> > case NVF0_3D_CLASS:
> >return ARRAY_SIZE(sm35_hw_metric_queries);
> > case NVE4_3D_CLASS:
> > @@ -684,6 +686,7 @@ nvc0_hw_metric_get_query_result(struct nvc0_context 
> > *nvc0,
> > switch (screen->base.class_3d) {
> > case GM200_3D_CLASS:
> > case GM107_3D_CLASS:
> > +   case NVEA_3D_CLASS:
> > case NVF0_3D_CLASS:
> >value = sm35_hw_metric_calc_result(hq, res64);
> >break;
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> > index df5723d..440e5d3 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
> > @@ -2239,6 +2239,7 @@ nvc0_hw_sm_get_queries(struct nvc0_screen *screen)
> >return sm52_hw_sm_queries;
> > case GM107_3D_CLASS:
> >return sm50_hw_sm_queries;
> > +   case NVEA_3D_CLASS:
> > case NVF0_3D_CLASS:
> >return sm35_hw_sm_queries;
> > case NVE4_3D_CLASS:
> > @@ -2262,6 +2263,7 @@ nvc0_hw_sm_get_num_queries(struct nvc0_screen *screen)
> >return ARRAY_SIZE(sm52_hw_sm_queries);
> > case GM107_3D_CLASS:
> >return ARRAY_SIZE(sm50_hw_sm_queries);
> > +   case NVEA_3D_CLASS:
> > case NVF0_3D_CLASS:
> >return ARRAY_SIZE(sm35_hw_sm_queries);
> > case NVE4_3D_CLASS:
> > @@ -2475,15 +2477,14 @@ nvc0_hw_sm_get_program(struct nvc0_screen *screen)
> >prog->code_size = sizeof(gm107_read_hw_sm_counters_code);
> >prog->num_gprs = 14;
> > } else
> > -   if (screen->base.class_3d == NVE4_3D_CLASS ||
> > -   screen->base.class_3d == NVF0_3D_CLASS) {
> > -  if (screen->base.class_3d == NVE4_3D_CLASS) {
> > - prog->cod

Re: [Mesa-dev] [PATCH] mesa/main: fix version/extension checks in _mesa_ClampColor

2017-01-16 Thread Nicolai Hähnle
Emil, I'm going to follow up with a patch to try to fix the reported 
i915 regression, but feel free to drop the patch from this thread from 
mesa-stable entirely. It's not an important fix.


Cheers,
Nicolai

On 13.01.2017 21:23, Mark Janes wrote:

This patch regressed i915 systems:

https://bugs.freedesktop.org/show_bug.cgi?id=99401

Please don't apply to stable until the bug is resolved.

Nicolai Hähnle  writes:


From: Nicolai Hähnle 

Add a proper check for feature support, and raise an invalid enum for
GL_CLAMP_VERTEX/FRAGMENT_COLOR unconditionally in core profiles, since
those enums were explicitly removed after the extension was promoted
to core functionality (not in the profile sense) with OpenGL 3.0.

This matches the behavior of the AMD closed source driver and fixes
GL45-CTS.gtf30.GL3Tests.half_float.half_float_textures.

Cc: "12.0 13.0" 
---
 src/mesa/main/blend.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/blend.c b/src/mesa/main/blend.c
index 0322799..955fda1 100644
--- a/src/mesa/main/blend.c
+++ b/src/mesa/main/blend.c
@@ -854,40 +854,44 @@ _mesa_ColorMaski( GLuint buf, GLboolean red, GLboolean 
green,
FLUSH_VERTICES(ctx, _NEW_COLOR);
COPY_4UBV(ctx->Color.ColorMask[buf], tmp);
 }


 void GLAPIENTRY
 _mesa_ClampColor(GLenum target, GLenum clamp)
 {
GET_CURRENT_CONTEXT(ctx);

+   /* Check for both the extension and the GL version, since the Intel driver
+* does not advertise the extension in core profiles.
+*/
+   if (ctx->Version <= 30 && !ctx->Extensions.ARB_color_buffer_float) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "glClampColor()");
+  return;
+   }
+
if (clamp != GL_TRUE && clamp != GL_FALSE && clamp != GL_FIXED_ONLY_ARB) {
   _mesa_error(ctx, GL_INVALID_ENUM, "glClampColorARB(clamp)");
   return;
}

switch (target) {
case GL_CLAMP_VERTEX_COLOR_ARB:
-  if (ctx->API == API_OPENGL_CORE &&
-  !ctx->Extensions.ARB_color_buffer_float) {
+  if (ctx->API == API_OPENGL_CORE)
  goto invalid_enum;
-  }
   FLUSH_VERTICES(ctx, _NEW_LIGHT);
   ctx->Light.ClampVertexColor = clamp;
   _mesa_update_clamp_vertex_color(ctx, ctx->DrawBuffer);
   break;
case GL_CLAMP_FRAGMENT_COLOR_ARB:
-  if (ctx->API == API_OPENGL_CORE &&
-  !ctx->Extensions.ARB_color_buffer_float) {
+  if (ctx->API == API_OPENGL_CORE)
  goto invalid_enum;
-  }
   FLUSH_VERTICES(ctx, _NEW_FRAG_CLAMP);
   ctx->Color.ClampFragmentColor = clamp;
   _mesa_update_clamp_fragment_color(ctx, ctx->DrawBuffer);
   break;
case GL_CLAMP_READ_COLOR_ARB:
   ctx->Color.ClampReadColor = clamp;
   break;
default:
   goto invalid_enum;
}
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/27] i965/gen6: Remove check for stencil format

2017-01-16 Thread Samuel Iglesias Gonsálvez
On Mon, 2017-01-16 at 11:13 +0200, Topi Pohjolainen wrote:
> There are is no alternative.
> 

There is no alternative.

Reviewed-by: Samuel Iglesias Gonsálvez 

> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/gen6_depth_state.c | 22 
> --
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c
> b/src/mesa/drivers/dri/i965/gen6_depth_state.c
> index 3f14006..cb0ed25 100644
> --- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
> @@ -191,20 +191,14 @@ gen6_emit_depth_stencil_hiz(struct brw_context
> *brw,
>   uint32_t offset = 0;
>  
>   if (stencil_mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
> -if (stencil_mt->format == MESA_FORMAT_S_UINT8) {
> -   /* Note: we can't compute the stencil offset using
> -* intel_region_get_aligned_offset(), because
> stencil_region
> -* claims that the region is untiled even though it's
> W tiled.
> -*/
> -   offset =
> -  stencil_mt->level[lod].level_y * stencil_mt->pitch 
> +
> -  stencil_mt->level[lod].level_x * 64;
> -} else {
> -   offset = intel_miptree_get_aligned_offset(
> -   stencil_mt,
> -   stencil_mt->level[lod].level_x,
> -   stencil_mt->level[lod].level_y);
> -}
> +assert(stencil_mt->format == MESA_FORMAT_S_UINT8);
> +
> +/* Note: we can't compute the stencil offset using
> + * intel_region_get_aligned_offset(), because
> stencil_region
> + * claims that the region is untiled even though it's W
> tiled.
> + */
> +offset = stencil_mt->level[lod].level_y * stencil_mt-
> >pitch +
> + stencil_mt->level[lod].level_x * 64;
>   }
>  
>    BEGIN_BATCH(3);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/27] i965: Remove check for hiz on earlier gens than SNB

2017-01-16 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Mon, 2017-01-16 at 11:13 +0200, Topi Pohjolainen wrote:
> Only caller, brw_workaround_depthstencil_alignment(), returns
> early for gen6+.
> 
> While at it, reduce scope for brw_get_depthstencil_tile_masks() as
> well.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h|  6 --
>  src/mesa/drivers/dri/i965/brw_misc_state.c | 18 ++
>  2 files changed, 2 insertions(+), 22 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index ff3f861..4176853 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1279,12 +1279,6 @@ brw_meta_resolve_color(struct brw_context
> *brw,
>  /*==
> 
>   * brw_misc_state.c
>   */
> -void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree
> *depth_mt,
> - uint32_t depth_level,
> - uint32_t depth_layer,
> - struct intel_mipmap_tree
> *stencil_mt,
> - uint32_t *out_tile_mask_x,
> - uint32_t *out_tile_mask_y);
>  void brw_workaround_depthstencil_alignment(struct brw_context *brw,
> GLbitfield clear_mask);
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c
> b/src/mesa/drivers/dri/i965/brw_misc_state.c
> index 40a8d07..616c0df 100644
> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> @@ -165,7 +165,7 @@ brw_depthbuffer_format(struct brw_context *brw)
>   * packet.  If the 3 buffers don't agree on the drawing offset ANDed
> with this
>   * mask, then we're in trouble.
>   */
> -void
> +static void
>  brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt,
>  uint32_t depth_level,
>  uint32_t depth_layer,
> @@ -179,21 +179,7 @@ brw_get_depthstencil_tile_masks(struct
> intel_mipmap_tree *depth_mt,
>    intel_get_tile_masks(depth_mt->tiling, depth_mt->tr_mode,
> depth_mt->cpp,
> &tile_mask_x, &tile_mask_y);
> -
> -  if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
> - uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
> - intel_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
> -  depth_mt->hiz_buf->mt->tr_mode,
> -  depth_mt->hiz_buf->mt->cpp,
> -  &hiz_tile_mask_x,
> -  &hiz_tile_mask_y);
> -
> - /* Each HiZ row represents 2 rows of pixels */
> - hiz_tile_mask_y = hiz_tile_mask_y << 1 | 1;
> -
> - tile_mask_x |= hiz_tile_mask_x;
> - tile_mask_y |= hiz_tile_mask_y;
> -  }
> +  assert(!intel_miptree_level_has_hiz(depth_mt, depth_level));
> }
>  
> if (stencil_mt) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/27] i965/meta: Remove unused brw_get_rb_for_slice()

2017-01-16 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Mon, 2017-01-16 at 11:13 +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_meta_util.c | 44 ---
> 
>  src/mesa/drivers/dri/i965/brw_meta_util.h |  5 
>  2 files changed, 49 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c
> b/src/mesa/drivers/dri/i965/brw_meta_util.c
> index 6d6b692..07a160f 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_util.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
> @@ -267,50 +267,6 @@ brw_meta_mirror_clip_and_scissor(const struct
> gl_context *ctx,
>  }
>  
>  /**
> - * Creates a new named renderbuffer that wraps the first slice
> - * of an existing miptree.
> - *
> - * Clobbers the current renderbuffer binding (ctx-
> >CurrentRenderbuffer).
> - */
> -struct gl_renderbuffer *
> -brw_get_rb_for_slice(struct brw_context *brw,
> - struct intel_mipmap_tree *mt,
> - unsigned level, unsigned layer, bool flat)
> -{
> -   struct gl_context *ctx = &brw->ctx;
> -   struct gl_renderbuffer *rb = ctx->Driver.NewRenderbuffer(ctx,
> 0xDEADBEEF);
> -   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> -
> -   rb->RefCount = 1;
> -   rb->Format = mt->format;
> -   rb->_BaseFormat = _mesa_get_format_base_format(mt->format);
> -
> -   /* Program takes care of msaa and mip-level access manually for
> stencil.
> -* The surface is also treated as Y-tiled instead of as W-tiled
> calling for
> -* twice the width and half the height in dimensions.
> -*/
> -   if (flat) {
> -  const unsigned halign_stencil = 8;
> -
> -  rb->NumSamples = 0;
> -  rb->Width = ALIGN(mt->total_width, halign_stencil) * 2;
> -  rb->Height = (mt->total_height / mt->physical_depth0) / 2;
> -  irb->mt_level = 0;
> -   } else {
> -  rb->NumSamples = mt->num_samples;
> -  rb->Width = mt->logical_width0;
> -  rb->Height = mt->logical_height0;
> -  irb->mt_level = level;
> -   }
> -
> -   irb->mt_layer = layer;
> -
> -   intel_miptree_reference(&irb->mt, mt);
> -
> -   return rb;
> -}
> -
> -/**
>   * Determine if fast color clear supports the given clear color.
>   *
>   * Fast color clear can only clear to color values of 1.0 or
> 0.0.  At the
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.h
> b/src/mesa/drivers/dri/i965/brw_meta_util.h
> index 93bc72c..207a54b 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_util.h
> +++ b/src/mesa/drivers/dri/i965/brw_meta_util.h
> @@ -57,11 +57,6 @@ brw_is_color_fast_clear_compatible(struct
> brw_context *brw,
> const struct intel_mipmap_tree
> *mt,
> const union gl_color_union
> *color);
>  
> -struct gl_renderbuffer *brw_get_rb_for_slice(struct brw_context
> *brw,
> - struct
> intel_mipmap_tree *mt,
> - unsigned level,
> unsigned layer,
> - bool flat);
> -
>  #ifdef __cplusplus
>  }
>  #endif
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] i965/miptree: Remove redundant check for null texture

2017-01-16 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Mon, 2017-01-16 at 11:13 +0200, Topi Pohjolainen wrote:
> There exact same check earlier in brw_miptree_layout() which
> intel_miptree_create_layout() in turn calls unconditionally.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 25f8f39..9488bec 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -628,13 +628,8 @@ miptree_create(struct brw_context *brw,
>  first_level, last_level, width0,
>  height0, depth0, num_samples,
>  layout_flags);
> -   /*
> -* pitch == 0 || height == 0  indicates the null texture
> -*/
> -   if (!mt || !mt->total_width || !mt->total_height) {
> -  intel_miptree_release(&mt);
> +   if (!mt)
>    return NULL;
> -   }
>  
> if (mt->tiling == (I915_TILING_Y | I915_TILING_X))
>    mt->tiling = I915_TILING_Y;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/spirv/glsl450: rewrite atan2 to deal with denorms / infinities

2017-01-16 Thread Juan A. Suarez Romero
Rewrite atan2(y,x) to cover (+/-)INF values.

Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the
very beginning.

The reason is that in other case, the hardware will do the flush in some
of the steps, but not in order. So we end up handling in some steps a
denorm value and in others a 0. This causes wrong results.

Doing it at the very beginning we ensure always the same value is used
(a 0) in all the steps.

This fixes several test cases in Vulkan CTS
(dEQP-VK.glsl.builtin.precision.atan2.*)
---
 src/compiler/spirv/vtn_glsl450.c | 68 ++--
 1 file changed, 58 insertions(+), 10 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 0d32fdd..d2743a8 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
 }
 
+/*
+ * Computes atan2(y,x)
+ *
+ * If any of the parameters is a denorm value, it is flushed to 0 at the very
+ * beginning to avoid precision errors
+ */
 static nir_ssa_def *
 build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
 {
nir_ssa_def *zero = nir_imm_float(b, 0.0f);
-
-   /* If |x| >= 1.0e-8 * |y|: */
+   nir_ssa_def *inf = nir_imm_float(b, INFINITY);
+   nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
+   nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
+   nir_imm_float(b, M_PI_4f));
+
+   nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, y)),
+   zero),
+zero,
+y);
+   nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, x)),
+   zero),
+zero,
+x);
+
+   /* if y == +-INF */
+   nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
+
+   /* if x == +-INF */
+   nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
+
+   /* Case: y is +-INF */
+   nir_ssa_def *y_is_inf_then =
+  nir_fmul(b, nir_fsign(b, y),
+  nir_bcsel(b, nir_feq(b, x, inf),
+   nir_imm_float(b, M_PI_4f),
+   nir_bcsel(b, nir_feq(b, x, minus_inf),
+m_3_pi_4,
+nir_imm_float(b, M_PI_2f;
+
+   /* Case: x is +-INF */
+   nir_ssa_def *x_is_inf_then =
+  nir_fmul(b, nir_fsign(b, y),
+  nir_bcsel(b, nir_feq(b, x, inf),
+   zero,
+   nir_imm_float(b, M_PIf)));
+
+   /* If x > 0 */
nir_ssa_def *condition =
-  nir_fge(b, nir_fabs(b, x),
-  nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
+  nir_fne(b, denorm_x, zero);
 
/* Then...call atan(y/x) and fix it up: */
-   nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x));
+   nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x));
+
nir_ssa_def *r_then =
-  nir_bcsel(b, nir_flt(b, x, zero),
+  nir_bcsel(b, nir_flt(b, denorm_x, zero),
nir_fadd(b, atan1,
-   nir_bcsel(b, nir_fge(b, y, zero),
+   nir_bcsel(b, nir_fge(b, denorm_y, zero),
 nir_imm_float(b, M_PIf),
 nir_imm_float(b, -M_PIf))),
atan1);
 
/* Else... */
nir_ssa_def *r_else =
-  nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f));
-
-   return nir_bcsel(b, condition, r_then, r_else);
+  nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f));
+
+   /* Everything together */
+   return nir_bcsel(b, y_is_inf,
+   y_is_inf_then,
+   nir_bcsel(b, x_is_inf,
+x_is_inf_then,
+nir_bcsel(b, condition,
+ r_then,
+ r_else)));
 }
 
 static nir_ssa_def *
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/27] i965: Replace open coded with intel_miptree_get_image_offset()

2017-01-16 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Mon, 2017-01-16 at 11:13 +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_pixel_read.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> index 2563897..ace94a0 100644
> --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> @@ -47,6 +47,19 @@
>  
>  #define FILE_DEBUG_FLAG DEBUG_PIXEL
>  
> +static void
> +adjust_image_offset(const struct intel_renderbuffer *irb,
> +int *xoffset, int *yoffset)
> +{
> +   unsigned x;
> +   unsigned y;
> +   intel_miptree_get_image_offset(irb->mt, irb->mt_level, irb-
> >mt_layer,
> +  &x, &y);
> +
> +   *xoffset += x;
> +   *yoffset += y;
> +}
> +
>  /**
>   * \brief A fast path for glReadPixels
>   *
> @@ -153,8 +166,7 @@ intel_readpixels_tiled_memcpy(struct gl_context *
> ctx,
>    return false;
> }
>  
> -   xoffset += irb->mt->level[irb->mt_level].slice[irb-
> >mt_layer].x_offset;
> -   yoffset += irb->mt->level[irb->mt_level].slice[irb-
> >mt_layer].y_offset;
> +   adjust_image_offset(irb, &xoffset, &yoffset);
>  
> dst_pitch = _mesa_image_row_stride(pack, width, format, type);
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98833] [REGRESSION, bisected] Wayland revert commit breaks fullscreen frame updates

2017-01-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98833

Eero Tamminen  changed:

   What|Removed |Added

 Attachment #128240|0   |1
is obsolete||

--- Comment #12 from Eero Tamminen  ---
Created attachment 128981
  --> https://bugs.freedesktop.org/attachment.cgi?id=128981&action=edit
Patch to add frame swap delay option to weston-simple-egl

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98833] [REGRESSION, bisected] Wayland revert commit breaks fullscreen frame updates

2017-01-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98833

--- Comment #13 from Eero Tamminen  ---
(In reply to Pekka Paalanen from comment #11)
> Hi, sorry, just back from holidays. The patch looks fine so would be nice to
> have that on wayland-devel@ mailing list if you didn't send it already. I
> think it'd be an ok addition to simple-egl.

I'm not on wayland list, and rather not subscribe on one more.  I don't need
attributions, so feel free to apply it as you see fit.


> Does it matter if the delay is before or after swapbuffers? I would slightly
> prefer it before swapbuffers so it would emulate time spent in drawing.

As expected, it doesn't matter.  Attached is updated patch.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: set UAV coherence required bit when needed

2017-01-16 Thread Iago Toral Quiroga
The same we do in the OpenGL driver (comment copied from there).

This is required to ensure that we execute the fragment shader stage when
side-effects (such as image or ssbo stores) are present but there are no
color writes.

I found this while writing a test to check rendering to a framebuffer
without attachments where the fragment shader does not produce any
color outputs but writes to an image via imageStore(). Without this patch
the fragment shader does not execute and the image is not written,
which is not correct.
---
 src/intel/vulkan/genX_pipeline.c | 51 
 1 file changed, 51 insertions(+)

diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 90968b4..6a927d5 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -1248,6 +1248,26 @@ emit_3dstate_ps(struct anv_pipeline *pipeline,
}
 }
 
+static inline bool
+has_color_buffer_write_enabled(const struct anv_pipeline *pipeline)
+{
+   const struct anv_shader_bin *shader_bin =
+  pipeline->shaders[MESA_SHADER_FRAGMENT];
+   if (!shader_bin)
+  return false;
+
+   const struct anv_pipeline_bind_map *bind_map = &shader_bin->bind_map;
+   for (int i = 0; i < bind_map->surface_count; i++) {
+  if (bind_map->surface_to_descriptor[i].set !=
+  ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
+ continue;
+  if (bind_map->surface_to_descriptor[i].index != UINT8_MAX)
+ return true;
+   }
+
+   return false;
+}
+
 #if GEN_GEN >= 8
 static void
 emit_3dstate_ps_extra(struct anv_pipeline *pipeline,
@@ -1278,6 +1298,37 @@ emit_3dstate_ps_extra(struct anv_pipeline *pipeline,
   ps.PixelShaderKillsPixel = subpass->has_ds_self_dep ||
  wm_prog_data->uses_kill;
 
+  /* The stricter cross-primitive coherency guarantees that the hardware
+   * gives us with the "Accesses UAV" bit set for at least one shader stage
+   * and the "UAV coherency required" bit set on the 3DPRIMITIVE command 
are
+   * redundant within the current image, atomic counter and SSBO GL APIs,
+   * which all have very loose ordering and coherency requirements and
+   * generally rely on the application to insert explicit barriers when a
+   * shader invocation is expected to see the memory writes performed by 
the
+   * invocations of some previous primitive.  Regardless of the value of
+   * "UAV coherency required", the "Accesses UAV" bits will implicitly 
cause
+   * an in most cases useless DC flush when the lowermost stage with the 
bit
+   * set finishes execution.
+   *
+   * It would be nice to disable it, but in some cases we can't because on
+   * Gen8+ it also has an influence on rasterization via the PS UAV-only
+   * signal (which could be set independently from the coherency mechanism
+   * in the 3DSTATE_WM command on Gen7), and because in some cases it will
+   * determine whether the hardware skips execution of the fragment shader
+   * or not via the ThreadDispatchEnable signal.  However if we know that
+   * GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and
+   * GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make any
+   * difference so we may just disable it here.
+   *
+   * Gen8 hardware tries to compute ThreadDispatchEnable for us but doesn't
+   * take into account KillPixels when no depth or stencil writes are
+   * enabled. In order for occlusion queries to work correctly with no
+   * attachments, we need to force-enable here.
+   */
+  if ((wm_prog_data->has_side_effects || wm_prog_data->uses_kill) &&
+  !has_color_buffer_write_enabled(pipeline))
+ ps.PixelShaderHasUAV = true;
+
 #if GEN_GEN >= 9
   ps.PixelShaderPullsBary= wm_prog_data->pulls_bary;
   ps.InputCoverageMaskState  = wm_prog_data->uses_sample_mask ?
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD

2017-01-16 Thread Emil Velikov
Hi Vladislav,

On 14 January 2017 at 01:50, Vladislav Egorov  wrote:
> 14.01.2017 01:45, Timothy Arceri пишет:
>>
>> I'm asking for a chance to test before we jump in, its probably not a
>> big deal and I may even still be able to reduce my use of hashing but
>> it would be nice to be given a few days to test and even explore
>> alternatives before jumping on this implementation.
>
> A very quick and very dirty simple benchmark. I took shader-cache from
> github, branch shader-cache39. Then I've applied my preprocessor patch on
> top (because shader-cache still uses preprocessor even if the shader is
> cached and it was painful to see preprocessor taking more than half of the
> whole time). Then I've compiled it with openssl and with the Emil's patch.
> Full run on shader-db (300Mb+ of shaders) with shader-cache warmed up. It
> takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs
> approximately the same time, spends 0.53% in SHA1Transform() and other SHA1*
> functions. Subtest - 46Mb of shaders from Total War: Attila - 3.10s (for
> some reason, the cache works much faster on smaller subsets than on full
> shader-db). 1.08% were spent in libcrypto, 1.04% in
> sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in SHA1Transform()
> and other SHA1* functions.
>
Did you mean "shader-db" with the "shader-cache" references above ? I
cannot find any projects with the latter name on github.

> Overall not that significant in context of shader-cache, but as expected, on
> Haswell it's twice slower than OpenSSL's AVX2 implementation.

If I understood you correctly you're saying that despite this
implementation being slower the actual total runtime remains
unchanged.
In the Attila case we even get a minimal improvement, although I'm
inclined to discard that as within the error margin.

That sounds interesting, but in all honesty I don't believe it's worth
worrying unless we have usecase(s) where the runtime difference is
noticeable.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD

2017-01-16 Thread Emil Velikov
On 14 January 2017 at 06:25, Jonathan Gray  wrote:
> On Fri, Jan 13, 2017 at 04:51:31PM +, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> At the moment we support 5+ different implementations each with varying
>> amount of bugs - from thread safely problems [1], to outright broken
>> implementation(s) [2]
>>
>> In order to accommodate these we have 150+ lines of configure script and
>> extra two configure toggles. Whist an actual implementation being
>> ~200loc and our current compat wrapping ~250.
>>
>> Let's not forget that different people use different code paths, thus
>> effectively makes it harder to test and debug since the default
>> implementation is automatically detected.
>>
>> To minimise all these lovely experiences, import the "100% Public
>> Domain" OpenBSD sha1 implementation. Clearly document any changes needed
>> to get building correctly, since many/most of those can be upstreamed
>> making future syncs easier.
>
> I had feared that this would somehow collide with the symbols
> in libc but it seems to build and run xorg/glxgears at least
> on broadwell with i965.
>
Amazing, thanks for testing !

> Patches for OpenBSD go to tech@ and you should look at how portable
> openssh and libressl handle systems that lack functions like
> explicit_bzero, autoconf detects systems that lack functions or are
> known to have broken implementations and alternate versions are
> provided.  Damien Miller described how this is handled for ssh in
> https://www.openbsd.org/papers/portability.pdf
> https://www.openbsd.org/papers/auug2005-portability/
>
> The attribute could also be checked in autoconf as is already done
> for various other attributes.
>
> Other parts seem odd, posix defines size_t as being in sys/types.h
> not stddef.h for example.
>
> u_int* are bsd types which predate c99 types, I could see an
> argument being made for changing the types there but it
> would likely have to cover all the other hashes as well,
> not just sha1.
>
Thanks for the tips Jonathan. I'll follow suite as/if we get this patch merged.

To answer your question - I've opted for the C99 variants
(headers/types) since we build mesa/gallium on non-POSIX platforms.
With the former being almost universally supported these days.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: delay calling begin_frame until we have all parameters

2017-01-16 Thread Nayan Deshmukh
Hi Christian,

Please push this patch.

There are a couple of patches [1] which are not yet reviewed. They are
trivial and are tested by Andy. Please have a look at them.

Regards,
Nayan

[1] https://lists.freedesktop.org/archives/mesa-dev/2017-January/140395.html

On Fri, Jan 13, 2017 at 11:17 PM, Andy Furniss  wrote:

> Nayan Deshmukh wrote:
>
>> On Fri, Jan 13, 2017 at 9:54 PM, Andy Furniss 
>> wrote:
>>
>
> Would be interesting to see if you see the same with this vid
>>> which easily shows the corruption.
>>>
>>> https://drive.google.com/drive/folders/0BxP5-S1t9VEEbkR4dWhT
>>> UFozV2s?usp=sharing
>>>
>>> Looks bad --hwdec-vaapi with or without --vo=vaapi
>>>
>>> with --hwdec=vaapi and --vo=vaapi I see the corruption. But without
>> --vo=vaapi it uses VAAPI EGL interop and leads to this error
>> unsupported VA image format unknown
>>
>
> Ok and thanks for looking into the buzilla bug.
>
> I don't know why you get egl interop - I get "normal" opengl and don't
> know how force mpv to try egl.
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: correctly manage MCJIT at run-time

2017-01-16 Thread Emil Velikov
On 14 January 2017 at 08:46, Jose Fonseca  wrote:
> I suspect this might break builds with LLVM 3.6 or higher.
>
> The LLVMLinkInJIT must be inside #if ... #endif, and it must not be expanded
> when HAVE_LLVM >= 0x0306, since LLVMLinkInJIT(),
>
> That is, when HAVE_LLVM >= 0x0306:
> - USE_MCJIT should be static const
> - no point claling GALLIVM_MCJIT
> - must not have any LLVMLinkInJIT() call around, regardless it's called or
> not.
>
>
> And this code doesn't make sense:
>
>  if (USE_MCJIT)
> LLVMLinkInMCJIT();
>  else
> LLVMLinkInJIT();
>
> If these functions are meant to force the static linking of external
> libraries, putting any control flow around it is just misleading.  It gives
> the illusion that if we don't call these functions nothing will happen which
> is defintely not true.
>
>
>> As an added bonus might even solve the issue Wu Zhen is hitting :-)
>
> I'm not enterly sure I understad Wu Zhen problem.
>
> If android doesn't have MCJIT then I think the right fix is merely
>
> #if defined(ANDROID)
> #define USE_MCJIT 0
> #endif
>
> ...
>
>
>   // Link MCJIT when USE_MCJIT is a runtime option or defined as 1
> #if !defined(USE_MCJIT) || USE_MCJIT
>   LLVMLinkInMCJIT();
> #endif
>
>   // Link old JIT when USE_MCJIT is a runtime option or defined as 0
> #if !defined(USE_MCJIT) || !USE_MCJIT
>   LLVMLinkInJIT();
> #endif
>
> That is, any logic to decide whether to call or not LLVMLinkIn* must be done
> with _build_ time C-processor logic.
>
I might be the only one here, but having FOO (USE_MCJIT in this case)
as define or variable depending on $heuristics reads a bit iffy.

That aside - if I understood you correctly:
 - One of LLVMLinkIn{MC,}JIT might be missing on some versions of LLVM.
In that case having a guard called "USE" sounds like a misnomer.
Providing a static inline as needed might be cleaner ?
 - If LLVMLinkInJIT/LLVMLinkInMCJIT are solely for linking purposes,
it will be better (imho) to add a small comment and still have them
honour the user selection.
With the latter, since we don't want things to explode as older/newer
LLVM adds specific code in said functions.

How does that sound ?
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] radv: rename global extension properties structs

2017-01-16 Thread Emil Velikov
On 14 January 2017 at 02:31, Andres Rodriguez  wrote:
> On Fri, Jan 13, 2017 at 8:13 PM, Emil Velikov 
> wrote:
>>
>> On 13 January 2017 at 23:44, Andres Rodriguez  wrote:
>> > All extension arrays are global, but only one of them refers to instance
>> > extensions.
>> >
>> > The device extension array refers to extensions that are common across
>> > all physical devices. This disctinction will be more imporant once we
>> Typos: "distinction" and "important"
>>
>> > have dynamic extension support for devices.
>> >
>> I think that this and 3/3 are very good idea, but since RADV supports
>> only one device I'm not sure that they're applicable, yet.
>> Not too familiar with the RADV code so I might be off there.
>
>
> Besides differences in HW functionality, another use for this feature would
> be to expose an extension only if the software stack supports it.
>
Guess I was drooling too much over someone adding multiple devices
support for radv ;-)

> Eg. something like:
>
> if (libdrm_version >= x && drm_version >= y)
> register_extension(...)
>
> This will come into play with some of the other patches on amd-gfx that
> you've helped me review :)
>
Yw. As you get to the respective work - please don't base it on libdrm
version. Please check that the kernel module is old enough either via
a) a module version check or b) -EINVAL as returned by the module
input validation. Former seems to be used by radeon/amdgpu userspace
while the latter by the i915 one.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: import sha1 implementation from OpenBSD

2017-01-16 Thread Vladislav Egorov



16.01.2017 16:13, Emil Velikov пишет:

Hi Vladislav,

On 14 January 2017 at 01:50, Vladislav Egorov  wrote:

14.01.2017 01:45, Timothy Arceri пишет:

I'm asking for a chance to test before we jump in, its probably not a
big deal and I may even still be able to reduce my use of hashing but
it would be nice to be given a few days to test and even explore
alternatives before jumping on this implementation.

A very quick and very dirty simple benchmark. I took shader-cache from
github, branch shader-cache39. Then I've applied my preprocessor patch on
top (because shader-cache still uses preprocessor even if the shader is
cached and it was painful to see preprocessor taking more than half of the
whole time). Then I've compiled it with openssl and with the Emil's patch.
Full run on shader-db (300Mb+ of shaders) with shader-cache warmed up. It
takes 78s, spends in libcrypto 0.27%. With OpenBSD SHA1 it runs
approximately the same time, spends 0.53% in SHA1Transform() and other SHA1*
functions. Subtest - 46Mb of shaders from Total War: Attila - 3.10s (for
some reason, the cache works much faster on smaller subsets than on full
shader-db). 1.08% were spent in libcrypto, 1.04% in
sha1_block_data_order_avx2(). With OpenBSD 3.07s - 2.27% in SHA1Transform()
and other SHA1* functions.


Did you mean "shader-db" with the "shader-cache" references above ? I
cannot find any projects with the latter name on github.


I meant Timothy's github: https://github.com/tarceri/Mesa


Overall not that significant in context of shader-cache, but as expected, on
Haswell it's twice slower than OpenSSL's AVX2 implementation.

If I understood you correctly you're saying that despite this
implementation being slower the actual total runtime remains
unchanged.
In the Attila case we even get a minimal improvement, although I'm
inclined to discard that as within the error margin.

That sounds interesting, but in all honesty I don't believe it's worth
worrying unless we have usecase(s) where the runtime difference is
noticeable.

Thanks
Emil


Sure, it's just noise, it should be slower by 1%. When using hot shader 
cache it would be at most 1% difference, and at most ~2% difference on 
CPUs with hardware SHA like Goldmont. I am not sure about Vulkan. But 
yeah, it doesn't seem like there is something to worry about so far.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: make fix_fetch 64-bit

2017-01-16 Thread Marek Olšák
From: Marek Olšák 

v2: add u_bit_consecutive64
---
 src/gallium/drivers/radeonsi/si_shader.c| 4 ++--
 src/gallium/drivers/radeonsi/si_shader.h| 4 ++--
 src/gallium/drivers/radeonsi/si_state.c | 6 +++---
 src/gallium/drivers/radeonsi/si_state.h | 2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
 src/util/bitscan.h  | 9 +
 6 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6f0f414..dfba9d4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -426,21 +426,21 @@ static void declare_input_vs(
"llvm.SI.vs.load.input", ctx->v4f32, args, 3,
LP_FUNC_ATTR_READNONE);
 
/* Break up the vec4 into individual components */
for (chan = 0; chan < 4; chan++) {
LLVMValueRef llvm_chan = lp_build_const_int32(gallivm, chan);
out[chan] = LLVMBuildExtractElement(gallivm->builder,
input, llvm_chan, "");
}
 
-   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (2 * input_index)) & 
3;
+   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (4 * input_index)) & 
0xf;
if (fix_fetch) {
/* The hardware returns an unsigned value; convert it to a
 * signed one.
 */
LLVMValueRef tmp = out[3];
LLVMValueRef c30 = LLVMConstInt(ctx->i32, 30, 0);
 
/* First, recover the sign-extended signed integer value. */
if (fix_fetch == SI_FIX_FETCH_A2_SSCALED)
tmp = LLVMBuildFPToUI(gallivm->builder, tmp, ctx->i32, 
"");
@@ -6578,21 +6578,21 @@ static void si_dump_shader_key(unsigned shader, struct 
si_shader_key *key,
switch (shader) {
case PIPE_SHADER_VERTEX:
fprintf(f, "  part.vs.prolog.instance_divisors = {");
for (i = 0; i < 
ARRAY_SIZE(key->part.vs.prolog.instance_divisors); i++)
fprintf(f, !i ? "%u" : ", %u",
key->part.vs.prolog.instance_divisors[i]);
fprintf(f, "}\n");
fprintf(f, "  part.vs.epilog.export_prim_id = %u\n", 
key->part.vs.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
fprintf(f, "  as_ls = %u\n", key->as_ls);
-   fprintf(f, "  mono.vs.fix_fetch = 0x%x\n", 
key->mono.vs.fix_fetch);
+   fprintf(f, "  mono.vs.fix_fetch = 0x%"PRIx64"\n", 
key->mono.vs.fix_fetch);
break;
 
case PIPE_SHADER_TESS_CTRL:
fprintf(f, "  part.tcs.epilog.prim_mode = %u\n", 
key->part.tcs.epilog.prim_mode);
fprintf(f, "  mono.tcs.inputs_to_copy = 0x%"PRIx64"\n", 
key->mono.tcs.inputs_to_copy);
break;
 
case PIPE_SHADER_TESS_EVAL:
fprintf(f, "  part.tes.epilog.export_prim_id = %u\n", 
key->part.tes.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 1b5dec2..89f9628 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -418,22 +418,22 @@ struct si_shader_key {
 
/* These two are initially set according to the NEXT_SHADER property,
 * or guessed if the property doesn't seem correct.
 */
unsigned as_es:1; /* export shader */
unsigned as_ls:1; /* local shader */
 
/* Flags for monolithic compilation only. */
union {
struct {
-   /* One pair of bits for every input: SI_FIX_FETCH_* 
enums. */
-   uint32_tfix_fetch;
+   /* One nibble for every input: SI_FIX_FETCH_* enums. */
+   uint64_tfix_fetch;
} vs;
struct {
uint64_tinputs_to_copy; /* for fixed-func TCS */
} tcs;
} mono;
 
/* Optimization flags for asynchronous compilation only. */
union {
struct {
uint64_tkill_outputs; /* "get_unique_index" 
bits */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 6e7d8da..fa78a56 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3356,26 +3356,26 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
   
S_008F0C_DST_SEL_W(si_map_swizzle(desc->swizzle[3])) |
   S_008F0C_NUM_FORMAT(num_format) |
   S_008F0C_DATA_FORMAT(data_format);
v->format_

Re: [Mesa-dev] [PATCH] radeonsi: make fix_fetch 64-bit

2017-01-16 Thread Marek Olšák
On Mon, Jan 16, 2017 at 3:00 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> v2: add u_bit_consecutive64
> ---
>  src/gallium/drivers/radeonsi/si_shader.c| 4 ++--
>  src/gallium/drivers/radeonsi/si_shader.h| 4 ++--
>  src/gallium/drivers/radeonsi/si_state.c | 6 +++---
>  src/gallium/drivers/radeonsi/si_state.h | 2 +-
>  src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
>  src/util/bitscan.h  | 9 +
>  6 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 6f0f414..dfba9d4 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -426,21 +426,21 @@ static void declare_input_vs(
> "llvm.SI.vs.load.input", ctx->v4f32, args, 3,
> LP_FUNC_ATTR_READNONE);
>
> /* Break up the vec4 into individual components */
> for (chan = 0; chan < 4; chan++) {
> LLVMValueRef llvm_chan = lp_build_const_int32(gallivm, chan);
> out[chan] = LLVMBuildExtractElement(gallivm->builder,
> input, llvm_chan, "");
> }
>
> -   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (2 * input_index)) 
> & 3;
> +   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (4 * input_index)) 
> & 0xf;
> if (fix_fetch) {
> /* The hardware returns an unsigned value; convert it to a
>  * signed one.
>  */
> LLVMValueRef tmp = out[3];
> LLVMValueRef c30 = LLVMConstInt(ctx->i32, 30, 0);
>
> /* First, recover the sign-extended signed integer value. */
> if (fix_fetch == SI_FIX_FETCH_A2_SSCALED)
> tmp = LLVMBuildFPToUI(gallivm->builder, tmp, 
> ctx->i32, "");
> @@ -6578,21 +6578,21 @@ static void si_dump_shader_key(unsigned shader, 
> struct si_shader_key *key,
> switch (shader) {
> case PIPE_SHADER_VERTEX:
> fprintf(f, "  part.vs.prolog.instance_divisors = {");
> for (i = 0; i < 
> ARRAY_SIZE(key->part.vs.prolog.instance_divisors); i++)
> fprintf(f, !i ? "%u" : ", %u",
> key->part.vs.prolog.instance_divisors[i]);
> fprintf(f, "}\n");
> fprintf(f, "  part.vs.epilog.export_prim_id = %u\n", 
> key->part.vs.epilog.export_prim_id);
> fprintf(f, "  as_es = %u\n", key->as_es);
> fprintf(f, "  as_ls = %u\n", key->as_ls);
> -   fprintf(f, "  mono.vs.fix_fetch = 0x%x\n", 
> key->mono.vs.fix_fetch);
> +   fprintf(f, "  mono.vs.fix_fetch = 0x%"PRIx64"\n", 
> key->mono.vs.fix_fetch);
> break;
>
> case PIPE_SHADER_TESS_CTRL:
> fprintf(f, "  part.tcs.epilog.prim_mode = %u\n", 
> key->part.tcs.epilog.prim_mode);
> fprintf(f, "  mono.tcs.inputs_to_copy = 0x%"PRIx64"\n", 
> key->mono.tcs.inputs_to_copy);
> break;
>
> case PIPE_SHADER_TESS_EVAL:
> fprintf(f, "  part.tes.epilog.export_prim_id = %u\n", 
> key->part.tes.epilog.export_prim_id);
> fprintf(f, "  as_es = %u\n", key->as_es);
> diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
> b/src/gallium/drivers/radeonsi/si_shader.h
> index 1b5dec2..89f9628 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.h
> +++ b/src/gallium/drivers/radeonsi/si_shader.h
> @@ -418,22 +418,22 @@ struct si_shader_key {
>
> /* These two are initially set according to the NEXT_SHADER property,
>  * or guessed if the property doesn't seem correct.
>  */
> unsigned as_es:1; /* export shader */
> unsigned as_ls:1; /* local shader */
>
> /* Flags for monolithic compilation only. */
> union {
> struct {
> -   /* One pair of bits for every input: SI_FIX_FETCH_* 
> enums. */
> -   uint32_tfix_fetch;
> +   /* One nibble for every input: SI_FIX_FETCH_* enums. 
> */
> +   uint64_tfix_fetch;
> } vs;
> struct {
> uint64_tinputs_to_copy; /* for fixed-func TCS 
> */
> } tcs;
> } mono;
>
> /* Optimization flags for asynchronous compilation only. */
> union {
> struct {
> uint64_tkill_outputs; /* "get_unique_index" 
> bits */
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 6e7d8da..fa78a56 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3356,26 +3356,26 @@ static void *si_create_vertex_elements(struct 
> pipe_con

Re: [Mesa-dev] [PATCH 2/3] radv: rename global extension properties structs

2017-01-16 Thread Bas Nieuwenhuizen
On Mon, Jan 16, 2017 at 2:51 PM, Emil Velikov  wrote:
> On 14 January 2017 at 02:31, Andres Rodriguez  wrote:
>> On Fri, Jan 13, 2017 at 8:13 PM, Emil Velikov 
>> wrote:
>>>
>>> On 13 January 2017 at 23:44, Andres Rodriguez  wrote:
>>> > All extension arrays are global, but only one of them refers to instance
>>> > extensions.
>>> >
>>> > The device extension array refers to extensions that are common across
>>> > all physical devices. This disctinction will be more imporant once we
>>> Typos: "distinction" and "important"
>>>
>>> > have dynamic extension support for devices.
>>> >
>>> I think that this and 3/3 are very good idea, but since RADV supports
>>> only one device I'm not sure that they're applicable, yet.
>>> Not too familiar with the RADV code so I might be off there.
>>
>>
>> Besides differences in HW functionality, another use for this feature would
>> be to expose an extension only if the software stack supports it.
>>
> Guess I was drooling too much over someone adding multiple devices
> support for radv ;-)
>
>> Eg. something like:
>>
>> if (libdrm_version >= x && drm_version >= y)
>> register_extension(...)
>>
>> This will come into play with some of the other patches on amd-gfx that
>> you've helped me review :)
>>
> Yw. As you get to the respective work - please don't base it on libdrm
> version. Please check that the kernel module is old enough either via
> a) a module version check or b) -EINVAL as returned by the module
> input validation. Former seems to be used by radeon/amdgpu userspace
> while the latter by the i915 one.

Using the kernel exported DRM version seems to be common practice for
radeonsi and the amdgpu and radeon winsyses though? That version has
been increased in both kernel drivers to indiciate new features too.
libdrm version will probably need to be compile time though anyway.

- Bas
>
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium: correctly manage libsensors link flags

2017-01-16 Thread Emil Velikov
On 8 December 2016 at 17:58, Emil Velikov  wrote:
> From: Emil Velikov 
>
> We should be using LIBS rather than the LDFLAGS variable. Furthermore
> try to keep the linking to the final stage, rather than intermetent
> static library.
>
> Cc: Steven Toth 
> Signed-off-by: Emil Velikov 
> ---
> Steven please double-check things on your end.
Humble ping on this and 2/2 ? Barring any objections I'll be pushing
these in a day or two.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] configure: forbid static EGL/GBM

2017-01-16 Thread Emil Velikov
On 7 December 2016 at 13:24, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Both libraries implicitly require shared GLAPI which in itself mandates
> shared libraries.
>
> Stop pretending that one can use it and error out at configure stage.
>
Humble ping on the series ? Barring any objections I'll be landing
these in the next few days.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] glx: remove always false ifdef GLX_NO_STATIC_EXTENSION_FUNCTIONS

2017-01-16 Thread Emil Velikov
On 5 December 2016 at 19:52, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Quick search through git history (of both mesa and xserver) hows no
> instances where this was ever set.
>
Some of the series is short on reviews - 1, 2, 4, 5, 6. Barring any
objections I'll me merging the lot in the next few days.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] egl/wayland: use the destroy_window_callback for swrast

2017-01-16 Thread Emil Velikov
On 28 November 2016 at 18:25, Emil Velikov  wrote:
> From: Emil Velikov 
>
> As described in commit 690ead4a135 ("egl/wayland-egl: Fix for segfault
> in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface
> attached to already destroyed Wayland window we'll get a segfault.
>
> v2: set the correct callback alongside the window->private. (Dan)
>
Humble poke on this and 2/2 ?

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: correctly manage MCJIT at run-time

2017-01-16 Thread Jose Fonseca

On 16/01/17 13:46, Emil Velikov wrote:

On 14 January 2017 at 08:46, Jose Fonseca  wrote:

I suspect this might break builds with LLVM 3.6 or higher.

The LLVMLinkInJIT must be inside #if ... #endif, and it must not be expanded
when HAVE_LLVM >= 0x0306, since LLVMLinkInJIT(),

That is, when HAVE_LLVM >= 0x0306:
- USE_MCJIT should be static const
- no point claling GALLIVM_MCJIT
- must not have any LLVMLinkInJIT() call around, regardless it's called or
not.


And this code doesn't make sense:

 if (USE_MCJIT)
LLVMLinkInMCJIT();
 else
LLVMLinkInJIT();

If these functions are meant to force the static linking of external
libraries, putting any control flow around it is just misleading.  It gives
the illusion that if we don't call these functions nothing will happen which
is defintely not true.



As an added bonus might even solve the issue Wu Zhen is hitting :-)


I'm not enterly sure I understad Wu Zhen problem.

If android doesn't have MCJIT then I think the right fix is merely

#if defined(ANDROID)
#define USE_MCJIT 0
#endif

...


  // Link MCJIT when USE_MCJIT is a runtime option or defined as 1
#if !defined(USE_MCJIT) || USE_MCJIT
  LLVMLinkInMCJIT();
#endif

  // Link old JIT when USE_MCJIT is a runtime option or defined as 0
#if !defined(USE_MCJIT) || !USE_MCJIT
  LLVMLinkInJIT();
#endif

That is, any logic to decide whether to call or not LLVMLinkIn* must be done
with _build_ time C-processor logic.


I might be the only one here, but having FOO (USE_MCJIT in this case)
as define or variable depending on $heuristics reads a bit iffy.


Good point.  I'm attaching a patch that addresses this.


That aside - if I understood you correctly:
 - One of LLVMLinkIn{MC,}JIT might be missing on some versions of LLVM.
In that case having a guard called "USE" sounds like a misnomer.
Providing a static inline as needed might be cleaner ?


I didn't use an inline but I separated the define and variable more clearly.


 - If LLVMLinkInJIT/LLVMLinkInMCJIT are solely for linking purposes,
it will be better (imho) to add a small comment and still have them
honour the user selection.
With the latter, since we don't want things to explode as older/newer
LLVM adds specific code in said functions.


That's bordering paranoia.  The semantics of LLVMLinkIn* are clear. 
Yes, upstream can break them as they can break the semantics of any 
other function we use.  It makes no sense to trust upstream to keep 
backwards compatability for some functions and not others.


And I still think that's guarding the LLVMLinkIn in runtime checks is 
more evil (because it's misleading) than good.




How does that sound ?
Emil


This patch does nothing for Android, but it should now be trivial for 
Zhen Wu to rebase his patch.


Jose

commit 411eb361e284ec26f875daa84de4a181dcea9288
Author: Jose Fonseca 
Date:   Mon Jan 16 14:19:36 2017 +

gallivm: Cleanup USE_MCJIT.

Split USE_MCJIT macro dual nature into a separate constant time define
and a run-time variable.

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index d1b2369..ada823b 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -48,8 +48,12 @@
 #  define USE_MCJIT 1
 #elif defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
 #  define USE_MCJIT 1
+#endif
+
+#if defined(USE_MCJIT)
+static const bool use_mcjit = USE_MCJIT;
 #else
-static bool USE_MCJIT = 0;
+static bool use_mcjit = FALSE;
 #endif
 
 
@@ -190,7 +194,7 @@ gallivm_free_ir(struct gallivm_state *gallivm)
 
FREE(gallivm->module_name);
 
-   if (!USE_MCJIT) {
+   if (!use_mcjit) {
   /* Don't free the TargetData, it's owned by the exec engine */
} else {
   if (gallivm->target) {
@@ -248,7 +252,7 @@ init_gallivm_engine(struct gallivm_state *gallivm)
 gallivm->module,
 gallivm->memorymgr,
 (unsigned) optlevel,
-USE_MCJIT,
+use_mcjit,
 &error);
   if (ret) {
  _debug_printf("%s\n", error);
@@ -257,7 +261,7 @@ init_gallivm_engine(struct gallivm_state *gallivm)
   }
}
 
-   if (!USE_MCJIT) {
+   if (!use_mcjit) {
   gallivm->target = LLVMGetExecutionEngineTargetData(gallivm->engine);
   if (!gallivm->target)
  goto fail;
@@ -336,7 +340,7 @@ init_gallivm_state(struct gallivm_state *gallivm, const char *name,
 * complete when MC-JIT is created. So defer the MC-JIT engine creation for
 * now.
 */
-   if (!USE_MCJIT) {
+   if (!use_mcjit) {
   if (!init_gallivm_engine(gallivm)) {
  goto fail;
   }
@@ -395,10 +399,16 @@ lp_build_init(void)
if (gallivm_i

Re: [Mesa-dev] [PATCH v2 1/2] egl/wayland: use the destroy_window_callback for swrast

2017-01-16 Thread Daniel Stone
Hi Emil,

On 16 January 2017 at 14:25, Emil Velikov  wrote:
> On 28 November 2016 at 18:25, Emil Velikov  wrote:
>> As described in commit 690ead4a135 ("egl/wayland-egl: Fix for segfault
>> in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface
>> attached to already destroyed Wayland window we'll get a segfault.
>>
>> v2: set the correct callback alongside the window->private. (Dan)
>>
> Humble poke on this and 2/2 ?

Both are:
Reviewed-by: Daniel Stone 

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97967] glsl/tests/cache-test regression

2017-01-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97967

--- Comment #4 from Emil Velikov  ---
Fwiw I've ran into identical issue whist importing a SHA1 implementation.

The issue here is buggy SHA1 implementation. As such if you print the result of
_mesa_sha1_compute and match that across sha1 implementation you'll see it
first hand.

Thus for this bug the cache_test.c comment "For this test, we force this
signature to land in the same directory as the original blob first written to
the cache," is not true, leading to the interesting experience.

Fwiw I've imported OpenBSD's implementation [1]. It works great, saves us a bit
of code, configure script and toggles.

[1] https://patchwork.freedesktop.org/patch/133113/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: increase ANV_MAX_STATE_SIZE_LOG2 limit to 1 MB

2017-01-16 Thread Jason Ekstrand
Rb

On Jan 16, 2017 12:15 AM, "Samuel Iglesias Gonsálvez" 
wrote:

> Fixes crash in dEQP-VK.ubo.random.all_shared_buffer.48 due to a
> fragment shader code bigger than 128 kB.
>
> This patch increases the allocation size limit to 1 MB.
>
> v2:
> - Increase it to 1 MB (Jason)
> - Increase device->instruction_block_pool allocation size in
>   anv_device.c (Jason)
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/intel/vulkan/anv_device.c  | 2 +-
>  src/intel/vulkan/anv_private.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 6349537d172..f80a36a9400 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -948,7 +948,7 @@ VkResult anv_CreateDevice(
> anv_state_pool_init(&device->dynamic_state_pool,
> &device->dynamic_state_block_pool);
>
> -   anv_block_pool_init(&device->instruction_block_pool, device, 128 *
> 1024);
> +   anv_block_pool_init(&device->instruction_block_pool, device, 1024 *
> 1024);
> anv_state_pool_init(&device->instruction_state_pool,
> &device->instruction_block_pool);
>
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> index 17b72368819..75f2bde66a8 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -388,7 +388,7 @@ struct anv_fixed_size_state_pool {
>  };
>
>  #define ANV_MIN_STATE_SIZE_LOG2 6
> -#define ANV_MAX_STATE_SIZE_LOG2 17
> +#define ANV_MAX_STATE_SIZE_LOG2 20
>
>  #define ANV_STATE_BUCKETS (ANV_MAX_STATE_SIZE_LOG2 -
> ANV_MIN_STATE_SIZE_LOG2 + 1)
>
> --
> 2.11.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/19] i965: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" 
Cc: Kenneth Graunke 
Signed-off-by: Emil Velikov 
---
Strictly speaking not introduced with the above commit.
---
 src/mesa/drivers/dri/i965/Makefile.am | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
b/src/mesa/drivers/dri/i965/Makefile.am
index 92cb5b5ba0..1fb7485ffa 100644
--- a/src/mesa/drivers/dri/i965/Makefile.am
+++ b/src/mesa/drivers/dri/i965/Makefile.am
@@ -30,15 +30,15 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
+   -I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_srcdir)/src/gtest/include \
-   -I$(top_srcdir)/src/compiler/nir \
-   -I$(top_srcdir)/src/intel \
-I$(top_builddir)/src/compiler/glsl \
-I$(top_builddir)/src/compiler/nir \
+   -I$(top_srcdir)/src/compiler/nir \
-I$(top_builddir)/src/intel \
-   -I$(top_builddir)/src/mesa/drivers/dri/common \
+   -I$(top_srcdir)/src/intel \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(INTEL_CFLAGS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/19] freedreno: automake: correctly set MKDIR_GEN

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" 
Cc: Rob Clark 
Cc: Nicolas Dechesne 
Reported-by: Nicolas Dechesne 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/freedreno/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/freedreno/Makefile.am 
b/src/gallium/drivers/freedreno/Makefile.am
index e5c344d700..128c7fb599 100644
--- a/src/gallium/drivers/freedreno/Makefile.am
+++ b/src/gallium/drivers/freedreno/Makefile.am
@@ -9,6 +9,7 @@ AM_CFLAGS = \
$(GALLIUM_DRIVER_CFLAGS) \
$(FREEDRENO_CFLAGS)
 
+MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
 ir3/ir3_nir_trig.c: ir3/ir3_nir_trig.py 
$(top_srcdir)/src/compiler/nir/nir_algebraic.py
$(MKDIR_GEN)
$(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2) 
$(PYTHON_FLAGS) $(srcdir)/ir3/ir3_nir_trig.py > $@ || ($(RM) $@; false)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/19] i965: automake: correctly set MKDIR_GEN

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" 
Cc: Kenneth Graunke 
Signed-off-by: Emil Velikov 
---
Worth setting in configure and/or using @MKDIR_GEN@ + AC_SUBST ?
---
 src/mesa/drivers/dri/i965/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
b/src/mesa/drivers/dri/i965/Makefile.am
index 4b009770ab..92cb5b5ba0 100644
--- a/src/mesa/drivers/dri/i965/Makefile.am
+++ b/src/mesa/drivers/dri/i965/Makefile.am
@@ -45,6 +45,7 @@ AM_CFLAGS = \
 
 AM_CXXFLAGS = $(AM_CFLAGS)
 
+MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
 brw_nir_trig_workarounds.c: brw_nir_trig_workarounds.py 
$(top_srcdir)/src/compiler/nir/nir_algebraic.py
$(MKDIR_GEN)
$(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2) 
$(PYTHON_FLAGS) $(srcdir)/brw_nir_trig_workarounds.py > $@ || ($(RM) $@; false)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/19] clover: automake: remove -I$(srcdir)

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Already implicitly handled by the build system.

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/clover/Makefile.am | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/clover/Makefile.am 
b/src/gallium/state_trackers/clover/Makefile.am
index 8abcfec2e3..a657e5b88a 100644
--- a/src/gallium/state_trackers/clover/Makefile.am
+++ b/src/gallium/state_trackers/clover/Makefile.am
@@ -7,8 +7,7 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/src/gallium/auxiliary \
-   -I$(top_srcdir)/src/gallium/winsys \
-   -I$(srcdir)
+   -I$(top_srcdir)/src/gallium/winsys
 
 if HAVE_CLOVER_ICD
 AM_CPPFLAGS += -DHAVE_CLOVER_ICD
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/19] i915: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i915/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i915/Makefile.am 
b/src/mesa/drivers/dri/i915/Makefile.am
index 822f74c230..11b7341c73 100644
--- a/src/mesa/drivers/dri/i915/Makefile.am
+++ b/src/mesa/drivers/dri/i915/Makefile.am
@@ -30,9 +30,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
+   -I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-   -I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(INTEL_CFLAGS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/19] clover: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Cc: Aaron Watry 
Cc: Francisco Jerez 
Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/clover/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/Makefile.am 
b/src/gallium/state_trackers/clover/Makefile.am
index d0b191464d..8abcfec2e3 100644
--- a/src/gallium/state_trackers/clover/Makefile.am
+++ b/src/gallium/state_trackers/clover/Makefile.am
@@ -2,12 +2,12 @@ include Makefile.sources
 
 AM_CPPFLAGS = \
-I$(top_srcdir)/include \
+   -I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/winsys \
-   -I$(top_builddir)/src \
-I$(srcdir)
 
 if HAVE_CLOVER_ICD
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/19] st/dri: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/dri/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/dri/Makefile.am 
b/src/gallium/state_trackers/dri/Makefile.am
index 74bccaa641..61a1cabeb8 100644
--- a/src/gallium/state_trackers/dri/Makefile.am
+++ b/src/gallium/state_trackers/dri/Makefile.am
@@ -28,8 +28,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-   -I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_builddir)/src/mesa/drivers/dri/common \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common \
$(GALLIUM_CFLAGS) \
$(LIBDRM_CFLAGS) \
$(VISIBILITY_CFLAGS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/19] d3dadapter9: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Cc: Axel Davy 
Signed-off-by: Emil Velikov 
---
 src/gallium/targets/d3dadapter9/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/targets/d3dadapter9/Makefile.am 
b/src/gallium/targets/d3dadapter9/Makefile.am
index a3d2416c31..b78fb721a1 100644
--- a/src/gallium/targets/d3dadapter9/Makefile.am
+++ b/src/gallium/targets/d3dadapter9/Makefile.am
@@ -27,8 +27,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/loader \
-I$(top_srcdir)/src/mapi/ \
-I$(top_srcdir)/src/mesa/ \
-   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_builddir)/src/mesa/drivers/dri/common/ \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/gallium/winsys \
-I$(top_srcdir)/src/gallium/state_trackers/nine \
$(GALLIUM_TARGET_CFLAGS) \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/19] glx/windows: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Cc: Jon Turney 
Signed-off-by: Emil Velikov 
---
 src/glx/windows/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/windows/Makefile.am b/src/glx/windows/Makefile.am
index 9806988236..6de3cf226b 100644
--- a/src/glx/windows/Makefile.am
+++ b/src/glx/windows/Makefile.am
@@ -24,8 +24,8 @@ libwindowsglx_la_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/glx \
-I$(top_srcdir)/src/mapi \
-   -I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi/glapi \
+   -I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(DEFINES) \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/19] loader: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/loader/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index ba2e65c371..eff85af42d 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -39,8 +39,8 @@ libloader_la_LIBADD =
 
 if HAVE_DRICOMMON
 libloader_la_CPPFLAGS += \
-   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_builddir)/src/mesa/drivers/dri/common/ \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/mapi/ \
-DUSE_DRICONF
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/19] glx/apple: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Cc: Jeremy Huddleston Sequoia 
Signed-off-by: Emil Velikov 
---
 src/glx/apple/Makefile.am | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glx/apple/Makefile.am b/src/glx/apple/Makefile.am
index 2cbff9ea90..ca74aa7b99 100644
--- a/src/glx/apple/Makefile.am
+++ b/src/glx/apple/Makefile.am
@@ -6,11 +6,11 @@ AM_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/glx \
-   -I$(top_srcdir)/src/mesa \
-I$(top_builddir)/src/mesa \
+   -I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mapi \
-   -I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi/glapi \
+   -I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(DEFINES) \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/19] egl: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/egl/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index 7c5abd2114..407c69a992 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -97,8 +97,8 @@ AM_CFLAGS += \
-I$(top_srcdir)/src/egl/drivers/dri2 \
-I$(top_srcdir)/src/gbm/backends/dri \
-I$(top_srcdir)/src/egl/wayland/wayland-egl \
-   -I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
+   -I$(top_srcdir)/src/egl/wayland/wayland-drm \
-DDEFAULT_DRIVER_DIR=\"$(DRI_DRIVER_SEARCH_DIR)\" \
-D_EGL_BUILT_IN_DRIVER_DRI2
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/19] dri/swrast: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/swrast/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/swrast/Makefile.am 
b/src/mesa/drivers/dri/swrast/Makefile.am
index 9d21d9ea4d..a82e580f1d 100644
--- a/src/mesa/drivers/dri/swrast/Makefile.am
+++ b/src/mesa/drivers/dri/swrast/Makefile.am
@@ -30,8 +30,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-   -I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_builddir)/src/mesa/drivers/dri/common \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common \
$(LIBDRM_CFLAGS) \
$(DEFINES) \
$(VISIBILITY_CFLAGS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/19] radeon, r200: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/r200/Makefile.am   | 2 +-
 src/mesa/drivers/dri/radeon/Makefile.am | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/r200/Makefile.am 
b/src/mesa/drivers/dri/r200/Makefile.am
index 137d3c85a6..1094343d60 100644
--- a/src/mesa/drivers/dri/r200/Makefile.am
+++ b/src/mesa/drivers/dri/r200/Makefile.am
@@ -34,9 +34,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
+   -I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/r200/server \
-   -I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(RADEON_CFLAGS)
diff --git a/src/mesa/drivers/dri/radeon/Makefile.am 
b/src/mesa/drivers/dri/radeon/Makefile.am
index b236aa6b5e..176ec797ef 100644
--- a/src/mesa/drivers/dri/radeon/Makefile.am
+++ b/src/mesa/drivers/dri/radeon/Makefile.am
@@ -35,9 +35,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
+   -I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/radeon/server \
-   -I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(RADEON_CFLAGS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/19] glx: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/glx/Makefile.am | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index 5884e33f80..79d416abca 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -37,10 +37,10 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/GL/internal \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/loader \
-   -I$(top_srcdir)/src/mapi \
-   -I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi \
+   -I$(top_srcdir)/src/mapi \
-I$(top_builddir)/src/mapi/glapi \
+   -I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(EXTRA_DEFINES_XF86VIDMODE) \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/19] mesa/tests: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mesa/main/tests/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/tests/Makefile.am b/src/mesa/main/tests/Makefile.am
index d6977e20e8..18f750e4d4 100644
--- a/src/mesa/main/tests/Makefile.am
+++ b/src/mesa/main/tests/Makefile.am
@@ -4,8 +4,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/gtest/include \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/mapi \
-   -I$(top_srcdir)/src/mesa \
-I$(top_builddir)/src/mesa \
+   -I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/include \
$(DEFINES) $(INCLUDE_DIRS)
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/19] i915, i965: automake: remove NA include directive

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

The path in question (... dri/intel/server) was removed years ago.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i915/Makefile.am | 1 -
 src/mesa/drivers/dri/i965/Makefile.am | 1 -
 2 files changed, 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/Makefile.am 
b/src/mesa/drivers/dri/i915/Makefile.am
index 11b7341c73..e85fb9d548 100644
--- a/src/mesa/drivers/dri/i915/Makefile.am
+++ b/src/mesa/drivers/dri/i915/Makefile.am
@@ -32,7 +32,6 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-   -I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(INTEL_CFLAGS)
diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
b/src/mesa/drivers/dri/i965/Makefile.am
index 1fb7485ffa..b87e19a4a8 100644
--- a/src/mesa/drivers/dri/i965/Makefile.am
+++ b/src/mesa/drivers/dri/i965/Makefile.am
@@ -32,7 +32,6 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-   -I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_srcdir)/src/gtest/include \
-I$(top_builddir)/src/compiler/glsl \
-I$(top_builddir)/src/compiler/nir \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/19] mapi: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mapi/Makefile.am | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
index 5013e9af5e..7ebe14f520 100644
--- a/src/mapi/Makefile.am
+++ b/src/mapi/Makefile.am
@@ -46,8 +46,8 @@ AM_CPPFLAGS = 
\
$(SELINUX_CFLAGS)   \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
-   -I$(top_srcdir)/src/mapi\
-   -I$(top_builddir)/src/mapi
+   -I$(top_builddir)/src/mapi  \
+   -I$(top_srcdir)/src/mapi
 
 include Makefile.sources
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/19] dri/osmesa: automake: include builddir prior to srcdir

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Cc: "12.0 13.0" 
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/osmesa/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/osmesa/Makefile.am 
b/src/mesa/drivers/osmesa/Makefile.am
index 5525687c5b..2c8d4668b1 100644
--- a/src/mesa/drivers/osmesa/Makefile.am
+++ b/src/mesa/drivers/osmesa/Makefile.am
@@ -28,8 +28,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-   -I$(top_srcdir)/src/mapi \
-I$(top_builddir)/src/mapi \
+   -I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
$(DEFINES)
 AM_CFLAGS = $(PTHREAD_CFLAGS) \
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: make fix_fetch 64-bit

2017-01-16 Thread Nicolai Hähnle

On 16.01.2017 15:04, Marek Olšák wrote:

On Mon, Jan 16, 2017 at 3:00 PM, Marek Olšák  wrote:

From: Marek Olšák 

v2: add u_bit_consecutive64
---
 src/gallium/drivers/radeonsi/si_shader.c| 4 ++--
 src/gallium/drivers/radeonsi/si_shader.h| 4 ++--
 src/gallium/drivers/radeonsi/si_state.c | 6 +++---
 src/gallium/drivers/radeonsi/si_state.h | 2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
 src/util/bitscan.h  | 9 +
 6 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6f0f414..dfba9d4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -426,21 +426,21 @@ static void declare_input_vs(
"llvm.SI.vs.load.input", ctx->v4f32, args, 3,
LP_FUNC_ATTR_READNONE);

/* Break up the vec4 into individual components */
for (chan = 0; chan < 4; chan++) {
LLVMValueRef llvm_chan = lp_build_const_int32(gallivm, chan);
out[chan] = LLVMBuildExtractElement(gallivm->builder,
input, llvm_chan, "");
}

-   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (2 * input_index)) & 
3;
+   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (4 * input_index)) & 
0xf;
if (fix_fetch) {
/* The hardware returns an unsigned value; convert it to a
 * signed one.
 */
LLVMValueRef tmp = out[3];
LLVMValueRef c30 = LLVMConstInt(ctx->i32, 30, 0);

/* First, recover the sign-extended signed integer value. */
if (fix_fetch == SI_FIX_FETCH_A2_SSCALED)
tmp = LLVMBuildFPToUI(gallivm->builder, tmp, ctx->i32, 
"");
@@ -6578,21 +6578,21 @@ static void si_dump_shader_key(unsigned shader, struct 
si_shader_key *key,
switch (shader) {
case PIPE_SHADER_VERTEX:
fprintf(f, "  part.vs.prolog.instance_divisors = {");
for (i = 0; i < 
ARRAY_SIZE(key->part.vs.prolog.instance_divisors); i++)
fprintf(f, !i ? "%u" : ", %u",
key->part.vs.prolog.instance_divisors[i]);
fprintf(f, "}\n");
fprintf(f, "  part.vs.epilog.export_prim_id = %u\n", 
key->part.vs.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
fprintf(f, "  as_ls = %u\n", key->as_ls);
-   fprintf(f, "  mono.vs.fix_fetch = 0x%x\n", 
key->mono.vs.fix_fetch);
+   fprintf(f, "  mono.vs.fix_fetch = 0x%"PRIx64"\n", 
key->mono.vs.fix_fetch);
break;

case PIPE_SHADER_TESS_CTRL:
fprintf(f, "  part.tcs.epilog.prim_mode = %u\n", 
key->part.tcs.epilog.prim_mode);
fprintf(f, "  mono.tcs.inputs_to_copy = 0x%"PRIx64"\n", 
key->mono.tcs.inputs_to_copy);
break;

case PIPE_SHADER_TESS_EVAL:
fprintf(f, "  part.tes.epilog.export_prim_id = %u\n", 
key->part.tes.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 1b5dec2..89f9628 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -418,22 +418,22 @@ struct si_shader_key {

/* These two are initially set according to the NEXT_SHADER property,
 * or guessed if the property doesn't seem correct.
 */
unsigned as_es:1; /* export shader */
unsigned as_ls:1; /* local shader */

/* Flags for monolithic compilation only. */
union {
struct {
-   /* One pair of bits for every input: SI_FIX_FETCH_* 
enums. */
-   uint32_tfix_fetch;
+   /* One nibble for every input: SI_FIX_FETCH_* enums. */
+   uint64_tfix_fetch;
} vs;
struct {
uint64_tinputs_to_copy; /* for fixed-func TCS */
} tcs;
} mono;

/* Optimization flags for asynchronous compilation only. */
union {
struct {
uint64_tkill_outputs; /* "get_unique_index" 
bits */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 6e7d8da..fa78a56 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3356,26 +3356,26 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
   
S_008F0C_DST_SEL_W(si_map_swizzle(desc->swizzle[3])) |
   S_008F0C_NUM_FORMAT(num_format) |

Re: [Mesa-dev] [PATCH 1/3] gallium: add TGSI_PROPERTY_MUL_ZERO_WINS

2017-01-16 Thread Nicolai Hähnle
I guess doing this makes sense even while the GL extension discussion is 
stalled because of these wacky hardware differences.


Reviewed-by: Nicolai Hähnle 

On 15.01.2017 19:36, Ilia Mirkin wrote:

This will be useful for proper D3D9 emulation, where this behavior is
expected by some shaders.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  3 ++-
 src/gallium/docs/source/tgsi.rst   | 14 --
 src/gallium/include/pipe/p_shader_tokens.h |  1 +
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index 536a4c8..cebc1b4 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -148,7 +148,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"NEXT_SHADER",
"CS_FIXED_BLOCK_WIDTH",
"CS_FIXED_BLOCK_HEIGHT",
-   "CS_FIXED_BLOCK_DEPTH"
+   "CS_FIXED_BLOCK_DEPTH",
+   "MUL_ZERO_WINS",
 };

 const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 4d7ec90..4e71ea6 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -3538,13 +3538,23 @@ Which shader stage will MOST LIKELY follow after this 
shader when the shader
 is bound. This is only a hint to the driver and doesn't have to be precise.
 Only set for VS and TES.

-TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
-"""
+CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
+"

 Threads per block in each dimension, if known at compile time. If the block 
size
 is known all three should be at least 1. If it is unknown they should all be 
set
 to 0 or not set.

+MUL_ZERO_WINS
+"
+
+The MUL TGSI operation (FP32 multiplication) will return 0 if either
+of the operands are equal to 0. That means that 0 * Inf = 0. This
+should be set the same way for an entire pipeline. If there is a
+mismatch between shaders, then it is unspecified whether this behavior
+will be enabled.
+
+
 Texture Sampling and Texture Formats
 

diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index f9b658d..27f842c 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -290,6 +290,7 @@ enum tgsi_property_name {
TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH,
TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT,
TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH,
+   TGSI_PROPERTY_MUL_ZERO_WINS,
TGSI_PROPERTY_COUNT,
 };



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add TGSI_PROPERTY_MUL_ZERO_WINS

2017-01-16 Thread Ilia Mirkin
Yeah, Axel also asked for a cap. I tend to agree. I just didn't want
to have two outstanding changes to add caps, since they'd conflict
with each other. (My advanced blend series also adds a cap for
FBFETCH.) Once that lands, I can resend adding a cap for this.

On Mon, Jan 16, 2017 at 10:51 AM, Roland Scheidegger  wrote:
> I think you'd also want a cap bit - I don't think it's reasonable to
> expect all drivers to implement this (e.g. I really don't feel like
> doing that for llvmpipe, there is no way to do that natively obviously),
> and I'd think it's better that it would be the fault of the st and not
> the driver if the bit isn't supported and noone feels like doing
> workarounds...
>
> Roland
>
> Am 15.01.2017 um 19:36 schrieb Ilia Mirkin:
>> This will be useful for proper D3D9 emulation, where this behavior is
>> expected by some shaders.
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |  3 ++-
>>  src/gallium/docs/source/tgsi.rst   | 14 --
>>  src/gallium/include/pipe/p_shader_tokens.h |  1 +
>>  3 files changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
>> b/src/gallium/auxiliary/tgsi/tgsi_strings.c
>> index 536a4c8..cebc1b4 100644
>> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
>> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
>> @@ -148,7 +148,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
>> "NEXT_SHADER",
>> "CS_FIXED_BLOCK_WIDTH",
>> "CS_FIXED_BLOCK_HEIGHT",
>> -   "CS_FIXED_BLOCK_DEPTH"
>> +   "CS_FIXED_BLOCK_DEPTH",
>> +   "MUL_ZERO_WINS",
>>  };
>>
>>  const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
>> diff --git a/src/gallium/docs/source/tgsi.rst 
>> b/src/gallium/docs/source/tgsi.rst
>> index 4d7ec90..4e71ea6 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -3538,13 +3538,23 @@ Which shader stage will MOST LIKELY follow after 
>> this shader when the shader
>>  is bound. This is only a hint to the driver and doesn't have to be precise.
>>  Only set for VS and TES.
>>
>> -TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
>> -"""
>> +CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
>> +"
>>
>>  Threads per block in each dimension, if known at compile time. If the block 
>> size
>>  is known all three should be at least 1. If it is unknown they should all 
>> be set
>>  to 0 or not set.
>>
>> +MUL_ZERO_WINS
>> +"
>> +
>> +The MUL TGSI operation (FP32 multiplication) will return 0 if either
>> +of the operands are equal to 0. That means that 0 * Inf = 0. This
>> +should be set the same way for an entire pipeline. If there is a
>> +mismatch between shaders, then it is unspecified whether this behavior
>> +will be enabled.
>> +
>> +
>>  Texture Sampling and Texture Formats
>>  
>>
>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
>> b/src/gallium/include/pipe/p_shader_tokens.h
>> index f9b658d..27f842c 100644
>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>> @@ -290,6 +290,7 @@ enum tgsi_property_name {
>> TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH,
>> TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT,
>> TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH,
>> +   TGSI_PROPERTY_MUL_ZERO_WINS,
>> TGSI_PROPERTY_COUNT,
>>  };
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] configure: remove unused AC_SUBST variables

2017-01-16 Thread Nicolai Hähnle

Not a build system guru, but both patches make sense to me, so:

Reviewed-by: Nicolai Hähnle 

On 08.12.2016 18:58, Emil Velikov wrote:

From: Emil Velikov 

Signed-off-by: Emil Velikov 
---
 configure.ac | 10 --
 1 file changed, 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 2007098..b0bce9d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2068,7 +2068,6 @@ AC_SUBST([GBM_PC_LIB_PRIV])
 dnl
 dnl EGL configuration
 dnl
-EGL_CLIENT_APIS=""

 if test "x$enable_egl" = xyes; then
 EGL_LIB_DEPS="$DLOPEN_LIBS $SELINUX_LIBS $PTHREAD_LIBS"
@@ -2295,15 +2294,6 @@ dnl Gallium configuration
 dnl
 AM_CONDITIONAL(HAVE_GALLIUM, test -n "$with_gallium_drivers")

-case "x$enable_opengl$enable_gles1$enable_gles2" in
-x*yes*)
-EGL_CLIENT_APIS="$EGL_CLIENT_APIS "'$(GL_LIB)'
-;;
-esac
-
-AC_SUBST([VG_LIB_DEPS])
-AC_SUBST([EGL_CLIENT_APIS])
-
 # libEGL wants to default to the first platform specified in
 # ./configure.  parse that here.
 if test "x$platforms" != "x"; then


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: fix R600_DEBUG=nooptvariant

2017-01-16 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 9967837..9d30b90 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1129,7 +1129,7 @@ static int si_shader_select_with_key(struct si_screen 
*sscreen,
struct si_shader *current = state->current;
struct si_shader *iter, *shader = NULL;
 
-   if (unlikely(sscreen->b.chip_class & DBG_NO_OPT_VARIANT)) {
+   if (unlikely(sscreen->b.debug_flags & DBG_NO_OPT_VARIANT)) {
memset(&key->opt, 0, sizeof(key->opt));
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add TGSI_PROPERTY_MUL_ZERO_WINS

2017-01-16 Thread Roland Scheidegger
I think you'd also want a cap bit - I don't think it's reasonable to
expect all drivers to implement this (e.g. I really don't feel like
doing that for llvmpipe, there is no way to do that natively obviously),
and I'd think it's better that it would be the fault of the st and not
the driver if the bit isn't supported and noone feels like doing
workarounds...

Roland

Am 15.01.2017 um 19:36 schrieb Ilia Mirkin:
> This will be useful for proper D3D9 emulation, where this behavior is
> expected by some shaders.
> 
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |  3 ++-
>  src/gallium/docs/source/tgsi.rst   | 14 --
>  src/gallium/include/pipe/p_shader_tokens.h |  1 +
>  3 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
> b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> index 536a4c8..cebc1b4 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> @@ -148,7 +148,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
> "NEXT_SHADER",
> "CS_FIXED_BLOCK_WIDTH",
> "CS_FIXED_BLOCK_HEIGHT",
> -   "CS_FIXED_BLOCK_DEPTH"
> +   "CS_FIXED_BLOCK_DEPTH",
> +   "MUL_ZERO_WINS",
>  };
>  
>  const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
> diff --git a/src/gallium/docs/source/tgsi.rst 
> b/src/gallium/docs/source/tgsi.rst
> index 4d7ec90..4e71ea6 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -3538,13 +3538,23 @@ Which shader stage will MOST LIKELY follow after this 
> shader when the shader
>  is bound. This is only a hint to the driver and doesn't have to be precise.
>  Only set for VS and TES.
>  
> -TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
> -"""
> +CS_FIXED_BLOCK_WIDTH / HEIGHT / DEPTH
> +"
>  
>  Threads per block in each dimension, if known at compile time. If the block 
> size
>  is known all three should be at least 1. If it is unknown they should all be 
> set
>  to 0 or not set.
>  
> +MUL_ZERO_WINS
> +"
> +
> +The MUL TGSI operation (FP32 multiplication) will return 0 if either
> +of the operands are equal to 0. That means that 0 * Inf = 0. This
> +should be set the same way for an entire pipeline. If there is a
> +mismatch between shaders, then it is unspecified whether this behavior
> +will be enabled.
> +
> +
>  Texture Sampling and Texture Formats
>  
>  
> diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
> b/src/gallium/include/pipe/p_shader_tokens.h
> index f9b658d..27f842c 100644
> --- a/src/gallium/include/pipe/p_shader_tokens.h
> +++ b/src/gallium/include/pipe/p_shader_tokens.h
> @@ -290,6 +290,7 @@ enum tgsi_property_name {
> TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH,
> TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT,
> TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH,
> +   TGSI_PROPERTY_MUL_ZERO_WINS,
> TGSI_PROPERTY_COUNT,
>  };
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50, nvc0: disable depth offsets when there is no depth buffer

2017-01-16 Thread Roland Scheidegger
I'm pretty sure it's undefined in GL (because there's no defined minimum
resolvable difference in depth buffer format), IIRC this is stated
somewhere but can't remember.
d3d10 has a definition which doesn't make much sense (as it still says
to use the unorm formula in this case for which "r is the minimum
representable value > 0 in the depth-buffer format converted to
float32":
https://msdn.microsoft.com/en-us/library/windows/desktop/cc308048(v=vs.85).aspx
Albeit the scale part still would work fine...
Not sure though it's really the drivers job to hack around these...

Roland




Am 15.01.2017 um 21:52 schrieb Ilia Mirkin:
> While I can find no support for this in the GL spec, this is apparently
> what D3D9 wants. Also appears to fix a very long-standing bug in Tomb
> Raider: Underworld and Deus Ex: Human Revolution (probably based on the
> same engines).
> 
> Bugzilla: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91247&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0&m=fR-7SJseyJKibacwjqB22vAW8uf-C69bqQPlj0ApZ48&s=T0bGXIRFrc7Qm6Njye1wA-S-wWRHltfTQT1E-pa0XpM&e=
>  
> References: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_iXit_Mesa-2D3D_issues_224&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0&m=fR-7SJseyJKibacwjqB22vAW8uf-C69bqQPlj0ApZ48&s=5Kfu_g8K6BI7DVL9FngDmtJcsyo5zVlW_n_RNWWcNWk&e=
>  
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_state.c  |  4 
>  src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 17 +
>  src/gallium/drivers/nouveau/nv50/nv50_stateobj.h   |  2 +-
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c  |  4 
>  src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c |  5 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h   |  2 +-
>  6 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 99d70d1..e66257a 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -301,10 +301,6 @@ nv50_rasterizer_state_create(struct pipe_context *pipe,
>  
> SB_BEGIN_3D(so, POLYGON_STIPPLE_ENABLE, 1);
> SB_DATA(so, cso->poly_stipple_enable);
> -   SB_BEGIN_3D(so, POLYGON_OFFSET_POINT_ENABLE, 3);
> -   SB_DATA(so, cso->offset_point);
> -   SB_DATA(so, cso->offset_line);
> -   SB_DATA(so, cso->offset_tri);
>  
> if (cso->offset_point || cso->offset_line || cso->offset_tri) {
>SB_BEGIN_3D(so, POLYGON_OFFSET_FACTOR, 1);
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
> index c6f0363..0db13d9 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
> @@ -347,6 +347,22 @@ nv50_validate_derived_2(struct nv50_context *nv50)
>  }
>  
>  static void
> +nv50_validate_rast_fb(struct nv50_context *nv50)
> +{
> +   struct nouveau_pushbuf *push = nv50->base.pushbuf;
> +   struct pipe_framebuffer_state *fb = &nv50->framebuffer;
> +   struct pipe_rasterizer_state *rast = &nv50->rast->pipe;
> +
> +   if (!rast)
> +  return;
> +
> +   BEGIN_NV04(push, NV50_3D(POLYGON_OFFSET_POINT_ENABLE), 3);
> +   PUSH_DATA (push, rast->offset_point * !!fb->zsbuf);
> +   PUSH_DATA (push, rast->offset_line * !!fb->zsbuf);
> +   PUSH_DATA (push, rast->offset_tri * !!fb->zsbuf);
> +}
> +
> +static void
>  nv50_validate_clip(struct nv50_context *nv50)
>  {
> struct nouveau_pushbuf *push = nv50->base.pushbuf;
> @@ -515,6 +531,7 @@ validate_list_3d[] = {
>  { nv50_validate_derived_rs,NV50_NEW_3D_FRAGPROG | 
> NV50_NEW_3D_RASTERIZER |
> NV50_NEW_3D_VERTPROG | 
> NV50_NEW_3D_GMTYPROG },
>  { nv50_validate_derived_2, NV50_NEW_3D_ZSA | NV50_NEW_3D_FRAMEBUFFER 
> },
> +{ nv50_validate_rast_fb,   NV50_NEW_3D_RASTERIZER | 
> NV50_NEW_3D_FRAMEBUFFER },
>  { nv50_validate_clip,  NV50_NEW_3D_CLIP | NV50_NEW_3D_RASTERIZER 
> |
> NV50_NEW_3D_VERTPROG | 
> NV50_NEW_3D_GMTYPROG },
>  { nv50_constbufs_validate, NV50_NEW_3D_CONSTBUF },
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_stateobj.h 
> b/src/gallium/drivers/nouveau/nv50/nv50_stateobj.h
> index 579da9a..a5af115 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_stateobj.h
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_stateobj.h
> @@ -25,7 +25,7 @@ struct nv50_blend_stateobj {
>  struct nv50_rasterizer_stateobj {
> struct pipe_rasterizer_state pipe;
> int size;
> -   uint32_t state[49];
> +   uint32_t state[45];
>  };
>  
>  struct nv50_zsa_stateobj {
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index bba3

[Mesa-dev] [PATCH 1/4] glsl: split DIV_TO_MUL_RCP into single- and double-precision flags

2017-01-16 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/compiler/glsl/ir_optimization.h  |  4 +++-
 src/compiler/glsl/lower_instructions.cpp | 19 +++
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 0d6c4e6..01e5270 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -30,7 +30,7 @@
 
 /* Operations for lower_instructions() */
 #define SUB_TO_ADD_NEG 0x01
-#define DIV_TO_MUL_RCP 0x02
+#define FDIV_TO_MUL_RCP0x02
 #define EXP_TO_EXP20x04
 #define POW_TO_EXP20x08
 #define LOG_TO_LOG20x10
@@ -49,6 +49,8 @@
 #define FIND_LSB_TO_FLOAT_CAST0x2
 #define FIND_MSB_TO_FLOAT_CAST0x4
 #define IMUL_HIGH_TO_MUL  0x8
+#define DDIV_TO_MUL_RCP   0x10
+#define DIV_TO_MUL_RCP(FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 9fc83d1..729cb13 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -54,8 +54,8 @@
  * want to recognize add(op0, neg(op1)) or the other way around to
  * produce a subtract anyway.
  *
- * DIV_TO_MUL_RCP and INT_DIV_TO_MUL_RCP:
- * --
+ * FDIV_TO_MUL_RCP, DDIV_TO_MUL_RCP, and INT_DIV_TO_MUL_RCP:
+ * -
  * Breaks an ir_binop_div expression down to op0 * (rcp(op1)).
  *
  * Many GPUs don't have a divide instruction (945 and 965 included),
@@ -63,9 +63,11 @@
  * reciprocal.  By breaking the operation down, constant reciprocals
  * can get constant folded.
  *
- * DIV_TO_MUL_RCP only lowers floating point division; INT_DIV_TO_MUL_RCP
- * handles the integer case, converting to and from floating point so that
- * RCP is possible.
+ * FDIV_TO_MUL_RCP only lowers single-precision floating point division;
+ * DDIV_TO_MUL_RCP only lowers double-precision floating point division.
+ * DIV_TO_MUL_RCP is a convenience macro that sets both flags.
+ * INT_DIV_TO_MUL_RCP handles the integer case, converting to and from floating
+ * point so that RCP is possible.
  *
  * EXP_TO_EXP2 and LOG_TO_LOG2:
  * 
@@ -326,7 +328,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
/* Don't generate new IR that would need to be lowered in an additional
 * pass.
 */
-   if (lowering(DIV_TO_MUL_RCP) && (ir->type->is_float() || 
ir->type->is_double()))
+   if ((lowering(FDIV_TO_MUL_RCP) && ir->type->is_float()) ||
+   (lowering(DDIV_TO_MUL_RCP) && ir->type->is_double()))
   div_to_mul_rcp(div_expr);
 
ir_expression *const floor_expr =
@@ -1599,8 +1602,8 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
case ir_binop_div:
   if (ir->operands[1]->type->is_integer() && lowering(INT_DIV_TO_MUL_RCP))
 int_div_to_mul_rcp(ir);
-  else if ((ir->operands[1]->type->is_float() ||
-ir->operands[1]->type->is_double()) && 
lowering(DIV_TO_MUL_RCP))
+  else if ((ir->operands[1]->type->is_float() && 
lowering(FDIV_TO_MUL_RCP)) ||
+   (ir->operands[1]->type->is_double() && 
lowering(DDIV_TO_MUL_RCP)))
 div_to_mul_rcp(ir);
   break;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] radeonsi: add DDIV for double division instead of RCP+MUL

2017-01-16 Thread Nicolai Hähnle
Hi all,

This series fixes one of the last remaining CTS failures for radeonsi,
GL45-CTS.gpu_shader_fp64.built_in_functions.

Specifically, that test checks that mod(13.375, 13.375) == 0.0. As part of
the lowering of modulo, we compute 13.375 / 13.375, which is of course 1.0.
Unfortunately, when each of the steps in 13.375 * (1.0 / 13.375) is
computed with double precision and rounded to nearest, the result is the
largest number strictly smaller than 1.0. Computing the floor of that
becomes 0.0, hence we get a very wrong result.

With this series, we keep the division as a division, and let LLVM produce
the correct code for it. The resulting shader code is actually faster
than the RCP+MUL sequence, at least as long as the denominator isn't
re-used (not that anybody really cares about the performance of double
precision in OpenGL, but whatever). Please review!

Thanks,
Nicolai
--
 src/compiler/glsl/ir_optimization.h  |  4 +++-
 src/compiler/glsl/lower_instructions.cpp | 19 ++
 .../auxiliary/gallivm/lp_bld_tgsi_action.c   |  2 ++
 src/gallium/auxiliary/tgsi/tgsi_info.c   |  2 ++
 src/gallium/docs/source/screen.rst   |  2 ++
 src/gallium/docs/source/tgsi.rst |  9 +
 .../drivers/freedreno/freedreno_screen.c |  3 ++-
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/ilo/ilo_screen.c |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 .../drivers/nouveau/nv30/nv30_screen.c   |  1 +
 .../drivers/nouveau/nv50/nv50_screen.c   |  1 +
 .../drivers/nouveau/nvc0/nvc0_screen.c   |  1 +
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/drivers/swr/swr_screen.cpp   |  1 +
 src/gallium/drivers/vc4/vc4_screen.c |  1 +
 src/gallium/drivers/virgl/virgl_screen.c |  1 +
 src/gallium/include/pipe/p_defines.h |  1 +
 src/gallium/include/pipe/p_shader_tokens.h   |  5 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 11 +-
 24 files changed, 57 insertions(+), 19 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] tgsi: add DDIV instruction

2017-01-16 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Double-precision division, to allow more precision than a DRCP + DMUL
sequence.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 2 ++
 src/gallium/auxiliary/tgsi/tgsi_info.c | 2 ++
 src/gallium/docs/source/tgsi.rst   | 9 +
 src/gallium/include/pipe/p_shader_tokens.h | 5 -
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 91e959f..937170f 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -1355,6 +1355,7 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
bld_base)
bld_base->op_actions[TGSI_OPCODE_DMAX].emit = fmax_emit;
bld_base->op_actions[TGSI_OPCODE_DMIN].emit = fmin_emit;
bld_base->op_actions[TGSI_OPCODE_DMUL].emit = mul_emit;
+   bld_base->op_actions[TGSI_OPCODE_DDIV].emit = fdiv_emit;
 
bld_base->op_actions[TGSI_OPCODE_D2F].emit = d2f_emit;
bld_base->op_actions[TGSI_OPCODE_D2I].emit = d2i_emit;
@@ -2623,6 +2624,7 @@ lp_set_default_actions_cpu(
bld_base->op_actions[TGSI_OPCODE_DSLT].emit = dslt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSNE].emit = dsne_emit_cpu;
 
+   bld_base->op_actions[TGSI_OPCODE_DDIV].emit = div_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DRSQ].emit = drecip_sqrt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSQRT].emit = dsqrt_emit_cpu;
 
diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index a339ec2..3bec561 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -287,6 +287,7 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 2, 0, 0, 0, 0, 0, COMP, "U64DIV", TGSI_OPCODE_U64DIV },
{ 1, 2, 0, 0, 0, 0, 0, COMP, "I64MOD", TGSI_OPCODE_I64MOD },
{ 1, 2, 0, 0, 0, 0, 0, COMP, "U64MOD", TGSI_OPCODE_U64MOD },
+   { 1, 2, 0, 0, 0, 0, 0, COMP, "DDIV", TGSI_OPCODE_DDIV },
 };
 
 const struct tgsi_opcode_info *
@@ -417,6 +418,7 @@ tgsi_opcode_infer_type( uint opcode )
case TGSI_OPCODE_DNEG:
case TGSI_OPCODE_DMUL:
case TGSI_OPCODE_DMAX:
+   case TGSI_OPCODE_DDIV:
case TGSI_OPCODE_DMIN:
case TGSI_OPCODE_DRCP:
case TGSI_OPCODE_DSQRT:
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index d2d30b4..3e2d0e9 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2005,6 +2005,15 @@ Perform a * b + c with no intermediate rounding step.
   dst.zw = src0.zw \times src1.zw + src2.zw
 
 
+.. opcode:: DDIV - Divide
+
+.. math::
+
+  dst.xy = \frac{src0.xy}{src1.xy}
+
+  dst.zw = \frac{src0.zw}{src1.zw}
+
+
 .. opcode:: DRCP - Reciprocal
 
 .. math::
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 3384035..a867d13 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -601,7 +601,10 @@ struct tgsi_property_data {
 #define TGSI_OPCODE_U64DIV  245
 #define TGSI_OPCODE_I64MOD  246
 #define TGSI_OPCODE_U64MOD  247
-#define TGSI_OPCODE_LAST248
+
+#define TGSI_OPCODE_DDIV248
+
+#define TGSI_OPCODE_LAST249
 
 /**
  * Opcode is the operation code to execute. A given operation defines the
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] gallium: add PIPE_CAP_TGSI_DDIV capability

2017-01-16 Thread Nicolai Hähnle
From: Nicolai Hähnle 

For drivers to indicate that they don't want double-precision divides
to be lowered into rcp+mul.
---
 src/gallium/docs/source/screen.rst   | 2 ++
 src/gallium/drivers/freedreno/freedreno_screen.c | 3 ++-
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c   | 3 ++-
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 000551a..4d79104 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -369,6 +369,8 @@ The integer capabilities:
 * ``PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY``: Tell the GLSL compiler to use
   the minimum amount of optimizations just to be able to do all the linking
   and lowering.
+* ``PIPE_CAP_TGSI_DDIV``: Indicates that the TGSI DDIV instruction for double
+  division is supported and preferred over a DRCP+DMUL sequence.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 2ff89eb..d6fdb77 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -295,7 +295,8 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
-case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 18578c0..8791105 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -297,6 +297,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
   return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 20a0e8d..916c1c9 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -520,6 +520,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 4501df4..3248c4d 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -341,6 +341,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 19df068..586a85c 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -206,6 +206,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 5637001..3cf5724 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -258,6 +258,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
+   case PIPE_CAP_TGSI_DDIV:
   return 0;
 
case PIPE_CAP_

[Mesa-dev] [PATCH 4/4] st/glsl_to_tgsi: use DDIV if the driver requests it

2017-01-16 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Fixes GL45-CTS.gpu_shader_fp64.built_in_functions.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9356707..d1059a9 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -955,7 +955,7 @@ glsl_to_tgsi_visitor::get_opcode(unsigned op,
   case3fid(MUL, UMUL, DMUL);
   case3fid(MAD, UMAD, DMAD);
   case3fid(FMA, UMAD, DFMA);
-  case3(DIV, IDIV, UDIV);
+  case4d(DIV, IDIV, UDIV, DDIV);
   case4d(MAX, IMAX, UMAX, DMAX);
   case4d(MIN, IMIN, UMIN, DMIN);
   case2iu(MOD, UMOD);
@@ -1710,10 +1710,7 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* 
ir, st_src_reg *op)
   emit_asm(ir, TGSI_OPCODE_MUL, result_dst, op[0], op[1]);
   break;
case ir_binop_div:
-  if (result_dst.type == GLSL_TYPE_FLOAT || result_dst.type == 
GLSL_TYPE_DOUBLE)
- assert(!"not reached: should be handled by ir_div_to_mul_rcp");
-  else
- emit_asm(ir, TGSI_OPCODE_DIV, result_dst, op[0], op[1]);
+  emit_asm(ir, TGSI_OPCODE_DIV, result_dst, op[0], op[1]);
   break;
case ir_binop_mod:
   if (result_dst.type == GLSL_TYPE_FLOAT)
@@ -6904,7 +6901,9 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   do_mat_op_to_vec(ir);
   lower_instructions(ir,
  MOD_TO_FLOOR |
- DIV_TO_MUL_RCP |
+ FDIV_TO_MUL_RCP |
+ (pscreen->get_param(pscreen, PIPE_CAP_TGSI_DDIV)
+  ? 0 : DDIV_TO_MUL_RCP) |
  EXP_TO_EXP2 |
  LOG_TO_LOG2 |
  LDEXP_TO_ARITH |
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nouveau: remove always false argument in nouveau_fence_new()

2017-01-16 Thread Emil Velikov
No point in having the extra argument considering that it's effectively
unused since the function was introduced.

Cc: Ilia Mirkin 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/nouveau/nouveau_fence.c| 8 ++--
 src/gallium/drivers/nouveau/nouveau_fence.h| 3 +--
 src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 +-
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
 5 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c 
b/src/gallium/drivers/nouveau/nouveau_fence.c
index 691553ae7e..6c299cbc6a 100644
--- a/src/gallium/drivers/nouveau/nouveau_fence.c
+++ b/src/gallium/drivers/nouveau/nouveau_fence.c
@@ -30,8 +30,7 @@
 #endif
 
 bool
-nouveau_fence_new(struct nouveau_screen *screen, struct nouveau_fence **fence,
-  bool emit)
+nouveau_fence_new(struct nouveau_screen *screen, struct nouveau_fence **fence)
 {
*fence = CALLOC_STRUCT(nouveau_fence);
if (!*fence)
@@ -41,9 +40,6 @@ nouveau_fence_new(struct nouveau_screen *screen, struct 
nouveau_fence **fence,
(*fence)->ref = 1;
LIST_INITHEAD(&(*fence)->work);
 
-   if (emit)
-  nouveau_fence_emit(*fence);
-
return true;
 }
 
@@ -242,7 +238,7 @@ nouveau_fence_next(struct nouveau_screen *screen)
 
nouveau_fence_ref(NULL, &screen->fence.current);
 
-   nouveau_fence_new(screen, &screen->fence.current, false);
+   nouveau_fence_new(screen, &screen->fence.current);
 }
 
 void
diff --git a/src/gallium/drivers/nouveau/nouveau_fence.h 
b/src/gallium/drivers/nouveau/nouveau_fence.h
index f10016da82..e14572bce8 100644
--- a/src/gallium/drivers/nouveau/nouveau_fence.h
+++ b/src/gallium/drivers/nouveau/nouveau_fence.h
@@ -32,8 +32,7 @@ struct nouveau_fence {
 void nouveau_fence_emit(struct nouveau_fence *);
 void nouveau_fence_del(struct nouveau_fence *);
 
-bool nouveau_fence_new(struct nouveau_screen *, struct nouveau_fence **,
-   bool emit);
+bool nouveau_fence_new(struct nouveau_screen *, struct nouveau_fence **);
 bool nouveau_fence_work(struct nouveau_fence *, void (*)(void *), void *);
 void nouveau_fence_update(struct nouveau_screen *, bool flushed);
 void nouveau_fence_next(struct nouveau_screen *);
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 19df068e39..96a6cfdf33 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -734,6 +734,6 @@ nv30_screen_create(struct nouveau_device *dev)
 
nouveau_pushbuf_kick(push, push->channel);
 
-   nouveau_fence_new(&screen->base, &screen->base.fence.current, false);
+   nouveau_fence_new(&screen->base, &screen->base.fence.current);
return &screen->base;
 }
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 56370014bc..322ad59b17 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -1016,7 +1016,7 @@ nv50_screen_create(struct nouveau_device *dev)
   goto fail;
}
 
-   nouveau_fence_new(&screen->base, &screen->base.fence.current, false);
+   nouveau_fence_new(&screen->base, &screen->base.fence.current);
 
return &screen->base;
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index b6e4c6cfe9..8d1b7bed2d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -1225,7 +1225,7 @@ nvc0_screen_create(struct nouveau_device *dev)
if (!nvc0_blitter_create(screen))
   goto fail;
 
-   nouveau_fence_new(&screen->base, &screen->base.fence.current, false);
+   nouveau_fence_new(&screen->base, &screen->base.fence.current);
 
return &screen->base;
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] graw: provide static inline draw_create_with_llvm_context()

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Signed-off-by: Emil Velikov 
---
 src/gallium/auxiliary/draw/draw_context.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/draw/draw_context.h 
b/src/gallium/auxiliary/draw/draw_context.h
index 145fc2ed46..d6b85e20cf 100644
--- a/src/gallium/auxiliary/draw/draw_context.h
+++ b/src/gallium/auxiliary/draw/draw_context.h
@@ -68,6 +68,10 @@ struct draw_context *draw_create( struct pipe_context *pipe 
);
 #if HAVE_LLVM
 struct draw_context *draw_create_with_llvm_context(struct pipe_context *pipe,
void *context);
+#else
+static inline struct draw_context *
+draw_create_with_llvm_context(struct pipe_context *pipe,
+  void *context) { return NULL; }
 #endif
 
 struct draw_context *draw_create_no_llvm(struct pipe_context *pipe);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] graw: trivial coding style fixes

2017-01-16 Thread Emil Velikov
From: Emil Velikov 

Remove trailing whitespace and properly use brackets.

Signed-off-by: Emil Velikov 
---
 src/gallium/auxiliary/draw/draw_vs.c | 47 +++-
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_vs.c 
b/src/gallium/auxiliary/draw/draw_vs.c
index 48a1a34ee6..0def8462eb 100644
--- a/src/gallium/auxiliary/draw/draw_vs.c
+++ b/src/gallium/auxiliary/draw/draw_vs.c
@@ -55,20 +55,16 @@ draw_create_vertex_shader(struct draw_context *draw,
 {
struct draw_vertex_shader *vs = NULL;
 
-   if (draw->dump_vs) {
+   if (draw->dump_vs)
   tgsi_dump(shader->tokens, 0);
-   }
 
-   if (draw->pt.middle.llvm) {
+   if (draw->pt.middle.llvm)
   vs = draw_create_vs_llvm(draw, shader);
-   }
 
-   if (!vs) {
+   if (!vs)
   vs = draw_create_vs_exec( draw, shader );
-   }
 
-   if (vs)
-   {
+   if (vs) {
   uint i;
   bool found_clipvertex = FALSE;
   vs->position_output = -1;
@@ -105,9 +101,8 @@ draw_bind_vertex_shader(struct draw_context *draw,
 struct draw_vertex_shader *dvs)
 {
draw_do_flush( draw, DRAW_FLUSH_STATE_CHANGE );
-   
-   if (dvs) 
-   {
+
+   if (dvs) {
   draw->vs.vertex_shader = dvs;
   draw->vs.num_vs_outputs = dvs->info.num_outputs;
   draw->vs.position_output = dvs->position_output;
@@ -132,7 +127,7 @@ draw_delete_vertex_shader(struct draw_context *draw,
 {
unsigned i;
 
-   for (i = 0; i < dvs->nr_variants; i++) 
+   for (i = 0; i < dvs->nr_variants; i++)
   dvs->variant[i]->destroy( dvs->variant[i] );
 
dvs->nr_variants = 0;
@@ -142,7 +137,7 @@ draw_delete_vertex_shader(struct draw_context *draw,
 
 
 
-boolean 
+boolean
 draw_vs_init( struct draw_context *draw )
 {
draw->dump_vs = debug_get_option_gallium_dump_vs();
@@ -154,11 +149,11 @@ draw_vs_init( struct draw_context *draw )
}
 
draw->vs.emit_cache = translate_cache_create();
-   if (!draw->vs.emit_cache) 
+   if (!draw->vs.emit_cache)
   return FALSE;
-  
+
draw->vs.fetch_cache = translate_cache_create();
-   if (!draw->vs.fetch_cache) 
+   if (!draw->vs.fetch_cache)
   return FALSE;
 
return TRUE;
@@ -185,19 +180,19 @@ draw_vs_lookup_variant( struct draw_vertex_shader *vs,
struct draw_vs_variant *variant;
unsigned i;
 
-   /* Lookup existing variant: 
+   /* Lookup existing variant:
 */
for (i = 0; i < vs->nr_variants; i++)
   if (draw_vs_variant_key_compare(key, &vs->variant[i]->key) == 0)
  return vs->variant[i];
-   
-   /* Else have to create a new one: 
+
+   /* Else have to create a new one:
 */
variant = vs->create_variant( vs, key );
if (!variant)
   return NULL;
 
-   /* Add it to our list, could be smarter: 
+   /* Add it to our list, could be smarter:
 */
if (vs->nr_variants < ARRAY_SIZE(vs->variant)) {
   vs->variant[vs->nr_variants++] = variant;
@@ -209,7 +204,7 @@ draw_vs_lookup_variant( struct draw_vertex_shader *vs,
   vs->variant[vs->last_variant] = variant;
}
 
-   /* Done 
+   /* Done
 */
return variant;
 }
@@ -220,12 +215,11 @@ draw_vs_get_fetch( struct draw_context *draw,
struct translate_key *key )
 {
if (!draw->vs.fetch ||
-   translate_key_compare(&draw->vs.fetch->key, key) != 0) 
-   {
+   translate_key_compare(&draw->vs.fetch->key, key) != 0) {
   translate_key_sanitize(key);
   draw->vs.fetch = translate_cache_find(draw->vs.fetch_cache, key);
}
-   
+
return draw->vs.fetch;
 }
 
@@ -234,12 +228,11 @@ draw_vs_get_emit( struct draw_context *draw,
   struct translate_key *key )
 {
if (!draw->vs.emit ||
-   translate_key_compare(&draw->vs.emit->key, key) != 0) 
-   {
+   translate_key_compare(&draw->vs.emit->key, key) != 0) {
   translate_key_sanitize(key);
   draw->vs.emit = translate_cache_find(draw->vs.emit_cache, key);
}
-   
+
return draw->vs.emit;
 }
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >