[Mesa-dev] [PATCH] i965: Support the mesa_no_error driconf option.

2017-07-22 Thread Kenneth Graunke
This allows us to override contexts to use no_error functionality
even if the applications themselves do not.
---
 src/mesa/drivers/dri/i965/brw_context.c  | 3 +++
 src/mesa/drivers/dri/i965/intel_screen.c | 1 +
 2 files changed, 4 insertions(+)

Tested by running mesa_no_error=true ./bin/tex-errors in Piglit, and
noticing that it fails because we no longer raise errors :)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b412ecff901..d0b22d4342b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -745,6 +745,9 @@ brw_process_driconf_options(struct brw_context *brw)
   brw->has_separate_stencil = false;
}
 
+   if (driQueryOptionb(options, "mesa_no_error"))
+  ctx->Const.ContextFlags |= GL_CONTEXT_FLAG_NO_ERROR_BIT_KHR;
+
if (driQueryOptionb(options, "always_flush_batch")) {
   fprintf(stderr, "flushing batchbuffer before/after each draw call\n");
   brw->always_flush_batch = true;
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 994513189b9..5adb8ef1f63 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -65,6 +65,7 @@ DRI_CONF_BEGIN
DRI_CONF_ENUM(1, "Enable reuse of all sizes of buffer objects")
 DRI_CONF_DESC_END
   DRI_CONF_OPT_END
+  DRI_CONF_MESA_NO_ERROR("false")
DRI_CONF_SECTION_END
 
DRI_CONF_SECTION_QUALITY
-- 
2.13.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/dri2: Return invalid modifier when no driver support

2017-07-22 Thread Daniel Stone
Hi Emil,

On 21 July 2017 at 15:23, Emil Velikov  wrote:
> I think the key issue is that the resource_get_handle() calls lack
> proper error handling.
>
> There's a single case, that was fixed not too long ago, yet everywhere
> else we consider that resource_get_handle() can fail.
> Can we address that instead? Only dri2_query_image seems to be buggy.

We can address that as well, but not instead. Drivers which aren't
aware of modifiers will return success (as they know how to export
KMS-type handles), but not initialise the modifier field (because they
aren't aware of it). So we can check for failure, but the case I was
seeing here is that resource_get_handle() was succeeding - else we
wouldn't have had a buffer to query modifiers on in the first place -
but failing to either give the correct modifier or INVALID.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 07/32] i965/miptree: Make aux_state work in terms of logical layers

2017-07-22 Thread Pohjolainen, Topi
On Fri, Jul 21, 2017 at 09:47:52AM -0700, Jason Ekstrand wrote:
> This commit changes layer_range_length to return locical layers and also
> changes the way we allocate the aux_state field to not allocate extra
> layers for MCS.  This will be important as we're about to start doing
> significantly more detailed tracking of MCS state.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 42 
> +--
>  1 file changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index a5e71a1..cd36f5b 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -637,6 +637,22 @@ intel_lower_compressed_format(struct brw_context *brw, 
> mesa_format format)
>  }
>  
>  static unsigned
> +get_num_logical_layers(const struct intel_mipmap_tree *mt, unsigned level)
> +{
> +   if (mt->surf.size > 0) {
> +  if (mt->surf.dim == ISL_SURF_DIM_3D)
> + return minify(mt->surf.logical_level0_px.depth, level);
> +  else
> + return mt->surf.logical_level0_px.array_len;
> +   } else if (mt->surf.samples > 1) {
> +  assert(level == 0);
> +  return mt->logical_depth0;
> +   } else {
> +  return mt->level[level].depth;

You can drop this now :)

> +   }
> +}
> +
> +static unsigned
>  get_num_phys_layers(const struct isl_surf *surf, unsigned level)
>  {
> /* In case of physical dimensions one needs to consider also the layout.
> @@ -677,12 +693,8 @@ create_aux_state_map(struct intel_mipmap_tree *mt,
> const uint32_t levels = mt->last_level + 1;
>  
> uint32_t total_slices = 0;
> -   for (uint32_t level = 0; level < levels; level++) {
> -  if (mt->surf.size > 0)
> - total_slices += get_num_phys_layers(&mt->surf, level);
> -  else
> - total_slices += mt->level[level].depth;
> -   }
> +   for (uint32_t level = 0; level < levels; level++)
> +  total_slices += get_num_logical_layers(mt, level);
>  
> const size_t per_level_array_size = levels * sizeof(enum isl_aux_state *);
>  
> @@ -700,14 +712,8 @@ create_aux_state_map(struct intel_mipmap_tree *mt,
> enum isl_aux_state *s = data + per_level_array_size;
> for (uint32_t level = 0; level < levels; level++) {
>per_level_arr[level] = s;
> -
> -  unsigned level_depth;
> -  if (mt->surf.size > 0)
> - level_depth = get_num_phys_layers(&mt->surf, level);
> -  else
> - level_depth = mt->level[level].depth;
> -  
> -  for (uint32_t a = 0; a < level_depth; a++)
> +  const unsigned level_layers = get_num_logical_layers(mt, level);
> +  for (uint32_t a = 0; a < level_layers; a++)
>   *(s++) = initial;
> }
> assert((void *)s == data + total_size);
> @@ -2533,13 +2539,7 @@ miptree_layer_range_length(const struct 
> intel_mipmap_tree *mt, uint32_t level,
> uint32_t start_layer, uint32_t num_layers)
>  {
> assert(level <= mt->last_level);
> -   uint32_t total_num_layers;
> -
> -   if (mt->surf.size > 0)
> -  total_num_layers = get_num_phys_layers(&mt->surf, level);
> -   else 
> -  total_num_layers = mt->level[level].depth;
> -
> +   uint32_t total_num_layers = get_num_logical_layers(mt, level);
> assert(start_layer < total_num_layers);
> if (num_layers == INTEL_REMAINING_LAYERS)
>num_layers = total_num_layers - start_layer;
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 19/32] i965/miptree: Take an aux_usage in prepare/finish

2017-07-22 Thread Pohjolainen, Topi
On Fri, Jul 21, 2017 at 09:35:21AM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c |  25 +++---
>  src/mesa/drivers/dri/i965/brw_clear.c |   3 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 107 
> +++---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   5 +-
>  4 files changed, 80 insertions(+), 60 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 3197d66..493202c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -317,17 +317,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>  */
> if (src_aux_usage == ISL_AUX_USAGE_HIZ)
>src_aux_usage = ISL_AUX_USAGE_NONE;
> -   const bool src_aux_supported = src_aux_usage != ISL_AUX_USAGE_NONE;
> const bool src_clear_supported =
> -  src_aux_supported && (src_mt->format == src_format);
> +  src_aux_usage != ISL_AUX_USAGE_NONE && (src_mt->format == src_format);

Right hand side with () and left without, either way, but both should be the
same.

> intel_miptree_prepare_access(brw, src_mt, src_level, 1, src_layer, 1,
> -src_aux_supported, src_clear_supported);
> +src_aux_usage, src_clear_supported);
>  
> enum isl_aux_usage dst_aux_usage =
>intel_miptree_render_aux_usage(brw, dst_mt, encode_srgb);
> -   const bool dst_aux_supported = dst_aux_usage != ISL_AUX_USAGE_NONE;
> +   const bool dst_clear_supported = dst_aux_usage != ISL_AUX_USAGE_NONE;
> intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
> -dst_aux_supported, dst_aux_supported);
> +dst_aux_usage, dst_clear_supported);
>  
> struct isl_surf tmp_surfs[2];
> struct blorp_surf src_surf, dst_surf;
> @@ -356,7 +355,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> blorp_batch_finish(&batch);
>  
> intel_miptree_finish_write(brw, dst_mt, dst_level, dst_layer, 1,
> -  dst_aux_supported);
> +  dst_aux_usage);
>  }
>  
>  void
> @@ -417,11 +416,9 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
> }
>  
> intel_miptree_prepare_access(brw, src_mt, src_level, 1, src_layer, 1,
> -src_aux_usage != ISL_AUX_USAGE_NONE,
> -src_clear_supported);
> +src_aux_usage, src_clear_supported);
> intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
> -dst_aux_usage != ISL_AUX_USAGE_NONE,
> -dst_clear_supported);
> +dst_aux_usage, dst_clear_supported);
>  
> struct isl_surf tmp_surfs[2];
> struct blorp_surf src_surf, dst_surf;
> @@ -438,7 +435,7 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
> blorp_batch_finish(&batch);
>  
> intel_miptree_finish_write(brw, dst_mt, dst_level, dst_layer, 1,
> -  dst_aux_usage != ISL_AUX_USAGE_NONE);
> +  dst_aux_usage);
>  }
>  
>  static struct intel_mipmap_tree *
> @@ -1032,7 +1029,8 @@ brw_blorp_clear_depth_stencil(struct brw_context *brw,
>stencil_mask = ctx->Stencil.WriteMask[0] & 0xff;
>  
>intel_miptree_prepare_access(brw, stencil_mt, level, 1,
> -   start_layer, num_layers, false, false);
> +   start_layer, num_layers,
> +   ISL_AUX_USAGE_NONE, false);
>  
>unsigned stencil_level = level;
>blorp_surf_for_miptree(brw, &stencil_surf, stencil_mt,
> @@ -1059,7 +1057,8 @@ brw_blorp_clear_depth_stencil(struct brw_context *brw,
>  
> if (stencil_mask) {
>intel_miptree_finish_write(brw, stencil_mt, level,
> - start_layer, num_layers, false);
> + start_layer, num_layers,
> + ISL_AUX_USAGE_NONE);
> }
>  }
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index c310d25..a0a359d 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -164,7 +164,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
>  */
> if (mt->fast_clear_color.f32[0] != ctx->Depth.Clear) {
>intel_miptree_prepare_access(brw, mt, 0, INTEL_REMAINING_LEVELS,
> -   0, INTEL_REMAINING_LAYERS, true, false);
> +   0, INTEL_REMAINING_LAYERS,
> +   ISL_AUX_USAGE_HIZ, false);
>mt->fast_clear_color.f32[0] = ctx->Depth.Clear;
> }
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c

Re: [Mesa-dev] [PATCH 1/7] i965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.

2017-07-22 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-22 00:17:41)
> With the advent of asynchronous maps, domain tracking doesn't make a
> whole lot of sense.  Buffers can be in use on both the CPU and GPU at
> the same time.  In order to avoid blocking, we stopped using set_domain
> for asynchronous mappings, which means that the kernel's tracking has
> lies.  We can't properly track it in userspace either, as the kernel
> can change domains on us spontaneously (for example, when un-swapping).
> 
> I915_GEM_SET_DOMAIN combines two aspects: cache flushing, and waiting
> for a buffer to be idle.  In order to stop using it, we'll need to do
> clflushing ourselves, and use I915_GEM_WAIT to wait on buffers.
> 
> For cache-coherent buffers (most on LLC systems), we don't need to do
> any clflushing - the CPU and GPU views are coherent.  For non-coherent
> buffers (most on non-LLC systems), we currently only use the CPU for
> read-only maps, and we explicitly clflush when necessary.
> 
> One downside to this is that we'll stall unnecessarily if we do a
> read-only mapping of a buffer that the GPU is reading.  I believe
> that's pretty uncommon.

Pretty please add some of the warning comments about GTT from earlier.
For bonus paranoia, the *gtt_map trick as well.  :)

You should be able demonstrate the issue if you do
*brw_bo_map_gtt(bo) = 1;
assert(*brw_bo_map_cpu(bo) == 1);

Can we add unit tests to this file?
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] i965/bufmgr: Allocate BO pages outside of the kernel's locking.

2017-07-22 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-22 00:17:43)
> Suggested by Chris Wilson.
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 1669d26e990..78a4626d430 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -382,6 +382,19 @@ retry:
>  
>if (bo_set_tiling_internal(bo, tiling_mode, stride))
>   goto err_free;
> +
> +  /* Calling set_domain() will allocate pages for the BO outside of the
> +   * struct mutex lock in the kernel, which is more efficient than 
> waiting
> +   * to create them during the first execbuf that uses the BO.
> +   */
> +  struct drm_i915_gem_set_domain sd = {
> + .handle = bo->gem_handle,
> + .read_domains = I915_GEM_DOMAIN_CPU,
> + .write_domain = I915_GEM_DOMAIN_CPU,

We can pass .write_domain = 0 here. Should be no different as the bo is
created in the CPU write domain and already marked as dirty. I think it
is a better reflection of intent that we are just pulling in the pages.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Drop non-LLC lunacy in the program cache code.

2017-07-22 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-22 00:17:47)
> The non-LLC story was a horror show.  We uploaded data via pwrite
> (drm_intel_bo_subdata), which would stall if the cache BO was in
> use (being read) by the GPU.  Obviously, we wanted to avoid that.
> So, we tried to detect whether the buffer was busy, and if so, we'd
> allocate a new BO, map the old one read-only (hopefully not stalling),
> copy all shaders compiled since the dawn of time to the new buffer,
> upload our new one, toss the old BO, and let the state upload code
> know that our program cache BO changed.  This was a lot of extra data
> copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
> STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.
> 
> Not only that, but our rudimentary busy tracking consistented of a flag
> set at execbuf time, and not cleared until we threw out the program
> cache BO.  So, the first shader upload after any drawing would hit this
> "abandon the cache and start over" copying path.
> 
> This is largely unnecessary - it's just ancient and crufty code.  We can
> use the same persistent mapping paths on all platforms.  On non-ancient
> kernels, this will use a write combining map, which should be reasonably
> fast.
> 
> One aspect that is worse: we do occasionally grow the program cache BO,
> and copy the old contents to the newer BO.  This will suffer from UC
> readback performance now.  To mitigate this, we use the MOVNTDQA based
> streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms).  Gen4-5
> are unfortunately going to be penalized.
> 
> v2: Add MOVNTDQA path, rebase on other map flag changes.

Don't forgot cache->bo_used_by_gpu!

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 2acebaa820..a4a04daaa9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -373,7 +373,6 @@ struct brw_cache {
GLuint size, n_items;
 
uint32_t next_offset;
-   bool bo_used_by_gpu;
 };
 
 /* Considered adding a member to this struct to document which flags
diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c 
b/src/mesa/drivers/dri/i965/brw_program_cache.c
index 8d83af71d3..4dcfd5234d 100644
--- a/src/mesa/drivers/dri/i965/brw_program_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_program_cache.c
@@ -238,7 +238,6 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t new_size)
brw_bo_unreference(cache->bo);
cache->bo = new_bo;
cache->map = map;
-   cache->bo_used_by_gpu = false;
 
/* Since we have a new BO in place, we need to signal the units
 * that depend on it (state base address on gen5+, or unit state before).
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 4461a59b80..e2f208a3d1 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -465,12 +465,6 @@ brw_finish_batch(struct brw_context *brw)
   PIPE_CONTROL_CS_STALL);
   }
}
-
-   /* Mark that the current program cache BO has been used by the GPU.
-* It will be reallocated if we need to put new programs in for the
-* next batch.
-*/
-   brw->cache.bo_used_by_gpu = true;
 }
 
 static void
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/32] intel/isl/format: Add an srgb_to_linear helper

2017-07-22 Thread Pohjolainen, Topi
On Wed, Jul 19, 2017 at 02:01:54PM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/gen_format_layout.py | 46 
> +-
>  src/intel/isl/isl.h|  8 +++
>  2 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/isl/gen_format_layout.py 
> b/src/intel/isl/gen_format_layout.py
> index aa4e2d8..0ca42db 100644
> --- a/src/intel/isl/gen_format_layout.py
> +++ b/src/intel/isl/gen_format_layout.py
> @@ -88,6 +88,19 @@ isl_format_layouts[] = {
>  
>  % endfor
>  };
> +
> +enum isl_format
> +isl_format_srgb_to_linear(enum isl_format format)
> +{
> +switch (format) {
> +% for srgb, rgb in srgb_to_linear_map:
> +case ISL_FORMAT_${srgb}:
> +return ISL_FORMAT_${rgb};
> +%endfor
> +default:
> +return format;
> +}
> +}
>  """)
>  
>  
> @@ -167,6 +180,34 @@ def reader(csvfile):
>  if line and not line[0].startswith('#'):
>  yield line
>  
> +def get_srgb_to_linear_map(formats):
> +"""Compute a map from sRGB to linear formats.
> +
> +This function uses some probably somewhat fragile string munging to do
> +the conversion.  However, we do assert that, if it's SRGB, the munging
> +succeeded so that gives some safety.
> +"""
> +names = {f.name for f in formats}
> +for fmt in formats:
> +if fmt.colorspace != 'SRGB':
> +continue
> +
> +replacements = [
> +('_SRGB',   ''),
> +('SRGB','RGB'),
> +('U8SRGB',  'FLT16'),
> +]
> +
> +found = False;
> +for rep in replacements:
> +rgb_name = fmt.name.replace(rep[0], rep[1])
> +if rgb_name in names:
> +found = True
> +yield fmt.name, rgb_name
> +break;
> +
> +# We should have found a format name
> +assert found
>  
>  def main():
>  """Main function."""
> @@ -183,11 +224,14 @@ def main():
>  # problem: Unicode can be rendered even if the shell calling this script
>  # doesn't.
>  with open(args.out, 'wb') as f:
> +formats = [Format(l) for l in reader(args.csv)]
>  try:
>  # This basically does lazy evaluation and initialization, which
>  # saves on memory and startup overhead.
>  f.write(TEMPLATE.render(
> -formats=(Format(l) for l in reader(args.csv
> +formats = formats,
> +srgb_to_linear_map  = list(get_srgb_to_linear_map(formats)),
> +))
>  except Exception:
>  # In the even there's an error this imports some helpers from 
> mako
>  # to print a useful stack trace and prints it, then exits with
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index 68bfcee..bc68e58 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -1495,6 +1495,14 @@ isl_format_block_is_1x1x1(enum isl_format fmt)
>  }
>  
>  static inline bool
> +isl_format_is_srgb(enum isl_format fmt)
> +{
> +   return isl_format_layouts[fmt].colorspace == ISL_COLORSPACE_SRGB;
> +}

This put me off for a little while, it is in addition to what subject says but
really needed in the last patch.

> +
> +enum isl_format isl_format_srgb_to_linear(enum isl_format fmt);
> +
> +static inline bool
>  isl_format_is_rgb(enum isl_format fmt)
>  {
> return isl_format_layouts[fmt].channels.r.bits > 0 &&
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Drop non-LLC lunacy in the program cache code.

2017-07-22 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-22 00:17:47)
>  src/mesa/drivers/dri/i965/brw_program_cache.c | 83 
> +++
>  1 file changed, 21 insertions(+), 62 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c 
> b/src/mesa/drivers/dri/i965/brw_program_cache.c
> index 04682bef34c..8d83af71d3a 100644
> --- a/src/mesa/drivers/dri/i965/brw_program_cache.c
> +++ b/src/mesa/drivers/dri/i965/brw_program_cache.c
> @@ -45,6 +45,8 @@
>   */
>  
>  #include "main/imports.h"
> +#include "main/streaming-load-memcpy.h"
> +#include "x86/common_x86_asm.h"
>  #include "intel_batchbuffer.h"
>  #include "brw_state.h"
>  #include "brw_wm.h"
> @@ -214,32 +216,28 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t 
> new_size)
>  {
> struct brw_context *brw = cache->brw;
> struct brw_bo *new_bo;
> -   void *llc_map;
>  
> new_bo = brw_bo_alloc(brw->bufmgr, "program cache", new_size, 64);
> if (can_do_exec_capture(brw->screen))
>new_bo->kflags = EXEC_OBJECT_CAPTURE;
> -   if (brw->has_llc) {
> -  llc_map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE |
> -MAP_ASYNC | MAP_PERSISTENT);
> -   }
> +
> +   void *map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE |
> +   MAP_ASYNC | MAP_PERSISTENT);
>  
> /* Copy any existing data that needs to be saved. */
> if (cache->next_offset != 0) {
> -  if (brw->has_llc) {
> - memcpy(llc_map, cache->map, cache->next_offset);
> -  } else {
> - void *map = brw_bo_map(brw, cache->bo, MAP_READ);
> - brw_bo_subdata(new_bo, 0, cache->next_offset, map);
> - brw_bo_unmap(cache->bo);
> -  }
> +#ifdef USE_SSE41
> +  if (!cache->bo->cache_coherent && cpu_has_sse4_1)
> + _mesa_streaming_load_memcpy(map, cache->map, cache->next_offset);
> +  else
> +#endif
> + memcpy(map, cache->map, cache->next_offset);

Considering the prevalence of sse4.1, another candidate is
brw_get_buffer_subdata(), we could use a WC map there as well.

diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
index e932badaaf..705ade8021 100644
--- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
@@ -32,7 +32,9 @@
 #include "main/imports.h"
 #include "main/mtypes.h"
 #include "main/macros.h"
+#include "main/streaming-load-memcpy.h"
 #include "main/bufferobj.h"
+#include "x86/common_x86_asm.h"
 
 #include "brw_context.h"
 #include "intel_blit.h"
@@ -337,16 +339,24 @@ brw_get_buffer_subdata(struct gl_context *ctx,
   intel_batchbuffer_flush(brw);
}
 
-   void *map = brw_bo_map(brw, intel_obj->buffer, MAP_READ);
-
-   if (unlikely(!map)) {
-  _mesa_error_no_memory(__func__);
-  return;
+   if (!intel_obj->buffer->cache_coherent && cpu_has_sse4_1) {
+  void *map = brw_bo_map(brw, intel_obj->buffer, MAP_READ | MAP_COHERENT);
+  if (unlikely(!map)) {
+ _mesa_error_no_memory(__func__);
+ return;
+  }
+  _mesa_streaming_load_memcpy(data, map + offset, size);
+  brw_bo_unmap(intel_obj->buffer);
+   } else {
+  void *map = brw_bo_map(brw, intel_obj->buffer, MAP_READ);
+  if (unlikely(!map)) {
+ _mesa_error_no_memory(__func__);
+ return;
+  }
+  memcpy(data, map + offset, size);
+  brw_bo_unmap(intel_obj->buffer);
}
 
-   memcpy(data, map + offset, size);
-   brw_bo_unmap(intel_obj->buffer);
-
mark_buffer_inactive(intel_obj);
 }
 

I didn't see any other obvious candidates, the query readback objects I
suggest to use snooping instead.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/32] i965: Enabale CCS_E for sRGB render buffers

2017-07-22 Thread Pohjolainen, Topi
On Wed, Jul 19, 2017 at 02:01:26PM -0700, Jason Ekstrand wrote:
> Gen9 hardware has this annoying little corner where CCS_E is not allowed
> for any sRGB formats.  This is fixed on gen10 but on gen9 there's nothing
> we can do; it just doesn't work.  The old approach to working around this
> was to just disable CCS_E the moment we saw sRGB.  This is bad because GLX
> gives out sRGB-capable visuals by default and you can easily get one
> through EGL as well even if you never enable sRGB encode.  This means that
> users who have sRGB visuals and don't care about sRGB encode are getting
> unnecessarily punished.  This isn't a huge problem today because you also
> can't do CCS_E on X-tiled images but it will be a problem the moment we
> start seeing the Y-tiling modifier through the window system.
> 
> Also, I think sRGB + CCS was plain broken for the less likely case of
> rendering to a texture-backed framebuffer.  Our tracking for sRGB was based
> on piles of sRGBEnabled checks that I'm not at all sure added up to correct
> code.  When trying to better test the CCS_E modifier, I patched waffle to
> start using modifiers whenever the GBM back-end was in-use.  When I ran
> piglit with these waffle patches and my old CCS_E series, our pass rate was
> under 50%.  I think part of that was due to bugs with sRGB and part of it
> was due to not having a plan for falling back to CCS_D once the CCS_E
> modifier gets used.  This series is that plan.
> 
> The first 5 patches are a couple of bugfixes and the removal of a couple of
> bogus restrictions.  In particular, we were disabling CCS_E on all
> renderbuffers for no good reason.  Patch 3 fixes bugs exposed by patch 2
> related to glBlitFramebuffers with both color and depth bits specified.
> 
> The next 5 add a partial resolve pass for MCS and hook it up so that we can
> handle clear colors with texture views correctly.
> 
> The last 22 patches rework things so that we can properly fall back to
> CCS_D whenever we can't render with CCS_E.  This requires adding a seventh
> value to the isl_aux_state enum to describe a "partial clear" state which
> is the state a CCS_E image is in when it's been fast-cleared and then
> rendered to using CCS_D.  Tracking this additional state allows us to turn
> on CCS_E even when we have an sRGB visual and then just silently resolve if
> we ever need to render with sRGB encode enabled.  If they just turn on sRGB
> encode and leave it on, then they don't get a resolve because our tracking
> code knows that you can do CCS_D rendering on a CCS_E surface that is in
> the CLEAR state and the end result is the PARTIAL_CLEAR state.
> 
> We need to land this series before we flip on the CCS_E modifier and I'd
> like to land it in time for the 17.2 release if we can.
> 
> Happy Reviewing!

There were only small comments now and there most of which you already
covered. I looked at your wip/i965-srgb-ccs branch, patch 15 got updated but
patch 24 looked the same as in the list. I'd like to have a look at 24 once
you have it. Otherwise series:

Reviewed-by: Topi Pohjolainen 

> 
> Jason Ekstrand (32):
>   i965/surface_state: Use the minified depth for number of image layers
>   i965/blorp: Use the renderbuffer format for clears
>   i965/blorp: Do flushes around depth resolves
>   i965/miptree: Stop setting FOR_SCANOUT for renderbuffers
>   i965/miptree: Remove some unneeded restrictions
>   intel/blorp: Add a partial resolve pass for MCS
>   i965/miptree: Make layer_range_length return locical layers
>   i965/miptree: Tighten up finish_mcs_write
>   i965/miptree: Add support for partially resolving MCS
>   i965/miptree: Partially resolve MCS for texture views
>   i965/miptree: Add a helper for getting the aux usage for texturing
>   i965/miptree: Rework prepare/finish_render to be in terms of aux_usage
>   i965/blorp: Do prepare/finsh manually
>   i965/blorp: Use texture/render_aux_usage for blits
>   i965/blorp: Be more accurate about aux usage in blorp_copy
>   i965/blorp: Use render_aux_usage for color clears
>   i965/blorp: Use prepare/finish_depth for depth clears
>   i965/miptree: Refactor some things to use mt->aux_usage
>   i965/miptree: Take an aux_usage in prepare/finish
>   intel/isl: Add an aux state for "partial clear"
>   i965/miptree: Use ISL_AUX_STATE_PARTIAL_CLEAR for CCS_D
>   i965/miptree: Allow for accessing a CCS_E image as CCS_D
>   i965/miptree: Use miptree range helpers in has_color_unresolved
>   i965/miptree: Take an isl_format in prepare_texture
>   i965/surface_state: Take an isl_aux_usage in emit_surface_state
>   i965/surface_state: Get the aux usage from the miptree code
>   intel/isl/format: Dedent the template in gen_format_layout.py
>   intel/isl/format: Add an srgb_to_linear helper
>   i965: Weaken the texture view rules for formats slightly
>   intel/blorp: Allow blorp_copy on sRGB formats
>   intel/isl: Add a helper for determining if a color is 0/1
>   i965: Enable regular fast-clears (CCS_D) on gen9+
> 
>  sr

Re: [Mesa-dev] [PATCH 4/5] clover/llvm: Use -cl-std and device version to select language defaults

2017-07-22 Thread Pierre Moreau
Hi Aaron,

On 2017-07-21 — 23:19, Aaron Watry wrote:
> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>  1) If you have -cl-std=CL1.1+ use the version specified
>  2) If not, use the highest 1.x version that the device supports

According to that same part of the spec, clBuildProgram and clCompileProgram
should fail if the specified CL C version is strictly greater than the version
the device supports. You could add a check in `get_language_version()` to
compare `ver` and `device_version`, and throw a `build_error()` exception if
`ver > device_version`.

I have two more comments further down.

> Curiously, there is no valid value for -cl-std=CL1.0
> 
> Signed-off-by: Aaron Watry 
> ---
>  .../state_trackers/clover/llvm/invocation.cpp  | 48 
> --
>  1 file changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 364aaf1517..92d72e5b73 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -93,6 +93,48 @@ namespace {
>return ctx;
> }
>  
> +   clang::LangStandard::Kind
> +   get_language_from_version_str(const std::string &version_str,
> + bool is_opt = false) {
> +   /**
> +* Per CL 2.0 spec, section 5.8.4.5:
> +* If it's an option, use the value directly.
> +* If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> +*/
> +   if (version_str == "1.1")
> +  return clang::LangStandard::lang_opencl11;
> +   if (version_str == "1.2")
> +  return clang::LangStandard::lang_opencl12;
> +   if (version_str == "2.0"){
> +  if (is_opt) return clang::LangStandard::lang_opencl20;
> +  else return clang::LangStandard::lang_opencl12;
> +   }
> +
> +   /*
> +* At this point, it's not a recognized language version option or
> +* 1.1+ device version, which just leaves 1.0 as a possible device
> +* version (or an invalid version string).
> +*/
> +   return clang::LangStandard::lang_opencl10;
> +  }
> +
> +   clang::LangStandard::Kind
> +   get_language_version(const std::vector &opts,
> +const std::string &device_version) {
> +
> +  const std::string search = "-cl-std=CL";
> +
> +   for(auto opt: opts){
> +   auto pos = opt.find(search);
> +   if (pos == 0){
> +   auto ver = opt.substr(pos+search.size());
> +   return get_language_from_version_str(ver, true);
> +   }
> +   }
> +
> +   return get_language_from_version_str(device_version);
> +}
> +
> std::unique_ptr
> create_compiler_instance(const target &target,
>  const std::vector &opts,
> @@ -129,7 +171,7 @@ namespace {
>compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(),
>  compat::ik_opencl, 
> ::llvm::Triple(target.triple),
>  c->getPreprocessorOpts(),
> -clang::LangStandard::lang_opencl11);
> +get_language_version(opts, device_version));
>  
>c->createDiagnostics(new clang::TextDiagnosticPrinter(
>*new raw_string_ostream(r_log),
> @@ -211,7 +253,7 @@ clover::llvm::compile_program(const std::string &source,
>  
> auto ctx = create_context(r_log);
> auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> - r_log);
> + device_version, r_log);

This should be part of patch 3 as that patch doesn't build otherwise.

> auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
>r_log);
>  
> @@ -280,7 +322,7 @@ clover::llvm::link_program(const std::vector 
> &modules,
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, options, r_log);
> +   auto c = create_compiler_instance(target, options, device_version, r_log);

Same here, this should be in patch 3.

Thank you,
Pierre

> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
> -- 
> 2.11.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] A few clover fixes for both CTS and eventual 1.2 support

2017-07-22 Thread Pierre Moreau
With the comments in patch 4 taken care of, this series is

Reviewed-by: Pierre Moreau 


On 2017-07-21 — 23:19, Aaron Watry wrote:
> The first patch is one I've been sitting on for a few weeks while
> I've tried to chase down other issues with clover/llvm/libclc. It
> fixes at least one CTS test that I know of for CL 1.2.
> 
> The other 4 patches move the device version declaration to core/device
> and then use that along with the -cl-std option to determine which
> OpenCL language version to enable in clang.
> 
> I've done a full piglit run before/after, and there are no changes for me
> on radeonsi/pitcairn if the device is left at CL 1.1.
> 
> When I bump my platform/device versions to 1.2, the clang instance has
> been confirmed to enable 1.2 language features (like the static keyword
> required in test/cl/program/execute/static.cl, which goes skip->pass).
> 
> Anyway, happy reviewing.
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

2017-07-22 Thread Daniel Stone
Hi Michel,

On 21 July 2017 at 18:32, Michel Dänzer  wrote:
> On 20/07/17 01:08 PM, Daniel Stone wrote:
>> DRI3 version 1.1 adds support for explicit format modifiers, including
>> multi-planar buffers.
>
> Adding mesa-dev, Nicolai and Marek.
>
> We just ran into an issue which might mean that there's still something
> missing in this v2 proposal:
>
> The context is DRI3 PRIME render offloading of glxgears (not useful in
> practice, but bear with me). The display GPU is Raven Ridge, which requires
> that the stride even of linear textures is a multiple of 256 bytes. The
> renderer GPU is Polaris12, which still supports smaller alignment of the
> stride. With the glxgears window of width 300, the renderer GPU driver
> chooses a stride of 304 (* 4 / 256 = 4.75), whereas the display GPU would
> require 320 (* 4 / 256 = 5). This cannot work.

The obvious answer is just to increase padding on external/winsys
surfaces? Increasing it for all allocations would probably be a
non-starter, but winsys surfaces are rare enough that you could
probably afford to take the hit, I guess.

> I see two basic approaches to solve this:
>
> 1. A protocol request for the client to retrieve the display
>GPU constraints on the stride (and possibly other parameters) for a
>given format and modifier.

+ corresponding new EGL request and new GBM/KMS API :\

> 2. A protocol request which allows the creation of a pixmap with
>given format and modifier. The renderer GPU driver needs to pass in
>the stride it would choose, then the display GPU driver can choose a
>stride satisfying the constraints on both sides.

Heh, that sounds familiar - DRI2!

> Maybe there are other possible approaches I'm missing? Other comments?

I don't have any great solution off the top of my head, but I'd be
inclined to bundle stride in with placement. TTBOMK (from having
looked at radv), buffers shared cross-GPU also need to be allocated
from a separate externally-visible memory heap. And at the moment,
lacking placement information at allocation time (at least for EGL
allocations, via DRIImage), that happens via transparent migration at
import time I think. Placement restrictions would probably also
involve communicating base address alignment requirements.

Given that, I'm fairly inclined to punt those until we have the grand
glorious allocator, rather than trying to add it to EGL/GBM
separately. The modifiers stuff was a fairly obvious augmentation -
EGL already had no-modifier format import but no query as to which
formats it would accept, and modifiers are a logical extension of
format - but adding the other restrictions is a bigger step forward.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101876] SIGSEGV when launching Steam

2017-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101876

Bug ID: 101876
   Summary: SIGSEGV when launching Steam
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: 0xe2.0x9a.0...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

4.5 (Core Profile) Mesa 17.2.0-devel (git-fd199fe4a8)
AMD Radeon (TM) R9 390 Series (AMD HAWAII / DRM 3.15.0 / 4.12.2+, LLVM 4.0.1)

(gdb) bt
#0  0x0039 in ?? ()
#1  0xf66ac912 in _mesa_hash_table_search (ht=0x5865bf70, key=0x5827a528)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/util/hash_table.c:246
#2  0xf65f340f in st_framebuffer_iface_remove (stfbi=0x5827a528)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/state_tracker/st_manager.c:545
#3  st_api_destroy_drawable (stapi=0xf6c3c860 , stfbi=0x5827a528)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/state_tracker/st_manager.c:564
#4  0xf673431c in dri_destroy_buffer (dPriv=0x5827a1d8)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/gallium/state_trackers/dri/dri_drawable.c:186
#5  0xf6732fee in dri_put_drawable (pdp=0x5827a1d8)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/drivers/dri/common/dri_util.c:642
#6  0xf67330d5 in dri_put_drawable (pdp=0x5827a1d8)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/drivers/dri/common/dri_util.c:696
#7  driDestroyDrawable (pdp=0x5827a1d8)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/mesa/drivers/dri/common/dri_util.c:695
#8  0xf6f2c5dd in loader_dri3_drawable_fini (draw=0x5827a478)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/loader/loader_dri3_helper.c:115
#9  0xf6f27b6f in dri3_destroy_drawable (base=0x5827a458)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/dri3_glx.c:372
#10 0xf6f20fdb in driReleaseDrawables (gc=0x57ed2998)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/dri_common.c:452
#11 0xf6f27f6e in dri3_bind_context (context=0x57ed2998, old=0x57ed2998,
draw=33554471, read=33554471)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/dri3_glx.c:229
#12 0xf6efb949 in MakeContextCurrent (dpy=0x57cd4f88, draw=33554471,
read=33554471, gc_user=0x57ed2998)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/glxcurrent.c:228
#13 0xf6efbb56 in glXMakeCurrent (dpy=0x57cd4f88, draw=33554471, gc=0x57ed2998)
at
/var/tmp/portage/media-libs/mesa-/work/mesa-/src/glx/glxcurrent.c:274

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101876] SIGSEGV when launching Steam

2017-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101876

--- Comment #1 from Christoph Haag  ---
This is at fault:
https://cgit.freedesktop.org/mesa/mesa/commit/?id=5124bf982393114862f44ee62fa361027faa7c29

It's also on the list:
https://lists.freedesktop.org/archives/mesa-dev/2017-July/164038.html

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] AMD Tahiti: A question about PFP firmware

2017-07-22 Thread Gustaw Smolarczyk
Hello,

While looking at the extension list on my Tahiti GPU (SI) I have found
that ARB_draw_parameters is missing. Looking at the code, it requires
ME firmware version >= 87 and PFP firmware version >= 121. While the
first is satisfied, the second is not [1]. I believe I use the newest
TAHITI firmware from the linux-firmware repository [2] (I use the
lower-case tahiti_{me,pfp}.bin files). I use the amdgpu kernel driver
instead of radeon (not sure if that matters since both of them should
use the same firmware files); well, if I didn't the versions would be
0 since only amdgpu winsys queries them.

Is there maybe some unreleased firmware file with newer version for my
hardware? Or is the check a nice way of saying "if you want hardware
accelerated MDI, buy newer hardware"?

[1] https://pastebin.com/BpYiJj0Z
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon

Regards,
Gustaw

P.S. I tried to ask on IRC, but I was unable to write a message on
#dri-devel channel as it was rejected (using irc.freenode.net). May
there be something I was missing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 24/32] i965/miptree: Take an isl_format in prepare_texture

2017-07-22 Thread Jason Ekstrand
This will be a bit more convenient momentarily.  It's also more correct
because it makes prepare_texture take sRGB into account.

Cc: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_draw.c  |  6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 +++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 +-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index b77b44e..20ff99f 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -383,8 +383,12 @@ brw_predraw_resolve_inputs(struct brw_context *brw)
   if (!tex_obj || !tex_obj->mt)
 continue;
 
+  struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, i);
+  enum isl_format view_format =
+ translate_tex_format(brw, tex_obj->_Format, sampler->sRGBDecode);
+
   bool aux_supported;
-  intel_miptree_prepare_texture(brw, tex_obj->mt, tex_obj->_Format,
+  intel_miptree_prepare_texture(brw, tex_obj->mt, view_format,
 &aux_supported);
 
   if (!aux_supported && brw->gen >= 9 &&
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e802aff..0db05a7 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2464,18 +2464,15 @@ intel_miptree_set_aux_state(struct brw_context *brw,
 static bool
 can_texture_with_ccs(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
- mesa_format view_format)
+ enum isl_format view_format)
 {
if (mt->aux_usage != ISL_AUX_USAGE_CCS_E)
   return false;
 
-   enum isl_format isl_mt_format = brw_isl_format_for_mesa_format(mt->format);
-   enum isl_format isl_view_format = 
brw_isl_format_for_mesa_format(view_format);
-
if (!isl_formats_are_ccs_e_compatible(&brw->screen->devinfo,
- isl_mt_format, isl_view_format)) {
+ mt->surf.format, view_format)) {
   perf_debug("Incompatible sampling format (%s) for rbc (%s)\n",
- _mesa_get_format_name(view_format),
+ isl_format_get_layout(view_format)->name,
  _mesa_get_format_name(mt->format));
   return false;
}
@@ -2513,7 +2510,7 @@ intel_miptree_texture_aux_usage(struct brw_context *brw,
 static void
 intel_miptree_prepare_texture_slices(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
- mesa_format view_format,
+ enum isl_format view_format,
  uint32_t start_level, uint32_t num_levels,
  uint32_t start_layer, uint32_t num_layers,
  bool *aux_supported_out)
@@ -2526,7 +2523,7 @@ intel_miptree_prepare_texture_slices(struct brw_context 
*brw,
 * the sampler.  If we have a texture view, we would have to perform the
 * clear color conversion manually.  Just disable clear color.
 */
-   if (mt->format != view_format)
+   if (mt->surf.format != view_format)
   clear_supported = false;
 
intel_miptree_prepare_access(brw, mt, start_level, num_levels,
@@ -2539,7 +2536,7 @@ intel_miptree_prepare_texture_slices(struct brw_context 
*brw,
 void
 intel_miptree_prepare_texture(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
-  mesa_format view_format,
+  enum isl_format view_format,
   bool *aux_supported_out)
 {
intel_miptree_prepare_texture_slices(brw, mt, view_format,
@@ -2563,7 +2560,7 @@ intel_miptree_prepare_fb_fetch(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t num_layers)
 {
-   intel_miptree_prepare_texture_slices(brw, mt, mt->format, level, 1,
+   intel_miptree_prepare_texture_slices(brw, mt, mt->surf.format, level, 1,
 start_layer, num_layers, NULL);
 }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 5737808..bc49b6b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -618,7 +618,7 @@ intel_miptree_texture_aux_usage(struct brw_context *brw,
 void
 intel_miptree_prepare_texture(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
-  mesa_format view_format,
+  enum isl_format view_format,
   bool *aux_supported_out);
 void
 intel_miptree_prepare_image(struct brw_con

Re: [Mesa-dev] [PATCH] i965: Support the mesa_no_error driconf option.

2017-07-22 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] svga: rework the FS white fragments code

2017-07-22 Thread Brian Paul
When we forcibly write white to FS outputs (for XOR mode emulation)
we were using a temp register.  But that's not really necessary.
This also fixes the case of writing white to multiple color buffers.

Subsequent changes will build on this.
---
 src/gallium/drivers/svga/svga_state_fs.c| 11 
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c | 43 +++--
 2 files changed, 21 insertions(+), 33 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_state_fs.c 
b/src/gallium/drivers/svga/svga_state_fs.c
index 07a3614..bf45216 100644
--- a/src/gallium/drivers/svga/svga_state_fs.c
+++ b/src/gallium/drivers/svga/svga_state_fs.c
@@ -232,9 +232,7 @@ make_fs_key(const struct svga_context *svga,
 *   
 * SVGA_NEW_BLEND
 */
-   if (svga->curr.blend->need_white_fragments) {
-  key->fs.white_fragments = 1;
-   }
+   key->fs.white_fragments = svga->curr.blend->need_white_fragments;
 
 #ifdef DEBUG
/*
@@ -349,9 +347,10 @@ make_fs_key(const struct svga_context *svga,
   }
}
 
-   /* SVGA_NEW_FRAME_BUFFER */
-   if (fs->base.info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS]) {
-  /* Replicate color0 output to N colorbuffers */
+   /* SVGA_NEW_FRAME_BUFFER | SVGA_NEW_BLEND */
+   if (fs->base.info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS] ||
+   svga->curr.blend->need_white_fragments) {
+  /* Replicate color0 output (or white) to N colorbuffers */
   key->fs.write_color0_to_n_cbufs = svga->curr.framebuffer.nr_cbufs;
}
 
diff --git a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c 
b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
index d9b76c2..9f5cd4b 100644
--- a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
+++ b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
@@ -2707,7 +2707,6 @@ emit_temporaries_declaration(struct 
svga_shader_emitter_v10 *emit)
}
else if (emit->unit == PIPE_SHADER_FRAGMENT) {
   if (emit->key.fs.alpha_func != SVGA3D_CMP_ALWAYS ||
-  emit->key.fs.white_fragments ||
   emit->key.fs.write_color0_to_n_cbufs > 1) {
  /* Allocate a temp to hold the output color */
  emit->fs.color_tmp_index = total_temps;
@@ -6414,11 +6413,9 @@ emit_alpha_test_instructions(struct 
svga_shader_emitter_v10 *emit,
emit_src_register(emit, &tmp_src_x);
end_emit_instruction(emit);
 
-   /* If we don't need to broadcast the color below or set fragments to
-* white, emit final color here.
+   /* If we don't need to broadcast the color below, emit the final color here.
 */
-   if (emit->key.fs.write_color0_to_n_cbufs <= 1 &&
-   !emit->key.fs.white_fragments) {
+   if (emit->key.fs.write_color0_to_n_cbufs <= 1) {
   /* MOV output.color, tempcolor */
   emit_instruction_op1(emit, VGPU10_OPCODE_MOV, &color_dst,
&color_src, FALSE); /* XXX saturate? */
@@ -6429,23 +6426,6 @@ emit_alpha_test_instructions(struct 
svga_shader_emitter_v10 *emit,
 
 
 /**
- * When we need to emit white for all fragments (for emulating XOR logicop
- * mode), this function copies white into the temporary color output register.
- */
-static void
-emit_set_color_white(struct svga_shader_emitter_v10 *emit,
- unsigned fs_color_tmp_index)
-{
-   struct tgsi_full_dst_register color_dst =
-  make_dst_temp_reg(fs_color_tmp_index);
-   struct tgsi_full_src_register white =
-  make_immediate_reg_float(emit, 1.0f);
-
-   emit_instruction_op1(emit, VGPU10_OPCODE_MOV, &color_dst, &white, FALSE);
-}
-
-
-/**
  * Emit instructions for writing a single color output to multiple
  * color buffers.
  * This is used when the TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS (or
@@ -6460,8 +6440,17 @@ emit_broadcast_color_instructions(struct 
svga_shader_emitter_v10 *emit,
 {
const unsigned n = emit->key.fs.write_color0_to_n_cbufs;
unsigned i;
-   struct tgsi_full_src_register color_src =
-  make_src_temp_reg(fs_color_tmp_index);
+   struct tgsi_full_src_register color_src;
+
+   if (emit->key.fs.white_fragments) {
+  /* set all color outputs to white */
+  color_src = make_immediate_reg_float(emit, 1.0f);
+   }
+   else {
+  /* set all color outputs to TEMP[fs_color_tmp_index] */
+  assert(fs_color_tmp_index != INVALID_INDEX);
+  color_src = make_src_temp_reg(fs_color_tmp_index);
+   }
 
assert(emit->unit == PIPE_SHADER_FRAGMENT);
 
@@ -6497,6 +6486,9 @@ emit_post_helpers(struct svga_shader_emitter_v10 *emit)
else if (emit->unit == PIPE_SHADER_FRAGMENT) {
   const unsigned fs_color_tmp_index = emit->fs.color_tmp_index;
 
+  assert(!(emit->key.fs.white_fragments &&
+   emit->key.fs.write_color0_to_n_cbufs == 0));
+
   /* We no longer want emit_dst_register() to substitute the
* temporary fragment color register for the real color output.
*/
@@ -6505,9 +6497,6 @@ emit_post_helpers(struct svga_shader_emitter_v10 *emit)
   if (emit->key.fs.alpha_func != SVGA3D_CMP_ALWAYS) {
  emit_alpha_tes

[Mesa-dev] [PATCH 2/2] svga: implement MSAA alpha_to_one feature

2017-07-22 Thread Brian Paul
The device doesn't directly support this feature so we implement it with
additional shader code which sets the color output(s) w component to
1.0 (or max_int or max_uint).

Fixes 16 Piglit ext_framebuffer_multisample/*alpha-to-one* tests.
---
 src/gallium/drivers/svga/svga_context.h |  1 +
 src/gallium/drivers/svga/svga_pipe_blend.c  |  1 +
 src/gallium/drivers/svga/svga_shader.h  |  3 ++
 src/gallium/drivers/svga/svga_state_fs.c| 18 +
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c | 61 -
 5 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/svga/svga_context.h 
b/src/gallium/drivers/svga/svga_context.h
index d0306c0..0d695a6 100644
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -106,6 +106,7 @@ struct svga_blend_state {
unsigned need_white_fragments:1;
unsigned independent_blend_enable:1;
unsigned alpha_to_coverage:1;
+   unsigned alpha_to_one:1;
unsigned blend_color_alpha:1;  /**< set blend color to alpha value */
 
/** Per-render target state */
diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 408e175..a29fbd3 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -331,6 +331,7 @@ svga_create_blend_state(struct pipe_context *pipe,
blend->independent_blend_enable = templ->independent_blend_enable;
 
blend->alpha_to_coverage = templ->alpha_to_coverage;
+   blend->alpha_to_one = templ->alpha_to_one;
 
if (svga_have_vgpu10(svga)) {
   define_blend_state_object(svga, blend);
diff --git a/src/gallium/drivers/svga/svga_shader.h 
b/src/gallium/drivers/svga/svga_shader.h
index a594d12..53c9e22 100644
--- a/src/gallium/drivers/svga/svga_shader.h
+++ b/src/gallium/drivers/svga/svga_shader.h
@@ -77,11 +77,14 @@ struct svga_compile_key
   unsigned light_twoside:1;
   unsigned front_ccw:1;
   unsigned white_fragments:1;
+  unsigned alpha_to_one:1;
   unsigned flatshade:1;
   unsigned pstipple:1;
   unsigned alpha_func:4;  /**< SVGA3D_CMP_x */
   unsigned write_color0_to_n_cbufs:4;
   unsigned aa_point:1;
+  unsigned int_render_target_mask:8;
+  unsigned uint_render_target_mask:8;
   int aa_point_coord_index;
   float alpha_ref;
} fs;
diff --git a/src/gallium/drivers/svga/svga_state_fs.c 
b/src/gallium/drivers/svga/svga_state_fs.c
index bf45216..327364c 100644
--- a/src/gallium/drivers/svga/svga_state_fs.c
+++ b/src/gallium/drivers/svga/svga_state_fs.c
@@ -25,6 +25,7 @@
 
 #include "util/u_inlines.h"
 #include "pipe/p_defines.h"
+#include "util/u_format.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
 #include "util/u_bitmask.h"
@@ -234,6 +235,8 @@ make_fs_key(const struct svga_context *svga,
 */
key->fs.white_fragments = svga->curr.blend->need_white_fragments;
 
+   key->fs.alpha_to_one = svga->curr.blend->alpha_to_one;
+
 #ifdef DEBUG
/*
 * We expect a consistent set of samplers and sampler views.
@@ -354,6 +357,21 @@ make_fs_key(const struct svga_context *svga,
   key->fs.write_color0_to_n_cbufs = svga->curr.framebuffer.nr_cbufs;
}
 
+   /* SVGA_NEW_FRAME_BUFFER
+* Determine which render targets are int/uint/float.
+*/
+   const struct pipe_framebuffer_state *fb = &svga->curr.framebuffer;
+   for (i = 0; i < fb->nr_cbufs; i++) {
+  const enum pipe_format f =
+ fb->cbufs[i] ? fb->cbufs[i]->format : PIPE_FORMAT_NONE;
+  if (util_format_is_pure_sint(f)) {
+ key->fs.int_render_target_mask |= 1 << i;
+  }
+  else if (util_format_is_pure_uint(f)) {
+ key->fs.uint_render_target_mask |= 1 << i;
+  }
+   }
+
return PIPE_OK;
 }
 
diff --git a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c 
b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
index 9f5cd4b..8984ce5 100644
--- a/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
+++ b/src/gallium/drivers/svga/svga_tgsi_vgpu10.c
@@ -167,8 +167,8 @@ struct svga_shader_emitter_v10
 
/* For fragment shaders only */
struct {
-  /* apha test */
   unsigned color_out_index[PIPE_MAX_COLOR_BUFS];  /**< the real color 
output regs */
+  unsigned num_color_outputs;
   unsigned color_tmp_index;  /**< fake/temp color output reg */
   unsigned alpha_ref_index;  /**< immediate constant for alpha ref */
 
@@ -2499,6 +2499,9 @@ emit_output_declarations(struct svga_shader_emitter_v10 
*emit)
 
 emit->fs.color_out_index[semantic_index] = index;
 
+emit->fs.num_color_outputs = MAX2(emit->fs.num_color_outputs,
+  index + 1);
+
 /* The semantic index is the shader's color output/buffer index */
 emit_output_declaration(emit,
 VGPU10_OPCODE_DCL_OUTPUT, semantic_index,
@@ -2521,6 +2524,9 @@ emit_output_declarations(struct svga_shader_emitter_v10 
*emit)
 

Re: [Mesa-dev] [PATCH 1/5] clover/memory: Copy data when creating buffers with CL_MEM_USE_HOST_PTR

2017-07-22 Thread Jan Vesely
On Fri, 2017-07-21 at 23:19 -0500, Aaron Watry wrote:
> Fixes: OpenCL CTS test/conformance/buffers/buffer_copy

Similar patch was pushed in 2013:
56647c5d8f8e60269f0a3277e3caa7ee57d1fe6a
"clover: Append buffers that use CL_MEM_USE_HOST_PTR."

Grigory(added to cc) reverted the change and implemented user_ptr
mechanism in:
f972b223c4cb4ec58a9451cbac5d120ac9deb336
"clover: try userptr for CL_MEM_USE_HOST_PTR"

The buffer-flags piglit still passes, but it maps even CL_USE_HOST_PTR
buffers. I'm not sure what the CTS does, we might need a
synchronization point after kernels finish execution. I couldn't find
the relevant part of specs that would define accessing
CL_MEM_USE_HOST_PTR buffers without 
clEnqueueMapBuffer or clEnqueueReadBuffer


Jan


> 
> Signed-off-by: Aaron Watry 
> CC: Francisco Jerez 
> ---
>  src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/core/memory.cpp 
> b/src/gallium/state_trackers/clover/core/memory.cpp
> index b852e6896f..912d74830a 100644
> --- a/src/gallium/state_trackers/clover/core/memory.cpp
> +++ b/src/gallium/state_trackers/clover/core/memory.cpp
> @@ -30,7 +30,7 @@ memory_obj::memory_obj(clover::context &ctx, cl_mem_flags 
> flags,
> size_t size, void *host_ptr) :
> context(ctx), _flags(flags),
> _size(size), _host_ptr(host_ptr) {
> -   if (flags & CL_MEM_COPY_HOST_PTR)
> +   if (flags & (CL_MEM_COPY_HOST_PTR | CL_MEM_USE_HOST_PTR))
>data.append((char *)host_ptr, size);
>  }
>  

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] clover/device: Move device version into core/device.cpp

2017-07-22 Thread Jan Vesely
On Fri, 2017-07-21 at 23:19 -0500, Aaron Watry wrote:
> The device version is the maximum CL version that the device supports.
> 
> Eventually, this will query the pipe_driver itself, but for now move it
> a bit closer to its eventual destination.

Ideally we'd check supported extensions, but CLC 1.1 required
extensions are hardcoded just few lines below.

Reviewed-by: Jan Vesely 

Jan

> 
> Signed-off-by: Aaron Watry 
> ---
>  src/gallium/state_trackers/clover/api/device.cpp  | 4 ++--
>  src/gallium/state_trackers/clover/core/device.cpp | 5 +
>  src/gallium/state_trackers/clover/core/device.hpp | 1 +
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
> b/src/gallium/state_trackers/clover/api/device.cpp
> index 0b33350bb2..18ed2f059f 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -314,7 +314,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>break;
>  
> case CL_DEVICE_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  buf.as_string() = "OpenCL " + dev.device_version() + " Mesa " 
> PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> @@ -368,7 +368,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>break;
>  
> case CL_DEVICE_OPENCL_C_VERSION:
> -  buf.as_string() = "OpenCL C 1.1 ";
> +  buf.as_string() = "OpenCL C " + dev.device_version() + " ";
>break;
>  
> case CL_DEVICE_PRINTF_BUFFER_SIZE:
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 2ad9e49cf8..0277495506 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -240,3 +240,8 @@ enum pipe_endian
>  device::endianness() const {
> return (enum pipe_endian)pipe->get_param(pipe, PIPE_CAP_ENDIANNESS);
>  }
> +
> +std::string
> +device::device_version() const {
> +return "1.1";
> +}
> diff --git a/src/gallium/state_trackers/clover/core/device.hpp 
> b/src/gallium/state_trackers/clover/core/device.hpp
> index 7b3353df34..3cf7e20be5 100644
> --- a/src/gallium/state_trackers/clover/core/device.hpp
> +++ b/src/gallium/state_trackers/clover/core/device.hpp
> @@ -74,6 +74,7 @@ namespace clover {
>cl_uint address_bits() const;
>std::string device_name() const;
>std::string vendor_name() const;
> +  std::string device_version() const;
>enum pipe_shader_ir ir_format() const;
>std::string ir_target() const;
>enum pipe_endian endianness() const;

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] clover/llvm: Use -cl-std and device version to select language defaults

2017-07-22 Thread Jan Vesely
On Sat, 2017-07-22 at 13:20 +0200, Pierre Moreau wrote:
> Hi Aaron,
> 
> On 2017-07-21 — 23:19, Aaron Watry wrote:
> > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
> >  1) If you have -cl-std=CL1.1+ use the version specified
> >  2) If not, use the highest 1.x version that the device supports
> 
> According to that same part of the spec, clBuildProgram and clCompileProgram
> should fail if the specified CL C version is strictly greater than the version
> the device supports. You could add a check in `get_language_version()` to
> compare `ver` and `device_version`, and throw a `build_error()` exception if
> `ver > device_version`.

These should also be using CLC version rather than device CL version.
OCL allows OCL 1.0 devices to support CLC 1.1 language features. (it
also requires a split of these values in 3/5)
I'm not sure how realistic it is in clover. It does not look like any
clover supported device won't be able to provide at least OCL 1.0
runtime features. So feel free to ignore (but make a note if you choose
 to do so)

Jan

> 
> I have two more comments further down.
> 
> > Curiously, there is no valid value for -cl-std=CL1.0
> > 
> > Signed-off-by: Aaron Watry 
> > ---
> >  .../state_trackers/clover/llvm/invocation.cpp  | 48 
> > --
> >  1 file changed, 45 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 364aaf1517..92d72e5b73 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -93,6 +93,48 @@ namespace {
> >return ctx;
> > }
> >  
> > +   clang::LangStandard::Kind
> > +   get_language_from_version_str(const std::string &version_str,
> > + bool is_opt = false) {
> > +   /**
> > +* Per CL 2.0 spec, section 5.8.4.5:
> > +* If it's an option, use the value directly.
> > +* If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> > +*/
> > +   if (version_str == "1.1")
> > +  return clang::LangStandard::lang_opencl11;
> > +   if (version_str == "1.2")
> > +  return clang::LangStandard::lang_opencl12;
> > +   if (version_str == "2.0"){
> > +  if (is_opt) return clang::LangStandard::lang_opencl20;
> > +  else return clang::LangStandard::lang_opencl12;
> > +   }
> > +
> > +   /*
> > +* At this point, it's not a recognized language version option or
> > +* 1.1+ device version, which just leaves 1.0 as a possible device
> > +* version (or an invalid version string).
> > +*/
> > +   return clang::LangStandard::lang_opencl10;
> > +  }
> > +
> > +   clang::LangStandard::Kind
> > +   get_language_version(const std::vector &opts,
> > +const std::string &device_version) {
> > +
> > +  const std::string search = "-cl-std=CL";
> > +
> > +   for(auto opt: opts){
> > +   auto pos = opt.find(search);
> > +   if (pos == 0){
> > +   auto ver = opt.substr(pos+search.size());
> > +   return get_language_from_version_str(ver, true);
> > +   }
> > +   }
> > +
> > +   return get_language_from_version_str(device_version);
> > +}
> > +
> > std::unique_ptr
> > create_compiler_instance(const target &target,
> >  const std::vector &opts,
> > @@ -129,7 +171,7 @@ namespace {
> >compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(),
> >  compat::ik_opencl, 
> > ::llvm::Triple(target.triple),
> >  c->getPreprocessorOpts(),
> > -clang::LangStandard::lang_opencl11);
> > +get_language_version(opts, 
> > device_version));
> >  
> >c->createDiagnostics(new clang::TextDiagnosticPrinter(
> >*new raw_string_ostream(r_log),
> > @@ -211,7 +253,7 @@ clover::llvm::compile_program(const std::string &source,
> >  
> > auto ctx = create_context(r_log);
> > auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> > - r_log);
> > + device_version, r_log);
> 
> This should be part of patch 3 as that patch doesn't build otherwise.
> 
> > auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
> >r_log);
> >  
> > @@ -280,7 +322,7 @@ clover::llvm::link_program(const std::vector 
> > &modules,
> > erase_if(equals("-create-library"), options);
> >  
> > auto ctx = create_context(r_log);
> > -   auto c = create_compiler_instance(target, options, r_log);
> > +   auto c = create_compiler_instance(target, options, device_version, 
> > r_log);
> 
> Same here, this

Re: [Mesa-dev] [PATCH 5/5] clover/llvm: Make __OPENCL_VERSION__ dynamic

2017-07-22 Thread Jan Vesely
On Fri, 2017-07-21 at 23:19 -0500, Aaron Watry wrote:
> Base it on the active language version
> 
> Signed-off-by: Aaron Watry 
> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 92d72e5b73..b562babf91 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -202,7 +202,8 @@ namespace {
>c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
>  
>// Add definition for the OpenCL version
> -  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110");
> +  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" +
> +  std::to_string(c.getLangOpts().OpenCLVersion));

similar to previous patch. This should use device OCL version rather
than language version. Moreover, I don't think value of this macro
should be impacted by -cl-std= build parameter.
__OPENCL_C_VERSION__ was added for to fill the gap of both above
behaviours.

Jan

>  
>// clc.h requires that this macro be defined:
>
> c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers");

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101876] SIGSEGV when launching Steam

2017-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101876

Mike Lothian  changed:

   What|Removed |Added

 CC||m...@fireburn.co.uk

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Drop non-LLC lunacy in the program cache code.

2017-07-22 Thread Kenneth Graunke
On Saturday, July 22, 2017 2:14:28 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-07-22 00:17:47)
> > The non-LLC story was a horror show.  We uploaded data via pwrite
> > (drm_intel_bo_subdata), which would stall if the cache BO was in
> > use (being read) by the GPU.  Obviously, we wanted to avoid that.
> > So, we tried to detect whether the buffer was busy, and if so, we'd
> > allocate a new BO, map the old one read-only (hopefully not stalling),
> > copy all shaders compiled since the dawn of time to the new buffer,
> > upload our new one, toss the old BO, and let the state upload code
> > know that our program cache BO changed.  This was a lot of extra data
> > copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
> > STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.
> > 
> > Not only that, but our rudimentary busy tracking consistented of a flag
> > set at execbuf time, and not cleared until we threw out the program
> > cache BO.  So, the first shader upload after any drawing would hit this
> > "abandon the cache and start over" copying path.
> > 
> > This is largely unnecessary - it's just ancient and crufty code.  We can
> > use the same persistent mapping paths on all platforms.  On non-ancient
> > kernels, this will use a write combining map, which should be reasonably
> > fast.
> > 
> > One aspect that is worse: we do occasionally grow the program cache BO,
> > and copy the old contents to the newer BO.  This will suffer from UC
> > readback performance now.  To mitigate this, we use the MOVNTDQA based
> > streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms).  Gen4-5
> > are unfortunately going to be penalized.
> > 
> > v2: Add MOVNTDQA path, rebase on other map flag changes.
> 
> Don't forgot cache->bo_used_by_gpu!

Right!  fixed locally, thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] i965/bufmgr: Allocate BO pages outside of the kernel's locking.

2017-07-22 Thread Kenneth Graunke
On Saturday, July 22, 2017 2:09:43 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-07-22 00:17:43)
> > Suggested by Chris Wilson.
> > ---
> >  src/mesa/drivers/dri/i965/brw_bufmgr.c | 13 +
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> > b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > index 1669d26e990..78a4626d430 100644
> > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> > @@ -382,6 +382,19 @@ retry:
> >  
> >if (bo_set_tiling_internal(bo, tiling_mode, stride))
> >   goto err_free;
> > +
> > +  /* Calling set_domain() will allocate pages for the BO outside of the
> > +   * struct mutex lock in the kernel, which is more efficient than 
> > waiting
> > +   * to create them during the first execbuf that uses the BO.
> > +   */
> > +  struct drm_i915_gem_set_domain sd = {
> > + .handle = bo->gem_handle,
> > + .read_domains = I915_GEM_DOMAIN_CPU,
> > + .write_domain = I915_GEM_DOMAIN_CPU,
> 
> We can pass .write_domain = 0 here. Should be no different as the bo is
> created in the CPU write domain and already marked as dirty. I think it
> is a better reflection of intent that we are just pulling in the pages.
> -Chris

Fixed this locally, thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/isl: Tighten up restrictions for CCS on gen7

2017-07-22 Thread Jason Ekstrand
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7.  However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering.  Let's just play it safe and disallow it for now.  If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
---
 src/intel/isl/isl.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 9cf5821..5465496 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1749,9 +1749,30 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
if (surf->usage & ISL_SURF_USAGE_DISABLE_AUX_BIT)
   return false;
 
+   /* The PRM doesn't say this explicitly, but fast-clears don't appear to
+* work for 3D textures until gen9 where the layout of 3D textures changes
+* to match 2D array textures.
+*/
if (ISL_DEV_GEN(dev) <= 8 && surf->dim != ISL_SURF_DIM_2D)
   return false;
 
+   /* From the HSW PRM Volume 7: 3D-Media-GPGPU, page 652 (Color Clear of
+* Non-MultiSampler Render Target Restrictions):
+*
+*"Support is for non-mip-mapped and non-array surface types only."
+*
+* This restriction is lifted on gen8+.  Technically, it may be possible to
+* create a CCS for an arrayed or mipmapped image and only enable CCS_D
+* when rendering to the base slice.  However, there is no documentation
+* tell us what the hardware would do in that case or what it does if you
+* walk off the bases slice.  (Does it ignore CCS or does it start
+* scribbling over random memory?)  We play it safe and just follow the
+* docs and don't allow CCS_D for arrayed or mip-mapped surfaces.
+*/
+   if (ISL_DEV_GEN(dev) <= 7 &&
+   (surf->levels > 1 || surf->logical_level0_px.array_len > 1))
+  return false;
+
if (isl_format_is_compressed(surf->format))
   return false;
 
@@ -1789,21 +1810,14 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
   return false;
}
 
-   /* Multi-LOD and multi-layer CCS isn't supported on gen7. */
-   const uint8_t levels = ISL_DEV_GEN(dev) <= 7 ? 1 : surf->levels;
-   const uint32_t array_len = ISL_DEV_GEN(dev) <= 7 ?
-  1 : surf->logical_level0_px.array_len;
-   const uint32_t depth = ISL_DEV_GEN(dev) <= 7 ?
-  1 : surf->logical_level0_px.depth;
-
return isl_surf_init(dev, ccs_surf,
 .dim = surf->dim,
 .format = ccs_format,
 .width = surf->logical_level0_px.width,
 .height = surf->logical_level0_px.height,
-.depth = depth,
-.levels = levels,
-.array_len = array_len,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
 .samples = 1,
 .row_pitch = row_pitch,
 .usage = ISL_SURF_USAGE_CCS_BIT,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add compressed_tex_sub_image_{error, no_error} helpers

2017-07-22 Thread Timothy Arceri

On 21/07/17 19:44, Samuel Pitoiset wrote:

On 07/21/2017 11:19 AM, Timothy Arceri wrote:
I wasn't too worried about this because more than just no_error gets 
in-lined away, for example the dsa conditions and also the dim == 3. 
The resulting output shouldn't be overly large. What do you think?


Usually, when a helper function marked as ALWAYS_INLINE has to be 
inlined more than one time, we declare two more helpers (XXX_error() and 
XXX_no_error()), this is also why I wrote this patch.


Sure I'm just saying that in this instance it's not so bad, the code 
getting inlined will be rather small and in most cases will end up 
unique to the caller.


However we probably should just keep things consistent as you say, I'm 
sure its not gaining us much so:


Reviewed-by: Timothy Arceri 





On 21/07/17 18:43, Samuel Pitoiset wrote:

To avoid inlining compressed_tex_sub_image() a bunch of times.

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/teximage.c | 101 
++-

  1 file changed, 65 insertions(+), 36 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 38feb3fea4..c8aa2803e7 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -4997,6 +4997,31 @@ compressed_tex_sub_image(unsigned dim, GLenum 
target, GLuint texture,

 }
  }
+static void
+compressed_tex_sub_image_error(unsigned dim, GLenum target, GLuint 
texture,
+   GLint level, GLint xoffset, GLint 
yoffset,
+   GLint zoffset, GLsizei width, GLsizei 
height,
+   GLsizei depth, GLenum format, GLsizei 
imageSize,

+   const GLvoid *data, bool dsa,
+   const char *caller)
+{
+   compressed_tex_sub_image(dim, target, texture, level, xoffset, 
yoffset,
+zoffset, width, height, depth, format, 
imageSize,

+data, dsa, false, caller);
+}
+
+static void
+compressed_tex_sub_image_no_error(unsigned dim, GLenum target, 
GLuint texture,
+  GLint level, GLint xoffset, GLint 
yoffset,
+  GLint zoffset, GLsizei width, 
GLsizei height,
+  GLsizei depth, GLenum format, 
GLsizei imageSize,

+  const GLvoid *data, bool dsa,
+  const char *caller)
+{
+   compressed_tex_sub_image(dim, target, texture, level, xoffset, 
yoffset,
+zoffset, width, height, depth, format, 
imageSize,

+data, dsa, true, caller);
+}
  void GLAPIENTRY
  _mesa_CompressedTexSubImage1D_no_error(GLenum target, GLint level,
@@ -5004,9 +5029,9 @@ _mesa_CompressedTexSubImage1D_no_error(GLenum 
target, GLint level,
 GLenum format, GLsizei 
imageSize,

 const GLvoid *data)
  {
-   compressed_tex_sub_image(1, target, 0, level, xoffset, 0, 0, width,
-1, 1, format, imageSize, data, false, true,
-"glCompressedTexSubImage1D");
+   compressed_tex_sub_image_no_error(1, target, 0, level, xoffset, 
0, 0, width,
+ 1, 1, format, imageSize, data, 
false,

+ "glCompressedTexSubImage1D");
  }
@@ -5015,9 +5040,9 @@ _mesa_CompressedTexSubImage1D(GLenum target, 
GLint level, GLint xoffset,

GLsizei width, GLenum format,
GLsizei imageSize, const GLvoid *data)
  {
-   compressed_tex_sub_image(1, target, 0, level, xoffset, 0, 0, 
width, 1, 1,

-format, imageSize, data, false, false,
-"glCompressedTexSubImage1D");
+   compressed_tex_sub_image_error(1, target, 0, level, xoffset, 0, 
0, width, 1,

+  1, format, imageSize, data, false,
+  "glCompressedTexSubImage1D");
  }
@@ -5027,9 +5052,9 @@ 
_mesa_CompressedTextureSubImage1D_no_error(GLuint texture, GLint level,
 GLenum format, GLsizei 
imageSize,

 const GLvoid *data)
  {
-   compressed_tex_sub_image(1, 0, texture, level, xoffset, 0, 0, 
width, 1, 1,

-format, imageSize, data, true, true,
-"glCompressedTextureSubImage1D");
+   compressed_tex_sub_image_no_error(1, 0, texture, level, xoffset, 
0, 0, width,
+ 1, 1, format, imageSize, data, 
true,

+ "glCompressedTextureSubImage1D");
  }
@@ -5038,9 +5063,9 @@ _mesa_CompressedTextureSubImage1D(GLuint 
texture, GLint level, GLint xoffset,

GLsizei width, GLenum format,
  

Re: [Mesa-dev] [PATCH 026/101] mesa: add KHR_no_error support to gl{Create, Gen}VertexArrays()

2017-07-22 Thread Timothy Arceri

1-26:

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #4999 failed

2017-07-22 Thread AppVeyor



Build mesa 4999 failed


Commit 3e57e9494c by Jason Ekstrand on 6/22/2017 4:35 AM:

i965: Enable regular fast-clears (CCS_D) on gen9+\n\nThe set of formats which supports CCS_E is actually fairly small on\ngen9.  However, everything that supports fast-clears on gen8 also\nsupports fast-clears on gen9+.  The one very annoying exception is\nthat blending is broken for non-0/1 clear colors with sRGB formats.\nIn order to solve that problem, we do a resolve to get rid of the\nclear color.  Another option would be to just not fast-clear with\nnon-0/1 clear colors however non-0/1 + blending + sRGB is uncommon\nenough that this shouldn't be a significant performance problem.\n\nThis appears to help gl_manhattan31_off by about 2%.\n\nReviewed-by: Topi Pohjolainen 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #5000 completed

2017-07-22 Thread AppVeyor


Build mesa 5000 completed



Commit 6874b953f6 by Jason Ekstrand on 7/11/2017 6:07 PM:

anv/image: zalloc image views\n\nThis allows us to avoid some extra zeroing.\n\nReviewed-by: Lionel Landwerlin 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 17.2.0 release plan

2017-07-22 Thread Jason Ekstrand
On Fri, Jul 21, 2017 at 11:00 AM, Emil Velikov 
wrote:

> On 20 July 2017 at 06:09, Jason Ekstrand  wrote:
> > On Mon, Jul 17, 2017 at 7:54 AM, Emil Velikov 
> > wrote:
> >>
> >> On 16 July 2017 at 06:35, Jason Ekstrand  wrote:
> >> > n Fri, Jul 7, 2017 at 11:07 AM, Emil Velikov <
> emil.l.veli...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi all,
> >> >>
> >> >> As you may have noticed, for a little while now we've had the release
> >> >> plan on the mesa3d.org website [1].
> >> >>
> >> >> Here is the current tentative schedule.
> >> >>
> >> >>  Jul 21 2017 - Feature freeze/Release candidate 1
> >> >>  Jul 28 2017 - Release candidate 2
> >> >>  Aug 04 2017 - Release candidate 3
> >> >>  Aug 11 2017 - Release candidate 4/final release
> >> >>
> >> >> This gives us approximately 2 weeks to get new features in.
> >> >>
> >> >> As always, please let me know of must have features that you'll like
> >> >> to merge before the branch point.
> >> >
> >> >
> >> > Here's my wishlist:
> >> >
> >> >  1) I'd like to land the rest of Topi's work to convert the i965
> driver
> >> > over
> >> > to ISL.  I really don't want to have a release where it's half old
> >> > miptree
> >> > code and half ISL.  That's going to make back-porting a real pain.
> >
> >
> > The depth/stencil bits are basically ready to land and I think Topi will
> be
> > sending the color bits later today.
>

Landed.


> >>
> >> >  2) I've got a bunch of fixes to the miptree code that are currently
> WIP
> >> > that I'd like to see go in.  If they don't make the branch point, I'll
> >> > probably just CC the lot to stable so it's not a huge deal.
> >
> >
> > These hit the list today.  I expect topi to review fairly quickly.
>

Landed.


> >>
> >> >  3) As many of the new Vulkan extensions as we can land.  But don't
> >> > block
> >> > the release on any of those.
> >
> >
> > I think everything I had any hope of landing here has landed.
> >
> >>
> >> It's a bit lengthy list considering we have a bit less than a week.
> >>
> >> I've pushed my Vulkan fixes so you can merge the generator rework.
> >> I'll try to lend a hand for the rest.
> >
> >
> > If the branch got delayed by a couple of days to let the miptree churn
> land,
> > I wouldn't mind at all.  Otherwise, I think there's a decent chance
> we'll be
> > CCing the whole thing to stable because we really don't want 17.2 to be
> half
> > ISL and half old code.
> >
> Taking into account the above and the a few outstanding build bits,
> let's call give it another couple of days.
> That will be Sunday late afternoon. I'll do a final shout out on IRC
> an hour or two in advance
>

I've gotten everything in that I was hoping for.  Thanks for the short
delay!

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev