Re: [Mesa-dev] [PATCH 0/2] Nir: Allow CSE of SSBO loads

2015-10-26 Thread Iago Toral
On Fri, 2015-10-23 at 09:26 -0700, Jason Ekstrand wrote:
> On Thu, Oct 22, 2015 at 11:13 PM, Iago Toral  wrote:
> > On Thu, 2015-10-22 at 09:09 -0700, Jason Ekstrand wrote:
> >> On Thu, Oct 22, 2015 at 4:21 AM, Iago Toral Quiroga  
> >> wrote:
> >> > I implemented this first as a separate optimization pass in GLSL IR [1], 
> >> > but
> >> > Curro pointed out that this being pretty much a restricted form of a CSE 
> >> > pass
> >> > it would probably make more sense to do it inside CSE (and we no longer 
> >> > have
> >> > a CSE pass in GLSL IR).
> >> >
> >> > Unlike other things we CSE in NIR, in the case of SSBO loads we need to 
> >> > make
> >> > sure that we invalidate previous entries in the set in the presence of
> >> > conflicting instructions (i.e. SSBO writes to the same block and offset) 
> >> > or
> >> > in the presence of memory barriers.
> >> >
> >> > If this is accepted I intend to extend this to also cover image reads, 
> >> > which
> >> > follow similar behavior.
> >> >
> >> > No regressions observed in piglit or dEQP's SSBO functional tests.
> >> >
> >> > [1] 
> >> > http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html
> >>
> >> I think you've gotten enough NAK's that I don't need to chime in
> >> there.  Unfortunately, solving this in general is something of a
> >> research project that both Connor and I have been thinking about for
> >> quite some time.  I've been thinking off-and-on about how to add a
> >> proper memory model to lower_vars_to_ssa for almost a year now and
> >> still haven't come up with a good way to do it.  I don't know whether
> >> SSBO's would be simpler or not.  We need a proper memory model for
> >> both lower_vars_to_ssa and SSBO load/stores (and shared local
> >> variables) but it's a substantial research project.
> >>
> >> This isn't to say that you couldn't do it.  Just know what you're taking 
> >> on. ;-)
> >
> > Yeah, it does not make sense that I try to do this, you guys have
> > clearly given this much more thought than me and know much better how a
> > solution for this would fit in NIR than me.
> >
> >> That said, here's a suggestion for something that we *could* write
> >> today, wouldn't be very hard, and wold solve a decent number of cases.
> >>
> >> For each block:
> >>
> >> 1) Create a new instruction set (don't use anything from any previous 
> >> blocks)
> >> 2) call add_or_rewrite on all ssbo load operations
> >> 3) If you ever see a barrier or ssbo store, destroy the entire
> >> instruction set and start again.
> >
> > Yep, this is what I was thinking for the load-combine pass that Connor
> > suggested. However, I think that in this case we do not need to destroy
> > the entire set when we find a store, only for memory barriers, right? I
> > mean, there should be nothing preventing us from checking the
> > offset/block of the store and compare it with the offset/block of the
> > loads in the set to decide which ones we need to remove (like I was
> > doing in my last patch)
> 
> That's where you get into the "special casing" I mentioned below.  If
> you have an direct store, you would have to throw away any indirect
> loads 

Yes.

> and then insert a fake direct load for the given offset.

Actually, what I am doing is a bit different:

When I see stores, I also insert them in the hash table (but I never
rewrite stores). Then, when I see see a load, I check for a match, if I
have it, I use it, if not, I check if I have a store to the same offset,
and If I do, I just use that, no need to fake anything. Of course, if I
do this, in order to check if I have a compatible store I have to
traverse the hash table looking for a match, but I think that should be
okay in this case, since that only has load/store operations and only
the ones in the current block, so I think it should be okay.

Does this seem like a reasonable alternative?

>   If you
> have an indirect store, you would have to throw away everything and
> then insert a fake indirect load for the given offset.  So, yes, you
> can do it, but it'll take a little more work.

Yeah, mostly because there are also atomics to consider and then you
also have to check that the stores write to all the components we read
before we reuse them, etc.

>   You'll also probably
> want to use a different hashing function than we use for CSE because
> you need to be able to handle both real and fake loads and have them
> compare/hash the same.
>
> Does that make sense?
> --Jason
> 
> >> This is something you could put together fairly quickly and would
> >> handle a fair number of cases.  With a little special casing, you may
> >> also be able to handle store and then an immediate load of the same
> >> value or duplicate stores.  Anything much more complex than that is
> >> going to take a lot more thought.
> >
> > Yes, I'll give this a try next. Thanks for all the comments and
> > suggestions!
> >
> > Iago
> >
> 


___
mesa-dev mailing list
mesa-dev@

[Mesa-dev] [Bug 92645] kodi vdpau interop fails since mesa, meta: move gl_texture_object::TargetIndex initializations

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92645

Michel Dänzer  changed:

   What|Removed |Added

 CC||bri...@vmware.com
  Component|Drivers/Gallium/radeonsi|Mesa core
   Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
 QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radeonsi: allow copying between compatible compressed and uncompressed formats

2015-10-26 Thread Michel Dänzer
On 26.10.2015 02:25, Marek Olšák wrote:
> From: Marek Olšák 
> 
> which is where a block in src maps to a pixel in dst and vice versa.
> e.g. DXT1 <-> R32G32_UINT
>  DXT5 <-> R32G32B32A32_UINT
> ---
>  src/gallium/drivers/radeonsi/si_blit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
> b/src/gallium/drivers/radeonsi/si_blit.c
> index a226436..cd2add4 100644
> --- a/src/gallium/drivers/radeonsi/si_blit.c
> +++ b/src/gallium/drivers/radeonsi/si_blit.c
> @@ -507,7 +507,7 @@ void si_resource_copy_region(struct pipe_context *ctx,
>   util_blitter_default_dst_texture(&dst_templ, dst, dst_level, dstz);
>   util_blitter_default_src_texture(&src_templ, src, src_level);
>  
> - if (util_format_is_compressed(src->format) &&
> + if (util_format_is_compressed(src->format) ||
>   util_format_is_compressed(dst->format)) {
>   unsigned blocksize = util_format_get_blocksize(src->format);
>  
> 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Jose Fonseca

On 23/10/15 22:26, srol...@vmware.com wrote:

From: Roland Scheidegger 

f16c intrinsic can only be emitted when AVX is used. So when we disable AVX
due to forcing 128bit vectors we must not use this intrinsic (depending on
llvm version, this worked previously because llvm used AVX even when we didn't
tell it to, however I've seen this fail with llvm 3.3 since
718249843b915decf8fccec92e466ac1a6219934 which seems to have the side effect
of disabling avx in llvm albeit it only touches sse flags really).


Good catch.


Possibly one day should actually try to use avx even with 128bit vectors...


In the past we needed to override util_cpu_caps.has_avx on AVX capable 
machines but where old-JIT code.  But that's no longer the case: the min 
supported LLVM version is 3.3, which supports AVX both with MCJIT and 
old-JIT.



There, the only point of this code is to enable a developer to test SSE2 
code paths on a AVX capable machine.


There's no other reason for someone to go out of his way to override 
LP_NATIVE_VECTOR_WIDTH of 256 with 128.



So maybe it's worth to make this comment clear: the sole point is to 
enable SSE2 testing on AVX machines, and all avx flags, and flags which 
depend on avx, need to be masked out.



BTW the "For simulating less capable machines" code needs to be updated 
too (it's missing has_avx2=0).




---
  src/gallium/auxiliary/gallivm/lp_bld_init.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 017d075..e6eede8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -427,6 +427,7 @@ lp_build_init(void)
 */
util_cpu_caps.has_avx = 0;
util_cpu_caps.has_avx2 = 0;
+  util_cpu_caps.has_f16c = 0;
 }

  #ifdef PIPE_ARCH_PPC_64



Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add additional checks for uniform location query

2015-10-26 Thread Tapani Pälli
Patch adds additional check to make sure we don't return locations for
structures or arrays of structures.

From page 79 of the OpenGL 4.2 spec:
"A valid name cannot be a structure, an array of structures, or any
portion of a single vector or a matrix."

No Piglit or CTS regressions observed.

Signed-off-by: Tapani Pälli 
---
 src/mesa/main/shader_query.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 8182d3d..b0707a4 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -808,6 +808,16 @@ program_resource_location(struct gl_shader_program *shProg,
   if (RESOURCE_UNI(res)->builtin)
  return -1;
 
+ /* From page 79 of the OpenGL 4.2 spec:
+  *
+  * "A valid name cannot be a structure, an array of structures, or any
+  * portion of a single vector or a matrix."
+  */
+  if (RESOURCE_UNI(res)->type->is_record() ||
+  (RESOURCE_UNI(res)->type->is_array() &&
+   RESOURCE_UNI(res)->type->fields.array->is_record()))
+ return -1;
+
   /* From the GL_ARB_uniform_buffer_object spec:
*
* "The value -1 will be returned if  does not correspond to an
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92629] lp_bld_debug.cpp:141: undefined reference to `llvm::sys::getProcessTriple()'

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92629

Jose Fonseca  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #1 from Jose Fonseca  ---
Thanks Vinson.

I can repro this on Ubuntu 15.10 doing

  CC=clang CXX=clang++ scons lp_test_arit

If I do

  nm -u build/linux-x86_64-debug/gallium/auxiliary/gallivm/lp_bld_debug.os

with G++ I get

  U _ZN4llvm3sys16getProcessTripleB5cxx11Ev

where as with Clang I get

  _ZN4llvm3sys16getProcessTripleEv

If I run both through c++filt I get with g++

  U llvm::sys::getProcessTriple[abi:cxx11]()

whereas with clang I get

  U llvm::sys::getProcessTriple()

In other words, clang is not emitting cxx11 abi symbols, when g++ is.

This seems to be a known problem:

  http://allanmcrae.com/2015/06/the-case-of-gcc-5-1-and-the-two-c-abis/
 
https://www.reddit.com/r/cpp/comments/3b2glr/why_clang_cant_use_the_new_gcc_5_cxx11_abi/

And there's even a LLVM/clang bug filed:

  https://llvm.org/bugs/show_bug.cgi?id=23529

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add additional checks for uniform location query

2015-10-26 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

Thanks!

Sam

On 26/10/15 10:22, Tapani Pälli wrote:
> Patch adds additional check to make sure we don't return locations for
> structures or arrays of structures.
> 
> From page 79 of the OpenGL 4.2 spec:
> "A valid name cannot be a structure, an array of structures, or any
> portion of a single vector or a matrix."
> 
> No Piglit or CTS regressions observed.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/shader_query.cpp | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
> index 8182d3d..b0707a4 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -808,6 +808,16 @@ program_resource_location(struct gl_shader_program 
> *shProg,
>if (RESOURCE_UNI(res)->builtin)
>   return -1;
>  
> + /* From page 79 of the OpenGL 4.2 spec:
> +  *
> +  * "A valid name cannot be a structure, an array of structures, or 
> any
> +  * portion of a single vector or a matrix."
> +  */
> +  if (RESOURCE_UNI(res)->type->is_record() ||
> +  (RESOURCE_UNI(res)->type->is_array() &&
> +   RESOURCE_UNI(res)->type->fields.array->is_record()))
> + return -1;
> +
>/* From the GL_ARB_uniform_buffer_object spec:
> *
> * "The value -1 will be returned if  does not correspond to 
> an
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa: Draw indirect is not allowed if the default VAO is bound.

2015-10-26 Thread Marta Lofstedt
From: Marta Lofstedt 

From OpenGL ES 3.1 specification, section 10.5:
"DrawArraysIndirect requires that all data sourced for the
command, including the DrawArraysIndirectCommand
structure,  be in buffer objects,  and may not be called when
the default vertex array object is bound."

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/api_validate.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index a46c194..40a2f43 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -698,6 +698,18 @@ valid_draw_indirect(struct gl_context *ctx,
 {
const GLsizeiptr end = (GLsizeiptr)indirect + size;
 
+   /* OpenGL ES 3.1 spec. section 10.5:
+*
+*  "DrawArraysIndirect requires that all data sourced for the
+*  command, including the DrawArraysIndirectCommand
+*  structure,  be in buffer objects,  and may not be called when
+*  the default vertex array object is bound."
+*/
+   if (ctx->Array.VAO == ctx->Array.DefaultVAO) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "(no VAO bound)");
+  return GL_FALSE;
+   }
+
if (!_mesa_valid_prim_mode(ctx, mode, name))
   return GL_FALSE;
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] radeonsi: properly check if DCC is enabled and allocated

2015-10-26 Thread Marek Olšák
On Sat, Oct 24, 2015 at 11:13 PM, Nicolai Hähnle  wrote:
> On 24.10.2015 17:49, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>   src/gallium/drivers/radeon/r600_texture.c | 2 +-
>>   src/gallium/drivers/radeonsi/cik_sdma.c   | 2 +-
>>   src/gallium/drivers/radeonsi/si_blit.c| 6 +++---
>>   src/gallium/drivers/radeonsi/si_dma.c | 2 +-
>>   src/gallium/drivers/radeonsi/si_state.c   | 4 ++--
>>   5 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>> b/src/gallium/drivers/radeon/r600_texture.c
>> index f7a11a2..40075ae 100644
>> --- a/src/gallium/drivers/radeon/r600_texture.c
>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>> @@ -1367,7 +1367,7 @@ void evergreen_do_fast_color_clear(struct
>> r600_common_context *rctx,
>> continue;
>> }
>>
>> -   if (tex->surface.dcc_enabled) {
>> +   if (tex->dcc_buffer) {
>> uint32_t reset_value;
>> bool clear_words_needed;
>>
>> diff --git a/src/gallium/drivers/radeonsi/cik_sdma.c
>> b/src/gallium/drivers/radeonsi/cik_sdma.c
>> index 25fd09a..e53af1d 100644
>> --- a/src/gallium/drivers/radeonsi/cik_sdma.c
>> +++ b/src/gallium/drivers/radeonsi/cik_sdma.c
>> @@ -243,7 +243,7 @@ void cik_sdma_copy(struct pipe_context *ctx,
>> if (src->format != dst->format ||
>> rdst->surface.nsamples > 1 || rsrc->surface.nsamples > 1 ||
>> (rdst->dirty_level_mask | rdst->stencil_dirty_level_mask) & (1
>> << dst_level) ||
>> -   rdst->surface.dcc_enabled || rsrc->surface.dcc_enabled) {
>> +   rdst->dcc_buffer || rsrc->dcc_buffer) {
>> goto fallback;
>> }
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_blit.c
>> b/src/gallium/drivers/radeonsi/si_blit.c
>> index a226436..302b75c 100644
>> --- a/src/gallium/drivers/radeonsi/si_blit.c
>> +++ b/src/gallium/drivers/radeonsi/si_blit.c
>> @@ -326,7 +326,7 @@ void si_decompress_color_textures(struct si_context
>> *sctx,
>> assert(view);
>>
>> tex = (struct r600_texture *)view->texture;
>> -   assert(tex->cmask.size || tex->fmask.size ||
>> tex->surface.dcc_enabled);
>> +   assert(tex->cmask.size || tex->fmask.size ||
>> tex->dcc_buffer);
>>
>> si_blit_decompress_color(&sctx->b.b, tex,
>>  view->u.tex.first_level,
>> view->u.tex.last_level,
>> @@ -455,7 +455,7 @@ static void si_decompress_subresource(struct
>> pipe_context *ctx,
>> si_blit_decompress_depth_in_place(sctx, rtex,
>> true,
>>   level, level,
>>   first_layer,
>> last_layer);
>> -   } else if (rtex->fmask.size || rtex->cmask.size ||
>> rtex->surface.dcc_enabled) {
>> +   } else if (rtex->fmask.size || rtex->cmask.size ||
>> rtex->dcc_buffer) {
>> si_blit_decompress_color(ctx, rtex, level, level,
>>  first_layer, last_layer);
>> }
>> @@ -676,7 +676,7 @@ static bool do_hardware_msaa_resolve(struct
>> pipe_context *ctx,
>> dst->surface.level[info->dst.level].mode >=
>> RADEON_SURF_MODE_1D &&
>> !(dst->surface.flags & RADEON_SURF_SCANOUT) &&
>> (!dst->cmask.size || !dst->dirty_level_mask) && /* dst cannot
>> be fast-cleared */
>> -   !dst->surface.dcc_enabled) {
>> +   !dst->dcc_buffer) {
>> si_blitter_begin(ctx, SI_COLOR_RESOLVE |
>>  (info->render_condition_enable ? 0 :
>> SI_DISABLE_RENDER_COND));
>> util_blitter_custom_resolve_color(sctx->blitter,
>> diff --git a/src/gallium/drivers/radeonsi/si_dma.c
>> b/src/gallium/drivers/radeonsi/si_dma.c
>> index 73c026c..581e89f 100644
>> --- a/src/gallium/drivers/radeonsi/si_dma.c
>> +++ b/src/gallium/drivers/radeonsi/si_dma.c
>> @@ -249,7 +249,7 @@ void si_dma_copy(struct pipe_context *ctx,
>> (rdst->dirty_level_mask | rdst->stencil_dirty_level_mask) & (1
>> << dst_level) ||
>> rdst->cmask.size || rdst->fmask.size ||
>> rsrc->cmask.size || rsrc->fmask.size ||
>> -   rdst->surface.dcc_enabled || rsrc->surface.dcc_enabled) {
>> +   rdst->dcc_buffer || rsrc->dcc_buffer) {
>> goto fallback;
>> }
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_state.c
>> b/src/gallium/drivers/radeonsi/si_state.c
>> index c87f661..18b6405 100644
>> --- a/src/gallium/drivers/radeonsi/si_state.c
>> +++ b/src/gallium/drivers/radeonsi/si_state.c
>> @@ -1926,7 +1926,7 @@ static void si_initialize_color_surface(struct
>> si_context *sctx,
>> surf->cb_color_info = color_info;
>> surf->cb_color_attrib = color_attrib;
>>
>> -   if (sctx->b.chip_class >= VI && rtex->surface.dcc_enabled) {
>> +

Re: [Mesa-dev] [PATCH] mesa: add additional checks for uniform location query

2015-10-26 Thread Timothy Arceri
On Mon, 2015-10-26 at 11:22 +0200, Tapani Pälli wrote:
> Patch adds additional check to make sure we don't return locations
> for
> structures or arrays of structures.
> 
> From page 79 of the OpenGL 4.2 spec:
> "A valid name cannot be a structure, an array of structures, or
> any
> portion of a single vector or a matrix."
> 
> No Piglit or CTS regressions observed.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/shader_query.cpp | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/mesa/main/shader_query.cpp
> b/src/mesa/main/shader_query.cpp
> index 8182d3d..b0707a4 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -808,6 +808,16 @@ program_resource_location(struct
> gl_shader_program *shProg,
>if (RESOURCE_UNI(res)->builtin)
>   return -1;
>  
> + /* From page 79 of the OpenGL 4.2 spec:
> +  *
> +  * "A valid name cannot be a structure, an array of
> structures, or any
> +  * portion of a single vector or a matrix."
> +  */
> +  if (RESOURCE_UNI(res)->type->is_record() ||
> +  (RESOURCE_UNI(res)->type->is_array() &&
> +   RESOURCE_UNI(res)->type->fields.array->is_record()))
> + return -1;
> +

This could just be RESOURCE_UNI(res)->type->without_array()
->is_record() then it would also work for arrays of arrays.

>/* From the GL_ARB_uniform_buffer_object spec:
> *
> * "The value -1 will be returned if  does not
> correspond to an
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa: Draw Indirect return wrong error code on unalinged

2015-10-26 Thread Marta Lofstedt
From: Marta Lofstedt 

From OpenGL 4.4 specification, section 10.4 and
Open GL Es 3.1 section 10.5:
"An INVALID_VALUE error is generated if indirect is not a multiple
of the size, in basic machine units, of uint."

However, the current code follow the ARB_draw_indirect:
https://www.opengl.org/registry/specs/ARB/draw_indirect.txt
"INVALID_OPERATION is generated by DrawArraysIndirect and
DrawElementsIndirect if commands source data beyond the end
of a buffer object or if  is not word aligned."

V2: After discussions on the list, it was suggested to
only keep the INVALID_VALUE error.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/api_validate.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 40a2f43..19b6a73 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -714,12 +714,14 @@ valid_draw_indirect(struct gl_context *ctx,
   return GL_FALSE;
 
 
-   /* From the ARB_draw_indirect specification:
-* "An INVALID_OPERATION error is generated [...] if  is no
-*  word aligned."
+   /* From OpenGL version 4.4. section 10.5
+* and OpenGL ES 3.1, section 10.6:
+*
+*  "An INVALID_VALUE error is generated if indirect is not a
+*   multiple of the size, in basic machine units, of uint."
 */
if ((GLsizeiptr)indirect & (sizeof(GLuint) - 1)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
+  _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(indirect is not aligned)", name);
   return GL_FALSE;
}
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add additional checks for uniform location query

2015-10-26 Thread Tapani Pälli



On 10/26/2015 12:11 PM, Timothy Arceri wrote:

On Mon, 2015-10-26 at 11:22 +0200, Tapani Pälli wrote:

Patch adds additional check to make sure we don't return locations
for
structures or arrays of structures.

 From page 79 of the OpenGL 4.2 spec:
 "A valid name cannot be a structure, an array of structures, or
any
 portion of a single vector or a matrix."

No Piglit or CTS regressions observed.

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/shader_query.cpp | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/shader_query.cpp
b/src/mesa/main/shader_query.cpp
index 8182d3d..b0707a4 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -808,6 +808,16 @@ program_resource_location(struct
gl_shader_program *shProg,
if (RESOURCE_UNI(res)->builtin)
   return -1;

+ /* From page 79 of the OpenGL 4.2 spec:
+  *
+  * "A valid name cannot be a structure, an array of
structures, or any
+  * portion of a single vector or a matrix."
+  */
+  if (RESOURCE_UNI(res)->type->is_record() ||
+  (RESOURCE_UNI(res)->type->is_array() &&
+   RESOURCE_UNI(res)->type->fields.array->is_record()))
+ return -1;
+


This could just be RESOURCE_UNI(res)->type->without_array()
->is_record() then it would also work for arrays of arrays.


True, this is simpler way to say the same thing, I'll change this.




/* From the GL_ARB_uniform_buffer_object spec:
 *
 * "The value -1 will be returned if  does not
correspond to an

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] main: Remove interface block array index for doing the name comparison

2015-10-26 Thread Samuel Iglesias Gonsalvez
From ARB_program_query_interface spec:

"uint GetProgramResourceIndex(uint program, enum programInterface,
   const char *name);
 [...]
 If  exactly matches the name string of one of the active resources
 for , the index of the matched resource is returned.
 Additionally, if  would exactly match the name string of an active
 resource if "[0]" were appended to , the index of the matched
 resource is returned. [...]"

"A string provided to GetProgramResourceLocation or
 GetProgramResourceLocationIndex is considered to match an active variable
 if:
[...]
   * if the string identifies the base name of an active array, where the
 string would exactly match the name of the variable if the suffix
 "[0]" were appended to the string;
[...]
"

But this is only happening on those ARB_program_query_interface's queries.
For the rest of specs we need to keep old behavior. For that reason,
arb_program_interface_query boolean is added to the affected functions.

Fixes the following two dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element

v2:
- Add AoA support (Timothy)
- Apply it too for GetUniformLocation(), GetUniformName() and others
  because ARB_program_interface_query says that they are equivalent
  to GetProgramResourceLocation() and GetProgramResourceName() (Tapani)

Signed-off-by: Samuel Iglesias Gonsalvez 
Cc: Tapani Pälli 
---
 src/mesa/main/shader_query.cpp | 43 +-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index b0707a4..a0514b0 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -543,8 +543,49 @@ _mesa_program_resource_find_name(struct gl_shader_program 
*shProg,
   /* Resource basename. */
   const char *rname = _mesa_program_resource_name(res);
   unsigned baselen = strlen(rname);
+  unsigned baselen_without_array_index = baselen;
+  const char *rname_last_square_bracket = strrchr(rname, '[');
+  bool found = false;
+  bool rname_has_array_index_zero = false;
+  /* From ARB_program_interface_query spec:
+   *
+   * "uint GetProgramResourceIndex(uint program, enum programInterface,
+   *   const char *name);
+   *  [...]
+   *  If  exactly matches the name string of one of the active
+   *  resources for , the index of the matched resource 
is
+   *  returned. Additionally, if  would exactly match the name string
+   *  of an active resource if "[0]" were appended to , the index of
+   *  the matched resource is returned. [...]"
+   *
+   * "A string provided to GetProgramResourceLocation or
+   * GetProgramResourceLocationIndex is considered to match an active 
variable
+   * if:
+   *
+   *  * the string exactly matches the name of the active variable;
+   *
+   *  * if the string identifies the base name of an active array, where 
the
+   *string would exactly match the name of the variable if the suffix
+   *"[0]" were appended to the string; [...]"
+   */
+  /* Remove array's index from interface block name comparison only if
+   * array's index is zero and the resulting string length is the same
+   * than the provided name's length.
+   */
+  if (rname_last_square_bracket) {
+ baselen_without_array_index -= strlen(rname_last_square_bracket);
+ rname_has_array_index_zero =
+(strncmp(rname_last_square_bracket, "[0]\0", 4) == 0) &&
+(baselen_without_array_index == strlen(name));
+  }
+
+  if (strncmp(rname, name, baselen) == 0)
+ found = true;
+  else if (rname_has_array_index_zero &&
+   strncmp(rname, name, baselen_without_array_index) == 0)
+ found = true;
 
-  if (strncmp(rname, name, baselen) == 0) {
+  if (found) {
  switch (programInterface) {
  case GL_UNIFORM_BLOCK:
  case GL_SHADER_STORAGE_BLOCK:
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] winsys/amdgpu: remove the dcc_enable surface flag

2015-10-26 Thread Marek Olšák
From: Marek Olšák 

dcc_size is sufficient and doesn't need a further comment in my opinion.
---
 src/gallium/drivers/radeon/r600_texture.c  |  3 +--
 src/gallium/drivers/radeon/radeon_winsys.h |  1 -
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 13 ++---
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 789c66f..edfdfe3 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -641,9 +641,8 @@ r600_texture_create_object(struct pipe_screen *screen,
return NULL;
}
}
-   if (rtex->surface.dcc_enabled) {
+   if (rtex->surface.dcc_size)
vi_texture_alloc_dcc_separate(rscreen, rtex);
-   }
}
 
/* Now create the backing buffer. */
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 0178643..8bf1e15 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -371,7 +371,6 @@ struct radeon_surf {
 
 uint64_tdcc_size;
 uint64_tdcc_alignment;
-booldcc_enabled;
 };
 
 struct radeon_bo_list_item {
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index b442174..3006bd1 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -251,7 +251,7 @@ static int compute_level(struct amdgpu_winsys *ws,
 
surf->bo_size = surf_level->offset + AddrSurfInfoOut->surfSize;
 
-   if (surf->dcc_enabled) {
+   if (AddrSurfInfoIn->flags.dccCompatible) {
   AddrDccIn->colorSurfSize = AddrSurfInfoOut->surfSize;
   AddrDccIn->tileMode = AddrSurfInfoOut->tileMode;
   AddrDccIn->tileInfo = *AddrSurfInfoOut->pTileInfo;
@@ -267,10 +267,11 @@ static int compute_level(struct amdgpu_winsys *ws,
  surf->dcc_size = surf_level->dcc_offset + AddrDccOut->dccRamSize;
  surf->dcc_alignment = MAX2(surf->dcc_alignment, 
AddrDccOut->dccRamBaseAlign);
   } else {
- surf->dcc_enabled = false;
+ surf->dcc_size = 0;
  surf_level->dcc_offset = 0;
   }
} else {
+  surf->dcc_size = 0;
   surf_level->dcc_offset = 0;
}
 
@@ -354,10 +355,6 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
AddrDccIn.numSamples = AddrSurfInfoIn.numSamples = surf->nsamples;
AddrSurfInfoIn.tileIndex = -1;
 
-   surf->dcc_enabled =  !(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
-!(surf->flags & RADEON_SURF_SCANOUT) &&
-!compressed && AddrDccIn.numSamples <= 1;
-
/* Set the micro tile type. */
if (surf->flags & RADEON_SURF_SCANOUT)
   AddrSurfInfoIn.tileType = ADDR_DISPLAYABLE;
@@ -373,7 +370,9 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
AddrSurfInfoIn.flags.display = (surf->flags & RADEON_SURF_SCANOUT) != 0;
AddrSurfInfoIn.flags.pow2Pad = surf->last_level > 0;
AddrSurfInfoIn.flags.degrade4Space = 1;
-   AddrSurfInfoIn.flags.dccCompatible = surf->dcc_enabled;
+   AddrSurfInfoIn.flags.dccCompatible = !(surf->flags & 
RADEON_SURF_Z_OR_SBUFFER) &&
+!(surf->flags & RADEON_SURF_SCANOUT) &&
+!compressed && AddrDccIn.numSamples <= 
1;
 
/* This disables incorrect calculations (hacks) in addrlib. */
AddrSurfInfoIn.flags.noStencil = 1;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] main: Remove interface block array index for doing the name comparison

2015-10-26 Thread Samuel Iglesias Gonsálvez


On 26/10/15 11:38, Samuel Iglesias Gonsalvez wrote:
> From ARB_program_query_interface spec:
> 
> "uint GetProgramResourceIndex(uint program, enum programInterface,
>const char *name);
>  [...]
>  If  exactly matches the name string of one of the active resources
>  for , the index of the matched resource is returned.
>  Additionally, if  would exactly match the name string of an active
>  resource if "[0]" were appended to , the index of the matched
>  resource is returned. [...]"
> 
> "A string provided to GetProgramResourceLocation or
>  GetProgramResourceLocationIndex is considered to match an active variable
>  if:
> [...]
>* if the string identifies the base name of an active array, where the
>  string would exactly match the name of the variable if the suffix
>  "[0]" were appended to the string;
> [...]
> "
> 
> But this is only happening on those ARB_program_query_interface's queries.
> For the rest of specs we need to keep old behavior. For that reason,
> arb_program_interface_query boolean is added to the affected functions.
> 

Forgot to remove this paragraph ^ from commit log. I will do it before
pushing it to master or before sending a v3 (if needed).

Sam

> Fixes the following two dEQP-GLES31 tests:
> 
> dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
> dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element
> 
> v2:
> - Add AoA support (Timothy)
> - Apply it too for GetUniformLocation(), GetUniformName() and others
>   because ARB_program_interface_query says that they are equivalent
>   to GetProgramResourceLocation() and GetProgramResourceName() (Tapani)
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> Cc: Tapani Pälli 
> ---
>  src/mesa/main/shader_query.cpp | 43 
> +-
>  1 file changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
> index b0707a4..a0514b0 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -543,8 +543,49 @@ _mesa_program_resource_find_name(struct 
> gl_shader_program *shProg,
>/* Resource basename. */
>const char *rname = _mesa_program_resource_name(res);
>unsigned baselen = strlen(rname);
> +  unsigned baselen_without_array_index = baselen;
> +  const char *rname_last_square_bracket = strrchr(rname, '[');
> +  bool found = false;
> +  bool rname_has_array_index_zero = false;
> +  /* From ARB_program_interface_query spec:
> +   *
> +   * "uint GetProgramResourceIndex(uint program, enum programInterface,
> +   *   const char *name);
> +   *  [...]
> +   *  If  exactly matches the name string of one of the active
> +   *  resources for , the index of the matched 
> resource is
> +   *  returned. Additionally, if  would exactly match the name 
> string
> +   *  of an active resource if "[0]" were appended to , the index 
> of
> +   *  the matched resource is returned. [...]"
> +   *
> +   * "A string provided to GetProgramResourceLocation or
> +   * GetProgramResourceLocationIndex is considered to match an active 
> variable
> +   * if:
> +   *
> +   *  * the string exactly matches the name of the active variable;
> +   *
> +   *  * if the string identifies the base name of an active array, where 
> the
> +   *string would exactly match the name of the variable if the suffix
> +   *"[0]" were appended to the string; [...]"
> +   */
> +  /* Remove array's index from interface block name comparison only if
> +   * array's index is zero and the resulting string length is the same
> +   * than the provided name's length.
> +   */
> +  if (rname_last_square_bracket) {
> + baselen_without_array_index -= strlen(rname_last_square_bracket);
> + rname_has_array_index_zero =
> +(strncmp(rname_last_square_bracket, "[0]\0", 4) == 0) &&
> +(baselen_without_array_index == strlen(name));
> +  }
> +
> +  if (strncmp(rname, name, baselen) == 0)
> + found = true;
> +  else if (rname_has_array_index_zero &&
> +   strncmp(rname, name, baselen_without_array_index) == 0)
> + found = true;
>  
> -  if (strncmp(rname, name, baselen) == 0) {
> +  if (found) {
>   switch (programInterface) {
>   case GL_UNIFORM_BLOCK:
>   case GL_SHADER_STORAGE_BLOCK:
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa: Draw indirect is not allowed when xfb is active and unpaused

2015-10-26 Thread Marta Lofstedt
From: Marta Lofstedt 

OpenGL ES 3.1 specification, section 10.5:
"An INVALID_OPERATION error is generated if
transform feedback is active and not paused."

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/api_validate.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index fa6c1b5..303d5e8 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -710,6 +710,16 @@ valid_draw_indirect(struct gl_context *ctx,
   return GL_FALSE;
}
 
+   /* OpenGL ES 3.1 specification, section 10.5:
+*
+*  "An INVALID_OPERATION error is generated if
+*  transform feedback is active and not paused."
+*/
+   if (_mesa_is_gles31(ctx) && _mesa_is_xfb_active_and_unpaused(ctx)) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(TransformFeedback is active but not paused)", name);
+   }
+
/*
 * OpenGL ES 3.1 spec. section 10.5:
 * "An INVALID_OPERATION error is generated if zero is bound to
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] main: Remove interface block array index for doing the name comparison

2015-10-26 Thread Tapani Pälli



On 10/26/2015 12:45 PM, Samuel Iglesias Gonsálvez wrote:



On 26/10/15 11:38, Samuel Iglesias Gonsalvez wrote:

 From ARB_program_query_interface spec:

"uint GetProgramResourceIndex(uint program, enum programInterface,
const char *name);
  [...]
  If  exactly matches the name string of one of the active resources
  for , the index of the matched resource is returned.
  Additionally, if  would exactly match the name string of an active
  resource if "[0]" were appended to , the index of the matched
  resource is returned. [...]"

"A string provided to GetProgramResourceLocation or
  GetProgramResourceLocationIndex is considered to match an active variable
  if:
[...]
* if the string identifies the base name of an active array, where the
  string would exactly match the name of the variable if the suffix
  "[0]" were appended to the string;
[...]
"

But this is only happening on those ARB_program_query_interface's queries.
For the rest of specs we need to keep old behavior. For that reason,
arb_program_interface_query boolean is added to the affected functions.



Forgot to remove this paragraph ^ from commit log. I will do it before
pushing it to master or before sending a v3 (if needed).


Was just about to comment on this, no booleans required. The problem 
here is that find_name function was broken for names which contained the 
array brackets.


This patch (with changes to commit message) is

Reviewed-by: Tapani Pälli 




Sam


Fixes the following two dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element

v2:
- Add AoA support (Timothy)
- Apply it too for GetUniformLocation(), GetUniformName() and others
   because ARB_program_interface_query says that they are equivalent
   to GetProgramResourceLocation() and GetProgramResourceName() (Tapani)

Signed-off-by: Samuel Iglesias Gonsalvez 
Cc: Tapani Pälli 
---
  src/mesa/main/shader_query.cpp | 43 +-
  1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index b0707a4..a0514b0 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -543,8 +543,49 @@ _mesa_program_resource_find_name(struct gl_shader_program 
*shProg,
/* Resource basename. */
const char *rname = _mesa_program_resource_name(res);
unsigned baselen = strlen(rname);
+  unsigned baselen_without_array_index = baselen;
+  const char *rname_last_square_bracket = strrchr(rname, '[');
+  bool found = false;
+  bool rname_has_array_index_zero = false;
+  /* From ARB_program_interface_query spec:
+   *
+   * "uint GetProgramResourceIndex(uint program, enum programInterface,
+   *   const char *name);
+   *  [...]
+   *  If  exactly matches the name string of one of the active
+   *  resources for , the index of the matched resource 
is
+   *  returned. Additionally, if  would exactly match the name string
+   *  of an active resource if "[0]" were appended to , the index of
+   *  the matched resource is returned. [...]"
+   *
+   * "A string provided to GetProgramResourceLocation or
+   * GetProgramResourceLocationIndex is considered to match an active 
variable
+   * if:
+   *
+   *  * the string exactly matches the name of the active variable;
+   *
+   *  * if the string identifies the base name of an active array, where 
the
+   *string would exactly match the name of the variable if the suffix
+   *"[0]" were appended to the string; [...]"
+   */
+  /* Remove array's index from interface block name comparison only if
+   * array's index is zero and the resulting string length is the same
+   * than the provided name's length.
+   */
+  if (rname_last_square_bracket) {
+ baselen_without_array_index -= strlen(rname_last_square_bracket);
+ rname_has_array_index_zero =
+(strncmp(rname_last_square_bracket, "[0]\0", 4) == 0) &&
+(baselen_without_array_index == strlen(name));
+  }
+
+  if (strncmp(rname, name, baselen) == 0)
+ found = true;
+  else if (rname_has_array_index_zero &&
+   strncmp(rname, name, baselen_without_array_index) == 0)
+ found = true;

-  if (strncmp(rname, name, baselen) == 0) {
+  if (found) {
   switch (programInterface) {
   case GL_UNIFORM_BLOCK:
   case GL_SHADER_STORAGE_BLOCK:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/9] i965: Set annotation_info's mem_ctx.

2015-10-26 Thread Pohjolainen, Topi
On Wed, Oct 21, 2015 at 03:58:12PM -0700, Matt Turner wrote:
> It was being memset to 0 previously.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 +-
>  src/mesa/drivers/dri/i965/intel_asm_annotation.c | 3 +++
>  3 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index aed4adb..8ab57f7 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -2187,7 +2187,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
> dispatch_width)
>  
>dump_assembly(p->store, annotation.ann_count, annotation.ann,
>  p->devinfo);
> -  ralloc_free(annotation.ann);
> +  ralloc_free(annotation.mem_ctx);

I had to check how "annotation.ann" gets allocated - that is by reralloc()
against "annotation.mem_ctx".

Reviewed-by: Topi Pohjolainen 

> }
>  
> compiler->shader_debug_log(log_data,
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index a84f6c4..6ac8591 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1659,7 +1659,7 @@ vec4_generator::generate_code(const cfg_t *cfg, const 
> nir_shader *nir)
>  
>dump_assembly(p->store, annotation.ann_count, annotation.ann,
>  p->devinfo);
> -  ralloc_free(annotation.ann);
> +  ralloc_free(annotation.mem_ctx);
> }
>  
> compiler->shader_debug_log(log_data,
> diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
> b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> index b3d6324..f87a9bb 100644
> --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> @@ -86,6 +86,9 @@ void annotate(const struct brw_device_info *devinfo,
>struct annotation_info *annotation, const struct cfg_t *cfg,
>struct backend_instruction *inst, unsigned offset)
>  {
> +   if (annotation->mem_ctx == NULL)
> +  annotation->mem_ctx = ralloc_context(NULL);
> +
> if (annotation->ann_size <= annotation->ann_count) {
>int old_size = annotation->ann_size;
>annotation->ann_size = MAX2(1024, annotation->ann_size * 2);
> -- 
> 2.4.9
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/9] i965: Combine assembly annotations if possible.

2015-10-26 Thread Pohjolainen, Topi
On Wed, Oct 21, 2015 at 03:58:13PM -0700, Matt Turner wrote:
> Often annotations are identical between sets of consecutive
> instructions. We can perhaps avoid some memory allocations by reusing
> the previous annotation.
> ---
>  src/mesa/drivers/dri/i965/intel_asm_annotation.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
> b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> index f87a9bb..58830db 100644
> --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> @@ -112,6 +112,20 @@ void annotate(const struct brw_device_info *devinfo,
>ann->block_start = cfg->blocks[annotation->cur_block];
> }
>  
> +   if (bblock_end(cfg->blocks[annotation->cur_block]) == inst) {
> +  ann->block_end = cfg->blocks[annotation->cur_block];
> +  annotation->cur_block++;
> +   }
> +
> +   /* Merge this annotation with the previous if possible. */
> +   struct annotation *prev = &annotation->ann[annotation->ann_count - 2];

What guarantees that annotation->ann_count is always at least two at this
point?

> +   if (ann->ir == prev->ir &&
> +   ann->annotation == prev->annotation &&
> +   ann->block_start == NULL) {
> +  annotation->ann_count--;
> +  return;
> +   }
> +
> /* There is no hardware DO instruction on Gen6+, so since DO always
>  * starts a basic block, we need to set the .block_start of the next
>  * instruction's annotation with a pointer to the bblock started by
> @@ -123,11 +137,6 @@ void annotate(const struct brw_device_info *devinfo,
> if (devinfo->gen >= 6 && inst->opcode == BRW_OPCODE_DO) {
>annotation->ann_count--;
> }
> -
> -   if (bblock_end(cfg->blocks[annotation->cur_block]) == inst) {
> -  ann->block_end = cfg->blocks[annotation->cur_block];
> -  annotation->cur_block++;
> -   }
>  }
>  
>  void
> -- 
> 2.4.9
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/9] i965: Add annotation_insert_error() and support for printing errors.

2015-10-26 Thread Pohjolainen, Topi
On Wed, Oct 21, 2015 at 03:58:14PM -0700, Matt Turner wrote:
> Will allow annotations to contain error messages (indicating an
> instruction violates a rule for instance) that are printed after the
> disassembly of the block.
> ---
>  src/mesa/drivers/dri/i965/intel_asm_annotation.c | 60 
> 
>  src/mesa/drivers/dri/i965/intel_asm_annotation.h |  7 +++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
> b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> index 58830db..eaee386 100644
> --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
> @@ -69,6 +69,10 @@ dump_assembly(void *assembly, int num_annotations, struct 
> annotation *annotation
>  
>brw_disassemble(devinfo, assembly, start_offset, end_offset, stderr);
>  
> +  if (annotation[i].error) {
> + fputs(annotation[i].error, stderr);
> +  }
> +
>if (annotation[i].block_end) {
>   fprintf(stderr, "   END B%d", annotation[i].block_end->num);
>   foreach_list_typed(struct bblock_link, successor_link, link,
> @@ -152,3 +156,59 @@ annotation_finalize(struct annotation_info *annotation,
> }
> annotation->ann[annotation->ann_count].offset = next_inst_offset;
>  }
> +
> +void
> +annotation_insert_error(struct annotation_info *annotation, unsigned offset,
> +const char *error)
> +{
> +   struct annotation *ann = NULL;
> +
> +   if (!annotation->ann_count)
> +  return;
> +
> +   /* We may have to split an annotation, so ensure we have enough space
> +* allocated for that case up front.
> +*/
> +   if (annotation->ann_size <= annotation->ann_count) {
> +  int old_size = annotation->ann_size;
> +  annotation->ann_size = MAX2(1024, annotation->ann_size * 2);
> +  annotation->ann = reralloc(annotation->mem_ctx, annotation->ann,
> + struct annotation, annotation->ann_size);
> +  if (!annotation->ann)
> + return;
> +
> +  memset(annotation->ann + old_size, 0,
> + (annotation->ann_size - old_size) * sizeof(struct annotation));
> +   }

This same block can already be found in "annotate()". Would it made sense to
refactor and re-use?

> +
> +   for (int i = 0; i <= annotation->ann_count; i++) {
> +  if (annotation->ann[i].offset <= offset)
> + continue;
> +
> +  struct annotation *cur = &annotation->ann[i - 1];
> +  struct annotation *next = &annotation->ann[i];
> +  ann = cur;
> +
> +  if (offset + sizeof(brw_inst) != next->offset) {
> + memmove(next, cur,
> + (annotation->ann_count - i + 2) * sizeof(struct 
> annotation));

I guess the same question here as in patch five, does "i >= 2" always hold?

> + cur->error = NULL;
> + cur->error_length = 0;
> + cur->block_end = NULL;
> + next->offset = offset + sizeof(brw_inst);
> + next->block_start = NULL;
> + annotation->ann_count++;
> +  }
> +  break;
> +   }
> +
> +   assume(ann != NULL);
> +
> +   ralloc_asprintf_rewrite_tail(&ann->error, &ann->error_length, error);
> +
> +   /* FIXME: ralloc_vasprintf_rewrite_tail() allocates memory out of the
> +* null context. We have to reparent the it if we want it to be freed
> +* with the rest of the annotation context.
> +*/
> +   ralloc_steal(annotation->mem_ctx, ann->error);
> +}
> diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.h 
> b/src/mesa/drivers/dri/i965/intel_asm_annotation.h
> index 6c72326..662a4b4 100644
> --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.h
> +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.h
> @@ -37,6 +37,9 @@ struct cfg_t;
>  struct annotation {
> int offset;
>  
> +   size_t error_length;
> +   char *error;
> +
> /* Pointers to the basic block in the CFG if the instruction group starts
>  * or ends a basic block.
>  */
> @@ -69,6 +72,10 @@ annotate(const struct brw_device_info *devinfo,
>  void
>  annotation_finalize(struct annotation_info *annotation, unsigned offset);
>  
> +void
> +annotation_insert_error(struct annotation_info *annotation, unsigned offset,
> +const char *error);
> +
>  #ifdef __cplusplus
>  } /* extern "C" */
>  #endif
> -- 
> 2.4.9
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 1/2] gallium: add renderonly driver

2015-10-26 Thread Daniel Stone
Hi Lucas,

On 19 October 2015 at 10:25, Lucas Stach  wrote:
> Am Sonntag, den 18.10.2015, 21:41 +0200 schrieb Christian Gmeiner:
>> 2015-10-16 15:31 GMT+02:00 Thierry Reding :
>> > Gallium driver is rather complicated, but that's due to the fact that
>> > graphics drivers are complex beasts.
>> >
>> > That said, I think for some areas it might be beneficial to have helpers
>> > to reduce the amount of duplication. However I think at this point in
>> > time we haven't had enough real-world exposure for this kind of driver
>> > to know what the requirements are. For that reason I think it is
>> > premature to use a generic midlayer such as this. Yes, I know that the
>> > alternative is roughly 2000 lines of code per driver, but on one hand
>> > that's nothing compared to the amount of code required by a proper GPU
>> > driver and on the other hand this will (ideally) be temporary until we
>> > get a better picture of where things need to go. At which point it may
>> > become more obvious how we can solve the boilerplate problem while at
>> > the same time avoiding the restrictions imposed by the midlayer.
>>
>> For the moment I have no problem to go that way, but I am not sure how long
>> we should wait to collect the needed requirements. Do you know of any other
>> renderonly gpu coming in the near future (with opensource drivers)?

Well, everything on ARM, and any multi-GPU/hybrid/Prime situation on Intel.

>> I am really no expert in this area but the most interesting part for a
>> renderonly
>> GPU midlayer is "how to display the rendering result?".
>> So we need to support different tiling formats and if a intermediate buffer 
>> is
>> needed (for IP combination where GPU or scanout format are not compatible).
>>
>> For etnaviv we need to support all this cases (with the GC2000+ coming the 
>> next
>> months we can render into the scanout buffer directly - same as tegra).

Luckily etnaviv+imx-drm in the current generation seems to be a
glaring special case. Especially with higher resolutions becoming the
norm (1080p being non-negotiable and 4K working its way into even the
smallest mobile systems, e.g. Mali-450 on Amlogic, where the 6xx+
family is around 4 years old), intermediate copies just aren't viable
anymore.

I'd suggest keeping this as a total special case hidden internally,
quite apart from the general attempt to resolve the disjoint
GPU/scanout issue.

>> I know that it is quite hard to design a good api/midlayer/whatever to be 
>> future
>> proof for the next 1-2 years without (bigger) refactoring work. But at
>> some point
>> we need to jump into the cold water, start to design it and make use of it.
>> I am open to any discussion about software architecture and requirements.
>
> The problems I see with this series are not technical/implementation
> issues, but IMHO this whole device fusing is wrong from an architectural
> point of view.

Absolutely agreed.

> This series implements a way to fuse multiple DRM devices into a single
> EGL device, solely on the base that current EGL users expect that a
> single device is able to render and scanout at the same time. This is a
> flawed premise on itself and this series actively tries to keep the
> illusion that this is true, even for SoC and prime systems. While doing
> so it is putting a lot device specific knowledge into the implementation
> of the midlayer/helpers.
>
> IMHO we should not try to keep that illusion, but work out a way for
> client applications to deal with the situation of having multiple
> scanout/render devices in a easy way.
>
> I mean having render/scanout on different devices isn't exactly news, in
> fact we already have to deal with that for the various prime laptops.
> Currently all the complexity of those setups is hidden in the X server,
> but that one will go away and we have to find a way to make it easy for
> the various wayland compositors to deal with that.

So far we agree ...

> My current proposal would be as follows:
>
> 1. Implement etnaviv/nouveau on tegra as EGL devices that are only able
> to render into offscreen surfaces.
>
> 2. Implement an easy way for compositors to discover EGL devices and
> their capabilities. (There are already patches for EGL device and
> related stuff on the ML)

Indeed. The EGL_EXT_device_base series does need a respin for the new
libdrm device-discovery API, but it would still be nice to get review
here. Having this at least allows us the correct primitives (in the
sense of the driver having all required information), in that gbm has
a handle to the KMS/presentation device, and EGL then has a handle to
the EGL/render device.

> 3. Implement a generic KMS EGL device that is only able to prove
> scanout. Have EGL_EXT_stream_consumer_egloutput [1] working on top of
> that.
>
> 4. Implement EGL_KHR_stream_producer_eglsurface for the render only EGL
> devices.

At this point I violently disagree. My view on EGLStreams is roughly
the same as that of STREAMS (or OpenMAX). I can see the

Re: [Mesa-dev] [PATCH v2 1/7] nvc0: fix crash when nv50_miptree_from_handle fails

2015-10-26 Thread Julien Isorce
On 25 October 2015 at 21:38, Samuel Pitoiset 
wrote:

>
> Do you need someone to push this patch?
>

Yes please
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: pass picture desc to begin and decode

2015-10-26 Thread Julien Isorce
Hi Christian,

Would you mind to push this patch ? Thx

Cheers
Julien

On 23 October 2015 at 14:33, Christian König 
wrote:

> On 23.10.2015 14:25, Julien Isorce wrote:
>
>> At least vl_mpeg12_decoder uses the picture
>> desc in begin_frame and decode_bitstream.
>>
>> https://bugs.freedesktop.org/show_bug.cgi?id=92634
>>
>> Signed-off-by: Julien Isorce 
>>
>
> Reviewed-by: Christian König 
>
> ---
>>   src/gallium/state_trackers/va/picture.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/va/picture.c
>> b/src/gallium/state_trackers/va/picture.c
>> index 9b94b39..a0d530b 100644
>> --- a/src/gallium/state_trackers/va/picture.c
>> +++ b/src/gallium/state_trackers/va/picture.c
>> @@ -58,7 +58,7 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID
>> context_id, VASurfaceID rende
>> return VA_STATUS_ERROR_INVALID_SURFACE;
>>context->target = surf->buffer;
>> -   context->decoder->begin_frame(context->decoder, context->target,
>> NULL);
>> +   context->decoder->begin_frame(context->decoder, context->target,
>> &context->desc.base);
>>return VA_STATUS_SUCCESS;
>>   }
>> @@ -517,7 +517,7 @@ handleVASliceDataBufferType(vlVaContext *context,
>> vlVaBuffer *buf)
>>  buffers[num_buffers] = buf->data;
>>  sizes[num_buffers] = buf->size;
>>  ++num_buffers;
>> -   context->decoder->decode_bitstream(context->decoder, context->target,
>> NULL,
>> +   context->decoder->decode_bitstream(context->decoder, context->target,
>> &context->desc.base,
>> num_buffers, (const void * const*)buffers, sizes);
>>   }
>>
>>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92278] Black screen in War Thunder

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92278

Tapani Pälli  changed:

   What|Removed |Added

 Attachment #119191|0   |1
is obsolete||
 CC||lem...@gmail.com

--- Comment #4 from Tapani Pälli  ---
Created attachment 119200
  --> https://bugs.freedesktop.org/attachment.cgi?id=119200&action=edit
hack that fixes issue

This is slightly cleaner but still just a hack, it looks like we would need
full GL_EXT_direct_state_access and also GL_EXT_depth_bounds_test extension.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: keep track of intra-stage indices for atomics

2015-10-26 Thread Samuel Iglesias Gonsálvez
Hello Timothy,

I don't see anything wrong with this patch but I needed to review older
emails [0] to have a better understanding of why you were using an
intra-stage index.

So, from my side, I give my R-b but with a minor change (see below):

   Reviewed-by: Samuel Iglesias Gonsálvez 

Maybe you want to wait for Curro's opinion before pushing it to master.

Thanks,

Sam

[0]
http://lists.freedesktop.org/archives/mesa-dev/2015-September/095708.html and
the rest of emails in that thread.

On 04/10/15 09:45, Timothy Arceri wrote:
> This is more optimal as it means we no longer have to upload the same set
> of Atomic Buffer Object surfaces to all stages in the program.
> 
> This also fixes a bug where since commit c0cd5b var->data.binding was
> being used as a replacement for atomic buffer index, but they don't have
> to be the same value they just happened to end up the same when binding is 0.
> 
> Cc: Francisco Jerez 
> Cc: Ilia Mirkin 
> Cc: Alejandro Piñeiro 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> ---
>  src/glsl/ir_uniform.h|  8 -
>  src/glsl/link_atomics.cpp| 43 
> ++--
>  src/glsl/nir/glsl_to_nir.cpp |  2 --
>  src/glsl/nir/nir.h   |  4 +--
>  src/glsl/nir/nir_lower_atomics.c | 25 +++---
>  src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
>  src/mesa/drivers/dri/i965/brw_gs_surface_state.c |  4 +--
>  src/mesa/drivers/dri/i965/brw_nir.c  |  6 ++--
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +--
>  src/mesa/drivers/dri/i965/brw_vs_surface_state.c |  4 +--
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 37 ++--
>  src/mesa/main/mtypes.h   |  5 ++-
>  12 files changed, 103 insertions(+), 41 deletions(-)
> 
> diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
> index 50fe76b..2265caf 100644
> --- a/src/glsl/ir_uniform.h
> +++ b/src/glsl/ir_uniform.h
> @@ -71,7 +71,9 @@ struct gl_uniform_driver_storage {
>  
>  struct gl_opaque_uniform_index {
> /**
> -* Base opaque uniform index
> +* == For types other than atomics ==
> +*
> +* This is the base opaque uniform index
>  *
>  * If \c gl_uniform_storage::base_type is an opaque type, this
>  * represents its uniform index.  If \c
> @@ -80,6 +82,10 @@ struct gl_opaque_uniform_index {
>  * gl_uniform_storage::array_elements - 1, inclusive.
>  *
>  * Note that the index may be different in each shader stage.
> +*
> +* == For atomics ==
> +*
> +* This is the intra-stage stage index of gl_shader::AtomicBuffers[].
>  */
> uint8_t index;
>  
> diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
> index 100d03c..3073de0 100644
> --- a/src/glsl/link_atomics.cpp
> +++ b/src/glsl/link_atomics.cpp
> @@ -167,6 +167,7 @@ link_assign_atomic_counter_resources(struct gl_context 
> *ctx,
>   struct gl_shader_program *prog)
>  {
> unsigned num_buffers;
> +   unsigned num_atomic_buffers[MESA_SHADER_STAGES] = {};
> active_atomic_buffer *abs =
>find_active_atomic_counters(ctx, prog, &num_buffers);
>  
> @@ -211,13 +212,49 @@ link_assign_atomic_counter_resources(struct gl_context 
> *ctx,
>}
>  
>/* Assign stage-specific fields. */
> -  for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j)
> - mab.StageReferences[j] =
> -(ab.stage_references[j] ? GL_TRUE : GL_FALSE);
> +  for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j) {
> + if (ab.stage_references[j]) {
> +mab.StageReferences[j] = GL_TRUE;
> +num_atomic_buffers[j]++;
> + } else {
> +mab.StageReferences[j] = GL_FALSE;
> + }
> +  }
>  
>i++;
> }
>  
> +   /* Store a list pointers to atomic buffers per stage and store the index
> +* to this intra-stage buffer list along with the uniform.
> +*/
> +   for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j) {
> +  if (prog->_LinkedShaders[j] && num_atomic_buffers[j] > 0) {
> + prog->_LinkedShaders[j]->NumAtomicBuffers = num_atomic_buffers[j];
> + prog->_LinkedShaders[j]->AtomicBuffers =
> +rzalloc_array(prog, gl_active_atomic_buffer *,
> +  num_atomic_buffers[j]);
> +
> + unsigned intra_stage_idx = 0;
> + for (unsigned i = 0; i < num_buffers; i++) {
> +struct gl_active_atomic_buffer *atomic_buffer =
> +   &prog->AtomicBuffers[i];
> +if (atomic_buffer->StageReferences[j]) {
> +   prog->_LinkedShaders[j]->AtomicBuffers[intra_stage_idx] =
> +  atomic_buffer;
> +
> +   for (unsigned u = 0; u < atomic_buffer->NumUniforms; u++) {
> +  
> prog->UniformStorage[atomic_buffer->Uniforms[u]].opaque[j].index =
> +

Re: [Mesa-dev] [PATCH 2/2] glsl: keep track of intra-stage indices for atomics

2015-10-26 Thread Francisco Jerez
Timothy Arceri  writes:

> This is more optimal as it means we no longer have to upload the same set
> of Atomic Buffer Object surfaces to all stages in the program.
>
> This also fixes a bug where since commit c0cd5b var->data.binding was
> being used as a replacement for atomic buffer index, but they don't have
> to be the same value they just happened to end up the same when binding is 0.
>
> Cc: Francisco Jerez 
> Cc: Ilia Mirkin 
> Cc: Alejandro Piñeiro 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> ---
>  src/glsl/ir_uniform.h|  8 -
>  src/glsl/link_atomics.cpp| 43 
> ++--
>  src/glsl/nir/glsl_to_nir.cpp |  2 --
>  src/glsl/nir/nir.h   |  4 +--
>  src/glsl/nir/nir_lower_atomics.c | 25 +++---
>  src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
>  src/mesa/drivers/dri/i965/brw_gs_surface_state.c |  4 +--
>  src/mesa/drivers/dri/i965/brw_nir.c  |  6 ++--
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +--
>  src/mesa/drivers/dri/i965/brw_vs_surface_state.c |  4 +--
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 37 ++--
>  src/mesa/main/mtypes.h   |  5 ++-
>  12 files changed, 103 insertions(+), 41 deletions(-)
>
> diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
> index 50fe76b..2265caf 100644
> --- a/src/glsl/ir_uniform.h
> +++ b/src/glsl/ir_uniform.h
> @@ -71,7 +71,9 @@ struct gl_uniform_driver_storage {
>  
>  struct gl_opaque_uniform_index {
> /**
> -* Base opaque uniform index
> +* == For types other than atomics ==
> +*
> +* This is the base opaque uniform index
>  *
>  * If \c gl_uniform_storage::base_type is an opaque type, this
>  * represents its uniform index.  If \c
> @@ -80,6 +82,10 @@ struct gl_opaque_uniform_index {
>  * gl_uniform_storage::array_elements - 1, inclusive.
>  *
>  * Note that the index may be different in each shader stage.
> +*
> +* == For atomics ==
> +*
> +* This is the intra-stage stage index of gl_shader::AtomicBuffers[].

AFAICT this "index" field has the same semantics for atomics and
non-atomics, it's an intra-stage surface index regardless of whether
it's an atomic, sampler or image, the difference is just which array it
indexes into so I doubt it makes sense to split the comment like this.

With that and Samuel's comment taken into account this patch is:

Reviewed-by: Francisco Jerez 

>  */
> uint8_t index;
>  
> diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
> index 100d03c..3073de0 100644
> --- a/src/glsl/link_atomics.cpp
> +++ b/src/glsl/link_atomics.cpp
> @@ -167,6 +167,7 @@ link_assign_atomic_counter_resources(struct gl_context 
> *ctx,
>   struct gl_shader_program *prog)
>  {
> unsigned num_buffers;
> +   unsigned num_atomic_buffers[MESA_SHADER_STAGES] = {};
> active_atomic_buffer *abs =
>find_active_atomic_counters(ctx, prog, &num_buffers);
>  
> @@ -211,13 +212,49 @@ link_assign_atomic_counter_resources(struct gl_context 
> *ctx,
>}
>  
>/* Assign stage-specific fields. */
> -  for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j)
> - mab.StageReferences[j] =
> -(ab.stage_references[j] ? GL_TRUE : GL_FALSE);
> +  for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j) {
> + if (ab.stage_references[j]) {
> +mab.StageReferences[j] = GL_TRUE;
> +num_atomic_buffers[j]++;
> + } else {
> +mab.StageReferences[j] = GL_FALSE;
> + }
> +  }
>  
>i++;
> }
>  
> +   /* Store a list pointers to atomic buffers per stage and store the index
> +* to this intra-stage buffer list along with the uniform.
> +*/
> +   for (unsigned j = 0; j < MESA_SHADER_STAGES; ++j) {
> +  if (prog->_LinkedShaders[j] && num_atomic_buffers[j] > 0) {
> + prog->_LinkedShaders[j]->NumAtomicBuffers = num_atomic_buffers[j];
> + prog->_LinkedShaders[j]->AtomicBuffers =
> +rzalloc_array(prog, gl_active_atomic_buffer *,
> +  num_atomic_buffers[j]);
> +
> + unsigned intra_stage_idx = 0;
> + for (unsigned i = 0; i < num_buffers; i++) {
> +struct gl_active_atomic_buffer *atomic_buffer =
> +   &prog->AtomicBuffers[i];
> +if (atomic_buffer->StageReferences[j]) {
> +   prog->_LinkedShaders[j]->AtomicBuffers[intra_stage_idx] =
> +  atomic_buffer;
> +
> +   for (unsigned u = 0; u < atomic_buffer->NumUniforms; u++) {
> +  
> prog->UniformStorage[atomic_buffer->Uniforms[u]].opaque[j].index =
> + intra_stage_idx;
> +  
> prog->UniformStorage[atomic_buffer->Uniforms[u]].opaque[j].active =
> + tru

Re: [Mesa-dev] [PATCH] st/va: pass picture desc to begin and decode

2015-10-26 Thread Christian König

Done, but you should apply for an account.

Best regards,
Christian.

On 26.10.2015 13:46, Julien Isorce wrote:

Hi Christian,

Would you mind to push this patch ? Thx

Cheers
Julien

On 23 October 2015 at 14:33, Christian König > wrote:


On 23.10.2015 14:25, Julien Isorce wrote:

At least vl_mpeg12_decoder uses the picture
desc in begin_frame and decode_bitstream.

https://bugs.freedesktop.org/show_bug.cgi?id=92634

Signed-off-by: Julien Isorce mailto:j.iso...@samsung.com>>


Reviewed-by: Christian König mailto:christian.koe...@amd.com>>

---
  src/gallium/state_trackers/va/picture.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c
b/src/gallium/state_trackers/va/picture.c
index 9b94b39..a0d530b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -58,7 +58,7 @@ vlVaBeginPicture(VADriverContextP ctx,
VAContextID context_id, VASurfaceID rende
return VA_STATUS_ERROR_INVALID_SURFACE;
   context->target = surf->buffer;
-  context->decoder->begin_frame(context->decoder,
context->target, NULL);
+  context->decoder->begin_frame(context->decoder,
context->target, &context->desc.base);
   return VA_STATUS_SUCCESS;
  }
@@ -517,7 +517,7 @@ handleVASliceDataBufferType(vlVaContext
*context, vlVaBuffer *buf)
 buffers[num_buffers] = buf->data;
 sizes[num_buffers] = buf->size;
 ++num_buffers;
-  context->decoder->decode_bitstream(context->decoder,
context->target, NULL,
+  context->decoder->decode_bitstream(context->decoder,
context->target, &context->desc.base,
num_buffers, (const void * const*)buffers, sizes);
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
http://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Roland Scheidegger
Am 26.10.2015 um 10:02 schrieb Jose Fonseca:
> On 23/10/15 22:26, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> f16c intrinsic can only be emitted when AVX is used. So when we
>> disable AVX
>> due to forcing 128bit vectors we must not use this intrinsic
>> (depending on
>> llvm version, this worked previously because llvm used AVX even when
>> we didn't
>> tell it to, however I've seen this fail with llvm 3.3 since
>> 718249843b915decf8fccec92e466ac1a6219934 which seems to have the side
>> effect
>> of disabling avx in llvm albeit it only touches sse flags really).
> 
> Good catch.
> 
>> Possibly one day should actually try to use avx even with 128bit
>> vectors...
> 
> In the past we needed to override util_cpu_caps.has_avx on AVX capable
> machines but where old-JIT code.  But that's no longer the case: the min
> supported LLVM version is 3.3, which supports AVX both with MCJIT and
> old-JIT.
> 
> 
> There, the only point of this code is to enable a developer to test SSE2
> code paths on a AVX capable machine.
> 
> There's no other reason for someone to go out of his way to override
> LP_NATIVE_VECTOR_WIDTH of 256 with 128.
> 
> 
> So maybe it's worth to make this comment clear: the sole point is to
> enable SSE2 testing on AVX machines, and all avx flags, and flags which
> depend on avx, need to be masked out.

Well that's not quite true. Forcing 128bit wide vectors will get you
faster shader compiles and less llvm memory usage. And in some odd cases
the compiled shaders aren't even slower. Disabling AVX on top of that
doesn't really change much there though things might be minimally slower
(of course, if you hit the things which actually depend on avx, like
f16c, that's a different story).
Though it's not really exposed much as a feature, I find it's at least
as interesting for development to figure out why shaders using 256bit
vectors are scaling appropriately or not compared to 128bit rather than
SSE2 emulation. And for the former case it makes sense to leave avx
enabled. You are right though that the initial idea was to essentially
force llvm for the compiled shader to look like it was compiled on a
less capable machine (albeit since we're setting the cpu type,
instruction scheduling will still be different).

> 
> 
> BTW the "For simulating less capable machines" code needs to be updated
> too (it's missing has_avx2=0).
Ok I'll add that (plus sse42)...

Roland

> 
> 
>> ---
>>   src/gallium/auxiliary/gallivm/lp_bld_init.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c
>> b/src/gallium/auxiliary/gallivm/lp_bld_init.c
>> index 017d075..e6eede8 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
>> @@ -427,6 +427,7 @@ lp_build_init(void)
>>  */
>> util_cpu_caps.has_avx = 0;
>> util_cpu_caps.has_avx2 = 0;
>> +  util_cpu_caps.has_f16c = 0;
>>  }
>>
>>   #ifdef PIPE_ARCH_PPC_64
>>
> 
> Reviewed-by: Jose Fonseca 
> 
> Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Rob Clark
Signed-off-by: Rob Clark 
---
 src/glsl/Makefile.sources |   1 +
 src/glsl/nir/nir.c|   8 +
 src/glsl/nir/nir.h|   2 +
 src/glsl/nir/nir_clone.c  | 949 ++
 4 files changed, 960 insertions(+)
 create mode 100644 src/glsl/nir/nir_clone.c

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index ca87036..25e3801 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -26,6 +26,7 @@ NIR_FILES = \
nir/nir.h \
nir/nir_array.h \
nir/nir_builder.h \
+   nir/nir_clone.c \
nir/nir_constant_expressions.h \
nir/nir_control_flow.c \
nir/nir_control_flow.h \
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 0cbe4e1..2defa36 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -316,6 +316,14 @@ nir_block_create(nir_shader *shader)
block->predecessors = _mesa_set_create(block, _mesa_hash_pointer,
   _mesa_key_pointer_equal);
block->imm_dom = NULL;
+   /* XXX maybe it would be worth it to defer allocation?  This
+* way it doesn't get allocated for shader ref's that never run
+* nir_calc_dominance?  For example, state-tracker creates an
+* initial IR, clones that, runs appropriate lowering pass, passes
+* to driver which does common lowering/opt, and then stores ref
+* which is later used to do state specific lowering and futher
+* opt.  Do any of the references not need dominance metadata?
+*/
block->dom_frontier = _mesa_set_create(block, _mesa_hash_pointer,
   _mesa_key_pointer_equal);
 
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index fb60340..efcbf62 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1897,6 +1897,8 @@ void nir_index_blocks(nir_function_impl *impl);
 void nir_print_shader(nir_shader *shader, FILE *fp);
 void nir_print_instr(const nir_instr *instr, FILE *fp);
 
+nir_shader * nir_shader_clone(void *mem_ctx, const nir_shader *s);
+
 #ifdef DEBUG
 void nir_validate_shader(nir_shader *shader);
 #else
diff --git a/src/glsl/nir/nir_clone.c b/src/glsl/nir/nir_clone.c
new file mode 100644
index 000..5c1739e
--- /dev/null
+++ b/src/glsl/nir/nir_clone.c
@@ -0,0 +1,949 @@
+/*
+ * Copyright © 2015 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_control_flow_private.h"
+
+// TODO move these:
+#define exec_list_head(type, list, node) \
+   exec_node_data(type, exec_list_get_head(list), node)
+#define exec_list_next(type, nodeptr, node) \
+   exec_node_data(type, exec_node_get_next(nodeptr), node)
+
+/* Secret Decoder Ring:
+ *   clone_foo():
+ *Allocate and clone a foo.
+ *   __clone_foo():
+ *Clone body of foo (ie. parent class, embedded struct,
+ *etc)
+ *   __clone_foo_p2():
+ *clone body of foo, pass 2.. since in first pass we
+ *can have fwd references to embedded structs, some
+ *ptrs (and things that depend on them) must be
+ *resolved in 2nd pass
+ */
+
+typedef struct {
+   /* maps orig ptr -> cloned ptr: */
+   struct hash_table *ptr_table;
+   /* list of unresolved ssa src's: */
+   struct list_head ssa_src_list;
+
+   /* memctx for new toplevel shader object: */
+   void *mem_ctx;
+   /* new shader object, used as memctx for just about everything else: */
+   nir_shader *ns;
+} clone_state;
+
+typedef void *(*clone_fxn)(clone_state *state, const void *ptr);
+
+static void
+init_clone_state(clone_state *state, void *mem_ctx)
+{
+   state->ptr_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+  _mesa_key_pointer_equal);
+   list_inithead(&state->ssa_src_list);
+   state->mem_ctx = mem_ctx;
+}
+
+static void
+free_clone_state(clone_state *state)
+{
+   _mesa_hash_table_destroy(state->ptr_table, N

[Mesa-dev] [PATCH 4.5/7] nir: add list_head to nir_src::ssa

2015-10-26 Thread Rob Clark
From: Rob Clark 

This will be used by clone.  Possibly useful elsewhere.  The list link
will only be valid in ssa case, it fits in the padding in the union
left from the larger nir_reg_src.

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 2d9c94c..fb60340 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -528,7 +528,12 @@ typedef struct nir_src {
 
union {
   nir_reg_src reg;
-  nir_ssa_def *ssa;
+  struct {
+ /* used in clone to track unresolved ssa src's: */
+ struct list_head link;
+
+ nir_ssa_def *ssa;
+  };
};
 
bool is_ssa;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] nir: add helper macros for running NIR passes (v2)

2015-10-26 Thread Rob Clark
From: Rob Clark 

Convenient place to put in some extra sanity checking, without making
things messy for the drivers running the passes.

v2: don't use GNU C ({}) extension

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir.h  |  38 +++
 src/mesa/drivers/dri/i965/brw_nir.c | 126 
 2 files changed, 81 insertions(+), 83 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index a0585f1..047c0ea 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1945,10 +1945,48 @@ nir_shader_mutable(nir_shader *shader)
 
 #ifdef DEBUG
 void nir_validate_shader(nir_shader *shader);
+static inline bool
+__nir_test_clone(void)
+{
+   static int enable = -1;
+   if (enable == -1) {
+  enable = (getenv("NIR_TEST_CLONE") == NULL) ? 0 : 1;
+   }
+   return enable == 1;
+}
 #else
 static inline void nir_validate_shader(nir_shader *shader) { (void) shader; }
+static inline bool __nir_test_clone(void) { return false; }
 #endif /* DEBUG */
 
+
+#define NIR_PASS(pass, nir, ...) do {   \
+  assert(nir_shader_is_mutable(nir));   \
+  pass((nir), ##__VA_ARGS__);   \
+  nir_validate_shader(nir); \
+  if (__nir_test_clone()) { \
+ void *mem_ctx = ralloc_parent(nir);\
+ nir_shader *ns = nir_shader_clone(mem_ctx, (nir)); \
+ nir_shader_unref(nir); \
+ nir_validate_shader(ns);   \
+ (nir) = ns;\
+  } \
+   } while (0)
+
+#define NIR_PASS_PROGRESS(progress, pass, nir, ...) do {\
+  assert(nir_shader_is_mutable(nir));   \
+  (progress) |= pass((nir), ##__VA_ARGS__); \
+  nir_validate_shader(nir); \
+  if (__nir_test_clone()) { \
+ void *mem_ctx = ralloc_parent(nir);\
+ nir_shader *ns = nir_shader_clone(mem_ctx, (nir)); \
+ nir_shader_unref(nir); \
+ nir_validate_shader(ns);   \
+ (nir) = ns;\
+  } \
+   } while (0)
+
+
 void nir_calc_dominance_impl(nir_function_impl *impl);
 void nir_calc_dominance(nir_shader *shader);
 
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 8f09165..17f2244 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -136,47 +136,34 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
}
 }
 
-static void
+static nir_shader *
 nir_optimize(nir_shader *nir, bool is_scalar)
 {
bool progress;
do {
   progress = false;
-  nir_lower_vars_to_ssa(nir);
-  nir_validate_shader(nir);
 
-  if (is_scalar) {
- nir_lower_alu_to_scalar(nir);
- nir_validate_shader(nir);
-  }
+  NIR_PASS(nir_lower_vars_to_ssa, nir);
 
-  progress |= nir_copy_prop(nir);
-  nir_validate_shader(nir);
+  if (is_scalar)
+ NIR_PASS(nir_lower_alu_to_scalar, nir);
 
-  if (is_scalar) {
- nir_lower_phis_to_scalar(nir);
- nir_validate_shader(nir);
-  }
+  NIR_PASS_PROGRESS(progress, nir_copy_prop, nir);
+
+  if (is_scalar)
+ NIR_PASS(nir_lower_phis_to_scalar, nir);
 
-  progress |= nir_copy_prop(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_dce(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_cse(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_peephole_select(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_algebraic(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_constant_folding(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_dead_cf(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_remove_phis(nir);
-  nir_validate_shader(nir);
-  progress |= nir_opt_undef(nir);
-  nir_validate_shader(nir);
+  NIR_PASS_PROGRESS(progress, nir_copy_prop, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_dce, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_cse, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_peephole_select, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_algebraic, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_constant_folding, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_dead_cf, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_remove_phis, nir);
+  NIR_PASS_PROGRESS(progress, nir_opt_undef, nir);
} while (progress);
+   return nir;
 }
 
 nir_shader *
@@ -204,75 +191,52 @@ brw_create_nir(struct brw_context *b

Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Jose Fonseca

On 26/10/15 14:58, Roland Scheidegger wrote:

Am 26.10.2015 um 10:02 schrieb Jose Fonseca:

On 23/10/15 22:26, srol...@vmware.com wrote:

From: Roland Scheidegger 

f16c intrinsic can only be emitted when AVX is used. So when we
disable AVX
due to forcing 128bit vectors we must not use this intrinsic
(depending on
llvm version, this worked previously because llvm used AVX even when
we didn't
tell it to, however I've seen this fail with llvm 3.3 since
718249843b915decf8fccec92e466ac1a6219934 which seems to have the side
effect
of disabling avx in llvm albeit it only touches sse flags really).


Good catch.


Possibly one day should actually try to use avx even with 128bit
vectors...


In the past we needed to override util_cpu_caps.has_avx on AVX capable
machines but where old-JIT code.  But that's no longer the case: the min
supported LLVM version is 3.3, which supports AVX both with MCJIT and
old-JIT.


There, the only point of this code is to enable a developer to test SSE2
code paths on a AVX capable machine.

There's no other reason for someone to go out of his way to override
LP_NATIVE_VECTOR_WIDTH of 256 with 128.


So maybe it's worth to make this comment clear: the sole point is to
enable SSE2 testing on AVX machines, and all avx flags, and flags which
depend on avx, need to be masked out.


Well that's not quite true. Forcing 128bit wide vectors will get you
faster shader compiles and less llvm memory usage. And in some odd cases
the compiled shaders aren't even slower. Disabling AVX on top of that
doesn't really change much there though things might be minimally slower
(of course, if you hit the things which actually depend on avx, like
f16c, that's a different story).

.


Though it's not really exposed much as a feature, I find it's atleast
as interesting for development to figure out why shaders using 256bit
vectors are scaling appropriately or not compared to 128bit rather than
SSE2 emulation. And for the former case it makes sense to leave avx
enabled. You are right though that the initial idea was to essentially
force llvm for the compiled shader to look like it was compiled on a
less capable machine (albeit since we're setting the cpu type,
instruction scheduling will still be different).


So it sounds LP_NATIVE_VECTOR_WIDTH is not expressive enough for all 
development test cases.  We probably want another env var to fake SSE2 
machines etc, and let LP_NATIVE_VECTOR_WIDTH to be something orthogonal 
(that will however default to 128/256 based on the machine features).


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 11:27 AM, Rob Clark  wrote:
> Signed-off-by: Rob Clark 

fyi, branch which is using this at:

https://github.com/freedreno/mesa/commits/wip-nir

At this point it seems to be surviving piglit runs w/ NIR_TEST_CLONE=1
w/ freedreno.  I did briefly try piglit runs for i965, but it locked
up my laptop hard (and didn't feel like debugging that sort of issue
on my main laptop..  I'll setup piglit on a different machine when I
get a chance)

Couple small cosemetic things to clean up still, but seems at least
functional now.

BR,
-R

> ---
>  src/glsl/Makefile.sources |   1 +
>  src/glsl/nir/nir.c|   8 +
>  src/glsl/nir/nir.h|   2 +
>  src/glsl/nir/nir_clone.c  | 949 
> ++
>  4 files changed, 960 insertions(+)
>  create mode 100644 src/glsl/nir/nir_clone.c
>
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index ca87036..25e3801 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -26,6 +26,7 @@ NIR_FILES = \
> nir/nir.h \
> nir/nir_array.h \
> nir/nir_builder.h \
> +   nir/nir_clone.c \
> nir/nir_constant_expressions.h \
> nir/nir_control_flow.c \
> nir/nir_control_flow.h \
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 0cbe4e1..2defa36 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -316,6 +316,14 @@ nir_block_create(nir_shader *shader)
> block->predecessors = _mesa_set_create(block, _mesa_hash_pointer,
>_mesa_key_pointer_equal);
> block->imm_dom = NULL;
> +   /* XXX maybe it would be worth it to defer allocation?  This
> +* way it doesn't get allocated for shader ref's that never run
> +* nir_calc_dominance?  For example, state-tracker creates an
> +* initial IR, clones that, runs appropriate lowering pass, passes
> +* to driver which does common lowering/opt, and then stores ref
> +* which is later used to do state specific lowering and futher
> +* opt.  Do any of the references not need dominance metadata?
> +*/
> block->dom_frontier = _mesa_set_create(block, _mesa_hash_pointer,
>_mesa_key_pointer_equal);
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index fb60340..efcbf62 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1897,6 +1897,8 @@ void nir_index_blocks(nir_function_impl *impl);
>  void nir_print_shader(nir_shader *shader, FILE *fp);
>  void nir_print_instr(const nir_instr *instr, FILE *fp);
>
> +nir_shader * nir_shader_clone(void *mem_ctx, const nir_shader *s);
> +
>  #ifdef DEBUG
>  void nir_validate_shader(nir_shader *shader);
>  #else
> diff --git a/src/glsl/nir/nir_clone.c b/src/glsl/nir/nir_clone.c
> new file mode 100644
> index 000..5c1739e
> --- /dev/null
> +++ b/src/glsl/nir/nir_clone.c
> @@ -0,0 +1,949 @@
> +/*
> + * Copyright © 2015 Red Hat
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "nir.h"
> +#include "nir_control_flow_private.h"
> +
> +// TODO move these:
> +#define exec_list_head(type, list, node) \
> +   exec_node_data(type, exec_list_get_head(list), node)
> +#define exec_list_next(type, nodeptr, node) \
> +   exec_node_data(type, exec_node_get_next(nodeptr), node)
> +
> +/* Secret Decoder Ring:
> + *   clone_foo():
> + *Allocate and clone a foo.
> + *   __clone_foo():
> + *Clone body of foo (ie. parent class, embedded struct,
> + *etc)
> + *   __clone_foo_p2():
> + *clone body of foo, pass 2.. since in first pass we
> + *can have fwd references to embedded structs, some
> + *ptrs (and things that depend on them) must be
> + *resolved in 2nd pass
> + */
> +
> +typedef struct {
> +   /* maps orig ptr -> cloned ptr: */
> +   struct h

Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Roland Scheidegger
Am 26.10.2015 um 16:33 schrieb Jose Fonseca:
> On 26/10/15 14:58, Roland Scheidegger wrote:
>> Am 26.10.2015 um 10:02 schrieb Jose Fonseca:
>>> On 23/10/15 22:26, srol...@vmware.com wrote:
 From: Roland Scheidegger 

 f16c intrinsic can only be emitted when AVX is used. So when we
 disable AVX
 due to forcing 128bit vectors we must not use this intrinsic
 (depending on
 llvm version, this worked previously because llvm used AVX even when
 we didn't
 tell it to, however I've seen this fail with llvm 3.3 since
 718249843b915decf8fccec92e466ac1a6219934 which seems to have the side
 effect
 of disabling avx in llvm albeit it only touches sse flags really).
>>>
>>> Good catch.
>>>
 Possibly one day should actually try to use avx even with 128bit
 vectors...
>>>
>>> In the past we needed to override util_cpu_caps.has_avx on AVX capable
>>> machines but where old-JIT code.  But that's no longer the case: the min
>>> supported LLVM version is 3.3, which supports AVX both with MCJIT and
>>> old-JIT.
>>>
>>>
>>> There, the only point of this code is to enable a developer to test SSE2
>>> code paths on a AVX capable machine.
>>>
>>> There's no other reason for someone to go out of his way to override
>>> LP_NATIVE_VECTOR_WIDTH of 256 with 128.
>>>
>>>
>>> So maybe it's worth to make this comment clear: the sole point is to
>>> enable SSE2 testing on AVX machines, and all avx flags, and flags which
>>> depend on avx, need to be masked out.
>>
>> Well that's not quite true. Forcing 128bit wide vectors will get you
>> faster shader compiles and less llvm memory usage. And in some odd cases
>> the compiled shaders aren't even slower. Disabling AVX on top of that
>> doesn't really change much there though things might be minimally slower
>> (of course, if you hit the things which actually depend on avx, like
>> f16c, that's a different story).
> .
> 
>> Though it's not really exposed much as a feature, I find it's atleast
>> as interesting for development to figure out why shaders using 256bit
>> vectors are scaling appropriately or not compared to 128bit rather than
>> SSE2 emulation. And for the former case it makes sense to leave avx
>> enabled. You are right though that the initial idea was to essentially
>> force llvm for the compiled shader to look like it was compiled on a
>> less capable machine (albeit since we're setting the cpu type,
>> instruction scheduling will still be different).
> 
> So it sounds LP_NATIVE_VECTOR_WIDTH is not expressive enough for all
> development test cases.  We probably want another env var to fake SSE2
> machines etc, and let LP_NATIVE_VECTOR_WIDTH to be something orthogonal
> (that will however default to 128/256 based on the machine features).
Yes, ideally it would work like that. That said, since it's really meant
for development mostly, hacking around to what is needed seemed to work
well enough until now...

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: fix tgsi compiler crash with invalid src

2015-10-26 Thread Serge Martin
---
 src/gallium/state_trackers/clover/tgsi/compiler.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/state_trackers/clover/tgsi/compiler.cpp 
b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
index 54cb747..4d05666 100644
--- a/src/gallium/state_trackers/clover/tgsi/compiler.cpp
+++ b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
@@ -97,6 +97,11 @@ namespace {
 module
 clover::compile_program_tgsi(const std::string &source, std::string &r_log) {
const size_t body_pos = source.find("COMP\n");
+   if (body_pos == std::string::npos) {
+  r_log = "invalid source";
+  throw compile_error();
+   }
+
const char *body = &source[body_pos];
module m;
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: fix tgsi compiler crash with invalid src

2015-10-26 Thread Francisco Jerez
Serge Martin  writes:

> ---
>  src/gallium/state_trackers/clover/tgsi/compiler.cpp | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/tgsi/compiler.cpp 
> b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
> index 54cb747..4d05666 100644
> --- a/src/gallium/state_trackers/clover/tgsi/compiler.cpp
> +++ b/src/gallium/state_trackers/clover/tgsi/compiler.cpp
> @@ -97,6 +97,11 @@ namespace {
>  module
>  clover::compile_program_tgsi(const std::string &source, std::string &r_log) {
> const size_t body_pos = source.find("COMP\n");
> +   if (body_pos == std::string::npos) {
> +  r_log = "invalid source";
> +  throw compile_error();
> +   }
> +
> const char *body = &source[body_pos];
> module m;

Reviewed-by: Francisco Jerez 

>  
> -- 
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449
Bug 77449 depends on bug 86720, which changed state.

Bug 86720 Summary: [radeon] Europa Universalis 4 freezing during game start 
(10.3.3+, still broken on 11.0.2)
https://bugs.freedesktop.org/show_bug.cgi?id=86720

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92645] kodi vdpau interop fails since mesa, meta: move gl_texture_object::TargetIndex initializations

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92645

Lee Donaghy  changed:

   What|Removed |Added

 CC||lee295...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 18/21] mesa/extensions: Remove extra memsets on gl_extensions

2015-10-26 Thread Emil Velikov
On 23 October 2015 at 19:46, Jordan Justen  wrote:
> On 2015-10-22 03:32:58, Emil Velikov wrote:
>> On 19 October 2015 at 23:44, Nanley Chery  wrote:
>> > From: Nanley Chery 
>> >
>> > Aside from those modified in this commit, all gl_extensions structs are
>> > zero-initialized by default. There is therefore no need to memset the
>> > structs to 0. Also, remove the open-coded memset in
>> > _mesa_init_extensions().
>> >
>> > Signed-off-by: Nanley Chery 
>> > ---
>> >  src/mesa/main/extensions.c | 18 --
>> >  1 file changed, 4 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
>> > index 365e7ed..6cd2b27 100644
>> > --- a/src/mesa/main/extensions.c
>> > +++ b/src/mesa/main/extensions.c
>> > @@ -37,8 +37,8 @@
>> >  #include "macros.h"
>> >  #include "mtypes.h"
>> >
>> > -struct gl_extensions _mesa_extension_override_enables;
>> > -struct gl_extensions _mesa_extension_override_disables;
>> > +struct gl_extensions _mesa_extension_override_enables = {0};
>> > +struct gl_extensions _mesa_extension_override_disables = {0};
>> Side note: once ARB_compute_shader lands in for i965 we can even make
>> these local/static.
>
> I added these variables specifically so a driver could check to see
> what extensions were overridden early in context creation. Making them
> static would undo that feature...
>
> The alternative before was to add a custom environment variable. (See
> 10e03b44)
>
What I meant was that this feature feels like a (layering?) violation.
Although I do realise why you've added it, esp. since
arb_compute_shader is a large and complex beast. One can only imagine
what will happen if for every interaction between implemented
extension FOO and non-implemented (wip) extension BAR we had a similar
code around. I'm not sure it'll be pretty :)

That said I'm not suggesting that we touch it. Not at least
arb_compute_shader is complete.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] nir: add couple array length fields

2015-10-26 Thread Jason Ekstrand
On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> This will simplify things somewhat in clone.
>
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/glsl_to_nir.cpp |  6 ++
>  src/glsl/nir/nir.h   | 11 +++
>  2 files changed, 17 insertions(+)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index 9b50a93..8f83012 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -238,6 +238,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>
> unsigned total_elems = ir->type->components();
> unsigned i;
> +
> +   ret->num_elements = 0;
> switch (ir->type->base_type) {
> case GLSL_TYPE_UINT:
>for (i = 0; i < total_elems; i++)
> @@ -262,6 +264,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
> case GLSL_TYPE_STRUCT:
>ret->elements = ralloc_array(mem_ctx, nir_constant *,
> ir->type->length);
> +  ret->num_elements = ir->type->length;
> +
>i = 0;
>foreach_in_list(ir_constant, field, &ir->components) {
>   ret->elements[i] = constant_copy(field, mem_ctx);
> @@ -272,6 +276,7 @@ constant_copy(ir_constant *ir, void *mem_ctx)
> case GLSL_TYPE_ARRAY:
>ret->elements = ralloc_array(mem_ctx, nir_constant *,
> ir->type->length);
> +  ret->num_elements = ir->type->length;
>
>for (i = 0; i < ir->type->length; i++)
>   ret->elements[i] = constant_copy(ir->array_elements[i], mem_ctx);
> @@ -293,6 +298,7 @@ nir_visitor::visit(ir_variable *ir)
>
> if (ir->is_interface_instance() && ir->get_max_ifc_array_access() != 
> NULL) {
>unsigned size = ir->get_interface_type()->length;
> +  var->num_max_ifc_array_access = size;
>var->max_ifc_array_access = ralloc_array(var, unsigned, size);
>memcpy(var->max_ifc_array_access, ir->get_max_ifc_array_access(),
>   size * sizeof(unsigned));
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index a09c2a6..2d9c94c 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -112,6 +112,11 @@ typedef struct nir_constant {
>  */
> union nir_constant_data value;
>
> +   /* we could get this from the var->type but makes clone *much* easier to
> +* not have to care about the type.
> +*/
> +   unsigned num_elements;
> +
> /* Array elements / Structure Fields */
> struct nir_constant **elements;
>  } nir_constant;
> @@ -148,6 +153,12 @@ typedef struct {
>  */
> char *name;
>
> +   /* we could figure this out from interface_type but it isn't exposed
> +* cleanly outside of c++ and just having the length here simplifies
> +* clone:
> +*/
> +   unsigned num_max_ifc_array_access;

Can we just get rid of max_ifc_array_access instead?  It's something
that GLSL uses for some optimizations somewhere but NIR has never made
any use of it whatsoever and none of the backends care.  The only
reason it's here is because Connor copied and pasted ir_variable into
NIR variable.
--Jason

> +
> /**
>  * For variables which satisfy the is_interface_instance() predicate, this
>  * points to an array of integers such that if the ith member of the
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] nir: some small cleanups

2015-10-26 Thread Jason Ekstrand
On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
> The various cf nodes all get allocated w/ shader as their ralloc_parent,
> so lets make this more explicit.  Plus couple other corrections/
> clarifications.
>
> Signed-off-by: Rob Clark 

Reviewed-by: Jason Ekstrand 

> ---
>  src/glsl/nir/nir.c | 18 +-
>  src/glsl/nir/nir.h | 10 +-
>  2 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index caa9ffc..300e43f 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -306,9 +306,9 @@ nir_function_impl_create(nir_function_overload *overload)
>  }
>
>  nir_block *
> -nir_block_create(void *mem_ctx)
> +nir_block_create(nir_shader *shader)
>  {
> -   nir_block *block = ralloc(mem_ctx, nir_block);
> +   nir_block *block = ralloc(shader, nir_block);
>
> cf_init(&block->cf_node, nir_cf_node_block);
>
> @@ -334,19 +334,19 @@ src_init(nir_src *src)
>  }
>
>  nir_if *
> -nir_if_create(void *mem_ctx)
> +nir_if_create(nir_shader *shader)
>  {
> -   nir_if *if_stmt = ralloc(mem_ctx, nir_if);
> +   nir_if *if_stmt = ralloc(shader, nir_if);
>
> cf_init(&if_stmt->cf_node, nir_cf_node_if);
> src_init(&if_stmt->condition);
>
> -   nir_block *then = nir_block_create(mem_ctx);
> +   nir_block *then = nir_block_create(shader);
> exec_list_make_empty(&if_stmt->then_list);
> exec_list_push_tail(&if_stmt->then_list, &then->cf_node.node);
> then->cf_node.parent = &if_stmt->cf_node;
>
> -   nir_block *else_stmt = nir_block_create(mem_ctx);
> +   nir_block *else_stmt = nir_block_create(shader);
> exec_list_make_empty(&if_stmt->else_list);
> exec_list_push_tail(&if_stmt->else_list, &else_stmt->cf_node.node);
> else_stmt->cf_node.parent = &if_stmt->cf_node;
> @@ -355,13 +355,13 @@ nir_if_create(void *mem_ctx)
>  }
>
>  nir_loop *
> -nir_loop_create(void *mem_ctx)
> +nir_loop_create(nir_shader *shader)
>  {
> -   nir_loop *loop = ralloc(mem_ctx, nir_loop);
> +   nir_loop *loop = ralloc(shader, nir_loop);
>
> cf_init(&loop->cf_node, nir_cf_node_loop);
>
> -   nir_block *body = nir_block_create(mem_ctx);
> +   nir_block *body = nir_block_create(shader);
> exec_list_make_empty(&loop->body);
> exec_list_push_tail(&loop->body, &body->cf_node.node);
> body->cf_node.parent = &loop->cf_node;
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 79c0666..097d65b 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -395,10 +395,10 @@ typedef struct {
>  */
> bool is_packed;
>
> -   /** set of nir_instr's where this register is used (read from) */
> +   /** set of nir_src's where this register is used (read from) */
> struct list_head uses;
>
> -   /** set of nir_instr's where this register is defined (written to) */
> +   /** set of nir_dest's where this register is defined (written to) */
> struct list_head defs;
>
> /** set of nir_if's where this register is used as a condition */
> @@ -1622,9 +1622,9 @@ nir_function_overload 
> *nir_function_overload_create(nir_function *func);
>
>  nir_function_impl *nir_function_impl_create(nir_function_overload *func);
>
> -nir_block *nir_block_create(void *mem_ctx);
> -nir_if *nir_if_create(void *mem_ctx);
> -nir_loop *nir_loop_create(void *mem_ctx);
> +nir_block *nir_block_create(nir_shader *shader);
> +nir_if *nir_if_create(nir_shader *shader);
> +nir_loop *nir_loop_create(nir_shader *shader);
>
>  nir_function_impl *nir_cf_node_get_function(nir_cf_node *node);
>
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] nir: add nir_var_all enum

2015-10-26 Thread Jason Ekstrand
On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
> Otherwise, passing -1 gets you:
>
>   error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive]

Yeah, I've never really liked that.  Another option (which I think I'd
marginally prefer) would be to make nir_lower_io take a bitfield of
which variable types you want to lower.  Then you would pass in (1 <<
nir_var_foo) for a specific one and ~0u for all.  That way you could
also lower two at a time if you wanted.
--Jason

>
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir.c  | 4 
>  src/glsl/nir/nir.h  | 1 +
>  src/glsl/nir/nir_lower_io.c | 2 +-
>  src/mesa/drivers/dri/i965/brw_nir.c | 3 ++-
>  4 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 793bdaf..caa9ffc 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -107,6 +107,10 @@ void
>  nir_shader_add_variable(nir_shader *shader, nir_variable *var)
>  {
> switch (var->data.mode) {
> +   case nir_var_all:
> +  assert(!"invalid mode");
> +  break;
> +
> case nir_var_local:
>assert(!"nir_shader_add_variable cannot be used for local variables");
>break;
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index e3777f9..79c0666 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -82,6 +82,7 @@ typedef struct {
>  } nir_state_slot;
>
>  typedef enum {
> +   nir_var_all = -1,
> nir_var_shader_in,
> nir_var_shader_out,
> nir_var_global,
> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> index 688b48f..b4ed857 100644
> --- a/src/glsl/nir/nir_lower_io.c
> +++ b/src/glsl/nir/nir_lower_io.c
> @@ -176,7 +176,7 @@ nir_lower_io_block(nir_block *block, void *void_state)
>
>nir_variable_mode mode = intrin->variables[0]->var->data.mode;
>
> -  if (state->mode != -1 && state->mode != mode)
> +  if (state->mode != nir_var_all && state->mode != mode)
>   continue;
>
>switch (intrin->intrinsic) {
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index 1b4dace..8f09165 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -235,7 +235,8 @@ brw_create_nir(struct brw_context *brw,
> nir_assign_var_locations(&nir->uniforms,
>  &nir->num_uniforms,
>  is_scalar ? type_size_scalar : type_size_vec4);
> -   nir_lower_io(nir, -1, is_scalar ? type_size_scalar : type_size_vec4);
> +   nir_lower_io(nir, nir_var_all,
> +is_scalar ? type_size_scalar : type_size_vec4);
> nir_validate_shader(nir);
>
> nir_remove_dead_variables(nir);
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4.5/7] nir: add list_head to nir_src::ssa

2015-10-26 Thread Jason Ekstrand
On Mon, Oct 26, 2015 at 8:26 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> This will be used by clone.  Possibly useful elsewhere.  The list link
> will only be valid in ssa case, it fits in the padding in the union
> left from the larger nir_reg_src.

We already have a use_link.  It seems perfectly reasonable to me to
overload it in the middle of a copy operation.
--Jason

> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 2d9c94c..fb60340 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -528,7 +528,12 @@ typedef struct nir_src {
>
> union {
>nir_reg_src reg;
> -  nir_ssa_def *ssa;
> +  struct {
> + /* used in clone to track unresolved ssa src's: */
> + struct list_head link;
> +
> + nir_ssa_def *ssa;
> +  };
> };
>
> bool is_ssa;
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965/fs: Properly check for PAD in fragment shaders with > 16 varyings.

2015-10-26 Thread Kenneth Graunke
Commit 268008f98c3810b9f276df985dc93efc0c49f33e changed unused VUE map
slots to be initialized with BRW_VARYING_SLOT_PAD, not COUNT.  I missed
updating this.  It also means that commit message was wrong, as some
code *did* rely slots being initialized to COUNT.

This may fix a bug with SSO programs with > 16 FS input varyings.
I think we probably just emitted extra pointless code, but probably
didn't break anything.  We might also just have no tests for that.

Signed-off-by: Kenneth Graunke 
Cc: Chris Forbes 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 75f1a92..2eef7af 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1456,10 +1456,7 @@ fs_visitor::calculate_urb_setup()
  for (int slot = first_slot; slot < prev_stage_vue_map.num_slots;
   slot++) {
 int varying = prev_stage_vue_map.slot_to_varying[slot];
-/* Note that varying == BRW_VARYING_SLOT_COUNT when a slot is
- * unused.
- */
-if (varying != BRW_VARYING_SLOT_COUNT &&
+if (varying != BRW_VARYING_SLOT_PAD &&
 (nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK &
  BITFIELD64_BIT(varying))) {
prog_data->urb_setup[varying] = slot - first_slot;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.

2015-10-26 Thread Kenneth Graunke
Integer varyings need to be flat qualified - all others were already.
I think we just missed this.  Presumably some hardware passes this via
sideband and ignores attribute interpolation, so no one has noticed.

Signed-off-by: Kenneth Graunke 
Cc: Chris Forbes 
---
 src/glsl/builtin_variables.cpp | 39 ++-
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index a6ad105..07aacd4 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -887,16 +887,22 @@ builtin_variable_generator::generate_uniforms()
 void
 builtin_variable_generator::generate_vs_special_vars()
 {
+   ir_variable *var;
+
if (state->is_version(130, 300))
   add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");
if (state->ARB_draw_instanced_enable)
   add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
   add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
-   if (state->AMD_vertex_shader_layer_enable)
-  add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
-   if (state->AMD_vertex_shader_viewport_index_enable)
-  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+   if (state->AMD_vertex_shader_layer_enable) {
+  var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
+  var->data.interpolation = INTERP_QUALIFIER_FLAT;
+   }
+   if (state->AMD_vertex_shader_viewport_index_enable) {
+  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+  var->data.interpolation = INTERP_QUALIFIER_FLAT;
+   }
if (compatibility) {
   add_input(VERT_ATTRIB_POS, vec4_t, "gl_Vertex");
   add_input(VERT_ATTRIB_NORMAL, vec3_t, "gl_Normal");
@@ -954,11 +960,16 @@ builtin_variable_generator::generate_tes_special_vars()
 void
 builtin_variable_generator::generate_gs_special_vars()
 {
-   add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
-   if (state->is_version(410, 0) || state->ARB_viewport_array_enable)
-  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+   ir_variable *var;
+
+   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
+   var->data.interpolation = INTERP_QUALIFIER_FLAT;
+   if (state->is_version(410, 0) || state->ARB_viewport_array_enable) {
+  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+  var->data.interpolation = INTERP_QUALIFIER_FLAT;
+   }
if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable)
-  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
+  var = add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, 
"gl_InvocationID");
 
/* Although gl_PrimitiveID appears in tessellation control and tessellation
 * evaluation shaders, it has a different function there than it has in
@@ -970,7 +981,6 @@ builtin_variable_generator::generate_gs_special_vars()
 * the specific case of gl_PrimitiveIDIn.  So we don't need to treat
 * gl_PrimitiveIDIn as an {ARB,EXT}_geometry_shader4-only variable.
 */
-   ir_variable *var;
var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveIDIn");
var->data.interpolation = INTERP_QUALIFIER_FLAT;
var = add_output(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
@@ -984,14 +994,15 @@ builtin_variable_generator::generate_gs_special_vars()
 void
 builtin_variable_generator::generate_fs_special_vars()
 {
+   ir_variable *var;
+
add_input(VARYING_SLOT_POS, vec4_t, "gl_FragCoord");
add_input(VARYING_SLOT_FACE, bool_t, "gl_FrontFacing");
if (state->is_version(120, 100))
   add_input(VARYING_SLOT_PNTC, vec2_t, "gl_PointCoord");
 
if (state->is_version(150, 0)) {
-  ir_variable *var =
- add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
+  var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
   var->data.interpolation = INTERP_QUALIFIER_FLAT;
}
 
@@ -1043,8 +1054,10 @@ builtin_variable_generator::generate_fs_special_vars()
}
 
if (state->is_version(430, 0) || state->ARB_fragment_layer_viewport_enable) 
{
-  add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");
-  add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+  var = add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");
+  var->data.interpolation = INTERP_QUALIFIER_FLAT;
+  var = add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
+  var->data.interpolation = INTERP_QUALIFIER_FLAT;
}
 }
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965: Implement ARB_fragment_layer_viewport.

2015-10-26 Thread Kenneth Graunke
Normally, we could read gl_Layer from bits 26:16 of R0.0.  However, the
specification requires that bogus out-of-range 32-bit values written by
previous stages need to appear in the fragment shader as-written.

Instead, we pass in the full 32-bit value from the VUE header as an
extra flat-shaded varying.  We have the SF override the value to 0
when the previous stage didn't actually write a value (it's actually
defined to return 0).

Signed-off-by: Kenneth Graunke 
Cc: Chris Forbes 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |  7 ++-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 +++
 src/mesa/drivers/dri/i965/gen6_sf_state.c| 31 
 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 4 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2eef7af..5a82b04 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1442,6 +1442,9 @@ fs_visitor::calculate_urb_setup()
 }
  }
   } else {
+ bool include_vue_header =
+nir->info.inputs_read & (VARYING_BIT_LAYER | VARYING_BIT_VIEWPORT);
+
  /* We have enough input varyings that the SF/SBE pipeline stage can't
   * arbitrarily rearrange them to suit our whim; we have to put them
   * in an order that matches the output of the previous pipeline stage
@@ -1451,7 +1454,9 @@ fs_visitor::calculate_urb_setup()
  brw_compute_vue_map(devinfo, &prev_stage_vue_map,
  key->input_slots_valid,
  nir->info.separate_shader);
- int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
+ int first_slot =
+include_vue_header ? 0 : 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
+
  assert(prev_stage_vue_map.num_slots <= first_slot + 32);
  for (int slot = first_slot; slot < prev_stage_vue_map.num_slots;
   slot++) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 4950ba4..9c1f95c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -71,6 +71,14 @@ fs_visitor::nir_setup_inputs()
  var->data.origin_upper_left);
  emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
input, reg), 0xF);
+  } else if (var->data.location == VARYING_SLOT_LAYER) {
+ struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_LAYER, 1), 3);
+ reg.type = BRW_REGISTER_TYPE_D;
+ bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D), reg);
+  } else if (var->data.location == VARYING_SLOT_VIEWPORT) {
+ struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_VIEWPORT, 2), 
3);
+ reg.type = BRW_REGISTER_TYPE_D;
+ bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D), reg);
   } else {
  emit_general_interpolation(input, var->name, var->type,
 (glsl_interp_qualifier) 
var->data.interpolation,
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 0c8c053..2634e6b 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -60,6 +60,23 @@ get_attr_override(const struct brw_vue_map *vue_map, int 
urb_entry_read_offset,
/* Find the VUE slot for this attribute. */
int slot = vue_map->varying_to_slot[fs_attr];
 
+   /* Viewport and Layer are stored in the VUE header.  We need to override
+* them to zero if earlier stages didn't write them, as GL requires that
+* they read back as zero when not explicitly set.
+*/
+   if (fs_attr == VARYING_SLOT_VIEWPORT || fs_attr == VARYING_SLOT_LAYER) {
+  unsigned override =
+ ATTRIBUTE_0_OVERRIDE_X | ATTRIBUTE_0_OVERRIDE_W |
+ ATTRIBUTE_CONST_ << ATTRIBUTE_0_CONST_SOURCE_SHIFT;
+
+  if (!(vue_map->slots_valid & VARYING_BIT_LAYER))
+ override |= ATTRIBUTE_0_OVERRIDE_Y;
+  if (!(vue_map->slots_valid & VARYING_BIT_VIEWPORT))
+ override |= ATTRIBUTE_0_OVERRIDE_Z;
+
+  return override;
+   }
+
/* If there was only a back color written but not front, use back
 * as the color instead of undefined
 */
@@ -169,6 +186,20 @@ calculate_attr_overrides(const struct brw_context *brw,
 
*urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
 
+   /* BRW_NEW_FRAGMENT_PROGRAM
+*
+* If the fragment shader reads VARYING_SLOT_LAYER, then we need to pass in
+* the full vertex header.  Otherwise, we can program the SF to start
+* reading at an offset of 1 (2 varying slots) to skip unnecessary data:
+* - VARYING_SLOT_PSIZ and BRW_VARYING_SLOT_NDC on gen4-5
+* - VARYING_SLOT_{PSIZ,LAYER} and VARYING_SLOT_POS on gen6+
+*/
+
+   bool fs_needs_vue_h

[Mesa-dev] [PATCH 1/5] i965: Update stale comment about unused VUE map slots.

2015-10-26 Thread Kenneth Graunke
I changed this from COUNT to PAD in commit 268008f98c3810b9f276df985dc93ef.

Signed-off-by: Kenneth Graunke 
Cc: Chris Forbes 
---
 src/mesa/drivers/dri/i965/brw_compiler.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index bb2fbd4..91eabaf 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -441,9 +441,7 @@ struct brw_vue_map {
 * directly correspond to a gl_varying_slot, the value comes from
 * brw_varying_slot.
 *
-* For slots that are not in use, the value is BRW_VARYING_SLOT_COUNT (this
-* simplifies code that uses the value stored in slot_to_varying to
-* create a bit mask).
+* For slots that are not in use, the value is BRW_VARYING_SLOT_PAD.
 */
signed char slot_to_varying[BRW_VARYING_SLOT_COUNT];
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965: Make calculate_attr_overrides return the URB read offset.

2015-10-26 Thread Kenneth Graunke
Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which
represents 2 vec4 varying slots) to skip over the 8 DWord VUE header.

In order to support ARB_fragment_layer_viewport, we'll need to read
from that header.  This patch adds the basic plumbing necessary to
calculate a value dynamically and hook it up in the SBE packets.

Signed-off-by: Kenneth Graunke 
Cc: Chris Forbes 
---
 src/mesa/drivers/dri/i965/brw_state.h |  3 ++-
 src/mesa/drivers/dri/i965/gen6_sf_state.c | 13 -
 src/mesa/drivers/dri/i965/gen7_sf_state.c |  5 +++--
 src/mesa/drivers/dri/i965/gen8_sf_state.c |  6 --
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index dc2b941..56ebb3e 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -345,7 +345,8 @@ calculate_attr_overrides(const struct brw_context *brw,
  uint16_t *attr_overrides,
  uint32_t *point_sprite_enables,
  uint32_t *flat_enables,
- uint32_t *urb_entry_read_length);
+ uint32_t *urb_entry_read_length,
+ uint32_t *urb_entry_read_offset);
 
 /* gen6_surface_state.c */
 void gen6_init_vtable_surface_functions(struct brw_context *brw);
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 4068f28..0c8c053 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -159,14 +159,16 @@ calculate_attr_overrides(const struct brw_context *brw,
  uint16_t *attr_overrides,
  uint32_t *point_sprite_enables,
  uint32_t *flat_enables,
- uint32_t *urb_entry_read_length)
+ uint32_t *urb_entry_read_length,
+ uint32_t *urb_entry_read_offset)
 {
-   const int urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
uint32_t max_source_attr = 0;
 
*point_sprite_enables = 0;
*flat_enables = 0;
 
+   *urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
+
/* _NEW_LIGHT */
bool shade_model_flat = brw->ctx.Light.ShadeModel == GL_FLAT;
 
@@ -228,7 +230,7 @@ calculate_attr_overrides(const struct brw_context *brw,
   /* BRW_NEW_VUE_MAP_GEOM_OUT | _NEW_LIGHT | _NEW_PROGRAM */
   uint16_t attr_override = point_sprite ? 0 :
  get_attr_override(&brw->vue_map_geom_out,
-  urb_entry_read_offset, attr,
+  *urb_entry_read_offset, attr,
brw->ctx.VertexProgram._TwoSideEnabled,
&max_source_attr);
 
@@ -276,7 +278,6 @@ upload_sf_state(struct brw_context *brw)
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
 
-   const int urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
float point_size;
uint16_t attr_overrides[16];
uint32_t point_sprite_origin;
@@ -411,8 +412,10 @@ upload_sf_state(struct brw_context *brw)
 * _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM | BRW_NEW_FS_PROG_DATA
 */
uint32_t urb_entry_read_length;
+   uint32_t urb_entry_read_offset;
calculate_attr_overrides(brw, attr_overrides, &point_sprite_enables,
-&flat_enables, &urb_entry_read_length);
+&flat_enables, &urb_entry_read_length,
+&urb_entry_read_offset);
dw1 |= (urb_entry_read_length << GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT |
urb_entry_read_offset << GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT);
 
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index 698b3d4..b1f13ac 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -40,7 +40,6 @@ upload_sbe_state(struct brw_context *brw)
uint32_t point_sprite_enables;
uint32_t flat_enables;
int i;
-   const int urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
uint16_t attr_overrides[16];
/* _NEW_BUFFERS */
bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
@@ -65,8 +64,10 @@ upload_sbe_state(struct brw_context *brw)
 * _NEW_POINT | _NEW_LIGHT | _NEW_PROGRAM | BRW_NEW_FS_PROG_DATA
 */
uint32_t urb_entry_read_length;
+   uint32_t urb_entry_read_offset;
calculate_attr_overrides(brw, attr_overrides, &point_sprite_enables,
-&flat_enables, &urb_entry_read_length);
+&flat_enables, &urb_entry_read_length,
+&urb_entry_read_offset);
dw1 |= urb_entry_read_length << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
   urb_entry_read_offset << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT;
 
diff --git a/src/mesa/

Re: [Mesa-dev] [PATCH 4/7] nir: add couple array length fields

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 1:52 PM, Jason Ekstrand  wrote:
> On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> This will simplify things somewhat in clone.
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/glsl/nir/glsl_to_nir.cpp |  6 ++
>>  src/glsl/nir/nir.h   | 11 +++
>>  2 files changed, 17 insertions(+)
>>
>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>> index 9b50a93..8f83012 100644
>> --- a/src/glsl/nir/glsl_to_nir.cpp
>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>> @@ -238,6 +238,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>>
>> unsigned total_elems = ir->type->components();
>> unsigned i;
>> +
>> +   ret->num_elements = 0;
>> switch (ir->type->base_type) {
>> case GLSL_TYPE_UINT:
>>for (i = 0; i < total_elems; i++)
>> @@ -262,6 +264,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>> case GLSL_TYPE_STRUCT:
>>ret->elements = ralloc_array(mem_ctx, nir_constant *,
>> ir->type->length);
>> +  ret->num_elements = ir->type->length;
>> +
>>i = 0;
>>foreach_in_list(ir_constant, field, &ir->components) {
>>   ret->elements[i] = constant_copy(field, mem_ctx);
>> @@ -272,6 +276,7 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>> case GLSL_TYPE_ARRAY:
>>ret->elements = ralloc_array(mem_ctx, nir_constant *,
>> ir->type->length);
>> +  ret->num_elements = ir->type->length;
>>
>>for (i = 0; i < ir->type->length; i++)
>>   ret->elements[i] = constant_copy(ir->array_elements[i], mem_ctx);
>> @@ -293,6 +298,7 @@ nir_visitor::visit(ir_variable *ir)
>>
>> if (ir->is_interface_instance() && ir->get_max_ifc_array_access() != 
>> NULL) {
>>unsigned size = ir->get_interface_type()->length;
>> +  var->num_max_ifc_array_access = size;
>>var->max_ifc_array_access = ralloc_array(var, unsigned, size);
>>memcpy(var->max_ifc_array_access, ir->get_max_ifc_array_access(),
>>   size * sizeof(unsigned));
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index a09c2a6..2d9c94c 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -112,6 +112,11 @@ typedef struct nir_constant {
>>  */
>> union nir_constant_data value;
>>
>> +   /* we could get this from the var->type but makes clone *much* easier to
>> +* not have to care about the type.
>> +*/
>> +   unsigned num_elements;
>> +
>> /* Array elements / Structure Fields */
>> struct nir_constant **elements;
>>  } nir_constant;
>> @@ -148,6 +153,12 @@ typedef struct {
>>  */
>> char *name;
>>
>> +   /* we could figure this out from interface_type but it isn't exposed
>> +* cleanly outside of c++ and just having the length here simplifies
>> +* clone:
>> +*/
>> +   unsigned num_max_ifc_array_access;
>
> Can we just get rid of max_ifc_array_access instead?  It's something
> that GLSL uses for some optimizations somewhere but NIR has never made
> any use of it whatsoever and none of the backends care.  The only
> reason it's here is because Connor copied and pasted ir_variable into
> NIR variable.

I only kept it because I assumed at some point someone would want it
(ie. porting opt's done in IR over to NIR, etc).  But no problem for
me to drop it.

BR,
-R

> --Jason
>
>> +
>> /**
>>  * For variables which satisfy the is_interface_instance() predicate, 
>> this
>>  * points to an array of integers such that if the ith member of the
>> --
>> 2.5.0
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Set correct field for indirect align16 addrimm.

2015-10-26 Thread Matt Turner
This has been wrong since the initial import of the i965 driver.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index ebd811f..df48590 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -410,7 +410,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct 
brw_reg reg)
  if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
 brw_inst_set_src0_ia1_addr_imm(devinfo, inst, 
reg.dw1.bits.indirect_offset);
 } else {
-brw_inst_set_src0_ia_subreg_nr(devinfo, inst, 
reg.dw1.bits.indirect_offset);
+brw_inst_set_src0_ia16_addr_imm(devinfo, inst, 
reg.dw1.bits.indirect_offset);
 }
   }
 
-- 
2.4.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] nir: add nir_var_all enum

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 1:56 PM, Jason Ekstrand  wrote:
> On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
>> Otherwise, passing -1 gets you:
>>
>>   error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive]
>
> Yeah, I've never really liked that.  Another option (which I think I'd
> marginally prefer) would be to make nir_lower_io take a bitfield of
> which variable types you want to lower.  Then you would pass in (1 <<
> nir_var_foo) for a specific one and ~0u for all.  That way you could
> also lower two at a time if you wanted.

or make enum nir_variable_mode itself a bitmask (and keep the enum
typed parameter).. if nothing else, that way gdb will display it in a
nicer way

BR,
-R

> --Jason
>
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/glsl/nir/nir.c  | 4 
>>  src/glsl/nir/nir.h  | 1 +
>>  src/glsl/nir/nir_lower_io.c | 2 +-
>>  src/mesa/drivers/dri/i965/brw_nir.c | 3 ++-
>>  4 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> index 793bdaf..caa9ffc 100644
>> --- a/src/glsl/nir/nir.c
>> +++ b/src/glsl/nir/nir.c
>> @@ -107,6 +107,10 @@ void
>>  nir_shader_add_variable(nir_shader *shader, nir_variable *var)
>>  {
>> switch (var->data.mode) {
>> +   case nir_var_all:
>> +  assert(!"invalid mode");
>> +  break;
>> +
>> case nir_var_local:
>>assert(!"nir_shader_add_variable cannot be used for local variables");
>>break;
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index e3777f9..79c0666 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -82,6 +82,7 @@ typedef struct {
>>  } nir_state_slot;
>>
>>  typedef enum {
>> +   nir_var_all = -1,
>> nir_var_shader_in,
>> nir_var_shader_out,
>> nir_var_global,
>> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
>> index 688b48f..b4ed857 100644
>> --- a/src/glsl/nir/nir_lower_io.c
>> +++ b/src/glsl/nir/nir_lower_io.c
>> @@ -176,7 +176,7 @@ nir_lower_io_block(nir_block *block, void *void_state)
>>
>>nir_variable_mode mode = intrin->variables[0]->var->data.mode;
>>
>> -  if (state->mode != -1 && state->mode != mode)
>> +  if (state->mode != nir_var_all && state->mode != mode)
>>   continue;
>>
>>switch (intrin->intrinsic) {
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index 1b4dace..8f09165 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -235,7 +235,8 @@ brw_create_nir(struct brw_context *brw,
>> nir_assign_var_locations(&nir->uniforms,
>>  &nir->num_uniforms,
>>  is_scalar ? type_size_scalar : type_size_vec4);
>> -   nir_lower_io(nir, -1, is_scalar ? type_size_scalar : type_size_vec4);
>> +   nir_lower_io(nir, nir_var_all,
>> +is_scalar ? type_size_scalar : type_size_vec4);
>> nir_validate_shader(nir);
>>
>> nir_remove_dead_variables(nir);
>> --
>> 2.5.0
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.

2015-10-26 Thread Ilia Mirkin
On Mon, Oct 26, 2015 at 2:03 PM, Kenneth Graunke  wrote:
> Integer varyings need to be flat qualified - all others were already.
> I think we just missed this.  Presumably some hardware passes this via
> sideband and ignores attribute interpolation, so no one has noticed.
>
> Signed-off-by: Kenneth Graunke 
> Cc: Chris Forbes 
> ---
>  src/glsl/builtin_variables.cpp | 39 ++-
>  1 file changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index a6ad105..07aacd4 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -887,16 +887,22 @@ builtin_variable_generator::generate_uniforms()
>  void
>  builtin_variable_generator::generate_vs_special_vars()
>  {
> +   ir_variable *var;
> +
> if (state->is_version(130, 300))
>add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");
> if (state->ARB_draw_instanced_enable)
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
> if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> -   if (state->AMD_vertex_shader_layer_enable)
> -  add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -   if (state->AMD_vertex_shader_viewport_index_enable)
> -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +   if (state->AMD_vertex_shader_layer_enable) {
> +  var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;

Not that I *strongly* prefer this, but what we'd do in nouveau/codegen
code (and is IMHO more concise):

add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer")
   ->data.interpolation = INTERP_QUALIFIER_FLAT;

Alternatively, you could just solve this in add_output which should
automatically set the flat interpolation for int types...

FTR I don't strongly prefer either of the above to your solution, just
wanted to point out some alternatives in case you like them more.

> +   }
> +   if (state->AMD_vertex_shader_viewport_index_enable) {
> +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   }
> if (compatibility) {
>add_input(VERT_ATTRIB_POS, vec4_t, "gl_Vertex");
>add_input(VERT_ATTRIB_NORMAL, vec3_t, "gl_Normal");
> @@ -954,11 +960,16 @@ builtin_variable_generator::generate_tes_special_vars()
>  void
>  builtin_variable_generator::generate_gs_special_vars()
>  {
> -   add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -   if (state->is_version(410, 0) || state->ARB_viewport_array_enable)
> -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +   ir_variable *var;
> +
> +   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> +   var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   if (state->is_version(410, 0) || state->ARB_viewport_array_enable) {
> +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   }
> if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable)
> -  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
> +  var = add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, 
> "gl_InvocationID");
>
> /* Although gl_PrimitiveID appears in tessellation control and 
> tessellation
>  * evaluation shaders, it has a different function there than it has in
> @@ -970,7 +981,6 @@ builtin_variable_generator::generate_gs_special_vars()
>  * the specific case of gl_PrimitiveIDIn.  So we don't need to treat
>  * gl_PrimitiveIDIn as an {ARB,EXT}_geometry_shader4-only variable.
>  */
> -   ir_variable *var;
> var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveIDIn");
> var->data.interpolation = INTERP_QUALIFIER_FLAT;
> var = add_output(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
> @@ -984,14 +994,15 @@ builtin_variable_generator::generate_gs_special_vars()
>  void
>  builtin_variable_generator::generate_fs_special_vars()
>  {
> +   ir_variable *var;
> +
> add_input(VARYING_SLOT_POS, vec4_t, "gl_FragCoord");
> add_input(VARYING_SLOT_FACE, bool_t, "gl_FrontFacing");
> if (state->is_version(120, 100))
>add_input(VARYING_SLOT_PNTC, vec2_t, "gl_PointCoord");
>
> if (state->is_version(150, 0)) {
> -  ir_variable *var =
> - add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
> +  var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
>var->data.interpolation = INTERP_QUALIFIER_FLAT;
> }
>
> @@ -1043,8 +1054,10 @@ builtin_variable_generator::generate_fs_special_vars()
> }
>
> if (state->is_version(430, 0) || 
> state->ARB_fragment_layer_viewport_enable) {
> -  add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -  add_input(VARY

Re: [Mesa-dev] [PATCH 4.5/7] nir: add list_head to nir_src::ssa

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 2:01 PM, Jason Ekstrand  wrote:
> On Mon, Oct 26, 2015 at 8:26 AM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> This will be used by clone.  Possibly useful elsewhere.  The list link
>> will only be valid in ssa case, it fits in the padding in the union
>> left from the larger nir_reg_src.
>
> We already have a use_link.  It seems perfectly reasonable to me to
> overload it in the middle of a copy operation.

I guess, only since we resolve src->ssa ptrs before
nir_add_defs_uses(), that could work..

BR,
-R


> --Jason
>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/glsl/nir/nir.h | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index 2d9c94c..fb60340 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -528,7 +528,12 @@ typedef struct nir_src {
>>
>> union {
>>nir_reg_src reg;
>> -  nir_ssa_def *ssa;
>> +  struct {
>> + /* used in clone to track unresolved ssa src's: */
>> + struct list_head link;
>> +
>> + nir_ssa_def *ssa;
>> +  };
>> };
>>
>> bool is_ssa;
>> --
>> 2.5.0
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 2/2] mesa/teximage: accept ASTC formats for 3D texture specification

2015-10-26 Thread Neil Roberts
Nanley Chery  writes:

> +  /* Throw an INVALID_OPERATION error if the target is
> +   * TEXTURE_CUBE_MAP_ARRAY and the format is not ASTC.
> +   */
> +  if (target_can_be_compresed &&
> +  ctx->Extensions.KHR_texture_compression_astc_ldr &&
> +  layout != MESA_FORMAT_LAYOUT_ASTC)
>   return write_error(error, GL_INVALID_OPERATION);

This seems to cause regressions in the following Piglit tests for Gen9:

piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array s3tc
piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array 
s3tc_dxt1

I think the spec for the extension is based on the GLES spec. Perhaps
the intention was to add ASTC support for cube-map array textures
whereas previously in GLES no compressed formats were supported for this
target. However in regular GL presumably S3TC is supposed to be
supported. As it stands this patch disables cube-map array textures for
any formats other than ASTC whenever the ASTC extension is available, so
it disables S3TC on Gen9.

Maybe this section should be limited to GLES?

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 2/2] mesa/teximage: accept ASTC formats for 3D texture specification

2015-10-26 Thread Ilia Mirkin
On Mon, Oct 26, 2015 at 2:15 PM, Neil Roberts  wrote:
> Nanley Chery  writes:
>
>> +  /* Throw an INVALID_OPERATION error if the target is
>> +   * TEXTURE_CUBE_MAP_ARRAY and the format is not ASTC.
>> +   */
>> +  if (target_can_be_compresed &&
>> +  ctx->Extensions.KHR_texture_compression_astc_ldr &&
>> +  layout != MESA_FORMAT_LAYOUT_ASTC)
>>   return write_error(error, GL_INVALID_OPERATION);
>
> This seems to cause regressions in the following Piglit tests for Gen9:
>
> piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array s3tc
> piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array 
> s3tc_dxt1
>
> I think the spec for the extension is based on the GLES spec. Perhaps
> the intention was to add ASTC support for cube-map array textures
> whereas previously in GLES no compressed formats were supported for this
> target. However in regular GL presumably S3TC is supposed to be
> supported. As it stands this patch disables cube-map array textures for
> any formats other than ASTC whenever the ASTC extension is available, so
> it disables S3TC on Gen9.
>
> Maybe this section should be limited to GLES?

I'm having trouble parsing what you're saying, but just want to
confirm that s3tc is definitely supposed to work on cubemaps, at least
in desktop GL.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] winsys/amdgpu: remove the dcc_enable surface flag

2015-10-26 Thread Nicolai Hähnle

On 26.10.2015 11:41, Marek Olšák wrote:

From: Marek Olšák 

dcc_size is sufficient and doesn't need a further comment in my opinion.
---
  src/gallium/drivers/radeon/r600_texture.c  |  3 +--
  src/gallium/drivers/radeon/radeon_winsys.h |  1 -
  src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 13 ++---
  3 files changed, 7 insertions(+), 10 deletions(-)


Agreed, this is an even better solution.

Reviewed-by: Nicolai Hähnle 



diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 789c66f..edfdfe3 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -641,9 +641,8 @@ r600_texture_create_object(struct pipe_screen *screen,
return NULL;
}
}
-   if (rtex->surface.dcc_enabled) {
+   if (rtex->surface.dcc_size)
vi_texture_alloc_dcc_separate(rscreen, rtex);
-   }
}

/* Now create the backing buffer. */
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 0178643..8bf1e15 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -371,7 +371,6 @@ struct radeon_surf {

  uint64_tdcc_size;
  uint64_tdcc_alignment;
-booldcc_enabled;
  };

  struct radeon_bo_list_item {
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index b442174..3006bd1 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -251,7 +251,7 @@ static int compute_level(struct amdgpu_winsys *ws,

 surf->bo_size = surf_level->offset + AddrSurfInfoOut->surfSize;

-   if (surf->dcc_enabled) {
+   if (AddrSurfInfoIn->flags.dccCompatible) {
AddrDccIn->colorSurfSize = AddrSurfInfoOut->surfSize;
AddrDccIn->tileMode = AddrSurfInfoOut->tileMode;
AddrDccIn->tileInfo = *AddrSurfInfoOut->pTileInfo;
@@ -267,10 +267,11 @@ static int compute_level(struct amdgpu_winsys *ws,
   surf->dcc_size = surf_level->dcc_offset + AddrDccOut->dccRamSize;
   surf->dcc_alignment = MAX2(surf->dcc_alignment, 
AddrDccOut->dccRamBaseAlign);
} else {
- surf->dcc_enabled = false;
+ surf->dcc_size = 0;
   surf_level->dcc_offset = 0;
}
 } else {
+  surf->dcc_size = 0;
surf_level->dcc_offset = 0;
 }

@@ -354,10 +355,6 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
 AddrDccIn.numSamples = AddrSurfInfoIn.numSamples = surf->nsamples;
 AddrSurfInfoIn.tileIndex = -1;

-   surf->dcc_enabled =  !(surf->flags & RADEON_SURF_Z_OR_SBUFFER) &&
-!(surf->flags & RADEON_SURF_SCANOUT) &&
-!compressed && AddrDccIn.numSamples <= 1;
-
 /* Set the micro tile type. */
 if (surf->flags & RADEON_SURF_SCANOUT)
AddrSurfInfoIn.tileType = ADDR_DISPLAYABLE;
@@ -373,7 +370,9 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
 AddrSurfInfoIn.flags.display = (surf->flags & RADEON_SURF_SCANOUT) != 0;
 AddrSurfInfoIn.flags.pow2Pad = surf->last_level > 0;
 AddrSurfInfoIn.flags.degrade4Space = 1;
-   AddrSurfInfoIn.flags.dccCompatible = surf->dcc_enabled;
+   AddrSurfInfoIn.flags.dccCompatible = !(surf->flags & RADEON_SURF_Z_OR_SBUFFER) 
&&
+!(surf->flags & RADEON_SURF_SCANOUT) &&
+!compressed && AddrDccIn.numSamples <= 
1;

 /* This disables incorrect calculations (hacks) in addrlib. */
 AddrSurfInfoIn.flags.noStencil = 1;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Jason Ekstrand
On Mon, Oct 26, 2015 at 8:27 AM, Rob Clark  wrote:
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/Makefile.sources |   1 +
>  src/glsl/nir/nir.c|   8 +
>  src/glsl/nir/nir.h|   2 +
>  src/glsl/nir/nir_clone.c  | 949 
> ++
>  4 files changed, 960 insertions(+)
>  create mode 100644 src/glsl/nir/nir_clone.c
>
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index ca87036..25e3801 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -26,6 +26,7 @@ NIR_FILES = \
> nir/nir.h \
> nir/nir_array.h \
> nir/nir_builder.h \
> +   nir/nir_clone.c \
> nir/nir_constant_expressions.h \
> nir/nir_control_flow.c \
> nir/nir_control_flow.h \
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 0cbe4e1..2defa36 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -316,6 +316,14 @@ nir_block_create(nir_shader *shader)
> block->predecessors = _mesa_set_create(block, _mesa_hash_pointer,
>_mesa_key_pointer_equal);
> block->imm_dom = NULL;
> +   /* XXX maybe it would be worth it to defer allocation?  This
> +* way it doesn't get allocated for shader ref's that never run
> +* nir_calc_dominance?  For example, state-tracker creates an
> +* initial IR, clones that, runs appropriate lowering pass, passes
> +* to driver which does common lowering/opt, and then stores ref
> +* which is later used to do state specific lowering and futher
> +* opt.  Do any of the references not need dominance metadata?
> +*/

Yes, we should defer this.  The dominance frontier is a product of the
dominance calculations (as are the dominance arrays) so we should
allocate it there.

> block->dom_frontier = _mesa_set_create(block, _mesa_hash_pointer,
>_mesa_key_pointer_equal);
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index fb60340..efcbf62 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1897,6 +1897,8 @@ void nir_index_blocks(nir_function_impl *impl);
>  void nir_print_shader(nir_shader *shader, FILE *fp);
>  void nir_print_instr(const nir_instr *instr, FILE *fp);
>
> +nir_shader * nir_shader_clone(void *mem_ctx, const nir_shader *s);
> +
>  #ifdef DEBUG
>  void nir_validate_shader(nir_shader *shader);
>  #else
> diff --git a/src/glsl/nir/nir_clone.c b/src/glsl/nir/nir_clone.c
> new file mode 100644
> index 000..5c1739e
> --- /dev/null
> +++ b/src/glsl/nir/nir_clone.c
> @@ -0,0 +1,949 @@
> +/*
> + * Copyright © 2015 Red Hat
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "nir.h"
> +#include "nir_control_flow_private.h"
> +
> +// TODO move these:
> +#define exec_list_head(type, list, node) \
> +   exec_node_data(type, exec_list_get_head(list), node)
> +#define exec_list_next(type, nodeptr, node) \
> +   exec_node_data(type, exec_node_get_next(nodeptr), node)
> +
> +/* Secret Decoder Ring:
> + *   clone_foo():
> + *Allocate and clone a foo.
> + *   __clone_foo():
> + *Clone body of foo (ie. parent class, embedded struct,
> + *etc)
> + *   __clone_foo_p2():
> + *clone body of foo, pass 2.. since in first pass we
> + *can have fwd references to embedded structs, some
> + *ptrs (and things that depend on them) must be
> + *resolved in 2nd pass
> + */
> +
> +typedef struct {
> +   /* maps orig ptr -> cloned ptr: */
> +   struct hash_table *ptr_table;
> +   /* list of unresolved ssa src's: */
> +   struct list_head ssa_src_list;
> +
> +   /* memctx for new toplevel shader object: */
> +   void *mem_ctx;
> +   /* new shader object, used as memctx for just about everything else: */
> +   nir_shader *ns;
> +} clone_state;
> +
> +typedef void *(*

Re: [Mesa-dev] [PATCH v6 2/2] mesa/teximage: accept ASTC formats for 3D texture specification

2015-10-26 Thread Roland Scheidegger
Am 26.10.2015 um 19:31 schrieb Ilia Mirkin:
> On Mon, Oct 26, 2015 at 2:15 PM, Neil Roberts  wrote:
>> Nanley Chery  writes:
>>
>>> +  /* Throw an INVALID_OPERATION error if the target is
>>> +   * TEXTURE_CUBE_MAP_ARRAY and the format is not ASTC.
>>> +   */
>>> +  if (target_can_be_compresed &&
>>> +  ctx->Extensions.KHR_texture_compression_astc_ldr &&
>>> +  layout != MESA_FORMAT_LAYOUT_ASTC)
>>>   return write_error(error, GL_INVALID_OPERATION);
>>
>> This seems to cause regressions in the following Piglit tests for Gen9:
>>
>> piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array s3tc
>> piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array 
>> s3tc_dxt1
>>
>> I think the spec for the extension is based on the GLES spec. Perhaps
>> the intention was to add ASTC support for cube-map array textures
>> whereas previously in GLES no compressed formats were supported for this
>> target. However in regular GL presumably S3TC is supposed to be
>> supported. As it stands this patch disables cube-map array textures for
>> any formats other than ASTC whenever the ASTC extension is available, so
>> it disables S3TC on Gen9.
>>
>> Maybe this section should be limited to GLES?
> 
> I'm having trouble parsing what you're saying, but just want to
> confirm that s3tc is definitely supposed to work on cubemaps, at least
> in desktop GL.
> 

FWIW compressed textures (all formats) not working with cube map
_arrays_ is imho a spec bug in both GL and GLES (in GLES it's just more
explicit with newest spec as it has a table listing all the compressed
formats NOT working for cube map arrays). Seems to be a result of the
compressed formats being developed when cube map arrays weren't arround
thus most of the wording in the spec now only allows them for 2d, 2d
array, cube maps, but not cube map arrays, which totally doesn't make
sense. s3tc itself (in contrast to rgtc etc.) doesn't say anything wrt
to cubemap arrays (spec was updated for 2d arrays, but not cubemap
arrays, thus you could theoretically conclude one way or another, albeit
as I said not allowing them makes no sense imho just like it doesn't for
the other formats).
I've filed bugs for this at khronos bugtracker and it looks like at
least for gles it's going to get fixed (no word on GL, unfortunately).
Of course, binary blobs ignored this bit since forever already, and mesa
was changed to do the same too. Thus imho every test expecting cube map
targets to fail for compressed formats should be ignored.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] nir: add couple array length fields

2015-10-26 Thread Jason Ekstrand
On Mon, Oct 26, 2015 at 11:06 AM, Rob Clark  wrote:
> On Mon, Oct 26, 2015 at 1:52 PM, Jason Ekstrand  wrote:
>> On Sat, Oct 24, 2015 at 10:07 AM, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> This will simplify things somewhat in clone.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>>  src/glsl/nir/glsl_to_nir.cpp |  6 ++
>>>  src/glsl/nir/nir.h   | 11 +++
>>>  2 files changed, 17 insertions(+)
>>>
>>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>>> index 9b50a93..8f83012 100644
>>> --- a/src/glsl/nir/glsl_to_nir.cpp
>>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>>> @@ -238,6 +238,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>>>
>>> unsigned total_elems = ir->type->components();
>>> unsigned i;
>>> +
>>> +   ret->num_elements = 0;
>>> switch (ir->type->base_type) {
>>> case GLSL_TYPE_UINT:
>>>for (i = 0; i < total_elems; i++)
>>> @@ -262,6 +264,8 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>>> case GLSL_TYPE_STRUCT:
>>>ret->elements = ralloc_array(mem_ctx, nir_constant *,
>>> ir->type->length);
>>> +  ret->num_elements = ir->type->length;
>>> +
>>>i = 0;
>>>foreach_in_list(ir_constant, field, &ir->components) {
>>>   ret->elements[i] = constant_copy(field, mem_ctx);
>>> @@ -272,6 +276,7 @@ constant_copy(ir_constant *ir, void *mem_ctx)
>>> case GLSL_TYPE_ARRAY:
>>>ret->elements = ralloc_array(mem_ctx, nir_constant *,
>>> ir->type->length);
>>> +  ret->num_elements = ir->type->length;
>>>
>>>for (i = 0; i < ir->type->length; i++)
>>>   ret->elements[i] = constant_copy(ir->array_elements[i], mem_ctx);
>>> @@ -293,6 +298,7 @@ nir_visitor::visit(ir_variable *ir)
>>>
>>> if (ir->is_interface_instance() && ir->get_max_ifc_array_access() != 
>>> NULL) {
>>>unsigned size = ir->get_interface_type()->length;
>>> +  var->num_max_ifc_array_access = size;
>>>var->max_ifc_array_access = ralloc_array(var, unsigned, size);
>>>memcpy(var->max_ifc_array_access, ir->get_max_ifc_array_access(),
>>>   size * sizeof(unsigned));
>>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>>> index a09c2a6..2d9c94c 100644
>>> --- a/src/glsl/nir/nir.h
>>> +++ b/src/glsl/nir/nir.h
>>> @@ -112,6 +112,11 @@ typedef struct nir_constant {
>>>  */
>>> union nir_constant_data value;
>>>
>>> +   /* we could get this from the var->type but makes clone *much* easier to
>>> +* not have to care about the type.
>>> +*/
>>> +   unsigned num_elements;
>>> +
>>> /* Array elements / Structure Fields */
>>> struct nir_constant **elements;
>>>  } nir_constant;
>>> @@ -148,6 +153,12 @@ typedef struct {
>>>  */
>>> char *name;
>>>
>>> +   /* we could figure this out from interface_type but it isn't exposed
>>> +* cleanly outside of c++ and just having the length here simplifies
>>> +* clone:
>>> +*/
>>> +   unsigned num_max_ifc_array_access;
>>
>> Can we just get rid of max_ifc_array_access instead?  It's something
>> that GLSL uses for some optimizations somewhere but NIR has never made
>> any use of it whatsoever and none of the backends care.  The only
>> reason it's here is because Connor copied and pasted ir_variable into
>> NIR variable.
>
> I only kept it because I assumed at some point someone would want it
> (ie. porting opt's done in IR over to NIR, etc).  But no problem for
> me to drop it.

That's a really bad reason to keep dead code... We'll add something
back in if/when we need it.

> BR,
> -R
>
>> --Jason
>>
>>> +
>>> /**
>>>  * For variables which satisfy the is_interface_instance() predicate, 
>>> this
>>>  * points to an array of integers such that if the ith member of the
>>> --
>>> 2.5.0
>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] Nir: Allow CSE of SSBO loads

2015-10-26 Thread Jason Ekstrand
On Mon, Oct 26, 2015 at 12:19 AM, Iago Toral  wrote:
> On Fri, 2015-10-23 at 09:26 -0700, Jason Ekstrand wrote:
>> On Thu, Oct 22, 2015 at 11:13 PM, Iago Toral  wrote:
>> > On Thu, 2015-10-22 at 09:09 -0700, Jason Ekstrand wrote:
>> >> On Thu, Oct 22, 2015 at 4:21 AM, Iago Toral Quiroga  
>> >> wrote:
>> >> > I implemented this first as a separate optimization pass in GLSL IR 
>> >> > [1], but
>> >> > Curro pointed out that this being pretty much a restricted form of a 
>> >> > CSE pass
>> >> > it would probably make more sense to do it inside CSE (and we no longer 
>> >> > have
>> >> > a CSE pass in GLSL IR).
>> >> >
>> >> > Unlike other things we CSE in NIR, in the case of SSBO loads we need to 
>> >> > make
>> >> > sure that we invalidate previous entries in the set in the presence of
>> >> > conflicting instructions (i.e. SSBO writes to the same block and 
>> >> > offset) or
>> >> > in the presence of memory barriers.
>> >> >
>> >> > If this is accepted I intend to extend this to also cover image reads, 
>> >> > which
>> >> > follow similar behavior.
>> >> >
>> >> > No regressions observed in piglit or dEQP's SSBO functional tests.
>> >> >
>> >> > [1] 
>> >> > http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html
>> >>
>> >> I think you've gotten enough NAK's that I don't need to chime in
>> >> there.  Unfortunately, solving this in general is something of a
>> >> research project that both Connor and I have been thinking about for
>> >> quite some time.  I've been thinking off-and-on about how to add a
>> >> proper memory model to lower_vars_to_ssa for almost a year now and
>> >> still haven't come up with a good way to do it.  I don't know whether
>> >> SSBO's would be simpler or not.  We need a proper memory model for
>> >> both lower_vars_to_ssa and SSBO load/stores (and shared local
>> >> variables) but it's a substantial research project.
>> >>
>> >> This isn't to say that you couldn't do it.  Just know what you're taking 
>> >> on. ;-)
>> >
>> > Yeah, it does not make sense that I try to do this, you guys have
>> > clearly given this much more thought than me and know much better how a
>> > solution for this would fit in NIR than me.
>> >
>> >> That said, here's a suggestion for something that we *could* write
>> >> today, wouldn't be very hard, and wold solve a decent number of cases.
>> >>
>> >> For each block:
>> >>
>> >> 1) Create a new instruction set (don't use anything from any previous 
>> >> blocks)
>> >> 2) call add_or_rewrite on all ssbo load operations
>> >> 3) If you ever see a barrier or ssbo store, destroy the entire
>> >> instruction set and start again.
>> >
>> > Yep, this is what I was thinking for the load-combine pass that Connor
>> > suggested. However, I think that in this case we do not need to destroy
>> > the entire set when we find a store, only for memory barriers, right? I
>> > mean, there should be nothing preventing us from checking the
>> > offset/block of the store and compare it with the offset/block of the
>> > loads in the set to decide which ones we need to remove (like I was
>> > doing in my last patch)
>>
>> That's where you get into the "special casing" I mentioned below.  If
>> you have an direct store, you would have to throw away any indirect
>> loads
>
> Yes.
>
>> and then insert a fake direct load for the given offset.
>
> Actually, what I am doing is a bit different:
>
> When I see stores, I also insert them in the hash table (but I never
> rewrite stores). Then, when I see see a load, I check for a match, if I
> have it, I use it, if not, I check if I have a store to the same offset,
> and If I do, I just use that, no need to fake anything. Of course, if I
> do this, in order to check if I have a compatible store I have to
> traverse the hash table looking for a match, but I think that should be
> okay in this case, since that only has load/store operations and only
> the ones in the current block, so I think it should be okay.
>
> Does this seem like a reasonable alternative?

Yeah, that's fine.

>>   If you
>> have an indirect store, you would have to throw away everything and
>> then insert a fake indirect load for the given offset.  So, yes, you
>> can do it, but it'll take a little more work.
>
> Yeah, mostly because there are also atomics to consider and then you
> also have to check that the stores write to all the components we read
> before we reuse them, etc.
>
>>   You'll also probably
>> want to use a different hashing function than we use for CSE because
>> you need to be able to handle both real and fake loads and have them
>> compare/hash the same.
>>
>> Does that make sense?
>> --Jason
>>
>> >> This is something you could put together fairly quickly and would
>> >> handle a fair number of cases.  With a little special casing, you may
>> >> also be able to handle store and then an immediate load of the same
>> >> value or duplicate stores.  Anything much more complex than that is
>> >> going to take a lot more thought

Re: [Mesa-dev] [PATCH v6 2/2] mesa/teximage: accept ASTC formats for 3D texture specification

2015-10-26 Thread Nanley Chery
On Mon, Oct 26, 2015 at 11:15 AM, Neil Roberts  wrote:

> Nanley Chery  writes:
>
> > +  /* Throw an INVALID_OPERATION error if the target is
> > +   * TEXTURE_CUBE_MAP_ARRAY and the format is not ASTC.
> > +   */
> > +  if (target_can_be_compresed &&
> > +  ctx->Extensions.KHR_texture_compression_astc_ldr &&
> > +  layout != MESA_FORMAT_LAYOUT_ASTC)
> >   return write_error(error, GL_INVALID_OPERATION);
>
> This seems to cause regressions in the following Piglit tests for Gen9:
>
> piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array
> s3tc
> piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array
> s3tc_dxt1
>
> I think the spec for the extension is based on the GLES spec. Perhaps
> the intention was to add ASTC support for cube-map array textures
> whereas previously in GLES no compressed formats were supported for this
> target. However in regular GL presumably S3TC is supposed to be
> supported. As it stands this patch disables cube-map array textures for
> any formats other than ASTC whenever the ASTC extension is available, so
> it disables S3TC on Gen9.
>
> Maybe this section should be limited to GLES?
>
>
Limiting this to GLES sounds like a good solution. Note that there is
similar logic for the
TEXTURE_3D target further down.

Nanley


> Regards,
> - Neil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 2/2] mesa/teximage: accept ASTC formats for 3D texture specification

2015-10-26 Thread Nanley Chery
On Mon, Oct 26, 2015 at 11:53 AM, Roland Scheidegger 
wrote:

> Am 26.10.2015 um 19:31 schrieb Ilia Mirkin:
> > On Mon, Oct 26, 2015 at 2:15 PM, Neil Roberts 
> wrote:
> >> Nanley Chery  writes:
> >>
> >>> +  /* Throw an INVALID_OPERATION error if the target is
> >>> +   * TEXTURE_CUBE_MAP_ARRAY and the format is not ASTC.
> >>> +   */
> >>> +  if (target_can_be_compresed &&
> >>> +  ctx->Extensions.KHR_texture_compression_astc_ldr &&
> >>> +  layout != MESA_FORMAT_LAYOUT_ASTC)
> >>>   return write_error(error, GL_INVALID_OPERATION);
> >>
> >> This seems to cause regressions in the following Piglit tests for Gen9:
> >>
> >> piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array
> s3tc
> >> piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array
> s3tc_dxt1
> >>
> >> I think the spec for the extension is based on the GLES spec. Perhaps
> >> the intention was to add ASTC support for cube-map array textures
> >> whereas previously in GLES no compressed formats were supported for this
> >> target. However in regular GL presumably S3TC is supposed to be
> >> supported. As it stands this patch disables cube-map array textures for
> >> any formats other than ASTC whenever the ASTC extension is available, so
> >> it disables S3TC on Gen9.
> >>
> >> Maybe this section should be limited to GLES?
> >
> > I'm having trouble parsing what you're saying, but just want to
> > confirm that s3tc is definitely supposed to work on cubemaps, at least
> > in desktop GL.
> >
>
> FWIW compressed textures (all formats) not working with cube map
> _arrays_ is imho a spec bug in both GL and GLES (in GLES it's just more
> explicit with newest spec as it has a table listing all the compressed
> formats NOT working for cube map arrays). Seems to be a result of the
> compressed formats being developed when cube map arrays weren't arround
> thus most of the wording in the spec now only allows them for 2d, 2d
> array, cube maps, but not cube map arrays, which totally doesn't make
> sense. s3tc itself (in contrast to rgtc etc.) doesn't say anything wrt
> to cubemap arrays (spec was updated for 2d arrays, but not cubemap
> arrays, thus you could theoretically conclude one way or another, albeit
> as I said not allowing them makes no sense imho just like it doesn't for
> the other formats).
>
I've filed bugs for this at khronos bugtracker and it looks like at
> least for gles it's going to get fixed (no word on GL, unfortunately).
> Of course, binary blobs ignored this bit since forever already, and mesa
> was changed to do the same too. Thus imho every test expecting cube map
> targets to fail for compressed formats should be ignored.
>
>
Thank you for filing the spec bugs.

Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 2:44 PM, Jason Ekstrand  wrote:
> On Mon, Oct 26, 2015 at 8:27 AM, Rob Clark  wrote:
>> Signed-off-by: Rob Clark 
>> ---
>>  src/glsl/Makefile.sources |   1 +
>>  src/glsl/nir/nir.c|   8 +
>>  src/glsl/nir/nir.h|   2 +
>>  src/glsl/nir/nir_clone.c  | 949 
>> ++
>>  4 files changed, 960 insertions(+)
>>  create mode 100644 src/glsl/nir/nir_clone.c
>>
>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>> index ca87036..25e3801 100644
>> --- a/src/glsl/Makefile.sources
>> +++ b/src/glsl/Makefile.sources
>> @@ -26,6 +26,7 @@ NIR_FILES = \
>> nir/nir.h \
>> nir/nir_array.h \
>> nir/nir_builder.h \
>> +   nir/nir_clone.c \
>> nir/nir_constant_expressions.h \
>> nir/nir_control_flow.c \
>> nir/nir_control_flow.h \
>> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> index 0cbe4e1..2defa36 100644
>> --- a/src/glsl/nir/nir.c
>> +++ b/src/glsl/nir/nir.c
>> @@ -316,6 +316,14 @@ nir_block_create(nir_shader *shader)
>> block->predecessors = _mesa_set_create(block, _mesa_hash_pointer,
>>_mesa_key_pointer_equal);
>> block->imm_dom = NULL;
>> +   /* XXX maybe it would be worth it to defer allocation?  This
>> +* way it doesn't get allocated for shader ref's that never run
>> +* nir_calc_dominance?  For example, state-tracker creates an
>> +* initial IR, clones that, runs appropriate lowering pass, passes
>> +* to driver which does common lowering/opt, and then stores ref
>> +* which is later used to do state specific lowering and futher
>> +* opt.  Do any of the references not need dominance metadata?
>> +*/
>
> Yes, we should defer this.  The dominance frontier is a product of the
> dominance calculations (as are the dominance arrays) so we should
> allocate it there.
>
>> block->dom_frontier = _mesa_set_create(block, _mesa_hash_pointer,
>>_mesa_key_pointer_equal);
>>
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index fb60340..efcbf62 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -1897,6 +1897,8 @@ void nir_index_blocks(nir_function_impl *impl);
>>  void nir_print_shader(nir_shader *shader, FILE *fp);
>>  void nir_print_instr(const nir_instr *instr, FILE *fp);
>>
>> +nir_shader * nir_shader_clone(void *mem_ctx, const nir_shader *s);
>> +
>>  #ifdef DEBUG
>>  void nir_validate_shader(nir_shader *shader);
>>  #else
>> diff --git a/src/glsl/nir/nir_clone.c b/src/glsl/nir/nir_clone.c
>> new file mode 100644
>> index 000..5c1739e
>> --- /dev/null
>> +++ b/src/glsl/nir/nir_clone.c
>> @@ -0,0 +1,949 @@
>> +/*
>> + * Copyright © 2015 Red Hat
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#include "nir.h"
>> +#include "nir_control_flow_private.h"
>> +
>> +// TODO move these:
>> +#define exec_list_head(type, list, node) \
>> +   exec_node_data(type, exec_list_get_head(list), node)
>> +#define exec_list_next(type, nodeptr, node) \
>> +   exec_node_data(type, exec_node_get_next(nodeptr), node)
>> +
>> +/* Secret Decoder Ring:
>> + *   clone_foo():
>> + *Allocate and clone a foo.
>> + *   __clone_foo():
>> + *Clone body of foo (ie. parent class, embedded struct,
>> + *etc)
>> + *   __clone_foo_p2():
>> + *clone body of foo, pass 2.. since in first pass we
>> + *can have fwd references to embedded structs, some
>> + *ptrs (and things that depend on them) must be
>> + *resolved in 2nd pass
>> + */
>> +
>> +typedef struct {
>> +   /* maps orig ptr -> cloned ptr: */
>> +   struct hash_table *ptr_table;
>> +   /* list of unresolved ssa src's: */
>> +   struct list_head ssa_src_list;
>> +
>> +   /* memctx for ne

Re: [Mesa-dev] [PATCH] r600: Fix special negative immediate constants when using ABS modifier.

2015-10-26 Thread Nicolai Hähnle

Hi Ivan,

On 25.10.2015 02:00, Ivan Kalvachev wrote:

Some constants (like 1.0 and 0.5) could be inlined as immediate inputs
without using their literal value. The r600_bytecode_special_constants()
function emulates the negative of these constants by using NEG modifier.

However some shaders define -1.0 constant and want to use it as 1.0.
They do so by using ABS modifier. But r600_bytecode_special_constants()
set NEG in addition to ABS. Since NEG modifier have priority over ABS one,
we get -|1.0| as result, instead of |1.0|.

The patch simply prevents the additional switching of NEG when ABS is set.


Nice catch. Is there a simple test case (e.g. in piglit) that exposes 
the incorrect behavior?



Signed-off-by: Ivan Kalvachev 
---
  src/gallium/drivers/r600/r600_asm.c| 9 +
  src/gallium/drivers/r600/r600_shader.c | 2 +-
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_asm.c
b/src/gallium/drivers/r600/r600_asm.c
index bc69806..8fc622c 100644
--- a/src/gallium/drivers/r600/r600_asm.c
+++ b/src/gallium/drivers/r600/r600_asm.c
@@ -635,8 +635,9 @@ static int replace_gpr_with_pv_ps(struct r600_bytecode *bc,
 return 0;
  }

-void r600_bytecode_special_constants(uint32_t value, unsigned *sel,
unsigned *neg)
+void r600_bytecode_special_constants(uint32_t value, unsigned *sel,
unsigned *neg, unsigned abs)
  {
+


Please remove the extra whitespace line.

Cheers,
Nicolai


 switch(value) {
 case 0:
 *sel = V_SQ_ALU_SRC_0;
@@ -655,11 +656,11 @@ void r600_bytecode_special_constants(uint32_t
value, unsigned *sel, unsigned *ne
 break;
 case 0xBF80: /* -1.0f */
 *sel = V_SQ_ALU_SRC_1;
-   *neg ^= 1;
+   *neg ^= !abs;
 break;
 case 0xBF00: /* -0.5f */
 *sel = V_SQ_ALU_SRC_0_5;
-   *neg ^= 1;
+   *neg ^= !abs;
 break;
 default:
 *sel = V_SQ_ALU_SRC_LITERAL;
@@ -1208,7 +1209,7 @@ int r600_bytecode_add_alu_type(struct r600_bytecode *bc,
 }
 if (nalu->src[i].sel == V_SQ_ALU_SRC_LITERAL)
 r600_bytecode_special_constants(nalu->src[i].value,
-   &nalu->src[i].sel, &nalu->src[i].neg);
+   &nalu->src[i].sel, &nalu->src[i].neg,
nalu->src[i].abs);
 }
 if (nalu->dst.sel >= bc->ngpr) {
 bc->ngpr = nalu->dst.sel + 1;
diff --git a/src/gallium/drivers/r600/r600_shader.c
b/src/gallium/drivers/r600/r600_shader.c
index 8efe902..50c0329 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -1008,7 +1008,7 @@ static void tgsi_src(struct r600_shader_ctx *ctx,
 (tgsi_src->Register.SwizzleX ==
tgsi_src->Register.SwizzleW)) {

 index = tgsi_src->Register.Index * 4 +
tgsi_src->Register.SwizzleX;
-
r600_bytecode_special_constants(ctx->literals[index], &r600_src->sel,
&r600_src->neg);
+
r600_bytecode_special_constants(ctx->literals[index], &r600_src->sel,
&r600_src->neg, r600_src->abs);
 if (r600_src->sel != V_SQ_ALU_SRC_LITERAL)
 return;
 }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 09/21] mesa: Generate a helper function for each extension

2015-10-26 Thread Chad Versace
On Fri 23 Oct 2015, Nanley Chery wrote:
> On Thu, Oct 22, 2015 at 11:30 AM, Chad Versace 
> wrote:
> 
> > On Mon 19 Oct 2015, Nanley Chery wrote:
> > > From: Nanley Chery 
> > >
> > > Generate functions which determine if an extension is supported in the
> > > current context. Initially, enums were going to be explicitly used with
> > > _mesa_extension_supported(). The idea to embed the function and enums
> > > into generated helper functions was suggested by Kristian Høgsberg.
> > >
> > > For performance, the function body no longer uses
> > > _mesa_extension_supported() and, as suggested by Chad Versace, the
> > > functions are also declared static inline.
> > >
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/mesa/main/context.h|  1 +
> > >  src/mesa/main/extensions.c | 22 +-
> > >  src/mesa/main/extensions.h | 39 +++
> > >  3 files changed, 41 insertions(+), 21 deletions(-)
> >
> > [...]


> > > @@ -55,6 +55,45 @@ _mesa_get_extension_count(struct gl_context *ctx);
> > >  extern const GLubyte *
> > >  _mesa_get_enabled_extension(struct gl_context *ctx, GLuint index);
> > >
> > > +
> > > +/**
> > > + * \brief An element of the \c extension_table.
> > > + */
> > > +struct extension {
> >
> > In addition to Marek's comment, the struct should be prefixed too.
> >
> >
> After rereading the coding style guidelines, I've found that it only says
> that
> functions need to be prefixed. Should these two suggestions be added?

Personally, I would avoid placing such a rule in the style guide,
because there is no hard, absolute rule that dictates when a symbol
should be prefixed. For example, many of the global symbols in
src/util/*.h are not prefixed.

If you examine the symbols in src/mesa/main/mtypes.h, though, nearly
everything is prefixed. And I view src/mesa/main/extensions.h as a file
that belongs in the same family as mtypes.h.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 19/21] mesa: Replace gl_extensions::EXT_texture3D with ::dummy_true

2015-10-26 Thread Chad Versace
On Mon 19 Oct 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Mesa unconditionally sets this driver flag to true in
> _mesa_init_extensions(). There is therefore no need for
> the driver to communicate support for this extension.
> Replace the driver capability flag with ::dummy_true.
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/glsl/glsl_parser_extras.cpp | 2 +-
>  src/glsl/standalone_scaffolding.cpp | 1 -
>  src/mesa/main/extensions.c  | 2 --
>  src/mesa/main/extensions_table.h| 4 ++--
>  src/mesa/main/mtypes.h  | 1 -
>  5 files changed, 3 insertions(+), 7 deletions(-)

This patch is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Implement ARB_fragment_layer_viewport.

2015-10-26 Thread Chris Forbes
For the series

Reviewed-by: Chris Forbes 
On Oct 27, 2015 7:03 AM, "Kenneth Graunke"  wrote:

> Normally, we could read gl_Layer from bits 26:16 of R0.0.  However, the
> specification requires that bogus out-of-range 32-bit values written by
> previous stages need to appear in the fragment shader as-written.
>
> Instead, we pass in the full 32-bit value from the VUE header as an
> extra flat-shaded varying.  We have the SF override the value to 0
> when the previous stage didn't actually write a value (it's actually
> defined to return 0).
>
> Signed-off-by: Kenneth Graunke 
> Cc: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp |  7 ++-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 +++
>  src/mesa/drivers/dri/i965/gen6_sf_state.c| 31
> 
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  4 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 2eef7af..5a82b04 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1442,6 +1442,9 @@ fs_visitor::calculate_urb_setup()
>  }
>   }
>} else {
> + bool include_vue_header =
> +nir->info.inputs_read & (VARYING_BIT_LAYER |
> VARYING_BIT_VIEWPORT);
> +
>   /* We have enough input varyings that the SF/SBE pipeline stage
> can't
>* arbitrarily rearrange them to suit our whim; we have to put
> them
>* in an order that matches the output of the previous pipeline
> stage
> @@ -1451,7 +1454,9 @@ fs_visitor::calculate_urb_setup()
>   brw_compute_vue_map(devinfo, &prev_stage_vue_map,
>   key->input_slots_valid,
>   nir->info.separate_shader);
> - int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
> + int first_slot =
> +include_vue_header ? 0 : 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
> +
>   assert(prev_stage_vue_map.num_slots <= first_slot + 32);
>   for (int slot = first_slot; slot < prev_stage_vue_map.num_slots;
>slot++) {
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 4950ba4..9c1f95c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -71,6 +71,14 @@ fs_visitor::nir_setup_inputs()
>   var->data.origin_upper_left);
>   emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(),
> input, reg), 0xF);
> +  } else if (var->data.location == VARYING_SLOT_LAYER) {
> + struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_LAYER,
> 1), 3);
> + reg.type = BRW_REGISTER_TYPE_D;
> + bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D),
> reg);
> +  } else if (var->data.location == VARYING_SLOT_VIEWPORT) {
> + struct brw_reg reg = suboffset(interp_reg(VARYING_SLOT_VIEWPORT,
> 2), 3);
> + reg.type = BRW_REGISTER_TYPE_D;
> + bld.emit(FS_OPCODE_CINTERP, retype(input, BRW_REGISTER_TYPE_D),
> reg);
>} else {
>   emit_general_interpolation(input, var->name, var->type,
>  (glsl_interp_qualifier)
> var->data.interpolation,
> diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c
> b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> index 0c8c053..2634e6b 100644
> --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
> @@ -60,6 +60,23 @@ get_attr_override(const struct brw_vue_map *vue_map,
> int urb_entry_read_offset,
> /* Find the VUE slot for this attribute. */
> int slot = vue_map->varying_to_slot[fs_attr];
>
> +   /* Viewport and Layer are stored in the VUE header.  We need to
> override
> +* them to zero if earlier stages didn't write them, as GL requires
> that
> +* they read back as zero when not explicitly set.
> +*/
> +   if (fs_attr == VARYING_SLOT_VIEWPORT || fs_attr == VARYING_SLOT_LAYER)
> {
> +  unsigned override =
> + ATTRIBUTE_0_OVERRIDE_X | ATTRIBUTE_0_OVERRIDE_W |
> + ATTRIBUTE_CONST_ << ATTRIBUTE_0_CONST_SOURCE_SHIFT;
> +
> +  if (!(vue_map->slots_valid & VARYING_BIT_LAYER))
> + override |= ATTRIBUTE_0_OVERRIDE_Y;
> +  if (!(vue_map->slots_valid & VARYING_BIT_VIEWPORT))
> + override |= ATTRIBUTE_0_OVERRIDE_Z;
> +
> +  return override;
> +   }
> +
> /* If there was only a back color written but not front, use back
>  * as the color instead of undefined
>  */
> @@ -169,6 +186,20 @@ calculate_attr_overrides(const struct brw_context
> *brw,
>
> *urb_entry_read_offset = BRW_SF_URB_ENTRY_READ_OFFSET;
>
> +   /* BRW_NEW_FRAGMENT_PROGRAM
> +*
> +* If the fragment shader reads VARYING_SLOT_LAYER, then we need to
> pass in

[Mesa-dev] [Bug 92278] Black screen in War Thunder

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92278

--- Comment #5 from Sven Arvidsson  ---
The patch works fine here, awesome work! Fantastic to have it fixed so quickly!

The developer of the game is Gaijin Entertainment, http://gaijinent.com

I could probably open a support ticket with them, but chances are they're not
interested. The FAQ for Linux only mentions "install proprietary drivers" which
is quite hard to do on Intel.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 15/21] mesa: Fix EXT_texture_sRGB functionality leaks

2015-10-26 Thread Nanley Chery
On Fri, Oct 23, 2015 at 3:01 PM, Nanley Chery  wrote:

>
>
> On Thu, Oct 22, 2015 at 12:15 PM, Chad Versace 
> wrote:
>
>> On Mon 19 Oct 2015, Nanley Chery wrote:
>> > From: Nanley Chery 
>> >
>> > Stop leaks into the following contexts:
>> >* GLES in _mesa_base_tex_format() and lookup_view_class().
>> >* Pre-1.1 GL legacy contexts in all uses.
>> >
>> > Stop allowing compressed sRGB formats as valid formats in GLES3
>> > contexts. I realized this was happening when CTS failures occured after
>> > fixing the extension functionality leak with the helper function.
>>
>> Do you mean that this patch *fixes* CTS failures? If so, which CTS for
>> which GLES version?  There are so many CTS's in the world :(
>>
>
> I meant that when I first modified _mesa_base_tex_format() to use the
> helper
> function, I had failures in piglit, dEQP, and the Khronos CTS. "failures
> in every
> test-suite" probably would've been better than "CTS failures" here. It was
> because of those failures, that I realized that we were incorrectly
> allowing
> formats like GL_COMPRESSED_SRGB_EXT in GLES contexts. This motivated
> the split of the switch statement you see in the diff of
> _mesa_base_tex_format().
>

It seems that the note about what inspired this change is confusing
(and likely unnecessary). I'll leave it out in the v2.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Jason Ekstrand
On Mon, Oct 26, 2015 at 12:53 PM, Rob Clark  wrote:
> On Mon, Oct 26, 2015 at 2:44 PM, Jason Ekstrand  wrote:
>> On Mon, Oct 26, 2015 at 8:27 AM, Rob Clark  wrote:
>>> Signed-off-by: Rob Clark 
>>> ---
>>>  src/glsl/Makefile.sources |   1 +
>>>  src/glsl/nir/nir.c|   8 +
>>>  src/glsl/nir/nir.h|   2 +
>>>  src/glsl/nir/nir_clone.c  | 949 
>>> ++
>>>  4 files changed, 960 insertions(+)
>>>  create mode 100644 src/glsl/nir/nir_clone.c
>>>
>>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>>> index ca87036..25e3801 100644
>>> --- a/src/glsl/Makefile.sources
>>> +++ b/src/glsl/Makefile.sources
>>> @@ -26,6 +26,7 @@ NIR_FILES = \
>>> nir/nir.h \
>>> nir/nir_array.h \
>>> nir/nir_builder.h \
>>> +   nir/nir_clone.c \
>>> nir/nir_constant_expressions.h \
>>> nir/nir_control_flow.c \
>>> nir/nir_control_flow.h \
>>> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>>> index 0cbe4e1..2defa36 100644
>>> --- a/src/glsl/nir/nir.c
>>> +++ b/src/glsl/nir/nir.c
>>> @@ -316,6 +316,14 @@ nir_block_create(nir_shader *shader)
>>> block->predecessors = _mesa_set_create(block, _mesa_hash_pointer,
>>>_mesa_key_pointer_equal);
>>> block->imm_dom = NULL;
>>> +   /* XXX maybe it would be worth it to defer allocation?  This
>>> +* way it doesn't get allocated for shader ref's that never run
>>> +* nir_calc_dominance?  For example, state-tracker creates an
>>> +* initial IR, clones that, runs appropriate lowering pass, passes
>>> +* to driver which does common lowering/opt, and then stores ref
>>> +* which is later used to do state specific lowering and futher
>>> +* opt.  Do any of the references not need dominance metadata?
>>> +*/
>>
>> Yes, we should defer this.  The dominance frontier is a product of the
>> dominance calculations (as are the dominance arrays) so we should
>> allocate it there.
>>
>>> block->dom_frontier = _mesa_set_create(block, _mesa_hash_pointer,
>>>_mesa_key_pointer_equal);
>>>
>>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>>> index fb60340..efcbf62 100644
>>> --- a/src/glsl/nir/nir.h
>>> +++ b/src/glsl/nir/nir.h
>>> @@ -1897,6 +1897,8 @@ void nir_index_blocks(nir_function_impl *impl);
>>>  void nir_print_shader(nir_shader *shader, FILE *fp);
>>>  void nir_print_instr(const nir_instr *instr, FILE *fp);
>>>
>>> +nir_shader * nir_shader_clone(void *mem_ctx, const nir_shader *s);
>>> +
>>>  #ifdef DEBUG
>>>  void nir_validate_shader(nir_shader *shader);
>>>  #else
>>> diff --git a/src/glsl/nir/nir_clone.c b/src/glsl/nir/nir_clone.c
>>> new file mode 100644
>>> index 000..5c1739e
>>> --- /dev/null
>>> +++ b/src/glsl/nir/nir_clone.c
>>> @@ -0,0 +1,949 @@
>>> +/*
>>> + * Copyright © 2015 Red Hat
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentation files (the 
>>> "Software"),
>>> + * to deal in the Software without restriction, including without 
>>> limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the 
>>> next
>>> + * paragraph) shall be included in all copies or substantial portions of 
>>> the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>>> OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>>> OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>>> DEALINGS
>>> + * IN THE SOFTWARE.
>>> + */
>>> +
>>> +#include "nir.h"
>>> +#include "nir_control_flow_private.h"
>>> +
>>> +// TODO move these:
>>> +#define exec_list_head(type, list, node) \
>>> +   exec_node_data(type, exec_list_get_head(list), node)
>>> +#define exec_list_next(type, nodeptr, node) \
>>> +   exec_node_data(type, exec_node_get_next(nodeptr), node)
>>> +
>>> +/* Secret Decoder Ring:
>>> + *   clone_foo():
>>> + *Allocate and clone a foo.
>>> + *   __clone_foo():
>>> + *Clone body of foo (ie. parent class, embedded struct,
>>> + *etc)
>>> + *   __clone_foo_p2():
>>> + *clone body of foo, pass 2.. since in first pass we
>>> + *can have fwd references to embedded structs, some
>>> + *ptrs (and things that depend on them) must be
>>> + *resolved in 2nd pass
>>> + */
>>> +
>>> +typedef struct {
>>>

Re: [Mesa-dev] [PATCH v2] mesa: Draw Indirect return wrong error code on unalinged

2015-10-26 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 10/26/2015 03:22 AM, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> From OpenGL 4.4 specification, section 10.4 and
> Open GL Es 3.1 section 10.5:
> "An INVALID_VALUE error is generated if indirect is not a multiple
> of the size, in basic machine units, of uint."
> 
> However, the current code follow the ARB_draw_indirect:
> https://www.opengl.org/registry/specs/ARB/draw_indirect.txt
> "INVALID_OPERATION is generated by DrawArraysIndirect and
> DrawElementsIndirect if commands source data beyond the end
> of a buffer object or if  is not word aligned."
> 
> V2: After discussions on the list, it was suggested to
> only keep the INVALID_VALUE error.
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/api_validate.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index 40a2f43..19b6a73 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -714,12 +714,14 @@ valid_draw_indirect(struct gl_context *ctx,
>return GL_FALSE;
>  
>  
> -   /* From the ARB_draw_indirect specification:
> -* "An INVALID_OPERATION error is generated [...] if  is no
> -*  word aligned."
> +   /* From OpenGL version 4.4. section 10.5
> +* and OpenGL ES 3.1, section 10.6:
> +*
> +*  "An INVALID_VALUE error is generated if indirect is not a
> +*   multiple of the size, in basic machine units, of uint."
>  */
> if ((GLsizeiptr)indirect & (sizeof(GLuint) - 1)) {
> -  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  _mesa_error(ctx, GL_INVALID_VALUE,
>"%s(indirect is not aligned)", name);
>return GL_FALSE;
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa: Draw indirect is not allowed when xfb is active and unpaused

2015-10-26 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 10/26/2015 03:50 AM, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> OpenGL ES 3.1 specification, section 10.5:
> "An INVALID_OPERATION error is generated if
> transform feedback is active and not paused."
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/api_validate.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index fa6c1b5..303d5e8 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -710,6 +710,16 @@ valid_draw_indirect(struct gl_context *ctx,
>return GL_FALSE;
> }
>  
> +   /* OpenGL ES 3.1 specification, section 10.5:
> +*
> +*  "An INVALID_OPERATION error is generated if
> +*  transform feedback is active and not paused."
> +*/
> +   if (_mesa_is_gles31(ctx) && _mesa_is_xfb_active_and_unpaused(ctx)) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  "%s(TransformFeedback is active but not paused)", name);
> +   }
> +
> /*
>  * OpenGL ES 3.1 spec. section 10.5:
>  * "An INVALID_OPERATION error is generated if zero is bound to
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] mesa: Enable ASTC in GLES' [NUM_]COMPRESSED_TEXTURE_FORMATS queries

2015-10-26 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 10/21/2015 03:06 PM, Nanley Chery wrote:
> From: Nanley Chery 
> 
> In OpenGL ES, the COMPRESSED_TEXTURE_FORMATS query returns the set of
> supported specific compressed formats. Since ASTC formats fit within
> that category, include them in the set and update the
> NUM_COMPRESSED_TEXTURE_FORMATS query as well.
> 
> This enables GLES2-based ASTC dEQP tests to run. See the Bugzilla for
> more info.
> 
> Cc: "11.0" 

I think we were mistaken about this... ASTC isn't actually in 11.0, is it?

> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193
> Reported-by: Tapani Pälli 
> Suggested-by: Ian Romanick 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/texcompress.c | 85 
> +
>  1 file changed, 63 insertions(+), 22 deletions(-)
> 
> diff --git a/src/mesa/main/texcompress.c b/src/mesa/main/texcompress.c
> index 84973d3..9d22586 100644
> --- a/src/mesa/main/texcompress.c
> +++ b/src/mesa/main/texcompress.c
> @@ -243,28 +243,6 @@ _mesa_gl_compressed_format_base_format(GLenum format)
>   *what GL_NUM_COMPRESSED_TEXTURE_FORMATS and
>   *GL_COMPRESSED_TEXTURE_FORMATS return."
>   *
> - * The KHR_texture_compression_astc_hdr spec says:
> - *
> - *"Interactions with OpenGL 4.2
> - *
> - *OpenGL 4.2 supports the feature that compressed textures can be
> - *compressed online, by passing the compressed texture format enum as
> - *the internal format when uploading a texture using TexImage1D,
> - *TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
> - *Specification, subsection Encoding of Special Internal Formats).
> - *
> - *Due to the complexity of the ASTC compression algorithm, it is not
> - *usually suitable for online use, and therefore ASTC support will be
> - *limited to pre-compressed textures only. Where on-device 
> compression
> - *is required, a domain-specific limited compressor will typically
> - *be used, and this is therefore not suitable for implementation in
> - *the driver.
> - *
> - *In particular, the ASTC format specifiers will not be added to
> - *Table 3.14, and thus will not be accepted by the TexImage*D
> - *functions, and will not be returned by the (already deprecated)
> - *COMPRESSED_TEXTURE_FORMATS query."
> - *
>   * There is no formal spec for GL_ATI_texture_compression_3dc.  Since the
>   * formats added by this extension are luminance-alpha formats, it is
>   * reasonable to expect them to follow the same rules as
> @@ -396,6 +374,69 @@ _mesa_get_compressed_formats(struct gl_context *ctx, 
> GLint *formats)
>   n += 10;
>}
> }
> +
> +   /* The KHR_texture_compression_astc_hdr spec says:
> +*
> +*"Interactions with OpenGL 4.2
> +*
> +*OpenGL 4.2 supports the feature that compressed textures can be
> +*compressed online, by passing the compressed texture format 
> enum as
> +*the internal format when uploading a texture using TexImage1D,
> +*TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
> +*Specification, subsection Encoding of Special Internal Formats).
> +*
> +*Due to the complexity of the ASTC compression algorithm, it is 
> not
> +*usually suitable for online use, and therefore ASTC support 
> will be
> +*limited to pre-compressed textures only. Where on-device 
> compression
> +*is required, a domain-specific limited compressor will typically
> +*be used, and this is therefore not suitable for implementation 
> in
> +*the driver.
> +*
> +*In particular, the ASTC format specifiers will not be added to
> +*Table 3.14, and thus will not be accepted by the TexImage*D
> +*functions, and will not be returned by the (already deprecated)
> +*COMPRESSED_TEXTURE_FORMATS query."
> +*
> +* The ES and the desktop specs diverge here. In OpenGL ES, the 
> COMPRESSED_TEXTURE_FORMATS
> +* query returns the set of supported specific compressed formats.
> +*/
> +   if (ctx->API == API_OPENGLES2 &&
> +   ctx->Extensions.KHR_texture_compression_astc_ldr) {
> +  if (formats) {
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_4x4_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_5x4_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_5x5_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_6x5_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_6x6_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_8x5_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_8x6_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_8x8_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_10x5_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC_10x6_KHR;
> + formats[n++] = GL_COMPRESSED_RGBA_ASTC

Re: [Mesa-dev] [PATCH 3/5] glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.

2015-10-26 Thread Kenneth Graunke
On Monday, October 26, 2015 02:10:29 PM Ilia Mirkin wrote:
> On Mon, Oct 26, 2015 at 2:03 PM, Kenneth Graunke  
> wrote:
> > Integer varyings need to be flat qualified - all others were already.
> > I think we just missed this.  Presumably some hardware passes this via
> > sideband and ignores attribute interpolation, so no one has noticed.
> >
> > Signed-off-by: Kenneth Graunke 
> > Cc: Chris Forbes 
> > ---
> >  src/glsl/builtin_variables.cpp | 39 ++-
> >  1 file changed, 26 insertions(+), 13 deletions(-)
> >
> > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> > index a6ad105..07aacd4 100644
> > --- a/src/glsl/builtin_variables.cpp
> > +++ b/src/glsl/builtin_variables.cpp
> > @@ -887,16 +887,22 @@ builtin_variable_generator::generate_uniforms()
> >  void
> >  builtin_variable_generator::generate_vs_special_vars()
> >  {
> > +   ir_variable *var;
> > +
> > if (state->is_version(130, 300))
> >add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");
> > if (state->ARB_draw_instanced_enable)
> >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, 
> > "gl_InstanceIDARB");
> > if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
> >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> > -   if (state->AMD_vertex_shader_layer_enable)
> > -  add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > -   if (state->AMD_vertex_shader_viewport_index_enable)
> > -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> > +   if (state->AMD_vertex_shader_layer_enable) {
> > +  var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> 
> Not that I *strongly* prefer this, but what we'd do in nouveau/codegen
> code (and is IMHO more concise):
> 
> add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer")
>->data.interpolation = INTERP_QUALIFIER_FLAT;
> 
> Alternatively, you could just solve this in add_output which should
> automatically set the flat interpolation for int types...
> 
> FTR I don't strongly prefer either of the above to your solution, just
> wanted to point out some alternatives in case you like them more.

Yeah, I've never been a huge fan of using -> on pointers directly
returned by function calls...but that's mostly just personal preference.

I considered having add_input/add_output automatically detect ivec*
types and automatically set things to FLAT.  The nice thing about that
is that would prevent future mistakes.  However, I think ivec* typed
VS inputs/FS outputs shouldn't be marked FLAT, so we might need a
parameter (or special vs_input/fs_output functions)...which seemed
potentially ugly.

I suppose adding a non-optional interp qualifier parameter would also
prevent future mistakes, at the cost of making the common case wordy.

In the end I figured I'd do this for now and let somebody more
inspired tidy it up...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] mesa: Enable ASTC in GLES' [NUM_]COMPRESSED_TEXTURE_FORMATS queries

2015-10-26 Thread Nanley Chery
On Mon, Oct 26, 2015 at 4:30 PM, Ian Romanick  wrote:

> Reviewed-by: Ian Romanick 
>
> On 10/21/2015 03:06 PM, Nanley Chery wrote:
> > From: Nanley Chery 
> >
> > In OpenGL ES, the COMPRESSED_TEXTURE_FORMATS query returns the set of
> > supported specific compressed formats. Since ASTC formats fit within
> > that category, include them in the set and update the
> > NUM_COMPRESSED_TEXTURE_FORMATS query as well.
> >
> > This enables GLES2-based ASTC dEQP tests to run. See the Bugzilla for
> > more info.
> >
> > Cc: "11.0" 
>
> I think we were mistaken about this... ASTC isn't actually in 11.0, is it?
>
>
You're right. It is not in 11.0.


> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193
> > Reported-by: Tapani Pälli 
> > Suggested-by: Ian Romanick 
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/main/texcompress.c | 85
> +
> >  1 file changed, 63 insertions(+), 22 deletions(-)
> >
> > diff --git a/src/mesa/main/texcompress.c b/src/mesa/main/texcompress.c
> > index 84973d3..9d22586 100644
> > --- a/src/mesa/main/texcompress.c
> > +++ b/src/mesa/main/texcompress.c
> > @@ -243,28 +243,6 @@ _mesa_gl_compressed_format_base_format(GLenum
> format)
> >   *what GL_NUM_COMPRESSED_TEXTURE_FORMATS and
> >   *GL_COMPRESSED_TEXTURE_FORMATS return."
> >   *
> > - * The KHR_texture_compression_astc_hdr spec says:
> > - *
> > - *"Interactions with OpenGL 4.2
> > - *
> > - *OpenGL 4.2 supports the feature that compressed textures can
> be
> > - *compressed online, by passing the compressed texture format
> enum as
> > - *the internal format when uploading a texture using TexImage1D,
> > - *TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
> > - *Specification, subsection Encoding of Special Internal
> Formats).
> > - *
> > - *Due to the complexity of the ASTC compression algorithm, it
> is not
> > - *usually suitable for online use, and therefore ASTC support
> will be
> > - *limited to pre-compressed textures only. Where on-device
> compression
> > - *is required, a domain-specific limited compressor will
> typically
> > - *be used, and this is therefore not suitable for
> implementation in
> > - *the driver.
> > - *
> > - *In particular, the ASTC format specifiers will not be added to
> > - *Table 3.14, and thus will not be accepted by the TexImage*D
> > - *functions, and will not be returned by the (already
> deprecated)
> > - *COMPRESSED_TEXTURE_FORMATS query."
> > - *
> >   * There is no formal spec for GL_ATI_texture_compression_3dc.  Since
> the
> >   * formats added by this extension are luminance-alpha formats, it is
> >   * reasonable to expect them to follow the same rules as
> > @@ -396,6 +374,69 @@ _mesa_get_compressed_formats(struct gl_context
> *ctx, GLint *formats)
> >   n += 10;
> >}
> > }
> > +
> > +   /* The KHR_texture_compression_astc_hdr spec says:
> > +*
> > +*"Interactions with OpenGL 4.2
> > +*
> > +*OpenGL 4.2 supports the feature that compressed textures
> can be
> > +*compressed online, by passing the compressed texture
> format enum as
> > +*the internal format when uploading a texture using
> TexImage1D,
> > +*TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
> > +*Specification, subsection Encoding of Special Internal
> Formats).
> > +*
> > +*Due to the complexity of the ASTC compression algorithm,
> it is not
> > +*usually suitable for online use, and therefore ASTC
> support will be
> > +*limited to pre-compressed textures only. Where on-device
> compression
> > +*is required, a domain-specific limited compressor will
> typically
> > +*be used, and this is therefore not suitable for
> implementation in
> > +*the driver.
> > +*
> > +*In particular, the ASTC format specifiers will not be
> added to
> > +*Table 3.14, and thus will not be accepted by the TexImage*D
> > +*functions, and will not be returned by the (already
> deprecated)
> > +*COMPRESSED_TEXTURE_FORMATS query."
> > +*
> > +* The ES and the desktop specs diverge here. In OpenGL ES, the
> COMPRESSED_TEXTURE_FORMATS
> > +* query returns the set of supported specific compressed formats.
> > +*/
> > +   if (ctx->API == API_OPENGLES2 &&
> > +   ctx->Extensions.KHR_texture_compression_astc_ldr) {
> > +  if (formats) {
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC_4x4_KHR;
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC_5x4_KHR;
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC_5x5_KHR;
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC_6x5_KHR;
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC_6x6_KHR;
> > + formats[n++] = GL_COMPRESSED_RGBA_ASTC

[Mesa-dev] [Bug 92552] [softpipe] piglit egl-create-context-valid-flag-forward-compatible-gl regression

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92552

--- Comment #8 from Ian Romanick  ---
(In reply to Boyan Ding from comment #4)
> (In reply to Matthew Waters from comment #2)
> > Created attachment 119006 [details] [review] [review]
> > egl: distnguish between unsupported api vs capabilities for 
> > EGL_CONTEXT_FLAGS
> > 
> > This fixes that issue for me.
> 
> Well, your patch does seem to eliminate the problem, but there is a subtle
> difference between how things works now and before, which I think may be
> questionable.
> 
> The spec of EGL_KHR_create_context says:
> requesting a forward-compatible context for OpenGL versions less
> than 3.0 will generate an error
> 
> The code now takes "OpenGL version" above as version requested with EGL (1.0
> by default). However, the OpenGL version actually provided can be up to 3.0.

I'm not really sure what you're asking.  If the create-context call (either EGL
or GLX) does not specify a version, it is as though it specified 1.0.  It
follows that not specifying a version and requesting a forward-compatible
context should generate an error.

I think the GLX spec is pretty clear that BadMatch is generated.  It is also
quite clear that the error is based on the value of the explicitly or
implicitly requested API version.

  * If attributes GLX_CONTEXT_MAJOR_VERSION_ARB and
GLX_CONTEXT_MINOR_VERSION_ARB, when considered together with
attributes GLX_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB and
GLX_RENDER_TYPE, specify an OpenGL version and feature set that
are not defined, BadMatch is generated.

The defined versions of OpenGL at the time of writing are OpenGL
1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 3.0, 3.1, and 3.2.
Feature deprecation was introduced with OpenGL 3.0, so
forward-compatible contexts may only be requested for OpenGL 3.0
and above. Thus, examples of invalid combinations of attributes
include:

  - Major version < 1 or > 3
  - Major version == 1 and minor version < 0 or > 5
  - Major version == 2 and minor version < 0 or > 1
  - Major version == 3 and minor version > 2
  - Forward-compatible flag set and major version < 3
  - Color index rendering and major version >= 3

Because the purpose of forward-compatible contexts is to allow
application development on a specific OpenGL version with the
knowledge that the app will run on a future version, context
creation will fail if GLX_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB is
set and the context version returned cannot implement exactly
the requested version.

That is at least what I intended to implement in the GLX code.

The language in the EGL extension is almost identical:

  * If an OpenGL context is requested and the values for attributes
EGL_CONTEXT_MAJOR_VERSION_KHR and EGL_CONTEXT_MINOR_VERSION_KHR,
when considered together with the value for attribute
EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR, specify an OpenGL
version and feature set that are not defined, than an
EGL_BAD_MATCH error is generated.

The defined versions of OpenGL at the time of writing are OpenGL
1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 2.0, 2.1, 3.0, 3.1, 3.2, 4.0, 4.1,
4.2, and 4.3. Feature deprecation was introduced with OpenGL
3.0, so forward-compatible contexts may only be requested for
OpenGL 3.0 and above. Thus, examples of invalid combinations of
attributes include:

  - Major version < 1 or > 4
  - Major version == 1 and minor version < 0 or > 5
  - Major version == 2 and minor version < 0 or > 1
  - Major version == 3 and minor version < 0 or > 2
  - Major version == 4 and minor version < 0 or > 3
  - Forward-compatible flag set and major version < 3

Because the purpose of forward-compatible contexts is to allow
application development on a specific OpenGL version with the
knowledge that the app will run on a future version, context
creation will fail if
EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR is set and the
context version returned cannot implement exactly the requested
version.


> In such case, your patch will fail to create forward-compatible context,
> giving the "appropriate" error, while mesa before 86ccb2a1 will succeed.
> Though both behaviors pass piglit.
> 
> Any ideas?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92552] [softpipe] piglit egl-create-context-valid-flag-forward-compatible-gl regression

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92552

--- Comment #9 from Ian Romanick  ---
Comment on attachment 119006
  --> https://bugs.freedesktop.org/attachment.cgi?id=119006
egl: distnguish between unsupported api vs capabilities for EGL_CONTEXT_FLAGS

Review of attachment 119006:
-

There are a few things wrong with this patch.  First, you can't check the
version when you see EGL_CONTEXT_FLAGS in the attribs list because the
application could specify the requested version after flags.  This should not
generate an error:

EGL_CONTEXT_FLAGS, EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR,
EGL_CONTEXT_CLIENT_VERSION, 3,
0

Second, it is not correct to generate EGL_BAD_MATCH for
EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR when EGL_EXT_create_context_robustness
is not supported.  Think of it this way... if we deleted all support and
knowledge of EGL_EXT_create_context_robustness, what error would be generated? 
Default case in the switch statement says EGL_BAD_ATTRIBUTE.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.

2015-10-26 Thread Ian Romanick
On 10/26/2015 11:03 AM, Kenneth Graunke wrote:
> Integer varyings need to be flat qualified - all others were already.
> I think we just missed this.  Presumably some hardware passes this via
> sideband and ignores attribute interpolation, so no one has noticed.
> 
> Signed-off-by: Kenneth Graunke 
> Cc: Chris Forbes 
> ---
>  src/glsl/builtin_variables.cpp | 39 ++-
>  1 file changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index a6ad105..07aacd4 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -887,16 +887,22 @@ builtin_variable_generator::generate_uniforms()
>  void
>  builtin_variable_generator::generate_vs_special_vars()
>  {
> +   ir_variable *var;
> +
> if (state->is_version(130, 300))
>add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");
> if (state->ARB_draw_instanced_enable)
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");
> if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
>add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> -   if (state->AMD_vertex_shader_layer_enable)
> -  add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -   if (state->AMD_vertex_shader_viewport_index_enable)
> -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +   if (state->AMD_vertex_shader_layer_enable) {
> +  var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   }
> +   if (state->AMD_vertex_shader_viewport_index_enable) {
> +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   }
> if (compatibility) {
>add_input(VERT_ATTRIB_POS, vec4_t, "gl_Vertex");
>add_input(VERT_ATTRIB_NORMAL, vec3_t, "gl_Normal");
> @@ -954,11 +960,16 @@ builtin_variable_generator::generate_tes_special_vars()
>  void
>  builtin_variable_generator::generate_gs_special_vars()
>  {
> -   add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -   if (state->is_version(410, 0) || state->ARB_viewport_array_enable)
> -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +   ir_variable *var;
> +
> +   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> +   var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   if (state->is_version(410, 0) || state->ARB_viewport_array_enable) {
> +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +   }
> if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable)
> -  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
> +  var = add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, 
> "gl_InvocationID");

Shouldn't this set var->data.interpolation?  It looks like you only made
half the change.

>  
> /* Although gl_PrimitiveID appears in tessellation control and 
> tessellation
>  * evaluation shaders, it has a different function there than it has in
> @@ -970,7 +981,6 @@ builtin_variable_generator::generate_gs_special_vars()
>  * the specific case of gl_PrimitiveIDIn.  So we don't need to treat
>  * gl_PrimitiveIDIn as an {ARB,EXT}_geometry_shader4-only variable.
>  */
> -   ir_variable *var;
> var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveIDIn");
> var->data.interpolation = INTERP_QUALIFIER_FLAT;
> var = add_output(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
> @@ -984,14 +994,15 @@ builtin_variable_generator::generate_gs_special_vars()
>  void
>  builtin_variable_generator::generate_fs_special_vars()
>  {
> +   ir_variable *var;
> +
> add_input(VARYING_SLOT_POS, vec4_t, "gl_FragCoord");
> add_input(VARYING_SLOT_FACE, bool_t, "gl_FrontFacing");
> if (state->is_version(120, 100))
>add_input(VARYING_SLOT_PNTC, vec2_t, "gl_PointCoord");
>  
> if (state->is_version(150, 0)) {
> -  ir_variable *var =
> - add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
> +  var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
>var->data.interpolation = INTERP_QUALIFIER_FLAT;
> }
>  
> @@ -1043,8 +1054,10 @@ builtin_variable_generator::generate_fs_special_vars()
> }
>  
> if (state->is_version(430, 0) || 
> state->ARB_fragment_layer_viewport_enable) {
> -  add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> -  add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var = add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> +  var = add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> }
>  }
>  
> 

___
mesa-d

Re: [Mesa-dev] [PATCH] i965: Set correct field for indirect align16 addrimm.

2015-10-26 Thread Ian Romanick
On 10/26/2015 11:10 AM, Matt Turner wrote:
> This has been wrong since the initial import of the i965 driver.

That is awesome.  Other than using the assembly validator, is there any
way to trigger this?  I assume you discovered this bug by inspection...

> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index ebd811f..df48590 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -410,7 +410,7 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, 
> struct brw_reg reg)
>   if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
>  brw_inst_set_src0_ia1_addr_imm(devinfo, inst, 
> reg.dw1.bits.indirect_offset);
>} else {
> -brw_inst_set_src0_ia_subreg_nr(devinfo, inst, 
> reg.dw1.bits.indirect_offset);
> +brw_inst_set_src0_ia16_addr_imm(devinfo, inst, 
> reg.dw1.bits.indirect_offset);
>}
>}
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir: support to clone shaders (v2)

2015-10-26 Thread Rob Clark
On Mon, Oct 26, 2015 at 6:56 PM, Jason Ekstrand  wrote:
> On Mon, Oct 26, 2015 at 12:53 PM, Rob Clark  wrote:
>> On Mon, Oct 26, 2015 at 2:44 PM, Jason Ekstrand  wrote:
>>> On Mon, Oct 26, 2015 at 8:27 AM, Rob Clark  wrote:

 +static void *
 +clone_intrinsic(clone_state *state, const void *ptr)
 +{
 +   const nir_intrinsic_instr *itr = ptr;
 +   unsigned num_srcs = nir_intrinsic_infos[itr->intrinsic].num_srcs;
 +
 +   nir_intrinsic_instr *nitr =
 +  ralloc_size(state->ns,
 +  sizeof(nir_intrinsic_instr) + num_srcs * 
 sizeof(nir_src));
 +   store_ptr(state, nitr, itr);
 +
 +   __clone_instr(state, &nitr->instr, &itr->instr);
 +
 +   nitr->intrinsic = itr->intrinsic;
 +
 +   if (nir_intrinsic_infos[itr->intrinsic].has_dest)
 +  __clone_dst(state, &nitr->dest, &itr->dest);
 +
 +   nitr->num_components = itr->num_components;
 +   memcpy(nitr->const_index, itr->const_index, sizeof(nitr->const_index));
 +
 +   for (unsigned i = 0; i < ARRAY_SIZE(nitr->variables); i++) {
 +  nitr->variables[i] = clone_ptr(state, itr->variables[i], 
 clone_deref_var);
>>>
>>> Derefs don't need to go in the remap table.  They need to use the
>>> variable from it and need to remap any indirect sources, but the deref
>>> doesn't need to go in the table.
>>
>> that goes for all the various places deref's can appear?  (Ie. call,
>> tex instr, etc)?  They are all private to the instruction?  I got a
>> bit lost/impatient in following glsl_to_nir around so I went for the
>> safe thing..
>
> Yes, that's correct.  A deref is always only local to the instruction.
> This is why I suggested passing the instruction into clone_deref
> above.  If you don't need to use clone_ptr on it, then it doesn't
> require passing extra stuff through.

oh, and I see, and nir_validate actually checks some of this stuff..

(possibly wouldn't hurt for it to validate parent memctx a bit more
thorough.. it took a bit of digging to figure out what should be
parent memctx of what and whether I could completely rely on that..
but other topic..)

I guess things that nir_validate asserts, I'd be more comfortable
relying on, vs letting things get broken until someone bothers to run
a debug build w/ NIR_TEST_CLONE=1 ;-)

>> (That said, in the grand scheme of things I don't think there are so
>> many deref objects, so probably doesn't make much difference)
>
> Sure, but it simplifies things if you just create it on the spot and
> pass the instruction through.

maybe.. otoh less things that are handled differently and require
thought to understand why they are handled in a particular way in
nir_clone, vs other fields/ptrs/etc, makes the code easier to
understand and keep up to date.. or at least that is part of my
reasoning about some things..

 +   }
 +
 +   for (unsigned i = 0; i < num_srcs; i++) {
 +  __clone_src(state, &nitr->src[i], &itr->src[i]);
 +   }
 +
 +   return nitr;
 +}
 +

[snip]

 +
 +static void
 +__clone_cf_node_p2(clone_state *state, nir_cf_node *ncf, const 
 nir_cf_node *cf)
 +{
 +   switch (cf->type) {
 +   case nir_cf_node_block:
 +  __clone_block_p2(state, nir_cf_node_as_block(ncf), 
 nir_cf_node_as_block(cf));
 +  break;
 +   case nir_cf_node_if:
 +  __clone_if_p2(state, nir_cf_node_as_if(ncf), nir_cf_node_as_if(cf));
 +  break;
 +   case nir_cf_node_loop:
 +  __clone_loop_p2(state, nir_cf_node_as_loop(ncf), 
 nir_cf_node_as_loop(cf));
 +  break;
 +   case nir_cf_node_function:
 +  __clone_function_impl_p2(state, nir_cf_node_as_function(ncf),
 +   nir_cf_node_as_function(cf));
>>>
>>> This whole song-and-dance isn't needed.  We already have
>>> nir_foreach_block which will do all of this for you.  To handle the
>>> ifs, we have nir_block_get_following_if().  See also
>>> nir_live_variables for details.
>>
>> but it won't co-iterate two lists at the same time for me ;-)
>
> Sure, but I'm pretty sure you never use it given my comment on
> clone_block_p2 :-)

well, partially.. I still need some fixup in nir_if too, but given
that I simplified things to just keep a list of nir_src's to fixup, I
also don't need the origin nir_if there currently.  (The earlier
version of the patch had more complete ir __clone_foo_v2()'s to
traverse the entire graph to avoid keeping a list of nir_src's to
fixup.)

At any rate, in either case I need to do my own custom traversal to
get at the nir_if's, regardless of whether I need to co-traverse the
original-foo's.  Leaving the co-traversal in place at least simplifies
things later if any other place needs to do a 2nd-pass ptr lookup, if
the need arises in the future.

Maybe I should just throw in some some assert(nfoo == clone_ptr(foo))
so we have some use for the co-traversal until then ;-)

BR,
-R
___

Re: [Mesa-dev] [PATCH 3/5] glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.

2015-10-26 Thread Kenneth Graunke
On Monday, October 26, 2015 05:02:07 PM Ian Romanick wrote:
> On 10/26/2015 11:03 AM, Kenneth Graunke wrote:
> > Integer varyings need to be flat qualified - all others were already.
> > I think we just missed this.  Presumably some hardware passes this via
> > sideband and ignores attribute interpolation, so no one has noticed.
> > 
> > Signed-off-by: Kenneth Graunke 
> > Cc: Chris Forbes 
> > ---
> >  src/glsl/builtin_variables.cpp | 39 ++-
> >  1 file changed, 26 insertions(+), 13 deletions(-)
> > 
> > diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> > index a6ad105..07aacd4 100644
> > --- a/src/glsl/builtin_variables.cpp
> > +++ b/src/glsl/builtin_variables.cpp
> > @@ -887,16 +887,22 @@ builtin_variable_generator::generate_uniforms()
> >  void
> >  builtin_variable_generator::generate_vs_special_vars()
> >  {
> > +   ir_variable *var;
> > +
> > if (state->is_version(130, 300))
> >add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");
> > if (state->ARB_draw_instanced_enable)
> >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, 
> > "gl_InstanceIDARB");
> > if (state->ARB_draw_instanced_enable || state->is_version(140, 300))
> >add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");
> > -   if (state->AMD_vertex_shader_layer_enable)
> > -  add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > -   if (state->AMD_vertex_shader_viewport_index_enable)
> > -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> > +   if (state->AMD_vertex_shader_layer_enable) {
> > +  var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> > +   }
> > +   if (state->AMD_vertex_shader_viewport_index_enable) {
> > +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> > +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> > +   }
> > if (compatibility) {
> >add_input(VERT_ATTRIB_POS, vec4_t, "gl_Vertex");
> >add_input(VERT_ATTRIB_NORMAL, vec3_t, "gl_Normal");
> > @@ -954,11 +960,16 @@ 
> > builtin_variable_generator::generate_tes_special_vars()
> >  void
> >  builtin_variable_generator::generate_gs_special_vars()
> >  {
> > -   add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > -   if (state->is_version(410, 0) || state->ARB_viewport_array_enable)
> > -  add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> > +   ir_variable *var;
> > +
> > +   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");
> > +   var->data.interpolation = INTERP_QUALIFIER_FLAT;
> > +   if (state->is_version(410, 0) || state->ARB_viewport_array_enable) {
> > +  var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");
> > +  var->data.interpolation = INTERP_QUALIFIER_FLAT;
> > +   }
> > if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable)
> > -  add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, 
> > "gl_InvocationID");
> > +  var = add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, 
> > "gl_InvocationID");
> 
> Shouldn't this set var->data.interpolation?  It looks like you only made
> half the change.

I actually meant to leave that alone, as I don't think it makes sense to
set interpolation modes on system values (they aren't varyings!).  Good
catch, though, I got a bit overzealous in my "var" adding :)

I've dropped that one line of diff locally.  Thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Set correct field for indirect align16 addrimm.

2015-10-26 Thread Matt Turner
On Mon, Oct 26, 2015 at 5:07 PM, Ian Romanick  wrote:
> On 10/26/2015 11:10 AM, Matt Turner wrote:
>> This has been wrong since the initial import of the i965 driver.
>
> That is awesome.  Other than using the assembly validator, is there any
> way to trigger this?  I assume you discovered this bug by inspection...

I don't think so. As far as I know we never use indirect addressing in
align16 mode. In fact, the disassembler doesn't even have support for
it. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] st/mesa: implement ARB_copy_image

2015-10-26 Thread Brian Paul
Looks good to me.  Yeah, it is kind of crazy.  I think softpipe/llvmpipe 
just need one of my previous patches, plus minor updates.  I'll take 
care of that.


Just one minor nit below.

Oh, and you could probably mention support for GL_ARB_copy_image in the 
release notes file and GL3.txt.


Reviewed-by: Brian Paul 


On 10/25/2015 11:25 AM, Marek Olšák wrote:

From: Marek Olšák 

I wonder if the craziness was worth it.
---
  src/mesa/Makefile.sources|   2 +
  src/mesa/state_tracker/st_cb_copyimage.c | 609 +++
  src/mesa/state_tracker/st_cb_copyimage.h |  33 ++
  src/mesa/state_tracker/st_cb_texture.c   |  51 ---
  src/mesa/state_tracker/st_context.c  |   2 +
  src/mesa/state_tracker/st_extensions.c   |   1 +
  6 files changed, 647 insertions(+), 51 deletions(-)
  create mode 100644 src/mesa/state_tracker/st_cb_copyimage.c
  create mode 100644 src/mesa/state_tracker/st_cb_copyimage.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 4bcaa62..de0e330 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -423,6 +423,8 @@ STATETRACKER_FILES = \
state_tracker/st_cb_clear.h \
state_tracker/st_cb_condrender.c \
state_tracker/st_cb_condrender.h \
+   state_tracker/st_cb_copyimage.c \
+   state_tracker/st_cb_copyimage.h \
state_tracker/st_cb_drawpixels.c \
state_tracker/st_cb_drawpixels.h \
state_tracker/st_cb_drawpixels_shader.c \
diff --git a/src/mesa/state_tracker/st_cb_copyimage.c 
b/src/mesa/state_tracker/st_cb_copyimage.c
new file mode 100644
index 000..1740aaf
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_copyimage.c
@@ -0,0 +1,609 @@
+/*
+ * Copyright 2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * on the rights to use, copy, modify, merge, publish, distribute, sub
+ * license, and/or sell copies of the Software, and to permit persons to whom
+ * the Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "state_tracker/st_context.h"
+#include "state_tracker/st_cb_copyimage.h"
+#include "state_tracker/st_cb_fbo.h"
+#include "state_tracker/st_texture.h"
+
+#include "util/u_box.h"
+#include "util/u_format.h"
+#include "util/u_inlines.h"
+
+
+/**
+ * Return an equivalent canonical format without "X" channels.
+ *
+ * Copying between incompatible formats is easier when the format is
+ * canonicalized, meaning that it is in a standard form.
+ *
+ * The returned format has the same component sizes and swizzles as
+ * the source format, the type is changed to UINT or UNORM, depending on
+ * which one has the most swizzle combinations in their group.
+ *
+ * If it's not an array format, return a memcpy-equivalent array format.
+ *
+ * The key feature is that swizzled versions of formats of the same
+ * component size always return the same component type.
+ *
+ * X returns A.
+ * Luminance, intensity, alpha, depth, stencil, and 8-bit and 16-bit packed
+ * formats are not supported. (same as ARB_copy_image)
+ */
+static enum pipe_format
+get_canonical_format(enum pipe_format format)
+{
+   const struct util_format_description *desc =
+  util_format_description(format);
+
+   /* Packed formats. Return the equivalent array format. */
+   if (format == PIPE_FORMAT_R11G11B10_FLOAT ||
+   format == PIPE_FORMAT_R9G9B9E5_FLOAT)
+  return get_canonical_format(PIPE_FORMAT_R8G8B8A8_UINT);
+
+   if (desc->nr_channels == 4 &&
+   desc->channel[0].size == 10 &&
+   desc->channel[1].size == 10 &&
+   desc->channel[2].size == 10 &&
+   desc->channel[3].size == 2) {
+  if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
+  desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_Y &&
+  desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_Z)
+ return get_canonical_format(PIPE_FORMAT_R8G8B8A8_UINT);
+
+  return PIPE_FORMAT_NONE;
+   }
+
+#define RETURN_FOR_SWIZZLE1(x, format) \
+   if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_##x) \
+  return format
+
+#define RETURN_FOR_SWIZZLE2(x, y, format) \
+   if (desc->swizzle[0] == UTIL_FORMAT

Re: [Mesa-dev] [PATCH 2/3] gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS

2015-10-26 Thread Brian Paul


Reviewed-by: Brian Paul 


On 10/25/2015 11:25 AM, Marek Olšák wrote:

From: Marek Olšák 

For ARB_copy_image.
---
  src/gallium/docs/source/screen.rst   | 4 +++-
  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
  src/gallium/drivers/i915/i915_screen.c   | 1 +
  src/gallium/drivers/ilo/ilo_screen.c | 1 +
  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
  src/gallium/drivers/r300/r300_screen.c   | 1 +
  src/gallium/drivers/r600/r600_pipe.c | 1 +
  src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
  src/gallium/drivers/softpipe/sp_screen.c | 1 +
  src/gallium/drivers/svga/svga_screen.c   | 1 +
  src/gallium/drivers/vc4/vc4_screen.c | 1 +
  src/gallium/include/pipe/p_defines.h | 1 +
  15 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 151afb2..91fdb43 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -278,7 +278,9 @@ The integer capabilities:
in the shader.
  * ``PIPE_CAP_SHAREABLE_SHADERS``: Whether shader CSOs can be used by any
pipe_context.
-
+* ``PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS``:
+  Whether copying between compressed and plain formats is supported where
+  a compressed block is copied to/from a plain pixel of the same size.


  .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 50d140f..9f8c332 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -238,6 +238,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_TXQS:
case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;

case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 5812af6..2d2fd37 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -252,6 +252,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
 case PIPE_CAP_TGSI_TXQS:
 case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
 case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;

 case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index e1a7dc5..888f7aa 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -474,6 +474,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
 case PIPE_CAP_TGSI_TXQS:
 case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
 case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;

 case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index e2ed267..d1c50ae 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -299,6 +299,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
 case PIPE_CAP_TGSI_TXQS:
 case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
 case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;
 }
 /* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 0330164..bdecb0a 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -172,6 +172,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 case PIPE_CAP_TGSI_TXQS:
 case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
 case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;

 case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index ec51d00..2c8884e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -217,6 +217,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
 case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
 case PIPE_CAP_SHAREABLE_SHADERS:
+   case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
return 0;

 case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc

Re: [Mesa-dev] [PATCH 3/3] st/mesa: implement ARB_copy_image

2015-10-26 Thread Ilia Mirkin
On Sun, Oct 25, 2015 at 1:25 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> I wonder if the craziness was worth it.
> ---
>  src/mesa/Makefile.sources|   2 +
>  src/mesa/state_tracker/st_cb_copyimage.c | 609 
> +++
>  src/mesa/state_tracker/st_cb_copyimage.h |  33 ++
>  src/mesa/state_tracker/st_cb_texture.c   |  51 ---
>  src/mesa/state_tracker/st_context.c  |   2 +
>  src/mesa/state_tracker/st_extensions.c   |   1 +
>  6 files changed, 647 insertions(+), 51 deletions(-)
>  create mode 100644 src/mesa/state_tracker/st_cb_copyimage.c
>  create mode 100644 src/mesa/state_tracker/st_cb_copyimage.h
>
> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> index 4bcaa62..de0e330 100644
> --- a/src/mesa/Makefile.sources
> +++ b/src/mesa/Makefile.sources
> @@ -423,6 +423,8 @@ STATETRACKER_FILES = \
> state_tracker/st_cb_clear.h \
> state_tracker/st_cb_condrender.c \
> state_tracker/st_cb_condrender.h \
> +   state_tracker/st_cb_copyimage.c \
> +   state_tracker/st_cb_copyimage.h \
> state_tracker/st_cb_drawpixels.c \
> state_tracker/st_cb_drawpixels.h \
> state_tracker/st_cb_drawpixels_shader.c \
> diff --git a/src/mesa/state_tracker/st_cb_copyimage.c 
> b/src/mesa/state_tracker/st_cb_copyimage.c
> new file mode 100644
> index 000..1740aaf
> --- /dev/null
> +++ b/src/mesa/state_tracker/st_cb_copyimage.c
> @@ -0,0 +1,609 @@
> +/*
> + * Copyright 2015 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * on the rights to use, copy, modify, merge, publish, distribute, sub
> + * license, and/or sell copies of the Software, and to permit persons to whom
> + * the Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
> + * USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#include "state_tracker/st_context.h"
> +#include "state_tracker/st_cb_copyimage.h"
> +#include "state_tracker/st_cb_fbo.h"
> +#include "state_tracker/st_texture.h"
> +
> +#include "util/u_box.h"
> +#include "util/u_format.h"
> +#include "util/u_inlines.h"
> +
> +
> +/**
> + * Return an equivalent canonical format without "X" channels.
> + *
> + * Copying between incompatible formats is easier when the format is
> + * canonicalized, meaning that it is in a standard form.
> + *
> + * The returned format has the same component sizes and swizzles as
> + * the source format, the type is changed to UINT or UNORM, depending on
> + * which one has the most swizzle combinations in their group.
> + *
> + * If it's not an array format, return a memcpy-equivalent array format.
> + *
> + * The key feature is that swizzled versions of formats of the same
> + * component size always return the same component type.
> + *
> + * X returns A.
> + * Luminance, intensity, alpha, depth, stencil, and 8-bit and 16-bit packed
> + * formats are not supported. (same as ARB_copy_image)
> + */
> +static enum pipe_format
> +get_canonical_format(enum pipe_format format)
> +{
> +   const struct util_format_description *desc =
> +  util_format_description(format);
> +
> +   /* Packed formats. Return the equivalent array format. */
> +   if (format == PIPE_FORMAT_R11G11B10_FLOAT ||
> +   format == PIPE_FORMAT_R9G9B9E5_FLOAT)
> +  return get_canonical_format(PIPE_FORMAT_R8G8B8A8_UINT);
> +
> +   if (desc->nr_channels == 4 &&
> +   desc->channel[0].size == 10 &&
> +   desc->channel[1].size == 10 &&
> +   desc->channel[2].size == 10 &&
> +   desc->channel[3].size == 2) {
> +  if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
> +  desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_Y &&
> +  desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_Z)
> + return get_canonical_format(PIPE_FORMAT_R8G8B8A8_UINT);
> +
> +  return PIPE_FORMAT_NONE;
> +   }
> +
> +#define RETURN_FOR_SWIZZLE1(x, format) \
> +   if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_##x) \
> +  return format
> +
> +#define RETURN_FOR_SWIZZLE2(x, y, format) \
> +   if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_##x && \
> +   desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_##y) \
> +  re

[Mesa-dev] [Bug 92552] [softpipe] piglit egl-create-context-valid-flag-forward-compatible-gl regression

2015-10-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92552

--- Comment #10 from Matthew Waters  ---
(In reply to Ian Romanick from comment #9)
> Comment on attachment 119006 [details] [review]
> egl: distnguish between unsupported api vs capabilities for EGL_CONTEXT_FLAGS
> 
> Review of attachment 119006 [details] [review]:
> -
> 
> There are a few things wrong with this patch.  First, you can't check the
> version when you see EGL_CONTEXT_FLAGS in the attribs list because the
> application could specify the requested version after flags.  This should
> not generate an error:
> 
> EGL_CONTEXT_FLAGS, EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR,
> EGL_CONTEXT_CLIENT_VERSION, 3,
> 0

Right, will fix.

> Second, it is not correct to generate EGL_BAD_MATCH for
> EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR when
> EGL_EXT_create_context_robustness is not supported.  Think of it this way...
> if we deleted all support and knowledge of
> EGL_EXT_create_context_robustness, what error would be generated?  Default
> case in the switch statement says EGL_BAD_ATTRIBUTE.

Not sure what you're trying to say here.

Couple of interpretations:
1. Unknown bit in EGL_CONTEXT_FLAGS -> specs are pretty clear that
EGL_BAD_ATTRIBUTE should be returned
2. EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR is only defined for OpenGL contexts
(not ES) -> taken care of previously and will return EGL_BAD_ATTRIBUTE.
3. Removing support for EGL_EXT_create_context_robustness is not the same as
removing support for GL_ARB_robustness.  My question then is, do we have a way
we can check for GL_ARB_robustness from the egl code or is it always (not?)
supported?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev