Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 09:32 +0200, Pohjolainen, Topi wrote:
> On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > Hello,
> > 
> > There is only one patch from this series that has been reviewed (patch
> > 1).
> > 
> > Our plans is to start sending patches for adding fp64 support to i965
> > driver in the coming weeks but they depend on these patches.
> > 
> > Can someone take a look at them? ;)
> > 
> > Sam
> > 
> > 
> > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > Hello,
> > > 
> > > This patch series is a updated version of the one Iago sent last
> > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > 
> > > We checked the gen9 code paths that work with a horizontal width of 4
> > > and we think there won't be any regression on gen9... but we don't
> > > have any gen9 machine to run piglit with these patches. Can someone
> > > check it?
> > > 
> > > Please read the original cover letter [0] for more information.
> > > 
> > > Sam
> > > 
> > > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> > > 46.html
> > > 
> > > Iago Toral Quiroga (5):
> > >   i965/eu: set correct execution size in brw_NOP
> > >   i965/fs: set execution size for SEND messages in
> > > generate_uniform_pull_constant_load_gen7
> 
> Then about the other change. I like it being explicitly set instead of just
> inheriting the size from the previous instruction.
> 
> @@ -1248,6 +1248,7 @@ 
> fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
>brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
>brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
> +  brw_inst_set_exec_size(devinfo, send, dst.width);
> 
> But I'm seeing other occurrences of BRW_OPCODE_SEND as well. For example, 
> there
> are such instructions generated in generate_urb_read/write() which are not 
> addressed. Don't we end up there with doubles as well needing the same
> treatment?

Probably those other SENDs always operate with a width of 8/16 (the
default) and don't need to be fixed for other cases because they don't
happen. Notice that we are only trying to fix the cases of width = 4
here, which are the conflicting ones, so if some paths never execute
instructions with a width of 4 we don't need to do anything about them.

What we have done to identify the cases that need fixing was adding this
assertion in brw_set_dest (brw_eu_emit.c):

   assert(dest.width != BRW_EXECUTE_4 ||
  brw_inst_exec_size(devinfo, inst) == dest.width);

That catches any instruction with a width of 4 (the ones that need
fixing) that does not have the correct execsize set. We ran this through
piglit and dEQP functional's tests for all the generations we mentioned
and fixed the cases that broke this assertion one by one.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 09:26 +0200, Pohjolainen, Topi wrote:
> On Wed, Mar 09, 2016 at 09:07:44AM +0200, Pohjolainen, Topi wrote:
> > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > > Hello,
> > > 
> > > There is only one patch from this series that has been reviewed (patch
> > > 1).
> > > 
> > > Our plans is to start sending patches for adding fp64 support to i965
> > > driver in the coming weeks but they depend on these patches.
> > > 
> > > Can someone take a look at them? ;)
> > > 
> > > Sam
> > > 
> > > 
> > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > Hello,
> > > > 
> > > > This patch series is a updated version of the one Iago sent last
> > > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > > 
> > > > We checked the gen9 code paths that work with a horizontal width of 4
> > > > and we think there won't be any regression on gen9... but we don't
> > > > have any gen9 machine to run piglit with these patches. Can someone
> > > > check it?
> > > > 
> > > > Please read the original cover letter [0] for more information.
> > > > 
> > > > Sam
> > > > 
> > > > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> > > > 46.html
> > > > 
> > > > Iago Toral Quiroga (5):
> > > >   i965/eu: set correct execution size in brw_NOP
> > > >   i965/fs: set execution size for SEND messages in
> > > > generate_uniform_pull_constant_load_gen7
> > 
> > I don't have the series in my mailbox anymore, so I'll comment here. There 
> > is:
> > 
> >brw_set_dest(p, send, dst);
> > @@ -1279,6 +1280,7 @@ 
> > fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
> >/* dst = send(payload, a0.0 | ) */
> >brw_inst *insn = brw_send_indirect_message(
> >   p, BRW_SFID_SAMPLER, dst, src, addr);
> > +  brw_inst_set_exec_size(devinfo, insn, dst.width);
> > 
> > I wonder if we should modify brw_send_indirect_message() instead? It already
> > calls brw_inst_set_exec_size() itself:
> > 
> >if (dst.width < BRW_EXECUTE_8)
> >   brw_inst_set_exec_size(devinfo, send, dst.width);
> 
> Actually you set this yourself in the next patch of the series. Is the
> previous in the caller side really needed after this?

Good catch! I'll give it a quick test through piglit with the assertion
I mentioned in my previous reply to see if we can drop this hunk, that's
the only way to be certain :)

Iago

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > Hello,
> > 
> > There is only one patch from this series that has been reviewed (patch
> > 1).
> > 
> > Our plans is to start sending patches for adding fp64 support to i965
> > driver in the coming weeks but they depend on these patches.
> > 
> > Can someone take a look at them? ;)
> > 
> > Sam
> > 
> > 
> > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > Hello,
> > > 
> > > This patch series is a updated version of the one Iago sent last
> > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > 
> > > We checked the gen9 code paths that work with a horizontal width of 4
> > > and we think there won't be any regression on gen9... but we don't
> > > have any gen9 machine to run piglit with these patches. Can someone
> > > check it?
> 
> I rebased it and ran it through the test system, gen9 seems to be fine, I
> only got one regression, and that was on old g965:

Awesome! would it be possible to run that test in g695 with the attached
change? If this is a regression caused by our code it should break at
the assert introduced with it.

> /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy 
> all_samples srgb depthstencil -auto -fbo
> Pixels that should be unlit
>   count = 236444
>   RMS error = 0.025355
> Pixels that should be totally lit
>   count = 13308
>   Perfect output
> The error threshold for unlit and totally lit pixels test is 0.016650
> Pixels that should be partially lit
>   count = 12392
>   RMS error = 0.273876
> The error threshold for partially lit pixels is 0.333000
> Samples = 0, Result = fail
> 
> 
> But I'm not sure if this is caused by your patches.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 6f11f59..625447f 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest)
 * or 16 (SIMD16), as that's normally correct.  However, when dealing with
 * small registers, we automatically reduce it to match the register size.
 */
+   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, inst) == dest.width);
if (dest.width < BRW_EXECUTE_8)
   brw_inst_set_exec_size(devinfo, inst, dest.width);
 }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50/ir: Check for valid insn instead of defs size

2016-03-09 Thread Pierre Moreau
On 06:10 PM - Mar 08 2016, Ilia Mirkin wrote:
> Patch is fine, description is wrong (or at least inaccurate).
> 
> The real issue is that function arguments have defs, but no defining
> instruction. As a result, there's nothing to do when allocating
> registers. This has nothing to do with $r0, but it does have something
> to do with the fact that nv50 compute makes use of function arguments
> for compute programs.

That's what I meant to write, but reading it again, it is confusing. I'll
rewrite it.

Pierre

> 
>   -ilia
> 
> On Wed, Feb 24, 2016 at 8:03 PM, Pierre Moreau  wrote:
> > On Tesla cards, the first register $r0 contains the thread id; later
> > generations use a specialised register for it. In order to prevent the 
> > register
> > from being given to anyone, and thus lose the thread id information, an 
> > lvalue
> > is created to represent $r0 and is passed as an argument to the `main`
> > function.
> >
> > However, since the inputs and outputs of a function are stored as value
> > definitions, a definition is added onto the previously created lvalue 
> > without
> > it being associated to an instruction. Therefore, checking the number of
> > definitions of an lvalue do not ensure that it is associated to an 
> > instruction.
> >
> > Fixes a nullptr dereference in the register allocation pass, while running
> > compute kernels that do not use $r0.
> >
> > Signed-off-by: Pierre Moreau 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > index d877c25..500ab89 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> > @@ -853,7 +853,7 @@ isShortRegOp(Instruction *insn)
> >  static bool
> >  isShortRegVal(LValue *lval)
> >  {
> > -   if (lval->defs.size() == 0)
> > +   if (lval->getInsn() == NULL)
> >return false;
> > for (Value::DefCIterator def = lval->defs.begin();
> >  def != lval->defs.end(); ++def)
> > @@ -1467,7 +1467,7 @@ GCRA::allocateRegisters(ArrayList& insns)
> >   nodes[i].init(regs, lval);
> >   RIG.insert(&nodes[i]);
> >
> > - if (lval->inFile(FILE_GPR) && lval->defs.size() > 0 &&
> > + if (lval->inFile(FILE_GPR) && lval->getInsn() != NULL &&
> >   prog->getTarget()->getChipset() < 0xc0) {
> >  Instruction *insn = lval->getInsn();
> >  if (insn->op == OP_MAD || insn->op == OP_SAD)
> > --
> > 2.7.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Pohjolainen, Topi
On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > > Hello,
> > > 
> > > There is only one patch from this series that has been reviewed (patch
> > > 1).
> > > 
> > > Our plans is to start sending patches for adding fp64 support to i965
> > > driver in the coming weeks but they depend on these patches.
> > > 
> > > Can someone take a look at them? ;)
> > > 
> > > Sam
> > > 
> > > 
> > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > Hello,
> > > > 
> > > > This patch series is a updated version of the one Iago sent last
> > > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > > 
> > > > We checked the gen9 code paths that work with a horizontal width of 4
> > > > and we think there won't be any regression on gen9... but we don't
> > > > have any gen9 machine to run piglit with these patches. Can someone
> > > > check it?
> > 
> > I rebased it and ran it through the test system, gen9 seems to be fine, I
> > only got one regression, and that was on old g965:
> 
> Awesome! would it be possible to run that test in g695 with the attached
> change? If this is a regression caused by our code it should break at
> the assert introduced with it.
> 
> > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy 
> > all_samples srgb depthstencil -auto -fbo
> > Pixels that should be unlit
> >   count = 236444
> >   RMS error = 0.025355
> > Pixels that should be totally lit
> >   count = 13308
> >   Perfect output
> > The error threshold for unlit and totally lit pixels test is 0.016650
> > Pixels that should be partially lit
> >   count = 12392
> >   RMS error = 0.273876
> > The error threshold for partially lit pixels is 0.333000
> > Samples = 0, Result = fail
> > 
> > 
> > But I'm not sure if this is caused by your patches.
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 6f11f59..625447f 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, 
> struct brw_reg dest)
>  * or 16 (SIMD16), as that's normally correct.  However, when dealing with
>  * small registers, we automatically reduce it to match the register size.
>  */
> +   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, inst) 
> == dest.width);
> if (dest.width < BRW_EXECUTE_8)
>brw_inst_set_exec_size(devinfo, inst, dest.width);
>  }

Hmm, on top of your series this looks:

   /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8)
* or 16 (SIMD16), as that's normally correct.  However, when dealing with
* small registers, we automatically reduce it to match the register size.
*
* In platforms that support fp64 we can emit instructions with a width of
* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these
* cases we need to make sure that these instructions have their exec sizes
* set properly when they are emitted and we can't rely on this code to fix
* it.
*/
   bool fix_exec_size;
   if (devinfo->gen >= 6)
  fix_exec_size = dest.width < BRW_EXECUTE_4;
   else
  fix_exec_size = dest.width < BRW_EXECUTE_8;

   if (fix_exec_size)
  brw_inst_set_exec_size(devinfo, inst, dest.width);

Do you want the assertion before or after fixing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Vinson Lee
  CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a 
space between literal and identifier
  [-Wreserved-user-defined-literal]
   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
 ^

Signed-off-by: Vinson Lee 
---
 src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h 
b/src/gallium/drivers/nouveau/nouveau_debug.h
index d17df81..546a4ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_debug.h
+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
@@ -16,7 +16,7 @@
 #define NOUVEAU_DEBUG 0
 
 #define NOUVEAU_ERR(fmt, args...) \
-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)
 
 #define NOUVEAU_DBG(ch, args...)   \
if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.

2016-03-09 Thread Xavier B
From: xavier 

Previously it was doing this transformation for a Trine 3 shader:
 MUL R6.x.12,R13.x.23, 0.5|3f00
-MULADD R4.x.12,-R6.x.12, 2|4000, 1|3f80
+MULADD R4.x.12,-R13.x.23, -1|bf80, 1|3f80

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412
Signed-off-by: Xavier Bouchoux 
---
 src/gallium/drivers/r600/sb/sb_expr.cpp | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_expr.cpp 
b/src/gallium/drivers/r600/sb/sb_expr.cpp
index 556a05d..3dd3a48 100644
--- a/src/gallium/drivers/r600/sb/sb_expr.cpp
+++ b/src/gallium/drivers/r600/sb/sb_expr.cpp
@@ -598,9 +598,13 @@ bool expr_handler::fold_assoc(alu_node *n) {
 
unsigned op = n->bc.op;
bool allow_neg = false, cur_neg = false;
+   bool distribute_neg = false;
 
switch(op) {
case ALU_OP2_ADD:
+   distribute_neg = true;
+   allow_neg = true;
+   break;
case ALU_OP2_MUL:
case ALU_OP2_MUL_IEEE:
allow_neg = true;
@@ -632,7 +636,7 @@ bool expr_handler::fold_assoc(alu_node *n) {
if (v1->is_const()) {
literal arg = v1->get_const_value();
apply_alu_src_mod(a->bc, 1, arg);
-   if (cur_neg)
+   if (cur_neg && distribute_neg)
arg.f = -arg.f;
 
if (a == n)
@@ -660,7 +664,7 @@ bool expr_handler::fold_assoc(alu_node *n) {
if (v0->is_const()) {
literal arg = v0->get_const_value();
apply_alu_src_mod(a->bc, 0, arg);
-   if (cur_neg)
+   if (cur_neg && distribute_neg)
arg.f = -arg.f;
 
if (last_arg == 0) {
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 10:53 +0200, Pohjolainen, Topi wrote:
> On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> > On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > > > Hello,
> > > > 
> > > > There is only one patch from this series that has been reviewed (patch
> > > > 1).
> > > > 
> > > > Our plans is to start sending patches for adding fp64 support to i965
> > > > driver in the coming weeks but they depend on these patches.
> > > > 
> > > > Can someone take a look at them? ;)
> > > > 
> > > > Sam
> > > > 
> > > > 
> > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > > Hello,
> > > > > 
> > > > > This patch series is a updated version of the one Iago sent last
> > > > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > > > 
> > > > > We checked the gen9 code paths that work with a horizontal width of 4
> > > > > and we think there won't be any regression on gen9... but we don't
> > > > > have any gen9 machine to run piglit with these patches. Can someone
> > > > > check it?
> > > 
> > > I rebased it and ran it through the test system, gen9 seems to be fine, I
> > > only got one regression, and that was on old g965:
> > 
> > Awesome! would it be possible to run that test in g695 with the attached
> > change? If this is a regression caused by our code it should break at
> > the assert introduced with it.
> > 
> > > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy 
> > > all_samples srgb depthstencil -auto -fbo
> > > Pixels that should be unlit
> > >   count = 236444
> > >   RMS error = 0.025355
> > > Pixels that should be totally lit
> > >   count = 13308
> > >   Perfect output
> > > The error threshold for unlit and totally lit pixels test is 0.016650
> > > Pixels that should be partially lit
> > >   count = 12392
> > >   RMS error = 0.273876
> > > The error threshold for partially lit pixels is 0.333000
> > > Samples = 0, Result = fail
> > > 
> > > 
> > > But I'm not sure if this is caused by your patches.
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 6f11f59..625447f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, 
> > struct brw_reg dest)
> >  * or 16 (SIMD16), as that's normally correct.  However, when dealing 
> > with
> >  * small registers, we automatically reduce it to match the register 
> > size.
> >  */
> > +   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, inst) 
> > == dest.width);
> > if (dest.width < BRW_EXECUTE_8)
> >brw_inst_set_exec_size(devinfo, inst, dest.width);
> >  }
> 
> Hmm, on top of your series this looks:
> 
>/* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8)
> * or 16 (SIMD16), as that's normally correct.  However, when dealing with
> * small registers, we automatically reduce it to match the register size.
> *
> * In platforms that support fp64 we can emit instructions with a width of
> * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these
> * cases we need to make sure that these instructions have their exec sizes
> * set properly when they are emitted and we can't rely on this code to fix
> * it.
> */
>bool fix_exec_size;
>if (devinfo->gen >= 6)
>   fix_exec_size = dest.width < BRW_EXECUTE_4;
>else
>   fix_exec_size = dest.width < BRW_EXECUTE_8;
> 
>if (fix_exec_size)
>   brw_inst_set_exec_size(devinfo, inst, dest.width);
> 
> Do you want the assertion before or after fixing?
> 

Before, you can put it right after that comment. Thanks!

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Remove useless IR self-destruct backend_shader method.

2016-03-09 Thread Kenneth Graunke
On Tuesday, March 8, 2016 5:35:30 PM PST Francisco Jerez wrote:
> From the point it's constructed the CFG contains the only existing
> copy of the program IR, and it never becomes invalid.  Calling
> backend_shader::invalidate_cfg would have destroyed the program
> structure irrecoverably -- We weren't calling it at all for a good
> reason.
> ---
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 7 ---
>  src/mesa/drivers/dri/i965/brw_shader.h   | 1 -
>  2 files changed, 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/
dri/i965/brw_shader.cpp
> index dfe6afc..21977a2 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -1046,13 +1046,6 @@ backend_shader::calculate_cfg()
> cfg = new(mem_ctx) cfg_t(&this->instructions);
>  }
>  
> -void
> -backend_shader::invalidate_cfg()
> -{
> -   ralloc_free(this->cfg);
> -   this->cfg = NULL;
> -}
> -
>  /**
>   * Sets up the starting offsets for the groups of binding table entries
>   * commong to all pipeline stages.
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/
i965/brw_shader.h
> index 82374a4..15bed78 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -217,7 +217,6 @@ public:
> virtual void dump_instructions(const char *name);
>  
> void calculate_cfg();
> -   void invalidate_cfg();
>  
> virtual void invalidate_live_intervals() = 0;
>  };
> 

Yep.  This was useful back in the old days, when we stored the program
as a flat list of instructions, and kept the CFG as a data structure off
to the side.  We'd have to invalidate it and recompute it.

But then Matt made everything preserve the CFG, and made it essential.
So invalidating it is indeed just going to blow things up :)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Pohjolainen, Topi
On Wed, Mar 09, 2016 at 10:03:08AM +0100, Iago Toral wrote:
> On Wed, 2016-03-09 at 10:53 +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> > > On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez 
> > > > wrote:
> > > > > Hello,
> > > > > 
> > > > > There is only one patch from this series that has been reviewed (patch
> > > > > 1).
> > > > > 
> > > > > Our plans is to start sending patches for adding fp64 support to i965
> > > > > driver in the coming weeks but they depend on these patches.
> > > > > 
> > > > > Can someone take a look at them? ;)
> > > > > 
> > > > > Sam
> > > > > 
> > > > > 
> > > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > This patch series is a updated version of the one Iago sent last
> > > > > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > > > > 
> > > > > > We checked the gen9 code paths that work with a horizontal width of 
> > > > > > 4
> > > > > > and we think there won't be any regression on gen9... but we don't
> > > > > > have any gen9 machine to run piglit with these patches. Can someone
> > > > > > check it?
> > > > 
> > > > I rebased it and ran it through the test system, gen9 seems to be fine, 
> > > > I
> > > > only got one regression, and that was on old g965:
> > > 
> > > Awesome! would it be possible to run that test in g695 with the attached
> > > change? If this is a regression caused by our code it should break at
> > > the assert introduced with it.
> > > 
> > > > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy 
> > > > all_samples srgb depthstencil -auto -fbo
> > > > Pixels that should be unlit
> > > >   count = 236444
> > > >   RMS error = 0.025355
> > > > Pixels that should be totally lit
> > > >   count = 13308
> > > >   Perfect output
> > > > The error threshold for unlit and totally lit pixels test is 0.016650
> > > > Pixels that should be partially lit
> > > >   count = 12392
> > > >   RMS error = 0.273876
> > > > The error threshold for partially lit pixels is 0.333000
> > > > Samples = 0, Result = fail
> > > > 
> > > > 
> > > > But I'm not sure if this is caused by your patches.
> > > > ___
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > 
> > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > index 6f11f59..625447f 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, 
> > > struct brw_reg dest)
> > >  * or 16 (SIMD16), as that's normally correct.  However, when dealing 
> > > with
> > >  * small registers, we automatically reduce it to match the register 
> > > size.
> > >  */
> > > +   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, 
> > > inst) == dest.width);
> > > if (dest.width < BRW_EXECUTE_8)
> > >brw_inst_set_exec_size(devinfo, inst, dest.width);
> > >  }
> > 
> > Hmm, on top of your series this looks:
> > 
> >/* Generators should set a default exec_size of either 8 (SIMD4x2 or 
> > SIMD8)
> > * or 16 (SIMD16), as that's normally correct.  However, when dealing 
> > with
> > * small registers, we automatically reduce it to match the register 
> > size.
> > *
> > * In platforms that support fp64 we can emit instructions with a width 
> > of
> > * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these
> > * cases we need to make sure that these instructions have their exec 
> > sizes
> > * set properly when they are emitted and we can't rely on this code to 
> > fix
> > * it.
> > */
> >bool fix_exec_size;
> >if (devinfo->gen >= 6)
> >   fix_exec_size = dest.width < BRW_EXECUTE_4;
> >else
> >   fix_exec_size = dest.width < BRW_EXECUTE_8;
> > 
> >if (fix_exec_size)
> >   brw_inst_set_exec_size(devinfo, inst, dest.width);
> > 
> > Do you want the assertion before or after fixing?
> > 
> 
> Before, you can put it right after that comment. Thanks!

That is what I thought. Hold on, I'll give it a spin.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 09:24 +0100, Iago Toral wrote:
> On Wed, 2016-03-09 at 09:26 +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 09, 2016 at 09:07:44AM +0200, Pohjolainen, Topi wrote:
> > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > > > Hello,
> > > > 
> > > > There is only one patch from this series that has been reviewed (patch
> > > > 1).
> > > > 
> > > > Our plans is to start sending patches for adding fp64 support to i965
> > > > driver in the coming weeks but they depend on these patches.
> > > > 
> > > > Can someone take a look at them? ;)
> > > > 
> > > > Sam
> > > > 
> > > > 
> > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > > Hello,
> > > > > 
> > > > > This patch series is a updated version of the one Iago sent last
> > > > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > > > 
> > > > > We checked the gen9 code paths that work with a horizontal width of 4
> > > > > and we think there won't be any regression on gen9... but we don't
> > > > > have any gen9 machine to run piglit with these patches. Can someone
> > > > > check it?
> > > > > 
> > > > > Please read the original cover letter [0] for more information.
> > > > > 
> > > > > Sam
> > > > > 
> > > > > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> > > > > 46.html
> > > > > 
> > > > > Iago Toral Quiroga (5):
> > > > >   i965/eu: set correct execution size in brw_NOP
> > > > >   i965/fs: set execution size for SEND messages in
> > > > > generate_uniform_pull_constant_load_gen7
> > > 
> > > I don't have the series in my mailbox anymore, so I'll comment here. 
> > > There is:
> > > 
> > >brw_set_dest(p, send, dst);
> > > @@ -1279,6 +1280,7 @@ 
> > > fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
> > >/* dst = send(payload, a0.0 | ) */
> > >brw_inst *insn = brw_send_indirect_message(
> > >   p, BRW_SFID_SAMPLER, dst, src, addr);
> > > +  brw_inst_set_exec_size(devinfo, insn, dst.width);
> > > 
> > > I wonder if we should modify brw_send_indirect_message() instead? It 
> > > already
> > > calls brw_inst_set_exec_size() itself:
> > > 
> > >if (dst.width < BRW_EXECUTE_8)
> > >   brw_inst_set_exec_size(devinfo, send, dst.width);
> > 
> > Actually you set this yourself in the next patch of the series. Is the
> > previous in the caller side really needed after this?
> 
> Good catch! I'll give it a quick test through piglit with the assertion
> I mentioned in my previous reply to see if we can drop this hunk, that's
> the only way to be certain :)

Piglit seems to be happy dropping this hunk on IVB, so I think it is a
safe change.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: dont allow undefined array sizes in ES

2016-03-09 Thread Kenneth Graunke
On Tuesday, March 8, 2016 8:35:41 PM PST Timothy Arceri wrote:
> This applies the rule to empty declarations.
> 
> Fixes:
> dEQP-
GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex
> dEQP-
GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/
ast_to_hir.cpp
> index d755a11..8918981 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -4223,6 +4223,17 @@ ast_declarator_list::hir(exec_list *instructions,
>type_name);
>} else {
>   if (decl_type->base_type == GLSL_TYPE_ARRAY) {
> +/* From Section 13.22 (Array Declarations) of the GLSL ES 3.2
> + * spec:
> + *
> + *"... any declaration that leaves the size undefined is
> + *disallowed as this would add complexity and there are no
> + *use-cases."
> + */
> +if (state->es_shader && decl_type->is_unsized_array())
> +   _mesa_glsl_error(&loc, state, "array size must be explicitly 
"
> +"or implicitly defined");

Usual coding style is to add braces around multi-line branches.

Reviewed-by: Kenneth Graunke 

> +
>  /* From Section 4.12 (Empty Declarations) of the GLSL 4.5 spec:
>   *
>   *"The combinations of types and qualifiers that cause
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 09:32 +0200, Pohjolainen, Topi wrote:
> On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> > Hello,
> > 
> > There is only one patch from this series that has been reviewed (patch
> > 1).
> > 
> > Our plans is to start sending patches for adding fp64 support to i965
> > driver in the coming weeks but they depend on these patches.
> > 
> > Can someone take a look at them? ;)
> > 
> > Sam
> > 
> > 
> > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > Hello,
> > > 
> > > This patch series is a updated version of the one Iago sent last
> > > week [0] that includes patches for gen6 too, as suggested by Jason.
> > > 
> > > We checked the gen9 code paths that work with a horizontal width of 4
> > > and we think there won't be any regression on gen9... but we don't
> > > have any gen9 machine to run piglit with these patches. Can someone
> > > check it?
> > > 
> > > Please read the original cover letter [0] for more information.
> > > 
> > > Sam
> > > 
> > > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> > > 46.html
> > > 
> > > Iago Toral Quiroga (5):
> > >   i965/eu: set correct execution size in brw_NOP
> > >   i965/fs: set execution size for SEND messages in
> > > generate_uniform_pull_constant_load_gen7
> 
> Then about the other change. I like it being explicitly set instead of just
> inheriting the size from the previous instruction.

I'll do that, thanks!

> @@ -1248,6 +1248,7 @@ 
> fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
>brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
>brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
> +  brw_inst_set_exec_size(devinfo, send, dst.width);
> 
> But I'm seeing other occurrences of BRW_OPCODE_SEND as well. For example, 
> there
> are such instructions generated in generate_urb_read/write() which are not 
> addressed. Don't we end up there with doubles as well needing the same
> treatment?
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Pohjolainen, Topi
On Wed, Mar 09, 2016 at 11:05:17AM +0200, Pohjolainen, Topi wrote:
> On Wed, Mar 09, 2016 at 10:03:08AM +0100, Iago Toral wrote:
> > On Wed, 2016-03-09 at 10:53 +0200, Pohjolainen, Topi wrote:
> > > On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> > > > On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > > > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez 
> > > > > wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > There is only one patch from this series that has been reviewed 
> > > > > > (patch
> > > > > > 1).
> > > > > > 
> > > > > > Our plans is to start sending patches for adding fp64 support to 
> > > > > > i965
> > > > > > driver in the coming weeks but they depend on these patches.
> > > > > > 
> > > > > > Can someone take a look at them? ;)
> > > > > > 
> > > > > > Sam
> > > > > > 
> > > > > > 
> > > > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > > > > > Hello,
> > > > > > > 
> > > > > > > This patch series is a updated version of the one Iago sent last
> > > > > > > week [0] that includes patches for gen6 too, as suggested by 
> > > > > > > Jason.
> > > > > > > 
> > > > > > > We checked the gen9 code paths that work with a horizontal width 
> > > > > > > of 4
> > > > > > > and we think there won't be any regression on gen9... but we don't
> > > > > > > have any gen9 machine to run piglit with these patches. Can 
> > > > > > > someone
> > > > > > > check it?
> > > > > 
> > > > > I rebased it and ran it through the test system, gen9 seems to be 
> > > > > fine, I
> > > > > only got one regression, and that was on old g965:
> > > > 
> > > > Awesome! would it be possible to run that test in g695 with the attached
> > > > change? If this is a regression caused by our code it should break at
> > > > the assert introduced with it.
> > > > 
> > > > > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy
> > > > >  all_samples srgb depthstencil -auto -fbo
> > > > > Pixels that should be unlit
> > > > >   count = 236444
> > > > >   RMS error = 0.025355
> > > > > Pixels that should be totally lit
> > > > >   count = 13308
> > > > >   Perfect output
> > > > > The error threshold for unlit and totally lit pixels test is 0.016650
> > > > > Pixels that should be partially lit
> > > > >   count = 12392
> > > > >   RMS error = 0.273876
> > > > > The error threshold for partially lit pixels is 0.333000
> > > > > Samples = 0, Result = fail
> > > > > 
> > > > > 
> > > > > But I'm not sure if this is caused by your patches.
> > > > > ___
> > > > > mesa-dev mailing list
> > > > > mesa-dev@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > > 
> > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > index 6f11f59..625447f 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, 
> > > > struct brw_reg dest)
> > > >  * or 16 (SIMD16), as that's normally correct.  However, when 
> > > > dealing with
> > > >  * small registers, we automatically reduce it to match the 
> > > > register size.
> > > >  */
> > > > +   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, 
> > > > inst) == dest.width);
> > > > if (dest.width < BRW_EXECUTE_8)
> > > >brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > >  }
> > > 
> > > Hmm, on top of your series this looks:
> > > 
> > >/* Generators should set a default exec_size of either 8 (SIMD4x2 or 
> > > SIMD8)
> > > * or 16 (SIMD16), as that's normally correct.  However, when dealing 
> > > with
> > > * small registers, we automatically reduce it to match the register 
> > > size.
> > > *
> > > * In platforms that support fp64 we can emit instructions with a 
> > > width of
> > > * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In 
> > > these
> > > * cases we need to make sure that these instructions have their exec 
> > > sizes
> > > * set properly when they are emitted and we can't rely on this code 
> > > to fix
> > > * it.
> > > */
> > >bool fix_exec_size;
> > >if (devinfo->gen >= 6)
> > >   fix_exec_size = dest.width < BRW_EXECUTE_4;
> > >else
> > >   fix_exec_size = dest.width < BRW_EXECUTE_8;
> > > 
> > >if (fix_exec_size)
> > >   brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > 
> > > Do you want the assertion before or after fixing?
> > > 
> > 
> > Before, you can put it right after that comment. Thanks!
> 
> That is what I thought. Hold on, I'll give it a spin.

Okay, now the system got really mad, I have some 12000 regressions on
g45, ilk and g965.

And for the test discussed above we hit the assert:

/tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisampl

Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Iago Toral
On Wed, 2016-03-09 at 11:42 +0200, Pohjolainen, Topi wrote:
> On Wed, Mar 09, 2016 at 11:05:17AM +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 09, 2016 at 10:03:08AM +0100, Iago Toral wrote:
> > > On Wed, 2016-03-09 at 10:53 +0200, Pohjolainen, Topi wrote:
> > > > On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> > > > > On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > > > > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez 
> > > > > > wrote:
> > > > > > > Hello,
> > > > > > > 
> > > > > > > There is only one patch from this series that has been reviewed 
> > > > > > > (patch
> > > > > > > 1).
> > > > > > > 
> > > > > > > Our plans is to start sending patches for adding fp64 support to 
> > > > > > > i965
> > > > > > > driver in the coming weeks but they depend on these patches.
> > > > > > > 
> > > > > > > Can someone take a look at them? ;)
> > > > > > > 
> > > > > > > Sam
> > > > > > > 
> > > > > > > 
> > > > > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez 
> > > > > > > wrote:
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > This patch series is a updated version of the one Iago sent last
> > > > > > > > week [0] that includes patches for gen6 too, as suggested by 
> > > > > > > > Jason.
> > > > > > > > 
> > > > > > > > We checked the gen9 code paths that work with a horizontal 
> > > > > > > > width of 4
> > > > > > > > and we think there won't be any regression on gen9... but we 
> > > > > > > > don't
> > > > > > > > have any gen9 machine to run piglit with these patches. Can 
> > > > > > > > someone
> > > > > > > > check it?
> > > > > > 
> > > > > > I rebased it and ran it through the test system, gen9 seems to be 
> > > > > > fine, I
> > > > > > only got one regression, and that was on old g965:
> > > > > 
> > > > > Awesome! would it be possible to run that test in g695 with the 
> > > > > attached
> > > > > change? If this is a regression caused by our code it should break at
> > > > > the assert introduced with it.
> > > > > 
> > > > > > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy
> > > > > >  all_samples srgb depthstencil -auto -fbo
> > > > > > Pixels that should be unlit
> > > > > >   count = 236444
> > > > > >   RMS error = 0.025355
> > > > > > Pixels that should be totally lit
> > > > > >   count = 13308
> > > > > >   Perfect output
> > > > > > The error threshold for unlit and totally lit pixels test is 
> > > > > > 0.016650
> > > > > > Pixels that should be partially lit
> > > > > >   count = 12392
> > > > > >   RMS error = 0.273876
> > > > > > The error threshold for partially lit pixels is 0.333000
> > > > > > Samples = 0, Result = fail
> > > > > > 
> > > > > > 
> > > > > > But I'm not sure if this is caused by your patches.
> > > > > > ___
> > > > > > mesa-dev mailing list
> > > > > > mesa-dev@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > > > 
> > > > 
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > > > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > index 6f11f59..625447f 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst 
> > > > > *inst, struct brw_reg dest)
> > > > >  * or 16 (SIMD16), as that's normally correct.  However, when 
> > > > > dealing with
> > > > >  * small registers, we automatically reduce it to match the 
> > > > > register size.
> > > > >  */
> > > > > +   assert(dest.width != BRW_EXECUTE_4 || brw_inst_exec_size(devinfo, 
> > > > > inst) == dest.width);
> > > > > if (dest.width < BRW_EXECUTE_8)
> > > > >brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > > >  }
> > > > 
> > > > Hmm, on top of your series this looks:
> > > > 
> > > >/* Generators should set a default exec_size of either 8 (SIMD4x2 or 
> > > > SIMD8)
> > > > * or 16 (SIMD16), as that's normally correct.  However, when 
> > > > dealing with
> > > > * small registers, we automatically reduce it to match the register 
> > > > size.
> > > > *
> > > > * In platforms that support fp64 we can emit instructions with a 
> > > > width of
> > > > * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In 
> > > > these
> > > > * cases we need to make sure that these instructions have their 
> > > > exec sizes
> > > > * set properly when they are emitted and we can't rely on this code 
> > > > to fix
> > > > * it.
> > > > */
> > > >bool fix_exec_size;
> > > >if (devinfo->gen >= 6)
> > > >   fix_exec_size = dest.width < BRW_EXECUTE_4;
> > > >else
> > > >   fix_exec_size = dest.width < BRW_EXECUTE_8;
> > > > 
> > > >if (fix_exec_size)
> > > >   brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > > 
> > > > Do you want the assertion before or after f

Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Samuel Pitoiset

Nouveau doesn't use c++11 except the codegen part.
How do you hit that issue? Pretty sure that you forced c++11, right?

I can't reproduce that compilation error with clang 3.9 btw.

On 03/09/2016 09:57 AM, Vinson Lee wrote:

   CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a 
space between literal and identifier
   [-Wreserved-user-defined-literal]
fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
  ^

Signed-off-by: Vinson Lee 
---
  src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h 
b/src/gallium/drivers/nouveau/nouveau_debug.h
index d17df81..546a4ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_debug.h
+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
@@ -16,7 +16,7 @@
  #define NOUVEAU_DEBUG 0

  #define NOUVEAU_ERR(fmt, args...) \
-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)

  #define NOUVEAU_DBG(ch, args...)   \
 if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-09 Thread tournier.elie
Hi everyone.

My name is Elie TOURNIER, I am enrolled in a French Engineering school
(Telecom Physique Strasbourg) specialized in Medical ICT.
I'm interested in implementing "Soft" double precision floating point
support [1].
Taking this subject seem to be a good way to get my feet wet in the Mesa
code and discover how some of its components works.

I come to you in order to become know but also to retrieve valuable
information for the success of this project.

I would like to know more about the following things to understand your
requirements :
1- "*Each double precision value would be stored in a uvec2*" The IEEE
double precision floating point standard representation requires a 64 bit:
1 for sign, 11 for exponent and the others for fraction [2].
-> How double precision value must be stored?
2- Where can I find GL_ARB_gpu_shader_fp64 documentation?


This is my first exposure to Mesa. Please excuse me if I am asking basic
questions.

Please point me to the right resources so that I can better understand the
project. I would also be happy to fix a bug to familiarize myself  with the
source code. Any suggestions on bugs that are relevant to the project will
be of great help.

Regards,
Elie

[1]
http://www.x.org/wiki/SummerOfCodeIdeas/#softdoubleprecisionfloatingpointsupport
[2] http://steve.hollasch.net/cgindex/coding/ieeefloat.html#storage

PS: If you have any questions, please don't hesitate to contact me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: add support for TGSI FMA ops

2016-03-09 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 

On 03/09/2016 07:06 AM, Ilia Mirkin wrote:

This will allow the nouveau backend to not try and split up ops that are
fused in GLSL.

Signed-off-by: Ilia Mirkin 
---
  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 5 +
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c| 3 ++-
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 8683722..b06d86a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -525,6 +525,7 @@ nv50_ir::DataType Instruction::inferSrcType() const
 case TGSI_OPCODE_DRCP:
 case TGSI_OPCODE_DSQRT:
 case TGSI_OPCODE_DMAD:
+   case TGSI_OPCODE_DFMA:
 case TGSI_OPCODE_DFRAC:
 case TGSI_OPCODE_DRSQ:
 case TGSI_OPCODE_DTRUNC:
@@ -624,6 +625,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
 NV50_IR_OPCODE_CASE(SLT, SET);
 NV50_IR_OPCODE_CASE(SGE, SET);
 NV50_IR_OPCODE_CASE(MAD, MAD);
+   NV50_IR_OPCODE_CASE(FMA, FMA);
 NV50_IR_OPCODE_CASE(SUB, SUB);

 NV50_IR_OPCODE_CASE(FLR, FLOOR);
@@ -723,6 +725,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
 NV50_IR_OPCODE_CASE(DRCP, RCP);
 NV50_IR_OPCODE_CASE(DSQRT, SQRT);
 NV50_IR_OPCODE_CASE(DMAD, MAD);
+   NV50_IR_OPCODE_CASE(DFMA, FMA);
 NV50_IR_OPCODE_CASE(D2I, CVT);
 NV50_IR_OPCODE_CASE(D2U, CVT);
 NV50_IR_OPCODE_CASE(I2D, CVT);
@@ -2672,6 +2675,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
 case TGSI_OPCODE_MAD:
 case TGSI_OPCODE_UMAD:
 case TGSI_OPCODE_SAD:
+   case TGSI_OPCODE_FMA:
FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
   src0 = fetchSrc(0, c);
   src1 = fetchSrc(1, c);
@@ -3395,6 +3399,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
}
break;
 case TGSI_OPCODE_DMAD:
+   case TGSI_OPCODE_DFMA:
FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
   src0 = getSSA(8);
   src1 = getSSA(8);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 37620ea..eb2bff5 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -333,8 +333,9 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
return 1;
 case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
return 1;
-   case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
 case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
+  return 1;
+   case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
 case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
 case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: report correct number of allowed vertex inputs and fragment outputs

2016-03-09 Thread Iago Toral Quiroga
Before we would always report 16 for both and we would only fail if either
one exceeded 16. Now we fail if the maximum for each is exceeded, even if
it is smaller than 16 and we report the correct maximum.

Also, expand the size of to_assign[] to 32. There is code at the top
of the function handling max_index up to 32, so this just makes the
code more consistent.
---
 src/compiler/glsl/linker.cpp | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 4cec107..76b700d 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2417,7 +2417,8 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
 /* Reversed because we want a descending order sort below. */
 return r->slots - l->slots;
   }
-   } to_assign[16];
+   } to_assign[32];
+   assert(max_index <= 32);
 
unsigned num_attr = 0;
 
@@ -2625,11 +2626,11 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
 continue;
   }
 
-  if (num_attr >= ARRAY_SIZE(to_assign)) {
+  if (num_attr >= max_index) {
  linker_error(prog, "too many %s (max %u)",
   target_index == MESA_SHADER_VERTEX ?
   "vertex shader inputs" : "fragment shader outputs",
-  (unsigned)ARRAY_SIZE(to_assign));
+  max_index);
  return false;
   }
   to_assign[num_attr].slots = slots;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/26] gallium/radeon: disable CMASK on handle export if sharing doesn't allow it

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 7:18 AM, Nicolai Hähnle  wrote:
> On 08.03.2016 14:35, Marek Olšák wrote:
>>
>> On Tue, Mar 8, 2016 at 4:41 AM, Michel Dänzer  wrote:
>>>
>>> On 03.03.2016 01:36, Marek Olšák wrote:

 From: Marek Olšák 

 The disabling of CMASK is simple, but notifying all contexts about it is
 not:
 - The screen must have a list of all contexts.
 - Each context must have a monotonic counter that is incremented only
 when
the screen wants to re-emit framebuffer states.
 - Each context must check in draw_vbo if the counter has been changed
 and
re-emit the framebuffer state accordingly.
>>>
>>>
>>> The list seems a bit overkill. How about having dirty_fb_counter in the
>>> screen and last_dirty_fb_counter in the context, incrementing the former
>>> in r600_dirty_all_framebuffer_states and emitting the framebuffer state
>>> if the two counters don't match?
>>
>>
>> Thanks. The updated patch is attached. Please review.
>
>
> There is an unneeded empty line in this hunk:
>
> @@ -260,6 +265,31 @@ static void r600_eliminate_fast_color_clear(struct
> r600_common_screen *rscreen,
> pipe_mutex_unlock(rscreen->aux_context_lock);
>  }
>
> +static void r600_texture_disable_cmask(struct r600_common_screen *rscreen,
> +  struct r600_texture *rtex)
> +{
> +
> +   if (!rtex->cmask.size)
> +   return;
>
> Slightly further down, I believe the pipe_resource_reference should be
> unconditional:
>
> +   if (rtex->cmask_buffer != &rtex->resource)
> +   pipe_resource_reference((struct
> pipe_resource**)&rtex->cmask_buffer, NULL);

It's how cmask_buffer is unreferenced everywhere. If the texture had
multiple samples, cmask_buffer would be part of the resource and so
the code would release the resource, which would be undesirable. This
is not a possible scenario yet, because texture_get_handle doesn't
support MSAA surfaces, but it matches the unref code elsewhere.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/26] radeonsi: disable DCC on handle export if expecting write access

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 7:19 AM, Nicolai Hähnle  wrote:
> On 02.03.2016 11:36, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> This should be okay except that sampler views and images are not re-set.
>> ---
>>   src/gallium/drivers/radeon/r600_pipe_common.h |  3 +++
>>   src/gallium/drivers/radeon/r600_texture.c | 33
>> +++
>>   src/gallium/drivers/radeonsi/si_blit.c| 12 ++
>>   3 files changed, 48 insertions(+)
>>
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
>> b/src/gallium/drivers/radeon/r600_pipe_common.h
>> index 6e65742..43218f1 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
>> @@ -481,6 +481,9 @@ struct r600_common_context {
>>   unsigned first_layer, unsigned
>> last_layer,
>>   unsigned first_sample, unsigned
>> last_sample);
>>
>> +   void (*decompress_dcc)(struct pipe_context *ctx,
>> +  struct r600_texture *rtex);
>> +
>> /* Reallocate the buffer and update all resource bindings where
>>  * the buffer is bound, including all resource descriptors. */
>> void (*invalidate_buffer)(struct pipe_context *ctx, struct
>> pipe_resource *buf);
>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>> b/src/gallium/drivers/radeon/r600_texture.c
>> index 4424ca3..d42d807 100644
>> --- a/src/gallium/drivers/radeon/r600_texture.c
>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>> @@ -296,6 +296,31 @@ static void r600_texture_disable_cmask(struct
>> r600_common_screen *rscreen,
>> r600_dirty_all_framebuffer_states(rscreen);
>>   }
>>
>> +static void r600_texture_disable_dcc(struct r600_common_screen *rscreen,
>> +struct r600_texture *rtex)
>> +{
>> +   struct r600_common_context *rctx =
>> +   (struct r600_common_context *)rscreen->aux_context;
>> +
>> +   if (!rtex->dcc_offset)
>> +   return;
>> +
>> +   /* Decompress DCC. */
>> +   pipe_mutex_lock(rscreen->aux_context_lock);
>> +   rctx->decompress_dcc(&rctx->b, rtex);
>> +   rctx->b.flush(&rctx->b, NULL, 0);
>> +   pipe_mutex_unlock(rscreen->aux_context_lock);
>> +
>> +   /* Disable DCC. */
>> +   rtex->dcc_offset = 0;
>> +   rtex->cb_color_info &= ~VI_S_028C70_DCC_ENABLE(1);
>> +
>> +   /* Notify all contexts about the change. */
>> +   r600_dirty_all_framebuffer_states(rscreen);
>> +
>> +   /* TODO: re-set all sampler views and images, but how? */
>> +}
>> +
>>   static boolean r600_texture_get_handle(struct pipe_screen* screen,
>>struct pipe_resource *resource,
>>struct winsys_handle *whandle,
>> @@ -318,6 +343,13 @@ static boolean r600_texture_get_handle(struct
>> pipe_screen* screen,
>> res->external_usage = usage;
>>
>> if (resource->target != PIPE_BUFFER) {
>> +   /* Since shader image stores don't support DCC on
>> VI,
>> +* disable it for external clients that want write
>> +* access.
>> +*/
>> +   if (usage & PIPE_HANDLE_USAGE_WRITE)
>> +   r600_texture_disable_dcc(rscreen, rtex);
>
>
> Have you considered keeping DCC enabled when the user sets the explicit
> flush flag and having flush_resource decompress for writably-shared
> resources?

DCC decompression is a very costly operation and it's better to avoid
it if possible. Currently, DCC is only supported with non-displayable
surfaces, but all users of flush_resource (DRI2/3) only get
displayable surfaces. Thus, the driver doesn't have to worry about
flush_resource with DCC.

>
> Thinking about this brings up a more general question about the intended
> semantics of the explicit flush flag. If process A creates and exports a
> texture with explicit flush and process B imports it, is process B expected
> to explicitly flush the texture for *its* changes to the texture to become
> visible in program A?

If the process A state tracker calls the explicit flush, it should set
the flag in resource_get_handle. If the process B state tracker calls
the explicit flush, it should set the flag in resource_from_handle.
The explicit flush flag only describes the usage for the driver
receiving it.

>
> If I understand your current implementation correctly, changes in process B
> do *not* currently need explicit flush (because DCC is disabled and process
> B will never allocate a CMASK). The question is whether this is just a happy
> coincidence of the current implementation or whether it is actually
> something we want to promise for the future.

The happy coincidence is that the explicit flush is only done on
displayable surfaces (no DCC there).

Note that this series only makes nece

Re: [Mesa-dev] [PATCH 5/5] st/dri: implement the GL interop DRI extension (v2)

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 4:28 AM, Michel Dänzer  wrote:
> On 09.03.2016 07:52, Marek Olšák wrote:
>> From: Marek Olšák 
>>
>> v2: - set interop_version
>> - simplify the offset_after macro
>
> [...]
>
>> @@ -1417,6 +1422,254 @@ static const __DRIrobustnessExtension dri2Robustness 
>> = {
>> .base = { __DRI2_ROBUSTNESS, 1 }
>>  };
>>
>> +#define offset_after(type, member) \
>> +   offsetof(type, member) + sizeof(((type*)0)->member)
>
> Does this compile with clang?

Honestly I have no idea. Why wouldn't it?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v5)

2016-03-09 Thread Marek Olšák
From: Marek Olšák 

v2: - use "enum" to define stuff
v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED
v4: - add mesa_glinterop_device_info::interop_version
- more comments
- remove #define MESA_GLINTEROP_VERSION
- use const for "in"
v5: pass the structure sizes via function parameters
---
 include/GL/mesa_glinterop.h | 278 
 1 file changed, 278 insertions(+)
 create mode 100644 include/GL/mesa_glinterop.h

diff --git a/include/GL/mesa_glinterop.h b/include/GL/mesa_glinterop.h
new file mode 100644
index 000..1fcc162
--- /dev/null
+++ b/include/GL/mesa_glinterop.h
@@ -0,0 +1,278 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright 2016 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/* Mesa OpenGL inter-driver interoperability interface designed for but not
+ * limited to OpenCL.
+ *
+ * This is a driver-agnostic, backward-compatible interface. The structures
+ * are only allowed to grow. They can never shrink and their members can
+ * never be removed, renamed, or redefined.
+ *
+ * The interface doesn't return a lot of static texture parameters like
+ * width, height, etc. It mainly returns mutable buffer and texture view
+ * parameters that can't be part of the texture allocation (because they are
+ * mutable). If drivers want to return more data or want to return static
+ * allocation parameters, they can do it in one of these two ways:
+ * - attaching the data to the DMABUF handle in a driver-specific way
+ * - passing the data via "out_driver_data" in the "in" structure.
+ *
+ * Mesa is expected to do a lot of error checking on behalf of OpenCL, such
+ * as checking the target, miplevel, and texture completeness.
+ *
+ * OpenCL, on the other hand, needs to check if the display+context combo
+ * is compatible with the OpenCL driver by querying the device information.
+ * It also needs to check if the texture internal format and channel ordering
+ * (returned in a driver-specific way) is supported by OpenCL, among other
+ * things.
+ */
+
+#ifndef MESA_GLINTEROP_H
+#define MESA_GLINTEROP_H
+
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** Returned error codes. */
+enum {
+   MESA_GLINTEROP_SUCCESS = 0,
+   MESA_GLINTEROP_OUT_OF_RESOURCES,
+   MESA_GLINTEROP_OUT_OF_HOST_MEMORY,
+   MESA_GLINTEROP_INVALID_OPERATION,
+   MESA_GLINTEROP_INVALID_VALUE,
+   MESA_GLINTEROP_INVALID_DISPLAY,
+   MESA_GLINTEROP_INVALID_CONTEXT,
+   MESA_GLINTEROP_INVALID_TARGET,
+   MESA_GLINTEROP_INVALID_OBJECT,
+   MESA_GLINTEROP_INVALID_MIP_LEVEL,
+   MESA_GLINTEROP_UNSUPPORTED
+};
+
+/** Access flags. */
+enum {
+   MESA_GLINTEROP_ACCESS_READ_WRITE = 0,
+   MESA_GLINTEROP_ACCESS_READ_ONLY,
+   MESA_GLINTEROP_ACCESS_WRITE_ONLY
+};
+
+
+/**
+ * Device information returned by Mesa.
+ */
+typedef struct _mesa_glinterop_device_info {
+   /* PCI location */
+   uint32_t pci_segment_group;
+   uint32_t pci_bus;
+   uint32_t pci_device;
+   uint32_t pci_function;
+
+   /* Device identification */
+   uint32_t vendor_id;
+   uint32_t device_id;
+
+   /* The interop version determines what behavior the caller should expect
+* out of all functions.
+*
+* Interop version 1:
+* - mesa_glinterop_export_in is not read beyond "out_driver_data"
+* - mesa_glinterop_export_out is not written beyond "view_numlayers"
+* - mesa_glinterop_device_info is not written beyond "interop_version"
+*/
+   uint32_t interop_version;
+} mesa_glinterop_device_info;
+
+
+/**
+ * Input parameters to Mesa interop export functions.
+ */
+typedef struct _mesa_glinterop_export_in {
+   /* One of the following:
+* - GL_TEXTURE_BUFFER
+* - GL_TEXTURE_1D
+* - GL_TEXTURE_2D
+* - GL_TEXTURE_3D
+* - GL_TEXTURE_RECTANGLE
+* - GL_TEXTURE_1D_ARRAY
+* - GL_TEXTURE_2D_ARRAY
+* - GL_TEXTURE_CUBE_MAP_ARRAY
+* - GL_TEXTURE_CUBE_MAP
+* - GL_TEXTURE_CUBE_MAP_POSI

[Mesa-dev] [PATCH 3/5] egl: implement EGL part of interop interface (v3)

2016-03-09 Thread Marek Olšák
From: Marek Olšák 

v2: - use const
v3: - add in/out_size parameters
---
 src/egl/drivers/dri2/egl_dri2.c | 37 
 src/egl/drivers/dri2/egl_dri2.h |  1 +
 src/egl/main/eglapi.c   | 76 +
 src/egl/main/eglapi.h   | 12 +++
 4 files changed, 126 insertions(+)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 8f50f0c..3522dfa 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -44,6 +44,7 @@
 #endif
 #include 
 #include 
+#include "GL/mesa_glinterop.h"
 #include 
 #include 
 
@@ -736,6 +737,8 @@ dri2_create_screen(_EGLDisplay *disp)
   if (strcmp(extensions[i]->name, __DRI2_RENDERER_QUERY) == 0) {
  dri2_dpy->rendererQuery = (__DRI2rendererQueryExtension *) 
extensions[i];
   }
+  if (strcmp(extensions[i]->name, __DRI2_INTEROP) == 0)
+ dri2_dpy->interop = (__DRI2interopExtension *) extensions[i];
}
 
dri2_setup_screen(disp);
@@ -2512,6 +2515,38 @@ dri2_server_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSync *sync)
return EGL_TRUE;
 }
 
+static int
+dri2_interop_query_device_info(_EGLDisplay *dpy, _EGLContext *ctx,
+   unsigned out_size,
+   mesa_glinterop_device_info *out)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
+
+   if (!dri2_dpy->interop)
+  return MESA_GLINTEROP_UNSUPPORTED;
+
+   return dri2_dpy->interop->query_device_info(dri2_ctx->dri_context,
+   out_size, out);
+}
+
+static int
+dri2_interop_export_object(_EGLDisplay *dpy, _EGLContext *ctx,
+   unsigned in_size,
+   const mesa_glinterop_export_in *in,
+   unsigned out_size,
+   mesa_glinterop_export_out *out)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
+
+   if (!dri2_dpy->interop)
+  return MESA_GLINTEROP_UNSUPPORTED;
+
+   return dri2_dpy->interop->export_object(dri2_ctx->dri_context,
+   in_size, in, out_size, out);
+}
+
 static void
 dri2_unload(_EGLDriver *drv)
 {
@@ -2622,6 +2657,8 @@ _eglBuiltInDriverDRI2(const char *args)
dri2_drv->base.API.ClientWaitSyncKHR = dri2_client_wait_sync;
dri2_drv->base.API.WaitSyncKHR = dri2_server_wait_sync;
dri2_drv->base.API.DestroySyncKHR = dri2_destroy_sync;
+   dri2_drv->base.API.GLInteropQueryDeviceInfo = 
dri2_interop_query_device_info;
+   dri2_drv->base.API.GLInteropExportObject = dri2_interop_export_object;
 
dri2_drv->base.Name = "DRI2";
dri2_drv->base.Unload = dri2_unload;
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 52ad92b..d83bc1e 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -174,6 +174,7 @@ struct dri2_egl_display
const __DRI2configQueryExtension *config;
const __DRI2fenceExtension *fence;
const __DRI2rendererQueryExtension *rendererQuery;
+   const __DRI2interopExtension *interop;
int   fd;
 
int   own_device;
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 32f6823..e4ec5c0 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -88,6 +88,7 @@
 #include 
 #include "c99_compat.h"
 #include "c11/threads.h"
+#include "GL/mesa_glinterop.h"
 #include "eglcompiler.h"
 
 #include "eglglobals.h"
@@ -1905,3 +1906,78 @@ eglGetProcAddress(const char *procname)
 
RETURN_EGL_SUCCESS(NULL, ret);
 }
+
+static int
+_eglLockDisplayInterop(EGLDisplay dpy, EGLContext context,
+   _EGLDisplay **disp, _EGLDriver **drv,
+   _EGLContext **ctx)
+{
+
+   *disp = _eglLockDisplay(dpy);
+   if (!*disp || !(*disp)->Initialized || !(*disp)->Driver) {
+  if (*disp)
+ _eglUnlockDisplay(*disp);
+  return MESA_GLINTEROP_INVALID_DISPLAY;
+   }
+
+   *drv = (*disp)->Driver;
+
+   *ctx = _eglLookupContext(context, *disp);
+   if (!*ctx ||
+   ((*ctx)->ClientAPI != EGL_OPENGL_API &&
+(*ctx)->ClientAPI != EGL_OPENGL_ES_API)) {
+  _eglUnlockDisplay(*disp);
+  return MESA_GLINTEROP_INVALID_CONTEXT;
+   }
+
+   return MESA_GLINTEROP_SUCCESS;
+}
+
+GLAPI int GLAPIENTRY
+MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
+unsigned out_size,
+mesa_glinterop_device_info *out)
+{
+   _EGLDisplay *disp;
+   _EGLDriver *drv;
+   _EGLContext *ctx;
+   int ret;
+
+   ret = _eglLockDisplayInterop(dpy, context, &disp, &drv, &ctx);
+   if (ret != MESA_GLINTEROP_SUCCESS)
+  return ret;
+
+   if (drv->API.GLInteropQueryDeviceInfo)
+  ret = drv->API.GLInteropQueryDeviceInfo(disp, ctx, out_size, out);
+  

[Mesa-dev] [PATCH 2/5] dri_interface: add interface for GL interop with other APIs (v3)

2016-03-09 Thread Marek Olšák
From: Marek Olšák 

v2: - use const
v3: - add in/out_size parameters
---
 include/GL/internal/dri_interface.h | 29 +
 1 file changed, 29 insertions(+)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 2b49a29..c549adb 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -79,6 +79,7 @@ typedef struct __DRIdri2LoaderExtensionRec
__DRIdri2LoaderExtension;
 typedef struct __DRI2flushExtensionRec __DRI2flushExtension;
 typedef struct __DRI2throttleExtensionRec  __DRI2throttleExtension;
 typedef struct __DRI2fenceExtensionRec  __DRI2fenceExtension;
+typedef struct __DRI2interopExtensionRec   __DRI2interopExtension;
 
 
 typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
@@ -392,6 +393,34 @@ struct __DRI2fenceExtensionRec {
 };
 
 
+/**
+ * Extension for API interop.
+ * See GL/mesa_glinterop.h.
+ */
+
+#define __DRI2_INTEROP "DRI2_Interop"
+#define __DRI2_INTEROP_VERSION 1
+
+typedef struct _mesa_glinterop_device_info mesa_glinterop_device_info;
+typedef struct _mesa_glinterop_export_in mesa_glinterop_export_in;
+typedef struct _mesa_glinterop_export_out mesa_glinterop_export_out;
+
+struct __DRI2interopExtensionRec {
+   __DRIextension base;
+
+   /** Same as MesaGLInterop*QueryDeviceInfo. */
+   int (*query_device_info)(__DRIcontext *ctx,
+unsigned out_size,
+mesa_glinterop_device_info *out);
+
+   /** Same as MesaGLInterop*ExportObject. */
+   int (*export_object)(__DRIcontext *ctx,
+unsigned in_size,
+const mesa_glinterop_export_in *in,
+unsigned out_size,
+mesa_glinterop_export_out *out);
+};
+
 /*@}*/
 
 /**
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] st/dri: implement the GL interop DRI extension (v3)

2016-03-09 Thread Marek Olšák
From: Marek Olšák 

v2: - set interop_version
- simplify the offset_after macro
v3: - add in/out_size parameters
---
 src/gallium/state_trackers/dri/dri2.c | 258 ++
 1 file changed, 258 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 7f7fbc4..f6b64e1 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -30,14 +30,19 @@
 
 #include 
 #include 
+#include "GL/mesa_glinterop.h"
 #include "util/u_memory.h"
 #include "util/u_inlines.h"
 #include "util/u_format.h"
 #include "util/u_debug.h"
 #include "state_tracker/drm_driver.h"
+#include "state_tracker/st_cb_bufferobjects.h"
+#include "state_tracker/st_cb_fbo.h"
+#include "state_tracker/st_cb_texture.h"
 #include "state_tracker/st_texture.h"
 #include "state_tracker/st_context.h"
 #include "pipe-loader/pipe_loader.h"
+#include "main/bufferobj.h"
 #include "main/texobj.h"
 
 #include "dri_screen.h"
@@ -1417,6 +1422,257 @@ static const __DRIrobustnessExtension dri2Robustness = {
.base = { __DRI2_ROBUSTNESS, 1 }
 };
 
+#define offset_after(type, member) \
+   offsetof(type, member) + sizeof(((type*)0)->member)
+
+static int
+dri2_interop_query_device_info(__DRIcontext *_ctx,
+   unsigned out_size,
+   mesa_glinterop_device_info *out)
+{
+   struct pipe_screen *screen = dri_context(_ctx)->st->pipe->screen;
+
+   if (out_size < offset_after(mesa_glinterop_device_info, interop_version))
+  return MESA_GLINTEROP_INVALID_VALUE;
+
+   out->pci_segment_group = screen->get_param(screen, PIPE_CAP_PCI_GROUP);
+   out->pci_bus = screen->get_param(screen, PIPE_CAP_PCI_BUS);
+   out->pci_device = screen->get_param(screen, PIPE_CAP_PCI_DEVICE);
+   out->pci_function = screen->get_param(screen, PIPE_CAP_PCI_FUNCTION);
+
+   out->vendor_id = screen->get_param(screen, PIPE_CAP_VENDOR_ID);
+   out->device_id = screen->get_param(screen, PIPE_CAP_DEVICE_ID);
+
+   out->interop_version = 1;
+
+   return MESA_GLINTEROP_SUCCESS;
+}
+
+static int
+dri2_interop_export_object(__DRIcontext *_ctx,
+   unsigned in_size,
+   const mesa_glinterop_export_in *in,
+   unsigned out_size,
+   mesa_glinterop_export_out *out)
+{
+   struct st_context_iface *st = dri_context(_ctx)->st;
+   struct pipe_screen *screen = st->pipe->screen;
+   struct gl_context *ctx = ((struct st_context *)st)->ctx;
+   struct pipe_resource *res = NULL;
+   struct winsys_handle whandle;
+   unsigned target, usage;
+   boolean success;
+
+   /* Check structure sizes first. */
+   if (in_size < offset_after(mesa_glinterop_export_in, out_driver_data))
+  return MESA_GLINTEROP_INVALID_VALUE;
+
+   if (out_size < offset_after(mesa_glinterop_export_out, view_numlayers))
+  return MESA_GLINTEROP_INVALID_VALUE;
+
+   /* Validate the target. */
+   switch (in->target) {
+   case GL_TEXTURE_BUFFER:
+   case GL_TEXTURE_1D:
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_3D:
+   case GL_TEXTURE_RECTANGLE:
+   case GL_TEXTURE_1D_ARRAY:
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_CUBE_MAP_ARRAY:
+   case GL_TEXTURE_CUBE_MAP:
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+   case GL_TEXTURE_EXTERNAL_OES:
+   case GL_RENDERBUFFER:
+   case GL_ARRAY_BUFFER:
+  target = in->target;
+  break;
+   case GL_TEXTURE_CUBE_MAP_POSITIVE_X:
+   case GL_TEXTURE_CUBE_MAP_NEGATIVE_X:
+   case GL_TEXTURE_CUBE_MAP_POSITIVE_Y:
+   case GL_TEXTURE_CUBE_MAP_NEGATIVE_Y:
+   case GL_TEXTURE_CUBE_MAP_POSITIVE_Z:
+   case GL_TEXTURE_CUBE_MAP_NEGATIVE_Z:
+  target = GL_TEXTURE_CUBE_MAP;
+  break;
+   default:
+  return MESA_GLINTEROP_INVALID_TARGET;
+   }
+
+   /* Validate the simple case of miplevel. */
+   if ((target == GL_RENDERBUFFER || target == GL_ARRAY_BUFFER) &&
+   in->miplevel != 0)
+  return MESA_GLINTEROP_INVALID_MIP_LEVEL;
+
+   /* Validate the OpenGL object and get pipe_resource. */
+   mtx_lock(&ctx->Shared->Mutex);
+
+   if (target == GL_ARRAY_BUFFER) {
+  /* Buffer objects.
+   *
+   * The error checking is based on the documentation of
+   * clCreateFromGLBuffer from OpenCL 2.0 SDK.
+   */
+  struct gl_buffer_object *buf = _mesa_lookup_bufferobj(ctx, in->obj);
+
+  /* From OpenCL 2.0 SDK, clCreateFromGLBuffer:
+   *  "CL_INVALID_GL_OBJECT if bufobj is not a GL buffer object or is
+   *   a GL buffer object but does not have an existing data store or
+   *   the size of the buffer is 0."
+   */
+  if (!buf || buf->Size == 0) {
+ mtx_unlock(&ctx->Shared->Mutex);
+ return MESA_GLINTEROP_INVALID_OBJECT;
+  }
+
+  res = st_buffer_object(buf)->buffer;
+  if (!res) {
+ /* this shouldn't happen */
+ mtx_unlock(&ctx->Shared->Mutex);
+ return MESA_GLINTEROP_INVALID_OBJECT;
+  }
+
+  

[Mesa-dev] [PATCH 4/5] glx: implement GLX part of interop interface (v3)

2016-03-09 Thread Marek Olšák
From: Marek Olšák 

v2: - use const
v3: - add in/out_size parameters
---
 src/glx/Makefile.am  |   1 +
 src/glx/dri2_glx.c   |  11 +++--
 src/glx/dri2_priv.h  |  19 
 src/glx/dri3_glx.c   |   5 +++
 src/glx/dri3_priv.h  |  13 ++
 src/glx/dri_common_interop.c | 100 +++
 src/glx/glxclient.h  |  12 ++
 src/glx/glxcmds.c|  57 
 8 files changed, 212 insertions(+), 6 deletions(-)
 create mode 100644 src/glx/dri_common_interop.c

diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index 0092545..d65fb81 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -113,6 +113,7 @@ libglx_la_SOURCES += \
dri_common.c \
dri_common.h \
dri_common_query_renderer.c \
+   dri_common_interop.c \
xfont.c
 endif
 
diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 7710349..cc162f2 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -77,12 +77,6 @@ struct dri2_display
const __DRIextension *loader_extensions[4];
 };
 
-struct dri2_context
-{
-   struct glx_context base;
-   __DRIcontext *driContext;
-};
-
 struct dri2_drawable
 {
__GLXDRIdrawable base;
@@ -1061,6 +1055,8 @@ static const struct glx_context_vtable 
dri2_context_vtable = {
.bind_tex_image  = dri2_bind_tex_image,
.release_tex_image   = dri2_release_tex_image,
.get_proc_address= NULL,
+   .interop_query_device_info = dri2_interop_query_device_info,
+   .interop_export_object = dri2_interop_export_object
 };
 
 static void
@@ -1145,6 +1141,9 @@ dri2BindExtensions(struct dri2_screen *psc, struct 
glx_display * priv,
  psc->rendererQuery = (__DRI2rendererQueryExtension *) extensions[i];
  __glXEnableDirectExtension(&psc->base, "GLX_MESA_query_renderer");
   }
+
+  if (strcmp(extensions[i]->name, __DRI2_INTEROP) == 0)
+psc->interop = (__DRI2interopExtension*)extensions[i];
}
 }
 
diff --git a/src/glx/dri2_priv.h b/src/glx/dri2_priv.h
index b93d158..7947740 100644
--- a/src/glx/dri2_priv.h
+++ b/src/glx/dri2_priv.h
@@ -43,6 +43,7 @@ struct dri2_screen {
const __DRItexBufferExtension *texBuffer;
const __DRI2throttleExtension *throttle;
const __DRI2rendererQueryExtension *rendererQuery;
+   const __DRI2interopExtension *interop;
const __DRIconfig **driver_configs;
 
void *driver;
@@ -51,6 +52,12 @@ struct dri2_screen {
int show_fps_interval;
 };
 
+struct dri2_context
+{
+   struct glx_context base;
+   __DRIcontext *driContext;
+};
+
 _X_HIDDEN int
 dri2_query_renderer_integer(struct glx_screen *base, int attribute,
 unsigned int *value);
@@ -58,3 +65,15 @@ dri2_query_renderer_integer(struct glx_screen *base, int 
attribute,
 _X_HIDDEN int
 dri2_query_renderer_string(struct glx_screen *base, int attribute,
const char **value);
+
+_X_HIDDEN int
+dri2_interop_query_device_info(struct glx_context *ctx,
+   unsigned out_size,
+   mesa_glinterop_device_info *out);
+
+_X_HIDDEN int
+dri2_interop_export_object(struct glx_context *ctx,
+   unsigned in_size,
+   const mesa_glinterop_export_in *in,
+   unsigned out_size,
+   mesa_glinterop_export_out *out);
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 6054ffc..6729357 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -638,6 +638,8 @@ static const struct glx_context_vtable dri3_context_vtable 
= {
.bind_tex_image  = dri3_bind_tex_image,
.release_tex_image   = dri3_release_tex_image,
.get_proc_address= NULL,
+   .interop_query_device_info = dri3_interop_query_device_info,
+   .interop_export_object = dri3_interop_export_object
 };
 
 /** dri3_bind_extensions
@@ -704,6 +706,9 @@ dri3_bind_extensions(struct dri3_screen *psc, struct 
glx_display * priv,
  psc->rendererQuery = (__DRI2rendererQueryExtension *) extensions[i];
  __glXEnableDirectExtension(&psc->base, "GLX_MESA_query_renderer");
   }
+
+  if (strcmp(extensions[i]->name, __DRI2_INTEROP) == 0)
+psc->interop = (__DRI2interopExtension*)extensions[i];
}
 }
 
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index 56a6330..45a982f 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -96,6 +96,7 @@ struct dri3_screen {
const __DRI2configQueryExtension *config;
const __DRItexBufferExtension *texBuffer;
const __DRI2rendererQueryExtension *rendererQuery;
+   const __DRI2interopExtension *interop;
const __DRIconfig **driver_configs;
 
void *driver;
@@ -131,3 +132,15 @@ dri3_query_renderer_integer(struct glx_screen *base, int 
attribute,
 _X_HIDDEN int
 dri3_query_renderer_string(struct glx_screen *base, int attribute,
const char **value);
+
+_X_HIDDEN int
+dri3

Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Pierre Moreau
I did hit that issue as well, but I have C++11 forced on my SPIR-V branch.

I guess adding the whitespace will still result in code that works with older
C++ version, so the fix can still be accepted even if we do not plan to switch
to C++11 by default.

Pierre


On 11:16 AM - Mar 09 2016, Samuel Pitoiset wrote:
> Nouveau doesn't use c++11 except the codegen part.
> How do you hit that issue? Pretty sure that you forced c++11, right?
> 
> I can't reproduce that compilation error with clang 3.9 btw.
> 
> On 03/09/2016 09:57 AM, Vinson Lee wrote:
> >   CXX  codegen/nv50_ir.lo
> >In file included from codegen/nv50_ir.cpp:28:
> >./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a 
> >space between literal and identifier
> >   [-Wreserved-user-defined-literal]
> >fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
> >  ^
> >
> >Signed-off-by: Vinson Lee 
> >---
> >  src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h 
> >b/src/gallium/drivers/nouveau/nouveau_debug.h
> >index d17df81..546a4ad 100644
> >--- a/src/gallium/drivers/nouveau/nouveau_debug.h
> >+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
> >@@ -16,7 +16,7 @@
> >  #define NOUVEAU_DEBUG 0
> >
> >  #define NOUVEAU_ERR(fmt, args...) \
> >-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
> >+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)
> >
> >  #define NOUVEAU_DBG(ch, args...)   \
> > if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\
> >
> 
> -- 
> -Samuel
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/26] gallium/radeon: disable CMASK on handle export if sharing doesn't allow it

2016-03-09 Thread Nicolai Hähnle



On 09.03.2016 05:56, Marek Olšák wrote:

On Wed, Mar 9, 2016 at 7:18 AM, Nicolai Hähnle  wrote:

On 08.03.2016 14:35, Marek Olšák wrote:


On Tue, Mar 8, 2016 at 4:41 AM, Michel Dänzer  wrote:


On 03.03.2016 01:36, Marek Olšák wrote:


From: Marek Olšák 

The disabling of CMASK is simple, but notifying all contexts about it is
not:
- The screen must have a list of all contexts.
- Each context must have a monotonic counter that is incremented only
when
the screen wants to re-emit framebuffer states.
- Each context must check in draw_vbo if the counter has been changed
and
re-emit the framebuffer state accordingly.



The list seems a bit overkill. How about having dirty_fb_counter in the
screen and last_dirty_fb_counter in the context, incrementing the former
in r600_dirty_all_framebuffer_states and emitting the framebuffer state
if the two counters don't match?



Thanks. The updated patch is attached. Please review.



There is an unneeded empty line in this hunk:

@@ -260,6 +265,31 @@ static void r600_eliminate_fast_color_clear(struct
r600_common_screen *rscreen,
 pipe_mutex_unlock(rscreen->aux_context_lock);
  }

+static void r600_texture_disable_cmask(struct r600_common_screen *rscreen,
+  struct r600_texture *rtex)
+{
+
+   if (!rtex->cmask.size)
+   return;

Slightly further down, I believe the pipe_resource_reference should be
unconditional:

+   if (rtex->cmask_buffer != &rtex->resource)
+   pipe_resource_reference((struct
pipe_resource**)&rtex->cmask_buffer, NULL);


It's how cmask_buffer is unreferenced everywhere. If the texture had
multiple samples, cmask_buffer would be part of the resource and so
the code would release the resource, which would be undesirable.


Ah, in that case cmask_buffer doesn't actually own a reference count - I 
see. That's a bit surprising, but orthogonal to this patch, so go ahead 
and add my R-b with only Michel's and my other comment.


Cheers,
Nicolai


This
is not a possible scenario yet, because texture_get_handle doesn't
support MSAA surfaces, but it matches the unref code elsewhere.

Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Samuel Pitoiset



On 03/09/2016 01:46 PM, Pierre Moreau wrote:

I did hit that issue as well, but I have C++11 forced on my SPIR-V branch.

I guess adding the whitespace will still result in code that works with older
C++ version, so the fix can still be accepted even if we do not plan to switch
to C++11 by default.



Sure, the patch looks fine, but I wonder how he did hit that issue. :-)

Anyway, if this doesn't break compilation without c++11, this patch is:

Reviewed-by: Samuel Pitoiset 


Pierre


On 11:16 AM - Mar 09 2016, Samuel Pitoiset wrote:

Nouveau doesn't use c++11 except the codegen part.
How do you hit that issue? Pretty sure that you forced c++11, right?

I can't reproduce that compilation error with clang 3.9 btw.

On 03/09/2016 09:57 AM, Vinson Lee wrote:

   CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a 
space between literal and identifier
   [-Wreserved-user-defined-literal]
fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
  ^

Signed-off-by: Vinson Lee 
---
  src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h 
b/src/gallium/drivers/nouveau/nouveau_debug.h
index d17df81..546a4ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_debug.h
+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
@@ -16,7 +16,7 @@
  #define NOUVEAU_DEBUG 0

  #define NOUVEAU_ERR(fmt, args...) \
-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)

  #define NOUVEAU_DBG(ch, args...)   \
 if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-09 Thread Emil Velikov
Hello Elie,

On 9 March 2016 at 10:25, tournier.elie  wrote:
> Hi everyone.
>
> My name is Elie TOURNIER, I am enrolled in a French Engineering school
> (Telecom Physique Strasbourg) specialized in Medical ICT.
> I'm interested in implementing "Soft" double precision floating point
> support [1].
> Taking this subject seem to be a good way to get my feet wet in the Mesa
> code and discover how some of its components works.
>
> I come to you in order to become know but also to retrieve valuable
> information for the success of this project.
>
> I would like to know more about the following things to understand your
> requirements :
> 1- "Each double precision value would be stored in a uvec2" The IEEE double
> precision floating point standard representation requires a 64 bit: 1 for
> sign, 11 for exponent and the others for fraction [2].
> -> How double precision value must be stored?
As one cannot assume the presence of doubles, one will need to return
the value as a two "unsigned" values. uvec2 the GLSL data type that
represents that. I believe one should at least basic understanding of
GLSL for this project.

https://www.opengl.org/wiki/Data_Type_(GLSL)

> 2- Where can I find GL_ARB_gpu_shader_fp64 documentation?
>
https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt

>
> This is my first exposure to Mesa. Please excuse me if I am asking basic
> questions.
>
> Please point me to the right resources so that I can better understand the
> project. I would also be happy to fix a bug to familiarize myself  with the
> source code. Any suggestions on bugs that are relevant to the project will
> be of great help.
>
First I would suggest that you rebuild mesa master based on your
distribution's recommended method. You should be comfortable doing
this and using the final package.

Skim though the git history, see if any of that make sense, is too
easy/hard, etc.

Past that you should demonstrate your ability to understand code,
using git and getting to know the procedure of creating patches,
submission and review. See http://mesa3d.org/devinfo.html for some of
the bits.

On the question of "what patches" - I would start with something easy.
From resolving build warnings to tackling/skimming through the bugs
lists - depending on your hardware. I believe we ought to have a list
of simple contributions somewhere, but I cannot find it atm.

https://bugs.freedesktop.org/buglist.cgi?action=wrap&bug_status=NEW&component=Drivers%2FDRI%2Fnouveau
https://bugs.freedesktop.org/buglist.cgi?action=wrap&bug_status=NEW&component=Drivers%2FDRI%2Fi965

Last but not least, do come around in #dri-devel at FreeNode. Try to
get the feel what people are talking about and ping prospective
mentors.

I believe that's enough information to keep you busy for a bit, if not
do follow up. Be that here or on IRC.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: clean up typedef madness in the backend API

2016-03-09 Thread Marek Olšák
Ping

On Thu, Mar 3, 2016 at 8:35 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> let's use the dd.h format
> ---
>  src/egl/main/eglapi.h   | 280 
> ++--
>  src/egl/main/eglfallbacks.c |  30 ++---
>  2 files changed, 155 insertions(+), 155 deletions(-)
>
> diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
> index 6c54c7c..3f6d3c2 100644
> --- a/src/egl/main/eglapi.h
> +++ b/src/egl/main/eglapi.h
> @@ -41,153 +41,153 @@ extern "C" {
>   */
>  typedef void (*_EGLProc)(void);
>
> -
> -/**
> - * Typedefs for all EGL API entrypoint functions.
> - */
> -
> -/* driver funcs */
> -typedef EGLBoolean (*Initialize_t)(_EGLDriver *, _EGLDisplay *dpy);
> -typedef EGLBoolean (*Terminate_t)(_EGLDriver *, _EGLDisplay *dpy);
> -
> -/* config funcs */
> -typedef EGLBoolean (*GetConfigs_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> EGLConfig *configs, EGLint config_size, EGLint *num_config);
> -typedef EGLBoolean (*ChooseConfig_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> const EGLint *attrib_list, EGLConfig *configs, EGLint config_size, EGLint 
> *num_config);
> -typedef EGLBoolean (*GetConfigAttrib_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLConfig *config, EGLint attribute, EGLint *value);
> -
> -/* context funcs */
> -typedef _EGLContext *(*CreateContext_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLConfig *config, _EGLContext *share_list, const EGLint *attrib_list);
> -typedef EGLBoolean (*DestroyContext_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLContext *ctx);
> -/* this is the only function (other than Initialize) that may be called with 
> an uninitialized display */
> -typedef EGLBoolean (*MakeCurrent_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *draw, _EGLSurface *read, _EGLContext *ctx);
> -typedef EGLBoolean (*QueryContext_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLContext *ctx, EGLint attribute, EGLint *value);
> -
> -/* surface funcs */
> -typedef _EGLSurface *(*CreateWindowSurface_t)(_EGLDriver *drv, _EGLDisplay 
> *dpy, _EGLConfig *config, void *native_window, const EGLint *attrib_list);
> -typedef _EGLSurface *(*CreatePixmapSurface_t)(_EGLDriver *drv, _EGLDisplay 
> *dpy, _EGLConfig *config, void *native_pixmap, const EGLint *attrib_list);
> -typedef _EGLSurface *(*CreatePbufferSurface_t)(_EGLDriver *drv, _EGLDisplay 
> *dpy, _EGLConfig *config, const EGLint *attrib_list);
> -typedef EGLBoolean (*DestroySurface_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface);
> -typedef EGLBoolean (*QuerySurface_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface, EGLint attribute, EGLint *value);
> -typedef EGLBoolean (*SurfaceAttrib_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface, EGLint attribute, EGLint value);
> -typedef EGLBoolean (*BindTexImage_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface, EGLint buffer);
> -typedef EGLBoolean (*ReleaseTexImage_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface, EGLint buffer);
> -typedef EGLBoolean (*SwapInterval_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surf, EGLint interval);
> -typedef EGLBoolean (*SwapBuffers_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *draw);
> -typedef EGLBoolean (*CopyBuffers_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surface, void *native_pixmap_target);
> -
> -/* misc funcs */
> -typedef EGLBoolean (*WaitClient_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLContext *ctx);
> -typedef EGLBoolean (*WaitNative_t)(_EGLDriver *drv, _EGLDisplay *dpy, EGLint 
> engine);
> -
> -/* this function may be called from multiple threads at the same time */
> -typedef _EGLProc (*GetProcAddress_t)(_EGLDriver *drv, const char *procname);
> -
> -
> -
> -typedef _EGLSurface *(*CreatePbufferFromClientBuffer_t)(_EGLDriver *drv, 
> _EGLDisplay *dpy, EGLenum buftype, EGLClientBuffer buffer, _EGLConfig 
> *config, const EGLint *attrib_list);
> -
> -
> -typedef _EGLImage *(*CreateImageKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLContext *ctx, EGLenum target, EGLClientBuffer buffer, const EGLint 
> *attr_list);
> -typedef EGLBoolean (*DestroyImageKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLImage *image);
> -
> -
> -typedef _EGLSync *(*CreateSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> EGLenum type, const EGLint *attrib_list, const EGLAttrib *attrib_list64);
> -typedef EGLBoolean (*DestroySyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSync *sync);
> -typedef EGLint (*ClientWaitSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSync *sync, EGLint flags, EGLTime timeout);
> -typedef EGLint (*WaitSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync 
> *sync);
> -typedef EGLBoolean (*SignalSyncKHR_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSync *sync, EGLenum mode);
> -typedef EGLBoolean (*GetSyncAttrib_t)(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSync *sync, EGLint attribute, EGLAttrib *value);
> -
> -
> -typedef EGLBoolean (*SwapBuffersRegionNOK_t)(_EGLDriver *drv, _EGLDisplay 
> *disp, _EGLSurface *surf, EGLint

Re: [Mesa-dev] [PATCH 05/10] radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 4:54 AM, Michel Dänzer  wrote:
> On 08.03.2016 21:21, Christian König wrote:
>> From: Christian König 
>>
>> Linear layout should work for all formats as well.
>
> The hardware actually doesn't support linear e.g. for compressed formats
> or depth/stencil formats.

The driver ignores the flag for compressed formats. It doesn't ignore
the flag for depth/stencil formats, but it does support binding a
linear depth/stencil buffer as a color buffer.

That said, I think PIPE_BIND_LINEAR should not be an allowed parameter
of is_format_supported (unless we have a very good reason to allow it).

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: dont allow undefined array sizes in ES

2016-03-09 Thread Iago Toral
On Tue, 2016-03-08 at 20:35 +1100, Timothy Arceri wrote:
> This applies the rule to empty declarations.
> 
> Fixes:
> dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex
> dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index d755a11..8918981 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -4223,6 +4223,17 @@ ast_declarator_list::hir(exec_list *instructions,
>type_name);
>} else {
>   if (decl_type->base_type == GLSL_TYPE_ARRAY) {
> +/* From Section 13.22 (Array Declarations) of the GLSL ES 3.2
> + * spec:
> + *
> + *"... any declaration that leaves the size undefined is
> + *disallowed as this would add complexity and there are no
> + *use-cases."
> + */
> +if (state->es_shader && decl_type->is_unsized_array())
> +   _mesa_glsl_error(&loc, state, "array size must be explicitly "
> +"or implicitly defined");

What about unsized arrays in SSBOs? Unsized arrays are allowed as the
last element in a SSBO declaration. This is a special case because the
size of the array is implicitly set by the size of the underlying buffer
object.

Iago

>  /* From Section 4.12 (Empty Declarations) of the GLSL 4.5 spec:
>   *
>   *"The combinations of types and qualifiers that cause


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/cfg: Remove redundant #pragma once.

2016-03-09 Thread Iago Toral
On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
> brw_cfg.h already has include guards, remove the "#pragma once" which
> is redundant and non-standard.

FWIW, I think using both #pragma once and include guards is a way to
keep portability while still getting the performance advantage of
#pragma once where it is supported.

Also it seems that we do the same thing in many other files...

> ---
>  src/mesa/drivers/dri/i965/brw_cfg.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
> b/src/mesa/drivers/dri/i965/brw_cfg.h
> index 405020b..a2ca6b1 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.h
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
> @@ -25,7 +25,6 @@
>   *
>   */
>  
> -#pragma once
>  #ifndef BRW_CFG_H
>  #define BRW_CFG_H
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: clean up typedef madness in the backend API

2016-03-09 Thread Emil Velikov
On 3 March 2016 at 19:35, Marek Olšák  wrote:
> From: Marek Olšák 
>
> let's use the dd.h format
Personally I don't see it as madness, then again I'm fine with either approach.

Fwiw
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.

2016-03-09 Thread Jon Turney

On 05/03/2016 03:33, Kenneth Graunke wrote:

We resolved the implicit version directive when processing control lines,
such as #ifdef, to ensure any built-in macros exist.  However, we failed
to resolve it when handling ordinary text.

[...]

diff --git a/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected 
b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
new file mode 100644
index 000..2872090
--- /dev/null
+++ b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
@@ -0,0 +1,3 @@
+0:1(3): preprocessor error: #version must appear on the first line
+
+


This last test fails in glcpp-test-cr-lf for me (See attached).

Can you just confirm that it passes for you, before I start looking into 
why it might fail just for me...?


= Testing with \\r line terminators (old Mac format) =
== Testing for correctness ==
[...]

147/147 tests returned correct results

PASS
= Testing with \\r\\n line terminators (DOS format) =
== Testing for correctness ==
[...]
Testing subtest-cr-lf/146-version-first-hash.c... > 
/jhbuild/x86_64-pc-cygwin/build/mesa/mesa/src/compiler/glsl/glcpp/tests/subtest-cr-lf/146-version-first-hash.c.out
 (subtest-cr-lf/146-version-first-hash.c.expected) FAIL
--- subtest-cr-lf/146-version-first-hash.c.expected 2016-03-09 
13:39:45.679154000 +
+++ 
/jhbuild/x86_64-pc-cygwin/build/mesa/mesa/src/compiler/glsl/glcpp/tests/subtest-cr-lf/146-version-first-hash.c.out
  2016-03-09 13:40:15.043069600 +
@@ -1,3 +1,3 @@
-0:1(3): preprocessor error: #version must appear on the first line
+0:1(4): preprocessor error: #version must appear on the first line
 
 

146/147 tests returned correct results

FAIL
= Testing with \\n\\r (bizarre, but allowed by GLSL spec.) =
== Testing for correctness ==
[...]
Testing subtest-lf-cr/146-version-first-hash.c... > 
/jhbuild/x86_64-pc-cygwin/build/mesa/mesa/src/compiler/glsl/glcpp/tests/subtest-lf-cr/146-version-first-hash.c.out
 (subtest-lf-cr/146-version-first-hash.c.expected) FAIL
--- subtest-lf-cr/146-version-first-hash.c.expected 2016-03-09 
13:40:32.955390600 +
+++ 
/jhbuild/x86_64-pc-cygwin/build/mesa/mesa/src/compiler/glsl/glcpp/tests/subtest-lf-cr/146-version-first-hash.c.out
  2016-03-09 13:41:04.620827800 +
@@ -1,3 +1,3 @@
-0:1(3): preprocessor error: #version must appear on the first line
+0:1(4): preprocessor error: #version must appear on the first line
 
 

146/147 tests returned correct results

FAIL

1/3 tests returned correct results
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glcpp: Fix locations when encounting "#".

2016-03-09 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Tue, 2016-03-08 at 19:09 -0800, Kenneth Graunke wrote:
> We were failing to reset our location tracking when encountering a
> NEWLINE in the  state.  Rip the code from the <*>{NEWLINE} rule,
> which handles this properly.
> 
> Also, update 146-version-first-hash.c to have proper expectations.
> When I introduced the test, I didn't verify that the line/column
> numbers were correct, and it turns out they varied based on the type
> of newline ending.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94447
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/glcpp/glcpp-lex.l | 3 +++
>  src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected | 2 +-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/glcpp/glcpp-lex.l 
> b/src/compiler/glsl/glcpp/glcpp-lex.l
> index 071918e..d09441a 100644
> --- a/src/compiler/glsl/glcpp/glcpp-lex.l
> +++ b/src/compiler/glsl/glcpp/glcpp-lex.l
> @@ -320,6 +320,9 @@ HEXADECIMAL_INTEGER   0[xX][0-9a-fA-F]+[uU]?
>  
>  {NEWLINE} {
>   BEGIN INITIAL;
> + yyextra->space_tokens = 0;
> + yylineno++;
> + yycolumn = 0;
>   RETURN_TOKEN_NEVER_SKIP (NEWLINE);
>  }
>  
> diff --git a/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected 
> b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
> index 2872090..e8e4497 100644
> --- a/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
> +++ b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
> @@ -1,3 +1,3 @@
> -0:1(3): preprocessor error: #version must appear on the first line
> +0:2(1): preprocessor error: #version must appear on the first line
>  
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Emil Velikov
On 8 March 2016 at 22:29, Marek Olšák  wrote:

> Actually, I don't see how the version number would make it any better
> for the structures, but returning the version number by
> QueryDeviceInfo would be useful for the caller to know what to expect
> if Mesa version < caller version. The sizes are still useful if Mesa
> version > caller version.
>
If any of this is an issue, then the whole DRI model just won't work ;-)

I'm thinking that the following should work. Please let me know if I'm
loosing the plot.

Caller sets the structure and sets version of the interface it
provides. Then callee first checks if it can work with the provider
version. then proceeds as planned.
Passing around multiple sizes is ugly and error prone to a point. One
gets the order wrong (swaps in_size and out_size) or just the same
sizeof(struct foo) in both places.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Set a proper _BaseFormat for window system renderbuffers in ES.

2016-03-09 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Tue, 2016-03-08 at 20:50 -0800, Kenneth Graunke wrote:
> intel_alloc_private_renderbuffer_storage did:
> 
>rb->_BaseFormat = _mesa_base_fbo_format(ctx, internalFormat);
> 
> Unfortunately, internalFormat was usually an unsized format (such as
> GL_DEPTH_COMPONENT).  In OpenGL ES, _mesa_base_fbo_format() refuses to
> accept unsized formats, and returns 0 rather than a real base format.
> 
> This meant that we ended up with a completely bogus rb->_BaseFormat for
> window system buffers on OpenGL ES.  All other renderbuffer allocation
> functions in intel_fbo.c instead use the mesa_format, and do:
> 
>rb->_BaseFormat = _mesa_get_format_base_format(...);
> 
> We can do likewise, using rb->Format.  This appears to work just fine.
> 
> dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial
> failed, as it tried to perform a GL_FRAMEBUFFER_ATTACHMENT_DEPTH_SIZE query
> on the window system depth buffer.  That query relies on a proper
> rb->_BaseFormat being set, so it broke because rb->_BaseFormat was 0 due
> to the above bug.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94458
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/intel_fbo.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index 3a4a53a..b7b6796 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -289,7 +289,7 @@ intel_alloc_private_renderbuffer_storage(struct 
> gl_context * ctx, struct gl_rend
> rb->NumSamples = intel_quantize_num_samples(screen, rb->NumSamples);
> rb->Width = width;
> rb->Height = height;
> -   rb->_BaseFormat = _mesa_base_fbo_format(ctx, internalFormat);
> +   rb->_BaseFormat = _mesa_get_format_base_format(rb->Format);
>  
> intel_miptree_release(&irb->mt);
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] gallium/winsys/drm: add offset to struct winsys_handle

2016-03-09 Thread Marek Olšák
For patches 1-4:

Reviewed-by: Marek Olšák 

Marek

On Tue, Mar 8, 2016 at 1:21 PM, Christian König  wrote:
> From: Christian König 
>
> We are going to need this for EGL_EXT_image_dma_buf_import.
>
> Signed-off-by: Christian König 
> ---
>  src/gallium/include/state_tracker/drm_driver.h| 5 +
>  src/gallium/state_trackers/dri/dri2.c | 2 ++
>  src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 1 +
>  src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 1 +
>  src/gallium/winsys/svga/drm/vmw_screen_dri.c  | 1 +
>  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 3 +++
>  6 files changed, 13 insertions(+)
>
> diff --git a/src/gallium/include/state_tracker/drm_driver.h 
> b/src/gallium/include/state_tracker/drm_driver.h
> index 959a762..d81da89 100644
> --- a/src/gallium/include/state_tracker/drm_driver.h
> +++ b/src/gallium/include/state_tracker/drm_driver.h
> @@ -35,6 +35,11 @@ struct winsys_handle
>  * Output for texture_get_handle.
>  */
> unsigned stride;
> +   /**
> +* Input to texture_from_handle.
> +* Output for texture_get_handle.
> +*/
> +   unsigned offset;
>  };
>
>
> diff --git a/src/gallium/state_trackers/dri/dri2.c 
> b/src/gallium/state_trackers/dri/dri2.c
> index a11a6cb..4349775 100644
> --- a/src/gallium/state_trackers/dri/dri2.c
> +++ b/src/gallium/state_trackers/dri/dri2.c
> @@ -533,6 +533,7 @@ dri2_allocate_textures(struct dri_context *ctx,
>   templ.bind = bind;
>   whandle.handle = buf->name;
>   whandle.stride = buf->pitch;
> + whandle.offset = 0;
>   if (screen->can_share_buffer)
>  whandle.type = DRM_API_HANDLE_TYPE_SHARED;
>   else
> @@ -754,6 +755,7 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
> templ.array_size = 1;
>
> whandle->stride = pitch * util_format_get_blocksize(pf);
> +   whandle->offset = 0;
>
> img->texture = 
> screen->base.screen->resource_from_handle(screen->base.screen,
>   &templ, whandle);
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> index 59a801b..41996f3 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> @@ -650,6 +650,7 @@ static boolean amdgpu_bo_get_handle(struct pb_buffer 
> *buffer,
>return FALSE;
>
> whandle->stride = stride;
> +   whandle->offset = 0;
> bo->is_shared = true;
> return TRUE;
>  }
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
> b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> index 7e9ed0c..3df0a35 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
> @@ -1049,6 +1049,7 @@ static boolean radeon_winsys_bo_get_handle(struct 
> pb_buffer *buffer,
>  }
>
>  whandle->stride = stride;
> +whandle->offset = 0;
>  return TRUE;
>  }
>
> diff --git a/src/gallium/winsys/svga/drm/vmw_screen_dri.c 
> b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> index 01bb0e2..baa22a9 100644
> --- a/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> +++ b/src/gallium/winsys/svga/drm/vmw_screen_dri.c
> @@ -357,6 +357,7 @@ vmw_drm_surface_get_handle(struct svga_winsys_screen *sws,
>  vsrf = vmw_svga_winsys_surface(surface);
>  whandle->handle = vsrf->sid;
>  whandle->stride = stride;
> +whandle->offset = 0;
>
>  switch (whandle->type) {
>  case DRM_API_HANDLE_TYPE_SHARED:
> diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
> b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> index 1e85971..9aaee88 100644
> --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> @@ -309,17 +309,20 @@ kms_sw_displaytarget_get_handle(struct sw_winsys 
> *winsys,
> case DRM_API_HANDLE_TYPE_KMS:
>whandle->handle = kms_sw_dt->handle;
>whandle->stride = kms_sw_dt->stride;
> +  whandle->offset = 0;
>return TRUE;
> case DRM_API_HANDLE_TYPE_FD:
>if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
>   DRM_CLOEXEC, (int*)&whandle->handle)) {
>   whandle->stride = kms_sw_dt->stride;
> + whandle->offset = 0;
>   return TRUE;
>}
>/* fallthrough */
> default:
>whandle->handle = 0;
>whandle->stride = 0;
> +  whandle->offset = 0;
>return FALSE;
> }
>  }
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Emil Velikov
On 8 March 2016 at 15:39, Marek Olšák  wrote:
> On Thu, Mar 3, 2016 at 11:56 PM, Emil Velikov  
> wrote:
>> Hi Marek,
>>
>> A small question, and a few trivial suggestions. Hopefully I'm not too
>> late for the party.
>>
>> On 3 March 2016 at 19:46, Marek Olšák  wrote:
>>
>>> +typedef struct _mesa_glinterop_device_info {
>>> +   uint32_t size; /* size of this structure */
>>> +
>> I believe Michel suggested a similar thing: Wouldn't it be better to
>> use a version one just like we do for the DRI extensions ? Many other
>> interfaces also use version, some with a combination of size, but this
>> is the first one in my experience that does only size.
>>
>>
>>> +typedef struct _mesa_glinterop_export_in {
>>
>>> +   /* Size of memory pointed to by out_driver_data. */
>>> +   uint32_t out_driver_data_size;
>>> +
>>> +   /* If the caller wants to query driver-specific data about the OpenGL
>>> +* object, this should point to the memory where that data will be 
>>> stored.
>>> +*/
>>> +   void *out_driver_data;
>> I take it that the structure and format of this data will be
>> internal/implementation specific, correct ? As on each side there will
>
> Yes.
>
>> be some sanity checking, wouldn't to be better to have size (version
>> and/or other) within that structure format.
>
> Since amdgpu isn't going to use this feature, I don't care too much about it.
>
> Having the size outside of the driver-specific structure seems safer.
>
Trying future proof things does not work nicely, most of the time.
Imho it should be added as there is a user for it.

>>
>> IMHO it's worth mentioning any of that, plus some information about
>> the lifetime expectancy of the data. Thus it's perfectly clear to the
>> user how to manage/use it.
>
> The data pointer should only be used for querying stuff from Mesa. The
> same rules as for the "out" pointer apply.
>
I think there is some misunderstanding here. I wasn't asking "Who is
going to use this data ?", but "Can they use the pointer reliably, or
should they copy the data from it before using it. Copy, because the
opposite end will discard/free the block shortly after the call". I've
seen some people referring to this as lifetime expectancy, not sure if
it's the correct terminology to use here.

Would be nice to add a comment about this in the API.

>>
>>
>>> +GLAPI int GLAPIENTRY
>>> +MesaGLInteropGLXExportObject(Display *dpy, GLXContext context,
>>> + mesa_glinterop_export_in *in,
>>> + mesa_glinterop_export_out *out);
>> Annotating EGL/GLX display and context as const is very uncommon,
>> although we should do that for 'in'. Shouldn't we ?
>
> We can do that, yes.
>
Thanks.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] (no subject)

2016-03-09 Thread Emil Velikov
On 9 March 2016 at 01:28, Dongwon Kim  wrote:
> This patch enables an EGL extension, EGL_KHR_reusable_sync.
> This new extension basically provides a way for multiple APIs or
> threads to be excuted synchronously via a "reusable sync"
"executed"

> primitive shared by those threads/API calls.
>
> This was implemented based on the specification at
>
> https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt
>
Fiww, I almost nuked the infrastructure for this extension yesterday.
Guess I'll put that patch on hold.

Out of curiosity how did you test the implementation ? We don't have
any piglit tests for it - care to send a few :-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #5 from Emil Velikov  ---
Thanks for the nice check Tim.

Looks like a bug on our end - the configure check should honour the FLAGS, thus
we'll warn/error and one won't be able to build swr with -march=pentium3.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] scons: build osmesa swrast and gallium

2016-03-09 Thread Roland Scheidegger
Am 09.03.2016 um 08:41 schrieb Andreas Fänger:
>> -Ursprüngliche Nachricht- Von: Roland Scheidegger Gesendet:
>> Dienstag, 8. März 2016 18:26 Betreff: Re: [Mesa-dev] [PATCH] scons:
>> build osmesa swrast and gallium
>> 
>> Not that I really care what you can or can't build (and I won't
>> comment on build changes), what are those features lacking in
>> llvmpipe, beside from anisotropic filtering (which I always
>> considered essentially useless for a software renderer, albeit
>> interesting if you're curious about the math involved)? Last time I
>> checked llvmpipe/softpipe had a much more robust feature set 
>> (especially when it comes to non-legacy GL features), for starters
>> I'll just mention working derivatives which is usually the first
>> thing people still using classic swrast are hitting bugs on...
>> 
> 
> We are using osmesa for rendering single images on a server. No
> shaders at the moment, only texturing (fixed function pipeline). For
> us the qualitiy of the images is the most important criteria (and
> rendering speed, of course); therefore anisotropic filtering is
> absolutely necessary in order to achieve good looking images. The
> same with anti-aliasing: Currently we are using  GL_POLYGON_SMOOTH
> but this is also missing in gallium. We need an antialiasing method
> that is not simply blurring the whole image or texturese but affects
> only the edges of the polygons.

Ah ok. Gallium supports polygon smooth but the sw rasterization drivers
do not, it isn't supported by quite a lot of hw drivers neither (I
wasn't even aware swrast did), albeit of course hw drivers typically
support msaa (chances of llvmpipe getting support for msaa one day is
probably a lot higher than for polygon_smooth, albeit maybe the latter
could use mostly the same code as the former...).
Anisotropic would be interesting to implement, but it was just something
easy to skip (since no apis really require it) - there's just not many
developers working on llvmpipe...
Just don't get your hopes up for better support of modern GL features
for classic swrast.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] radeonsi external audio + powerconsumption

2016-03-09 Thread Jarkko Korpi
I have ati r9 290 I have been using external amplifier that puts the sound into 
speakers. I have this connection using r9 290 --> hdmi --> speakers
And dvi --> pc screen.

xrandr (that broke my config again) see this external amplifer as a display 
which is it not. Kde/mint/upstream should fix this so it would be more correct. 

Hdmi channel is just sending audio data to amplifier. And it feels like that 
ati is using more clock cycles to provide the picture and the audio than it 
should. 

Is there anyway to prevent this in future? 
  ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-09 Thread Pohjolainen, Topi
On Wed, Mar 09, 2016 at 11:16:41AM +0100, Iago Toral wrote:
> On Wed, 2016-03-09 at 11:42 +0200, Pohjolainen, Topi wrote:
> > On Wed, Mar 09, 2016 at 11:05:17AM +0200, Pohjolainen, Topi wrote:
> > > On Wed, Mar 09, 2016 at 10:03:08AM +0100, Iago Toral wrote:
> > > > On Wed, 2016-03-09 at 10:53 +0200, Pohjolainen, Topi wrote:
> > > > > On Wed, Mar 09, 2016 at 09:36:42AM +0100, Iago Toral wrote:
> > > > > > On Wed, 2016-03-09 at 09:54 +0200, Pohjolainen, Topi wrote:
> > > > > > > On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias 
> > > > > > > Gons?lvez wrote:
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > There is only one patch from this series that has been reviewed 
> > > > > > > > (patch
> > > > > > > > 1).
> > > > > > > > 
> > > > > > > > Our plans is to start sending patches for adding fp64 support 
> > > > > > > > to i965
> > > > > > > > driver in the coming weeks but they depend on these patches.
> > > > > > > > 
> > > > > > > > Can someone take a look at them? ;)
> > > > > > > > 
> > > > > > > > Sam
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez 
> > > > > > > > wrote:
> > > > > > > > > Hello,
> > > > > > > > > 
> > > > > > > > > This patch series is a updated version of the one Iago sent 
> > > > > > > > > last
> > > > > > > > > week [0] that includes patches for gen6 too, as suggested by 
> > > > > > > > > Jason.
> > > > > > > > > 
> > > > > > > > > We checked the gen9 code paths that work with a horizontal 
> > > > > > > > > width of 4
> > > > > > > > > and we think there won't be any regression on gen9... but we 
> > > > > > > > > don't
> > > > > > > > > have any gen9 machine to run piglit with these patches. Can 
> > > > > > > > > someone
> > > > > > > > > check it?
> > > > > > > 
> > > > > > > I rebased it and ran it through the test system, gen9 seems to be 
> > > > > > > fine, I
> > > > > > > only got one regression, and that was on old g965:
> > > > > > 
> > > > > > Awesome! would it be possible to run that test in g695 with the 
> > > > > > attached
> > > > > > change? If this is a regression caused by our code it should break 
> > > > > > at
> > > > > > the assert introduced with it.
> > > > > > 
> > > > > > > /tmp/build_root/m64/lib/piglit/bin/ext_framebuffer_multisample-accuracy
> > > > > > >  all_samples srgb depthstencil -auto -fbo
> > > > > > > Pixels that should be unlit
> > > > > > >   count = 236444
> > > > > > >   RMS error = 0.025355
> > > > > > > Pixels that should be totally lit
> > > > > > >   count = 13308
> > > > > > >   Perfect output
> > > > > > > The error threshold for unlit and totally lit pixels test is 
> > > > > > > 0.016650
> > > > > > > Pixels that should be partially lit
> > > > > > >   count = 12392
> > > > > > >   RMS error = 0.273876
> > > > > > > The error threshold for partially lit pixels is 0.333000
> > > > > > > Samples = 0, Result = fail
> > > > > > > 
> > > > > > > 
> > > > > > > But I'm not sure if this is caused by your patches.
> > > > > > > ___
> > > > > > > mesa-dev mailing list
> > > > > > > mesa-dev@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > > > > 
> > > > > 
> > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > > > > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > > index 6f11f59..625447f 100644
> > > > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > > > @@ -203,6 +203,7 @@ brw_set_dest(struct brw_codegen *p, brw_inst 
> > > > > > *inst, struct brw_reg dest)
> > > > > >  * or 16 (SIMD16), as that's normally correct.  However, when 
> > > > > > dealing with
> > > > > >  * small registers, we automatically reduce it to match the 
> > > > > > register size.
> > > > > >  */
> > > > > > +   assert(dest.width != BRW_EXECUTE_4 || 
> > > > > > brw_inst_exec_size(devinfo, inst) == dest.width);
> > > > > > if (dest.width < BRW_EXECUTE_8)
> > > > > >brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > > > >  }
> > > > > 
> > > > > Hmm, on top of your series this looks:
> > > > > 
> > > > >/* Generators should set a default exec_size of either 8 (SIMD4x2 
> > > > > or SIMD8)
> > > > > * or 16 (SIMD16), as that's normally correct.  However, when 
> > > > > dealing with
> > > > > * small registers, we automatically reduce it to match the 
> > > > > register size.
> > > > > *
> > > > > * In platforms that support fp64 we can emit instructions with a 
> > > > > width of
> > > > > * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In 
> > > > > these
> > > > > * cases we need to make sure that these instructions have their 
> > > > > exec sizes
> > > > > * set properly when they are emitted and we can't rely on this 
> > > > > code to fix
> > > > > * it.
> > > > > */
> > > > >bool fix_exec_size;
> > > > >if (d

Re: [Mesa-dev] radeonsi external audio + powerconsumption

2016-03-09 Thread Christian König

That configuration can't work correctly.

With HDMI the audio data is interleaved in the video vertical and 
horizontal sync periods. So if you don't have video you don't have any 
way to submit audio either.


What the driver properly does is providing a dummy video signal so that 
audio can be transfered anyway. Most likely just a clone of the desktop 
picture.


Please provide you're xrandr --verbose output.

Regards,
Christian.

Am 09.03.2016 um 17:31 schrieb Jarkko Korpi:
I have ati r9 290 I have been using external amplifier that puts the 
sound into speakers. I have this connection using r9 290 --> hdmi --> 
speakers

And dvi --> pc screen.

xrandr (that broke my config again) see this external amplifer as a 
display which is it not. Kde/mint/upstream should fix this so it would 
be more correct.


Hdmi channel is just sending audio data to amplifier. And it feels 
like that ati is using more clock cycles to provide the picture and 
the audio than it should.


Is there anyway to prevent this in future?


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #6 from Tim Rowley  ---
Won't you run into the same problem with 64-bit builds in your system?  I'd
imagine that they're probably configured with -march=core2 for the widest
target platform, which will run into the same issue with swr's direct avx
intrinsic usage.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #7 from Ilia Mirkin  ---
(In reply to Emil Velikov from comment #5)
> Thanks for the nice check Tim.
> 
> Looks like a bug on our end - the configure check should honour the FLAGS,
> thus we'll warn/error and one won't be able to build swr with
> -march=pentium3.

Other way around - you want to ignore the user's request and build with AVX
anyways. The code in question should only be loaded if AVX (or AVX2) are
detected at runtime.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Remove useless IR self-destruct backend_shader method.

2016-03-09 Thread Matt Turner
On Tue, Mar 8, 2016 at 5:35 PM, Francisco Jerez  wrote:
> From the point it's constructed the CFG contains the only existing
> copy of the program IR, and it never becomes invalid.  Calling
> backend_shader::invalidate_cfg would have destroyed the program
> structure irrecoverably -- We weren't calling it at all for a good
> reason.
> ---

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radeonsi external audio + powerconsumption

2016-03-09 Thread Jarkko Korpi
Makes sense. I just broke my xrandr config that it doesnt work correctly it 
happens from time to time and it's really irritating to fix. What I just need 
is basically clone mode. But kde doesnt see screens that are offline and if I 
put amplifier on, it shutdowns my pc screen. Before I had an issue that I am on 
desktop, but most of the working applications are on the other desktop. This 
was easier to fix, mostly just opening and closing amplifier and it started to 
work correctly. 

xrandr --verbose
Screen 0: minimum 8 x 8, current 1920 x 1080, maximum 32767 x 32767
HDMI1 connected 1920x1080+0+0 (0x47) normal (normal left inverted right x axis 
y axis) 531mm x 298mm
Identifier: 0x43
Timestamp:  307470
Subpixel:   unknown
Gamma:  1.0:1.0:1.0
Brightness: 1.0
Clones:
CRTC:   0
CRTCs:  0 1 2
Transform:  1.00 0.00 0.00
0.00 1.00 0.00
0.00 0.00 1.00
   filter: 
EDID: 
000009d1a7784554
1b18010380351e782eba45a159559d28
0d5054a56b80810081c08180a9c0b300
d1c001010101023a801871382d40582c
4500132a211e00ff00333745
30323032323031390a2000fd0032
4c1e5311000a20202020202000fc
0042656e5120474c32343530480a006c
aspect ratio: Automatic 
supported: Automatic, 4:3, 16:9
Broadcast RGB: Automatic 
supported: Automatic, Full, Limited 16:235
audio: auto 
supported: force-dvi, off, auto, on
  1920x1080 (0x47)  148.5MHz +HSync +VSync *current +preferred
h: width  1920 start 2008 end 2052 total 2200 skew0 clock   67.5KHz
v: height 1080 start 1084 end 1089 total 1125   clock   60.0Hz
  1680x1050 (0xc7)  119.0MHz +HSync -VSync
h: width  1680 start 1728 end 1760 total 1840 skew0 clock   64.7KHz
v: height 1050 start 1053 end 1059 total 1080   clock   59.9Hz
  1600x900 (0xc8)  108.0MHz +HSync +VSync
h: width  1600 start 1624 end 1704 total 1800 skew0 clock   60.0KHz
v: height  900 start  901 end  904 total 1000   clock   60.0Hz
  1280x1024 (0xc9)  135.0MHz +HSync +VSync
h: width  1280 start 1296 end 1440 total 1688 skew0 clock   80.0KHz
v: height 1024 start 1025 end 1028 total 1066   clock   75.0Hz
  1280x1024 (0xca)  108.0MHz +HSync +VSync
h: width  1280 start 1328 end 1440 total 1688 skew0 clock   64.0KHz
v: height 1024 start 1025 end 1028 total 1066   clock   60.0Hz
  1280x800 (0xcb)   71.0MHz +HSync -VSync
h: width  1280 start 1328 end 1360 total 1440 skew0 clock   49.3KHz
v: height  800 start  803 end  809 total  823   clock   59.9Hz
  1152x864 (0xcc)  108.0MHz +HSync +VSync
h: width  1152 start 1216 end 1344 total 1600 skew0 clock   67.5KHz
v: height  864 start  865 end  868 total  900   clock   75.0Hz
  1280x720 (0xcd)   74.2MHz +HSync +VSync
h: width  1280 start 1390 end 1430 total 1650 skew0 clock   45.0KHz
v: height  720 start  725 end  730 total  750   clock   60.0Hz
  1024x768 (0xce)   78.8MHz +HSync +VSync
h: width  1024 start 1040 end 1136 total 1312 skew0 clock   60.1KHz
v: height  768 start  769 end  772 total  800   clock   75.1Hz
  1024x768 (0xcf)   65.0MHz -HSync -VSync
h: width  1024 start 1048 end 1184 total 1344 skew0 clock   48.4KHz
v: height  768 start  771 end  777 total  806   clock   60.0Hz
  832x624 (0xd0)   57.3MHz -HSync -VSync
h: width   832 start  864 end  928 total 1152 skew0 clock   49.7KHz
v: height  624 start  625 end  628 total  667   clock   74.6Hz
  800x600 (0xd1)   49.5MHz +HSync +VSync
h: width   800 start  816 end  896 total 1056 skew0 clock   46.9KHz
v: height  600 start  601 end  604 total  625   clock   75.0Hz
  800x600 (0xd2)   40.0MHz +HSync +VSync
h: width   800 start  840 end  968 total 1056 skew0 clock   37.9KHz
v: height  600 start  601 end  605 total  628   clock   60.3Hz
  640x480 (0xd3)   31.5MHz -HSync -VSync
h: width   640 start  656 end  720 total  840 skew0 clock   37.5KHz
v: height  480 start  481 end  484 total  500   clock   75.0Hz
  640x480 (0xd4)   25.2MHz -HSync -VSync
h: width   640 start  656 end  752 total  800 skew0 clock   31.5KHz
v: height  480 start  490 end  492 total  525   clock   60.0Hz
  720x400 (0xd5)   28.3MHz -HSync +VSync
h: width   720 start  738 end  846 total  900 skew0 clock   31.5KHz
v: height  400 start  412 end  414 total  449   clock   70.1Hz
HDMI2 disconnected (normal left inverted right x axis y axi

[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #8 from Ilia Mirkin  ---
FWIW this is how we do it for some sse4 stuff:

src/mesa/Makefile.am:

libmesa_sse41_la_SOURCES = \
main/streaming-load-memcpy.c \
main/streaming-load-memcpy.h \
main/sse_minmax.c \
main/sse_minmax.h
libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS)

Presumably a similar approach should work for swr?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #9 from Fabio Pedretti  ---
(In reply to Tim Rowley from comment #6)
> Won't you run into the same problem with 64-bit builds in your system?  I'd
> imagine that they're probably configured with -march=core2 for the widest
> target platform, which will run into the same issue with swr's direct avx
> intrinsic usage.

The arch tuning is only applied on 32 bit, the 64 bit already defaults to a
more recent CPU.

In reply to Ilia Mirkin from comment #7)
> (In reply to Emil Velikov from comment #5)
> > Thanks for the nice check Tim.
> > 
> > Looks like a bug on our end - the configure check should honour the FLAGS,
> > thus we'll warn/error and one won't be able to build swr with
> > -march=pentium3.
> 
> Other way around - you want to ignore the user's request and build with AVX
> anyways. The code in question should only be loaded if AVX (or AVX2) are
> detected at runtime.

This make more sense than requiring disabling pentium3 optimization, as the
rationale for arch tuning on 32 bit is to tune for a newer CPU (pentium3)
rather than the default (i586 or something). Else we'll get AVX on swr and
standard i586 on all other code.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] your mail

2016-03-09 Thread dw kim
On Wed, Mar 09, 2016 at 04:02:06PM +, Emil Velikov wrote:
> On 9 March 2016 at 01:28, Dongwon Kim  wrote:
> > This patch enables an EGL extension, EGL_KHR_reusable_sync.
> > This new extension basically provides a way for multiple APIs or
> > threads to be excuted synchronously via a "reusable sync"
> "executed"
> 
> > primitive shared by those threads/API calls.
> >
> > This was implemented based on the specification at
> >
> > https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt
> >
> Fiww, I almost nuked the infrastructure for this extension yesterday.
> Guess I'll put that patch on hold.
> 
> Out of curiosity how did you test the implementation ? We don't have
> any piglit tests for it - care to send a few :-)
> 
> Thanks
> Emil

I used google-dEQP to verify basic requirement and error handling and used our 
own test routines for verifying client-wait and signaling mechanism and timeout 
in multi-threads environment. I am not sure if I can share these specific test 
routines because I don't own those but I am going to try to add equivalent 
tests to piglit.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

--- Comment #10 from Emil Velikov  ---
(In reply to Ilia Mirkin from comment #7)

> Other way around - you want to ignore the user's request and build with AVX
> anyways. The code in question should only be loaded if AVX (or AVX2) are
> detected at runtime.

Eek, brain freeze moment. You're right.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radeonsi external audio + powerconsumption

2016-03-09 Thread Christian König
The only of hand "hack" I can see is to open up an HDMI connector, 
shortcut the hot plug detection pin and then connect that as "output 
device" to the amplifier.


It should fix your issues with the hot plug detection. You most likely 
still won't get a valid EDID, but that can be overridden easily.


On the other hand what's the issue with connecting the PC Monitor to the 
amplifier using a HDMI->DVI cable?


Regards,
Christian.

Am 09.03.2016 um 17:44 schrieb Jarkko Korpi:
Makes sense. I just broke my xrandr config that it doesnt work 
correctly it happens from time to time and it's really irritating to 
fix. What I just need is basically clone mode. But kde doesnt see 
screens that are offline and if I put amplifier on, it shutdowns my pc 
screen. Before I had an issue that I am on desktop, but most of the 
working applications are on the other desktop. This was easier to fix, 
mostly just opening and closing amplifier and it started to work 
correctly.


xrandr --verbose
Screen 0: minimum 8 x 8, current 1920 x 1080, maximum 32767 x 32767
HDMI1 connected 1920x1080+0+0 (0x47) normal (normal left inverted 
right x axis y axis) 531mm x 298mm

Identifier: 0x43
Timestamp:  307470
Subpixel:   unknown
Gamma:  1.0:1.0:1.0
Brightness: 1.0
Clones:
CRTC:   0
CRTCs:  0 1 2
Transform:  1.00 0.00 0.00
0.00 1.00 0.00
0.00 0.00 1.00
   filter:
EDID:
000009d1a7784554
1b18010380351e782eba45a159559d28
0d5054a56b80810081c08180a9c0b300
d1c001010101023a801871382d40582c
4500132a211e00ff00333745
30323032323031390a2000fd0032
4c1e5311000a20202020202000fc
0042656e5120474c32343530480a006c
aspect ratio: Automatic
supported: Automatic, 4:3, 16:9
Broadcast RGB: Automatic
supported: Automatic, Full, Limited 16:235
audio: auto
supported: force-dvi, off, auto, on
  1920x1080 (0x47)  148.5MHz +HSync +VSync *current +preferred
h: width  1920 start 2008 end 2052 total 2200 skew0 
clock   67.5KHz

v: height 1080 start 1084 end 1089 total 1125 clock   60.0Hz
  1680x1050 (0xc7)  119.0MHz +HSync -VSync
h: width  1680 start 1728 end 1760 total 1840 skew0 
clock   64.7KHz

v: height 1050 start 1053 end 1059 total 1080 clock   59.9Hz
  1600x900 (0xc8)  108.0MHz +HSync +VSync
h: width  1600 start 1624 end 1704 total 1800 skew0 
clock   60.0KHz

v: height  900 start  901 end  904 total 1000 clock   60.0Hz
  1280x1024 (0xc9)  135.0MHz +HSync +VSync
h: width  1280 start 1296 end 1440 total 1688 skew0 
clock   80.0KHz

v: height 1024 start 1025 end 1028 total 1066 clock   75.0Hz
  1280x1024 (0xca)  108.0MHz +HSync +VSync
h: width  1280 start 1328 end 1440 total 1688 skew0 
clock   64.0KHz

v: height 1024 start 1025 end 1028 total 1066 clock   60.0Hz
  1280x800 (0xcb)   71.0MHz +HSync -VSync
h: width  1280 start 1328 end 1360 total 1440 skew0 
clock   49.3KHz

v: height  800 start  803 end  809 total  823 clock   59.9Hz
  1152x864 (0xcc)  108.0MHz +HSync +VSync
h: width  1152 start 1216 end 1344 total 1600 skew0 
clock   67.5KHz

v: height  864 start  865 end  868 total  900 clock   75.0Hz
  1280x720 (0xcd)   74.2MHz +HSync +VSync
h: width  1280 start 1390 end 1430 total 1650 skew0 
clock   45.0KHz

v: height  720 start  725 end  730 total  750 clock   60.0Hz
  1024x768 (0xce)   78.8MHz +HSync +VSync
h: width  1024 start 1040 end 1136 total 1312 skew0 
clock   60.1KHz

v: height  768 start  769 end  772 total  800 clock   75.1Hz
  1024x768 (0xcf)   65.0MHz -HSync -VSync
h: width  1024 start 1048 end 1184 total 1344 skew0 
clock   48.4KHz

v: height  768 start  771 end  777 total  806 clock   60.0Hz
  832x624 (0xd0)   57.3MHz -HSync -VSync
h: width   832 start  864 end  928 total 1152 skew0 
clock   49.7KHz

v: height  624 start  625 end  628 total  667 clock   74.6Hz
  800x600 (0xd1)   49.5MHz +HSync +VSync
h: width   800 start  816 end  896 total 1056 skew0 
clock   46.9KHz

v: height  600 start  601 end  604 total  625 clock   75.0Hz
  800x600 (0xd2)   40.0MHz +HSync +VSync
h: width   800 start  840 end  968 total 1056 skew0 
clock   37.9KHz

v: height  600 start  601 end  605 total  628 clock   60.3Hz
  640x480 (0xd3)   31.5MHz -HSync -VSync
h: width   640 start  656 end  720 total  840 skew0 
clock   37.5KHz

v: height  480 start  481 end  484 total  500 clock   75.0Hz
  640x480 (0xd4)   25.2MHz -HSync -VSync
h: width   640 start  656 end  752 total  800 s

Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 4:31 PM, Emil Velikov  wrote:
> On 8 March 2016 at 22:29, Marek Olšák  wrote:
>
>> Actually, I don't see how the version number would make it any better
>> for the structures, but returning the version number by
>> QueryDeviceInfo would be useful for the caller to know what to expect
>> if Mesa version < caller version. The sizes are still useful if Mesa
>> version > caller version.
>>
> If any of this is an issue, then the whole DRI model just won't work ;-)

The DRI extension versions only determine the number of function
callbacks, not function parameters and return values. This interop
thing is a lot more complicated than that, since it allows the "in"
and "out" structures to grow, and different rules apply to each. Also,
the implementation of DRI extensions allocates the structures, while
in the interop the user allocates the structures. It's a totally
different model.

>
> I'm thinking that the following should work. Please let me know if I'm
> loosing the plot.
>
> Caller sets the structure and sets version of the interface it
> provides. Then callee first checks if it can work with the provider
> version. then proceeds as planned.

The callee must always proceed even if the version is too low or too
high. The caller knows the Mesa interop version from QueryDeviceInfo
(defined in v4 and later) and knows what it should expect. Mesa only
checks the sizes, but otherwise doesn't care about the caller's
version.

> Passing around multiple sizes is ugly and error prone to a point. One
> gets the order wrong (swaps in_size and out_size) or just the same
> sizeof(struct foo) in both places.

The order of parameters in v5 is: in_size, *in, out_size, *out
If the implementation swaps in_size and out_size, it's no my problem.
Testing should show pretty quickly that's it's broken.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 4:47 PM, Emil Velikov  wrote:
> On 8 March 2016 at 15:39, Marek Olšák  wrote:
>> On Thu, Mar 3, 2016 at 11:56 PM, Emil Velikov  
>> wrote:
>>> Hi Marek,
>>>
>>> A small question, and a few trivial suggestions. Hopefully I'm not too
>>> late for the party.
>>>
>>> On 3 March 2016 at 19:46, Marek Olšák  wrote:
>>>
 +typedef struct _mesa_glinterop_device_info {
 +   uint32_t size; /* size of this structure */
 +
>>> I believe Michel suggested a similar thing: Wouldn't it be better to
>>> use a version one just like we do for the DRI extensions ? Many other
>>> interfaces also use version, some with a combination of size, but this
>>> is the first one in my experience that does only size.
>>>
>>>
 +typedef struct _mesa_glinterop_export_in {
>>>
 +   /* Size of memory pointed to by out_driver_data. */
 +   uint32_t out_driver_data_size;
 +
 +   /* If the caller wants to query driver-specific data about the OpenGL
 +* object, this should point to the memory where that data will be 
 stored.
 +*/
 +   void *out_driver_data;
>>> I take it that the structure and format of this data will be
>>> internal/implementation specific, correct ? As on each side there will
>>
>> Yes.
>>
>>> be some sanity checking, wouldn't to be better to have size (version
>>> and/or other) within that structure format.
>>
>> Since amdgpu isn't going to use this feature, I don't care too much about it.
>>
>> Having the size outside of the driver-specific structure seems safer.
>>
> Trying future proof things does not work nicely, most of the time.
> Imho it should be added as there is a user for it.

I agree, but:
1) Intel want it, so there is a future user.
2) One of our OpenCL guys and I have agreed to keep it in case we need
in the future too.

>
>>>
>>> IMHO it's worth mentioning any of that, plus some information about
>>> the lifetime expectancy of the data. Thus it's perfectly clear to the
>>> user how to manage/use it.
>>
>> The data pointer should only be used for querying stuff from Mesa. The
>> same rules as for the "out" pointer apply.
>>
> I think there is some misunderstanding here. I wasn't asking "Who is
> going to use this data ?", but "Can they use the pointer reliably, or
> should they copy the data from it before using it. Copy, because the
> opposite end will discard/free the block shortly after the call". I've
> seen some people referring to this as lifetime expectancy, not sure if
> it's the correct terminology to use here.

I'm not sure I fully understand. It can be a local variable in the
OpenCL stack or a variable in a long-living OpenCL object. Mesa/GL can
only write data to it inside the interop call.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Emil Velikov
On 9 March 2016 at 17:11, Marek Olšák  wrote:
> On Wed, Mar 9, 2016 at 4:31 PM, Emil Velikov  wrote:
>> On 8 March 2016 at 22:29, Marek Olšák  wrote:
>>
>>> Actually, I don't see how the version number would make it any better
>>> for the structures, but returning the version number by
>>> QueryDeviceInfo would be useful for the caller to know what to expect
>>> if Mesa version < caller version. The sizes are still useful if Mesa
>>> version > caller version.
>>>
>> If any of this is an issue, then the whole DRI model just won't work ;-)
>
> The DRI extension versions only determine the number of function
> callbacks, not function parameters and return values. This interop
> thing is a lot more complicated than that, since it allows the "in"
> and "out" structures to grow, and different rules apply to each.
The fact that the DRI extension version is used only to determine the
presence of certain function, is implementation detail imho.

If you look at the struct as a whole, it doesn't matter what the
contents (part the version field) are. For all one care there could be
none ?

> Also,
> the implementation of DRI extensions allocates the structures, while
> in the interop the user allocates the structures. It's a totally
> different model.
>
True, it's a slightly different model. The following should still work ?

Library A: allocates memory for the struct, and set the currently
maximum supported version
Library B: retrieves the 'empty' struct. Checks which version is lower
(self and the one in the struct) and only populates for up-to that
version.

>>
>> I'm thinking that the following should work. Please let me know if I'm
>> loosing the plot.
>>
>> Caller sets the structure and sets version of the interface it
>> provides. Then callee first checks if it can work with the provider
>> version. then proceeds as planned.
>
> The callee must always proceed even if the version is too low or too
> high.
What do you mean with "proceed" here ?

Are you saying that even if the callee can work with up-to v2, it
should somehow populate the v3 fields ?
Shouldn't it use the lowest version of the two and only handle those fields ?

> The caller knows the Mesa interop version from QueryDeviceInfo
> (defined in v4 and later) and knows what it should expect. Mesa only
> checks the sizes, but otherwise doesn't care about the caller's
> version.
>
>> Passing around multiple sizes is ugly and error prone to a point. One
>> gets the order wrong (swaps in_size and out_size) or just the same
>> sizeof(struct foo) in both places.
>
> The order of parameters in v5 is: in_size, *in, out_size, *out

> If the implementation swaps in_size and out_size, it's no my problem.
No offence, but you realise that this sounds rather funny, right ?
In the context of designing interfaces, of course.

> Testing should show pretty quickly that's it's broken.
>
Definitely. I just hope we see less cases where proper testing occurs,
months after things are out.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Emil Velikov
On 9 March 2016 at 17:28, Marek Olšák  wrote:
> On Wed, Mar 9, 2016 at 4:47 PM, Emil Velikov  wrote:
>> On 8 March 2016 at 15:39, Marek Olšák  wrote:
>>> On Thu, Mar 3, 2016 at 11:56 PM, Emil Velikov  
>>> wrote:
 Hi Marek,

 A small question, and a few trivial suggestions. Hopefully I'm not too
 late for the party.

 On 3 March 2016 at 19:46, Marek Olšák  wrote:

> +typedef struct _mesa_glinterop_device_info {
> +   uint32_t size; /* size of this structure */
> +
 I believe Michel suggested a similar thing: Wouldn't it be better to
 use a version one just like we do for the DRI extensions ? Many other
 interfaces also use version, some with a combination of size, but this
 is the first one in my experience that does only size.


> +typedef struct _mesa_glinterop_export_in {

> +   /* Size of memory pointed to by out_driver_data. */
> +   uint32_t out_driver_data_size;
> +
> +   /* If the caller wants to query driver-specific data about the OpenGL
> +* object, this should point to the memory where that data will be 
> stored.
> +*/
> +   void *out_driver_data;
 I take it that the structure and format of this data will be
 internal/implementation specific, correct ? As on each side there will
>>>
>>> Yes.
>>>
 be some sanity checking, wouldn't to be better to have size (version
 and/or other) within that structure format.
>>>
>>> Since amdgpu isn't going to use this feature, I don't care too much about 
>>> it.
>>>
>>> Having the size outside of the driver-specific structure seems safer.
>>>
>> Trying future proof things does not work nicely, most of the time.
>> Imho it should be added as there is a user for it.
>
> I agree, but:
> 1) Intel want it, so there is a future user.
> 2) One of our OpenCL guys and I have agreed to keep it in case we need
> in the future too.
>
Don't mean to sound snarky - two sentences, each consisting the word
"future" :-)
There was a song somewhere called "Tomorrow never comes".

>>

 IMHO it's worth mentioning any of that, plus some information about
 the lifetime expectancy of the data. Thus it's perfectly clear to the
 user how to manage/use it.
>>>
>>> The data pointer should only be used for querying stuff from Mesa. The
>>> same rules as for the "out" pointer apply.
>>>
>> I think there is some misunderstanding here. I wasn't asking "Who is
>> going to use this data ?", but "Can they use the pointer reliably, or
>> should they copy the data from it before using it. Copy, because the
>> opposite end will discard/free the block shortly after the call". I've
>> seen some people referring to this as lifetime expectancy, not sure if
>> it's the correct terminology to use here.
>
> I'm not sure I fully understand. It can be a local variable in the
> OpenCL stack or a variable in a long-living OpenCL object. Mesa/GL can
> only write data to it inside the interop call.
>
It was a misunderstanding from my end. Sorry for the noise.

Thanks again, for keeping up with my question/suggestions.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Query and store GPU properties from kernel

2016-03-09 Thread Ben Widawsky
On Mon, Mar 07, 2016 at 10:16:41PM -0800, Matt Turner wrote:
> On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
>  wrote:
> > Certain products are not uniquely identifiable based on device id alone. The
> > kernel exports an interface to help deal with this. This patch merely 
> > introduces
> > the consumer of the interface and makes sure nothing breaks.
> >
> > It is also possible to use these values for programming GPGPU mode, and I 
> > plan
> > to do that as well.
> >
> > The interface was introduced in libdrm 2.4.60, which is already required, 
> > so it
> > should all be fine.
> >
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/intel_screen.c | 21 +
> >  src/mesa/drivers/dri/i965/intel_screen.h | 12 +++-
> >  2 files changed, 32 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> > b/src/mesa/drivers/dri/i965/intel_screen.c
> > index ee7c1d7..343b497 100644
> > --- a/src/mesa/drivers/dri/i965/intel_screen.c
> > +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> > @@ -1082,6 +1082,7 @@ static bool
> >  intel_init_bufmgr(struct intel_screen *intelScreen)
> >  {
> > __DRIscreen *spriv = intelScreen->driScrnPriv;
> > +   bool devid_override = getenv("INTEL_DEVID_OVERRIDE") != NULL;
> >
> > intelScreen->no_hw = getenv("INTEL_NO_HW") != NULL;
> >
> > @@ -1099,6 +1100,26 @@ intel_init_bufmgr(struct intel_screen *intelScreen)
> >return false;
> > }
> >
> > +   intelScreen->subslice_total = -1;
> > +   intelScreen->eu_total = -1;
> > +
> > +   /* Everything below this is for real hardware only */
> > +   if (intelScreen->no_hw || devid_override)
> > +  return true;
> > +
> > +   intel_get_param(spriv, I915_PARAM_SUBSLICE_TOTAL,
> > +   &intelScreen->subslice_total);
> > +   intel_get_param(spriv, I915_PARAM_EU_TOTAL, &intelScreen->eu_total);
> > +
> > +   /* Without this information, we cannot get the right Braswell 
> > brandstrings,
> > +* and we have to use conservative numbers for GPGPU on many platforms, 
> > but
> > +* otherwise, things will just work.
> > +*/
> > +   if (intelScreen->subslice_total == -1 ||
> > +   intelScreen->eu_total == -1)
> 
> I think this condition will fit on one line.
> 
> > +  _mesa_warning(NULL,
> > +"Kernel 4.1 required to properly query GPU 
> > properties.\n");
> > +
> > return true;
> >  }
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.h 
> > b/src/mesa/drivers/dri/i965/intel_screen.h
> > index 3a5f22c..695ed50 100644
> > --- a/src/mesa/drivers/dri/i965/intel_screen.h
> > +++ b/src/mesa/drivers/dri/i965/intel_screen.h
> > @@ -81,7 +81,17 @@ struct intel_screen
> >  * I915_PARAM_CMD_PARSER_VERSION parameter
> >  */
> > int cmd_parser_version;
> > - };
> > +
> > +   /**
> > +* Best effort attempt to get system information. Needed for GPGPU, and 
> > brand
> > +* strings (sigh)
> 
> The comment doesn't really describe the fields. Maybe
> 
> /**
>  * Number of subslices reported by the I915_PARAM_SUBSLICE_TOTAL parameter
>  */
> int subslice_total;
> 
> /**
>  * Number of EUs reported by the I915_PARAM_EU_TOTAL parameter
>  */
> int eu_total;
> 
> (Might have to linewrap the comments, not sure)
> 
> > +* I915_PARAM_SUBSLICE_TOTAL, and I915_PARAM_EU_TOTAL
> > +*/
> > +   struct {
> 
> Do these need to be together in a struct?
> 

I like the idea of using the anonymous struct to logically group them - though
looking back now, I think it'd be cool to put all the params we get from the
kernel in a struct (easy debug to just dump struct contents in gdb). Anyway, I
don't feel strongly.

> > +  int subslice_total;
> > +  int eu_total;
> > +   };
> > +};

I took all of your suggestions. Thanks.

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-09 Thread Kyle Brenneman
The current implementation of libglvnd uses a new X extension called 
x11glvnd to look up a vendor name for each screen and to find a screen 
number for a GLXDrawable.


But, Adam Jackson pointed out that a GLX extension could do the same job 
more cleanly: Looking up a vendor name is just querying a per-screen 
string, which GLXQueryServerString does. Looking up a screen number for 
a drawable could work by adding a GLX_SCREEN attribute to the 
GLXGetDrawableAttributes reply.


Based on that idea, I've written up a rough draft of a GLX extension 
spec. Any comments, questions, or suggestions are welcome, of course.


-Kyle


Name

EXT_libglvnd

Name Strings

GLX_EXT_libglvnd

Contact

Kyle Brenneman, NVIDIA, kbrenneman at nvidia.com

Contributors

Kyle Brenneman
Adam Jackson

Status

XXX - Not complete yet

Version

Last Modified Date: March 8, 2016
Revision: 1

Number

???

Dependencies

GLX version 1.3 is required.

This specification is written against the wording of the GLX 1.4
Specification.

Overview

This extension allows the vendor-neutral GLX client library, 
libglvnd, to

determine which vendor-specific driver is needed to support a given GLX
drawable or X11 screen.

This GLX extension is not intended to be used directly by applications.
Instead, it is intended to be used by the GLX client library.

IP Status

No known IP claims.

New Procedures and Functions

None

New Types

None

New Tokens

Accepted by the  parameter of glXQueryServerString:

GLX_VENDOR_NAMES_EXT0x

Additions to Chapter 3 of the GLX 1.4 Specification
(Functions and Errors)

[Modify Section 3.3.2, GLX Versioning]

[Replace the 2nd sentence of the 5th paragraph with the following]

"The possible values for  and the format of the strings is 
the same

as for glXGetClientString.  may also be GLX_VENDOR_NAMES_EXT."

[Add the following paragraph to the end of the section]

"If  is GLX_VENDOR_NAMES_EXT, then the returned string is a
space-separated sequence of vendor names. The names are in order of
preference, with the most preferred vendor first."


[Modify Section 3.3.6, Querying Attributes]

[Replace the 2nd sentence of the 1st paragraph with the following]

" must be set to one of GLX_WIDTH, GLX_HEIGHT,
GLX_PRESERVED_CONTENTS, GLX_LARGEST_PBUFFER, GLX_FBCONFIG_ID, or
GLX_SCREEN"

[Add the following paragraph just before the last of the section]

"If  is GLX_SCREEN, then  will be the screen 
number that

the drawable was created on."

GLX Protocol

This extension does not add any new requests. The 
GLX_VENDOR_NAMES_EXT enum
is used with the existing glXQueryServerString request, and 
GLX_SCREEN is

added to the attributes in the glXGetDrawableAttributes reply.

Errors

None

Issues
1)  Should GLX_VENDOR_NAMES_EXT contain a single vendor name or a 
list of

names?

Allowing multiple names would allow for multiple client-side 
drivers
that work with a single server-side driver. With only a single 
name,
selecting between multiple client drivers would require some 
form of

additional configuration.

2)  How are vendor names defined and interpreted?

The vendor names for a screen are defined based on the server's GLX
implementation. Typically, a server will simply send the name 
of the

driver that controls the screen.

The GLX client library is responsible for translating the 
vendor name
to a vendor library name. The details of the translation are 
part of
the interface between the vendor library and the GLX client 
library,

and so is not defined in this specification.

3)  What order should the vendor names be returned in?

The GLX client library will try to load and use each vendor 
name, in
the order that the server lists them. It will stop when it 
finds the

first vendor that works.

Revision History

1. 8 March 2016
- Initial draft.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 6:58 PM, Emil Velikov  wrote:
> On 9 March 2016 at 17:28, Marek Olšák  wrote:
>> On Wed, Mar 9, 2016 at 4:47 PM, Emil Velikov  
>> wrote:
>>> On 8 March 2016 at 15:39, Marek Olšák  wrote:
 On Thu, Mar 3, 2016 at 11:56 PM, Emil Velikov  
 wrote:
> Hi Marek,
>
> A small question, and a few trivial suggestions. Hopefully I'm not too
> late for the party.
>
> On 3 March 2016 at 19:46, Marek Olšák  wrote:
>
>> +typedef struct _mesa_glinterop_device_info {
>> +   uint32_t size; /* size of this structure */
>> +
> I believe Michel suggested a similar thing: Wouldn't it be better to
> use a version one just like we do for the DRI extensions ? Many other
> interfaces also use version, some with a combination of size, but this
> is the first one in my experience that does only size.
>
>
>> +typedef struct _mesa_glinterop_export_in {
>
>> +   /* Size of memory pointed to by out_driver_data. */
>> +   uint32_t out_driver_data_size;
>> +
>> +   /* If the caller wants to query driver-specific data about the OpenGL
>> +* object, this should point to the memory where that data will be 
>> stored.
>> +*/
>> +   void *out_driver_data;
> I take it that the structure and format of this data will be
> internal/implementation specific, correct ? As on each side there will

 Yes.

> be some sanity checking, wouldn't to be better to have size (version
> and/or other) within that structure format.

 Since amdgpu isn't going to use this feature, I don't care too much about 
 it.

 Having the size outside of the driver-specific structure seems safer.

>>> Trying future proof things does not work nicely, most of the time.
>>> Imho it should be added as there is a user for it.
>>
>> I agree, but:
>> 1) Intel want it, so there is a future user.
>> 2) One of our OpenCL guys and I have agreed to keep it in case we need
>> in the future too.
>>
> Don't mean to sound snarky - two sentences, each consisting the word
> "future" :-)
> There was a song somewhere called "Tomorrow never comes".

OK. If Intel say they don't want it, I'll remove it.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swrast: fix possible null dereference

2016-03-09 Thread Lars Hamre
Fixes a possible null dereference.

NOTE: this is my first time contributing, please let me know if I
  should be doing anything differently, thanks!

Signed-off-by: Lars Hamre 
---
 src/mesa/swrast/s_triangle.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/swrast/s_triangle.c b/src/mesa/swrast/s_triangle.c
index 876a74b..9225974 100644
--- a/src/mesa/swrast/s_triangle.c
+++ b/src/mesa/swrast/s_triangle.c
@@ -781,7 +781,7 @@ fast_persp_span(struct gl_context *ctx, SWspan *span,
   }
   break;
}
-
+
assert(span->arrayMask & SPAN_RGBA);
_swrast_write_rgba_span(ctx, span);

@@ -1063,8 +1063,8 @@ _swrast_choose_triangle( struct gl_context *ctx )
  swImg = swrast_texture_image_const(texImg);

  format = texImg ? texImg->TexFormat : MESA_FORMAT_NONE;
- minFilter = texObj2D ? samp->MinFilter : GL_NONE;
- magFilter = texObj2D ? samp->MagFilter : GL_NONE;
+ minFilter = (texObj2D && samp) ? samp->MinFilter : GL_NONE;
+ magFilter = (texObj2D && samp) ? samp->MagFilter : GL_NONE;
  envMode = ctx->Texture.Unit[0].EnvMode;

  /* First see if we can use an optimized 2-D texture function */
@@ -1073,6 +1073,7 @@ _swrast_choose_triangle( struct gl_context *ctx )
  && !ctx->ATIFragmentShader._Enabled
  && ctx->Texture._MaxEnabledTexImageUnit == 0
  && ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D
+ && samp
  && samp->WrapS == GL_REPEAT
  && samp->WrapT == GL_REPEAT
  && texObj2D->_Swizzle == SWIZZLE_NOOP
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v3)

2016-03-09 Thread Marek Olšák
On Wed, Mar 9, 2016 at 6:51 PM, Emil Velikov  wrote:
> On 9 March 2016 at 17:11, Marek Olšák  wrote:
>> On Wed, Mar 9, 2016 at 4:31 PM, Emil Velikov  
>> wrote:
>>> On 8 March 2016 at 22:29, Marek Olšák  wrote:
>>>
 Actually, I don't see how the version number would make it any better
 for the structures, but returning the version number by
 QueryDeviceInfo would be useful for the caller to know what to expect
 if Mesa version < caller version. The sizes are still useful if Mesa
 version > caller version.

>>> If any of this is an issue, then the whole DRI model just won't work ;-)
>>
>> The DRI extension versions only determine the number of function
>> callbacks, not function parameters and return values. This interop
>> thing is a lot more complicated than that, since it allows the "in"
>> and "out" structures to grow, and different rules apply to each.
> The fact that the DRI extension version is used only to determine the
> presence of certain function, is implementation detail imho.
>
> If you look at the struct as a whole, it doesn't matter what the
> contents (part the version field) are. For all one care there could be
> none ?
>
>> Also,
>> the implementation of DRI extensions allocates the structures, while
>> in the interop the user allocates the structures. It's a totally
>> different model.
>>
> True, it's a slightly different model. The following should still work ?
>
> Library A: allocates memory for the struct, and set the currently
> maximum supported version
> Library B: retrieves the 'empty' struct. Checks which version is lower
> (self and the one in the struct) and only populates for up-to that
> version.
>
>>>
>>> I'm thinking that the following should work. Please let me know if I'm
>>> loosing the plot.
>>>
>>> Caller sets the structure and sets version of the interface it
>>> provides. Then callee first checks if it can work with the provider
>>> version. then proceeds as planned.
>>
>> The callee must always proceed even if the version is too low or too
>> high.
> What do you mean with "proceed" here ?
>
> Are you saying that even if the callee can work with up-to v2, it
> should somehow populate the v3 fields ?
> Shouldn't it use the lowest version of the two and only handle those fields ?

A v1 callee can only read and write v1 fields.
A v2 callee can only read and write v2 fields, but will only write v1
fields if the size doesn't include v2 fields.
I could go on.

The callee doesn't care about the caller's version at all.

>
>> The caller knows the Mesa interop version from QueryDeviceInfo
>> (defined in v4 and later) and knows what it should expect. Mesa only
>> checks the sizes, but otherwise doesn't care about the caller's
>> version.
>>
>>> Passing around multiple sizes is ugly and error prone to a point. One
>>> gets the order wrong (swaps in_size and out_size) or just the same
>>> sizeof(struct foo) in both places.
>>
>> The order of parameters in v5 is: in_size, *in, out_size, *out
>
>> If the implementation swaps in_size and out_size, it's no my problem.
> No offence, but you realise that this sounds rather funny, right ?
> In the context of designing interfaces, of course.

It's not my problem for the same reason that swapping src and dst in
memcpy is not glibc's problem. We have to draw the line somewhere. :)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965/chv: Display proper branding

2016-03-09 Thread Ben Widawsky
On Mon, Mar 07, 2016 at 10:11:11PM -0800, Matt Turner wrote:
> On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
>  wrote:
> > "Braswell" is a Cherryview based *thing*. It unfortunately requires extra
> > information to determine its marketing name. Unlike all previous products, 
> > and
> > hopefully all future ones, there is no unique 1:1 mapping of PCI device ID 
> > to
> > brand string.
> >
> > I put up a fight about adding any complexity to our GL renderer string code 
> > for
> > a very long time. However, a wise man made a comment to me that I couldn't 
> > argue
> > with: if a user installs Windows on their hardware, the brand string should 
> > be
> > the same as what we display in Linux. The Windows driver apparently does 
> > this
> > check, so we should too.
> >
> > Note that I did manage to find a good use for this info anyway in the 
> > computer
> > shader thread counts.
> >
> > Cc: Kaveh Nasri 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  include/pci_ids/i965_pci_ids.h   |  4 ++--
> >  src/mesa/drivers/dri/i965/brw_context.c  | 33 
> > +---
> >  src/mesa/drivers/dri/i965/brw_context.h  |  3 ++-
> >  src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
> >  4 files changed, 35 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> > index bdfbefe..d783e39 100644
> > --- a/include/pci_ids/i965_pci_ids.h
> > +++ b/include/pci_ids/i965_pci_ids.h
> > @@ -156,8 +156,8 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
> >  CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
> >  CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
> >  CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
> > -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
> > -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
> > +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
> > +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* 
> > Overriden in brw_get_renderer_string */
> 
> Typo: Overridden
> 
> >  CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
> >  CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
> >  CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > b/src/mesa/drivers/dri/i965/brw_context.c
> > index df0f6bb..f57184f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -77,13 +77,27 @@
> >
> >  const char *const brw_vendor_string = "Intel Open Source Technology 
> > Center";
> >
> > +static const char *
> > +get_bsw_model(const struct intel_screen *intelScreen)
> > +{
> > +   switch (intelScreen->eu_total) {
> > +   case 16:
> > +  return "405";
> > +   case 12:
> > +  return "400";
> > +   default:
> 
> I think this is safe to just mark unreachable(), right?
> 

No. If somehow the query from the kernel fails, we get nothing. That might be
some inexplicable IOCTL fail, or some bug in the kernel. In this case, I'd like
mesa to keep running, since by and large, nobody really cares about brand
strings except irrelevant people.

Note that in a previous patch, we fall back to sane defaults under that failure
condition.

> > +  return "   ";
> > +   }
> > +}
> > +
> >  const char *
> > -brw_get_renderer_string(unsigned deviceID)
> > +brw_get_renderer_string(const struct intel_screen *intelScreen)
> >  {
> > const char *chipset;
> > static char buffer[128];
> 
> Not your fault, but driGetRendererString() into this static buffer
> isn't thread-safe. I ran into a similar problem in EGL with
> shader-db's run.c last year.
> 

Do you want me to fix this up? AFAICS, I didn't actually make anything less
threadsafe.

> > +   char *bsw = NULL;
> 
> Thought the initialization wasn't necessary at first, but indeed it is
> if you want to unconditionally call free().
> 
> >
> > -   switch (deviceID) {
> > +   switch (intelScreen->deviceID) {
> >  #undef CHIPSET
> >  #define CHIPSET(id, symbol, str) case id: chipset = str; break;
> >  #include "pci_ids/i965_pci_ids.h"
> > @@ -92,7 +106,20 @@ brw_get_renderer_string(unsigned deviceID)
> >break;
> > }
> >
> > +   /* Braswell branding is funny, so we have to fix it up here */
> > +   if (intelScreen->deviceID == 0x22B1) {
> > +  char *needle;
> > +
> > +  bsw = strdup(chipset);
> > +  needle = strstr(bsw, "XXX");
> 
> Could declare char *needle here and initialize on one line if you wanted.
> 
> > +  if (needle) {
> > + strncpy(needle, get_bsw_model(intelScreen), strlen("XXX"));
> 
> Don't actually need (or want) any of the features of strncpy. Should
> just use memcpy.
> 
> > + chipset = bsw;
> > +  }
> > +   }
> > +
> > (void) driGetRendererString(buffer, chipset, 0);
> > +   free(bsw);
> > return buffer;
> >  }
> >
> > @@ -107,7 +134,7 @@ intel_get_string(struct gl_context * ctx, GLenum name)
> >
> > case GL_RENDER

Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-09 Thread Adam Jackson
On Wed, 2016-03-09 at 11:15 -0700, Kyle Brenneman wrote:
> The current implementation of libglvnd uses a new X extension called 
> x11glvnd to look up a vendor name for each screen and to find a screen 
> number for a GLXDrawable.
> 
> But, Adam Jackson pointed out that a GLX extension could do the same job 
> more cleanly: Looking up a vendor name is just querying a per-screen 
> string, which GLXQueryServerString does. Looking up a screen number for 
> a drawable could work by adding a GLX_SCREEN attribute to the 
> GLXGetDrawableAttributes reply.
> 
> Based on that idea, I've written up a rough draft of a GLX extension 
> spec. Any comments, questions, or suggestions are welcome, of course.

Argh, you beat me to it, I'd written almost exactly the same thing. I
just an update to my serverstring branch on github implementing what
I'd spec'd, details below...

> New Tokens
> 
>  Accepted by the  parameter of glXQueryServerString:
> 
>  GLX_VENDOR_NAMES_EXT0x

Perhaps easier than getting an enum allocated here, I'd appended this
string to the end of the response for GLX_VERSION, in the form

    glvnd:

where list is comma-separated, since that part of the string is already
"vendor-specific info".

Agreed with your rationale in the Issues section. I'd also had:

1) Do we need to define the interaction with GLX_SGIX_pbuffer?

   UNRESOLVED.  Xorg uses the same code paths for the 1.3 and
   pbuffer versions of GetDrawableAttributes, but extra attributes
   are probably harmless.

2) Do we want to add GLX_SCREEN to the list of fbconfig attributes
   as well?

   UNRESOLVED.  glvnd does not need that information, but it would
   be a natural orthogonality, and GLX_SGIX_fbconfig mentions it
   though GLX 1.3 does not.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-09 Thread Kyle Brenneman

On 03/09/2016 12:21 PM, Adam Jackson wrote:

On Wed, 2016-03-09 at 11:15 -0700, Kyle Brenneman wrote:

The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.

But, Adam Jackson pointed out that a GLX extension could do the same job
more cleanly: Looking up a vendor name is just querying a per-screen
string, which GLXQueryServerString does. Looking up a screen number for
a drawable could work by adding a GLX_SCREEN attribute to the
GLXGetDrawableAttributes reply.

Based on that idea, I've written up a rough draft of a GLX extension
spec. Any comments, questions, or suggestions are welcome, of course.

Argh, you beat me to it, I'd written almost exactly the same thing. I
just an update to my serverstring branch on github implementing what
I'd spec'd, details below...

Ah, sorry about that. I should have mentioned that I was working on it.

New Tokens

  Accepted by the  parameter of glXQueryServerString:

  GLX_VENDOR_NAMES_EXT0x

Perhaps easier than getting an enum allocated here, I'd appended this
string to the end of the response for GLX_VERSION, in the form

 glvnd:

where list is comma-separated, since that part of the string is already
"vendor-specific info".
That could work, although I would expect "vendor-specific info" to mean 
"random, arbitrary, and probably not machine-parsable". I'd be hesitant 
to try to impose a structure on something that's never had any structure 
before.


Agreed with your rationale in the Issues section. I'd also had:

 1) Do we need to define the interaction with GLX_SGIX_pbuffer?

UNRESOLVED.  Xorg uses the same code paths for the 1.3 and
pbuffer versions of GetDrawableAttributes, but extra attributes
are probably harmless.
We probably don't need to -- as you say, extra attributes are likely 
harmless. I'd guess that any system that supports libglvnd is going to 
support at least GLX 1.3, so using glXQueryDrawable to look up the 
screen number seems reasonable.


 2) Do we want to add GLX_SCREEN to the list of fbconfig attributes
as well?

UNRESOLVED.  glvnd does not need that information, but it would
be a natural orthogonality, and GLX_SGIX_fbconfig mentions it
though GLX 1.3 does not.
Possibly, but that wouldn't change the protocol at all. The screen 
number is included in the glXGetFBConfigs request, so it wouldn't make 
sense to add it to the reply as well. It would be up to the client to 
keep track of it instead.


- ajax


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Vinson Lee
On Wed, Mar 9, 2016 at 5:25 AM, Samuel Pitoiset
 wrote:
>
>
> On 03/09/2016 01:46 PM, Pierre Moreau wrote:
>>
>> I did hit that issue as well, but I have C++11 forced on my SPIR-V branch.
>>
>> I guess adding the whitespace will still result in code that works with
>> older
>> C++ version, so the fix can still be accepted even if we do not plan to
>> switch
>> to C++11 by default.
>>
>
> Sure, the patch looks fine, but I wonder how he did hit that issue. :-)
>
> Anyway, if this doesn't break compilation without c++11, this patch is:
>
> Reviewed-by: Samuel Pitoiset 
>
>
>> Pierre
>>
>>
>> On 11:16 AM - Mar 09 2016, Samuel Pitoiset wrote:
>>>
>>> Nouveau doesn't use c++11 except the codegen part.
>>> How do you hit that issue? Pretty sure that you forced c++11, right?
>>>
>>> I can't reproduce that compilation error with clang 3.9 btw.
>>>
>>> On 03/09/2016 09:57 AM, Vinson Lee wrote:

CXX  codegen/nv50_ir.lo
 In file included from codegen/nv50_ir.cpp:28:
 ./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11
 requires a space between literal and identifier
[-Wreserved-user-defined-literal]
 fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
   ^

 Signed-off-by: Vinson Lee 
 ---
   src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h
 b/src/gallium/drivers/nouveau/nouveau_debug.h
 index d17df81..546a4ad 100644
 --- a/src/gallium/drivers/nouveau/nouveau_debug.h
 +++ b/src/gallium/drivers/nouveau/nouveau_debug.h
 @@ -16,7 +16,7 @@
   #define NOUVEAU_DEBUG 0

   #define NOUVEAU_ERR(fmt, args...) \
 -   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
 +   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)

   #define NOUVEAU_DBG(ch, args...)   \
  if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\

>>>
>>> --
>>> -Samuel
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
> --
> -Samuel

Building swr also seems to be adding -std=c++11 to the nouveau portion
of the build. Can you try a clang build with this configure statement?

./autogen.sh --with-dri-drivers= --with-gallium-drivers=nouveau,swr
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965/chv: Display proper branding

2016-03-09 Thread Matt Turner
On Wed, Mar 9, 2016 at 10:36 AM, Ben Widawsky  wrote:
> On Mon, Mar 07, 2016 at 10:11:11PM -0800, Matt Turner wrote:
>> On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
>>  wrote:
>> > "Braswell" is a Cherryview based *thing*. It unfortunately requires extra
>> > information to determine its marketing name. Unlike all previous products, 
>> > and
>> > hopefully all future ones, there is no unique 1:1 mapping of PCI device ID 
>> > to
>> > brand string.
>> >
>> > I put up a fight about adding any complexity to our GL renderer string 
>> > code for
>> > a very long time. However, a wise man made a comment to me that I couldn't 
>> > argue
>> > with: if a user installs Windows on their hardware, the brand string 
>> > should be
>> > the same as what we display in Linux. The Windows driver apparently does 
>> > this
>> > check, so we should too.
>> >
>> > Note that I did manage to find a good use for this info anyway in the 
>> > computer
>> > shader thread counts.
>> >
>> > Cc: Kaveh Nasri 
>> > Signed-off-by: Ben Widawsky 
>> > ---
>> >  include/pci_ids/i965_pci_ids.h   |  4 ++--
>> >  src/mesa/drivers/dri/i965/brw_context.c  | 33 
>> > +---
>> >  src/mesa/drivers/dri/i965/brw_context.h  |  3 ++-
>> >  src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
>> >  4 files changed, 35 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/include/pci_ids/i965_pci_ids.h 
>> > b/include/pci_ids/i965_pci_ids.h
>> > index bdfbefe..d783e39 100644
>> > --- a/include/pci_ids/i965_pci_ids.h
>> > +++ b/include/pci_ids/i965_pci_ids.h
>> > @@ -156,8 +156,8 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
>> >  CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
>> >  CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
>> >  CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
>> > -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
>> > -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
>> > +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
>> > +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* 
>> > Overriden in brw_get_renderer_string */
>>
>> Typo: Overridden
>>
>> >  CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
>> >  CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
>> >  CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
>> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
>> > b/src/mesa/drivers/dri/i965/brw_context.c
>> > index df0f6bb..f57184f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_context.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> > @@ -77,13 +77,27 @@
>> >
>> >  const char *const brw_vendor_string = "Intel Open Source Technology 
>> > Center";
>> >
>> > +static const char *
>> > +get_bsw_model(const struct intel_screen *intelScreen)
>> > +{
>> > +   switch (intelScreen->eu_total) {
>> > +   case 16:
>> > +  return "405";
>> > +   case 12:
>> > +  return "400";
>> > +   default:
>>
>> I think this is safe to just mark unreachable(), right?
>>
>
> No. If somehow the query from the kernel fails, we get nothing. That might be
> some inexplicable IOCTL fail, or some bug in the kernel. In this case, I'd 
> like
> mesa to keep running, since by and large, nobody really cares about brand
> strings except irrelevant people.
>
> Note that in a previous patch, we fall back to sane defaults under that 
> failure
> condition.

Fine by me.

>> > +  return "   ";
>> > +   }
>> > +}
>> > +
>> >  const char *
>> > -brw_get_renderer_string(unsigned deviceID)
>> > +brw_get_renderer_string(const struct intel_screen *intelScreen)
>> >  {
>> > const char *chipset;
>> > static char buffer[128];
>>
>> Not your fault, but driGetRendererString() into this static buffer
>> isn't thread-safe. I ran into a similar problem in EGL with
>> shader-db's run.c last year.
>>
>
> Do you want me to fix this up? AFAICS, I didn't actually make anything less
> threadsafe.

Nope. It was just something I noticed.

>> > +   char *bsw = NULL;
>>
>> Thought the initialization wasn't necessary at first, but indeed it is
>> if you want to unconditionally call free().
>>
>> >
>> > -   switch (deviceID) {
>> > +   switch (intelScreen->deviceID) {
>> >  #undef CHIPSET
>> >  #define CHIPSET(id, symbol, str) case id: chipset = str; break;
>> >  #include "pci_ids/i965_pci_ids.h"
>> > @@ -92,7 +106,20 @@ brw_get_renderer_string(unsigned deviceID)
>> >break;
>> > }
>> >
>> > +   /* Braswell branding is funny, so we have to fix it up here */
>> > +   if (intelScreen->deviceID == 0x22B1) {
>> > +  char *needle;
>> > +
>> > +  bsw = strdup(chipset);
>> > +  needle = strstr(bsw, "XXX");
>>
>> Could declare char *needle here and initialize on one line if you wanted.
>>
>> > +  if (needle) {
>> > + strncpy(needle, get_bsw_model(intelScreen), strlen("XXX"));
>>
>> Don't actually need (or want) any of the features of strncpy. Should
>> just use memcpy.
>>
>> > + chipset = b

Re: [Mesa-dev] [PATCH] gallium/swr: remove use of BYTE from swr driver

2016-03-09 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 




On 3/8/16, 11:50 AM, "mesa-dev on behalf of Tim Rowley" 
 wrote:

>Remove use of a win32-style type leaked from the swr rasterizer.
>---
> src/gallium/drivers/swr/swr_memory.h| 8 
> src/gallium/drivers/swr/swr_scratch.cpp | 8 
> src/gallium/drivers/swr/swr_screen.cpp  | 4 ++--
> src/gallium/drivers/swr/swr_state.cpp   | 8 
> 4 files changed, 14 insertions(+), 14 deletions(-)
>
>diff --git a/src/gallium/drivers/swr/swr_memory.h 
>b/src/gallium/drivers/swr/swr_memory.h
>index d116781..65fc169 100644
>--- a/src/gallium/drivers/swr/swr_memory.h
>+++ b/src/gallium/drivers/swr/swr_memory.h
>@@ -28,14 +28,14 @@ void LoadHotTile(
> SWR_FORMAT dstFormat,
> SWR_RENDERTARGET_ATTACHMENT renderTargetIndex,
> UINT x, UINT y, uint32_t renderTargetArrayIndex,
>-BYTE *pDstHotTile);
>+uint8_t *pDstHotTile);
> 
> void StoreHotTile(
> SWR_SURFACE_STATE *pDstSurface,
> SWR_FORMAT srcFormat,
> SWR_RENDERTARGET_ATTACHMENT renderTargetIndex,
> UINT x, UINT y, uint32_t renderTargetArrayIndex,
>-BYTE *pSrcHotTile);
>+uint8_t *pSrcHotTile);
> 
> void StoreHotTileClear(
> SWR_SURFACE_STATE *pDstSurface,
>@@ -49,7 +49,7 @@ swr_LoadHotTile(HANDLE hPrivateContext,
> SWR_FORMAT dstFormat,
> SWR_RENDERTARGET_ATTACHMENT renderTargetIndex,
> UINT x, UINT y,
>-uint32_t renderTargetArrayIndex, BYTE* pDstHotTile)
>+uint32_t renderTargetArrayIndex, uint8_t* pDstHotTile)
> {
>// Grab source surface state from private context
>swr_draw_context *pDC = (swr_draw_context*)hPrivateContext;
>@@ -63,7 +63,7 @@ swr_StoreHotTile(HANDLE hPrivateContext,
>  SWR_FORMAT srcFormat,
>  SWR_RENDERTARGET_ATTACHMENT renderTargetIndex,
>  UINT x, UINT y,
>- uint32_t renderTargetArrayIndex, BYTE* pSrcHotTile)
>+ uint32_t renderTargetArrayIndex, uint8_t* pSrcHotTile)
> {
>// Grab destination surface state from private context
>swr_draw_context *pDC = (swr_draw_context*)hPrivateContext;
>diff --git a/src/gallium/drivers/swr/swr_scratch.cpp 
>b/src/gallium/drivers/swr/swr_scratch.cpp
>index e6c448c..28eb2ac 100644
>--- a/src/gallium/drivers/swr/swr_scratch.cpp
>+++ b/src/gallium/drivers/swr/swr_scratch.cpp
>@@ -58,14 +58,14 @@ swr_copy_to_scratch_space(struct swr_context *ctx,
>  }
> 
>  if (!space->base) {
>-space->base = (BYTE *)align_malloc(space->current_size, 4);
>+space->base = (uint8_t *)align_malloc(space->current_size, 4);
> space->head = (void *)space->base;
>  }
>   }
> 
>   /* Wrap */
>-  if (((BYTE *)space->head + size)
>-  >= ((BYTE *)space->base + space->current_size)) {
>+  if (((uint8_t *)space->head + size)
>+  >= ((uint8_t *)space->base + space->current_size)) {
>  /*
>   * TODO XXX: Should add a fence on wrap.  Assumption is that
>   * current_space >> size, and there are at least MAX_DRAWS_IN_FLIGHT
>@@ -78,7 +78,7 @@ swr_copy_to_scratch_space(struct swr_context *ctx,
>   }
> 
>   ptr = space->head;
>-  space->head = (BYTE *)space->head + size;
>+  space->head = (uint8_t *)space->head + size;
>}
> 
>/* Copy user_buffer to scratch */
>diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
>b/src/gallium/drivers/swr/swr_screen.cpp
>index f0d48cd..89de17a 100644
>--- a/src/gallium/drivers/swr/swr_screen.cpp
>+++ b/src/gallium/drivers/swr/swr_screen.cpp
>@@ -537,7 +537,7 @@ swr_texture_layout(struct swr_screen *screen,
>res->swr.pitch = res->row_stride[0];
> 
>if (allocate) {
>-  res->swr.pBaseAddress = (BYTE *)_aligned_malloc(total_size, 64);
>+  res->swr.pBaseAddress = (uint8_t *)_aligned_malloc(total_size, 64);
> 
>   if (res->has_depth && res->has_stencil) {
>  SWR_FORMAT_INFO finfo = GetFormatInfo(res->secondary.format);
>@@ -550,7 +550,7 @@ swr_texture_layout(struct swr_screen *screen,
>  res->secondary.numSamples = (1 << pt->nr_samples);
>  res->secondary.pitch = res->alignedWidth * finfo.Bpp;
> 
>- res->secondary.pBaseAddress = (BYTE *)_aligned_malloc(
>+ res->secondary.pBaseAddress = (uint8_t *)_aligned_malloc(
> res->alignedHeight * res->secondary.pitch, 64);
>   }
>}
>diff --git a/src/gallium/drivers/swr/swr_state.cpp 
>b/src/gallium/drivers/swr/swr_state.cpp
>index 49035b5..706bf10 100644
>--- a/src/gallium/drivers/swr/swr_state.cpp
>+++ b/src/gallium/drivers/swr/swr_state.cpp
>@@ -1032,12 +1032,12 @@ swr_update_derived(struct swr_context *ctx,
>  pDC->num_constantsVS[i] = cb->buffer_size;
>  if (cb->buffer)
> pDC->constantVS[i] =
>-   (const float *)((const BYTE *)cb->buffer + cb->buffer_offset);
>+   (const float *)((const uint8_t *)cb->buffer + 
>cb->buffer_offset);
>  else {
>   

[Mesa-dev] [PATCH] radeonsi: Lazily re-set sampler views after disabling DCC

2016-03-09 Thread Bas Nieuwenhuizen
Clear DCC flags if necessary when binding a new sampler_view. Also
rebind all sampler views so that the sampler views that were already
bound are also up to date.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/gallium/drivers/radeon/r600_texture.c |  2 --
 src/gallium/drivers/radeonsi/si_descriptors.c | 22 +++---
 src/gallium/drivers/radeonsi/si_state.h   |  1 +
 src/gallium/drivers/radeonsi/si_state_draw.c  |  1 +
 4 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 1a8822c..07118fc 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -307,14 +307,12 @@ static void r600_texture_disable_dcc(struct 
r600_common_screen *rscreen,
/* Disable DCC. */
rtex->dcc_offset = 0;
rtex->cb_color_info &= ~VI_S_028C70_DCC_ENABLE(1);
 
/* Notify all contexts about the change. */
r600_dirty_all_framebuffer_states(rscreen);
-
-   /* TODO: re-set all sampler views and images, but how? */
 }
 
 static boolean r600_texture_get_handle(struct pipe_screen* screen,
   struct pipe_resource *resource,
   struct winsys_handle *whandle,
unsigned usage)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 37b9d68..5838e24 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -182,18 +182,22 @@ static void si_sampler_views_begin_new_cs(struct 
si_context *sctx,
 }
 
 static void si_set_sampler_view(struct si_context *sctx,
struct si_sampler_views *views,
unsigned slot, struct pipe_sampler_view *view)
 {
-   if (views->views[slot] == view)
+   struct si_sampler_view *rview = (struct si_sampler_view*)view;
+
+   if (view && G_008F28_COMPRESSION_EN(rview->state[6]) &&
+   ((struct r600_texture*)rview->base.texture)->dcc_offset == 0) {
+   rview->state[6] &= C_008F28_COMPRESSION_EN &
+  C_008F28_ALPHA_IS_ON_MSB;
+   } else if (views->views[slot] == view)
return;
 
if (view) {
-   struct si_sampler_view *rview =
-   (struct si_sampler_view*)view;
struct r600_texture *rtex = (struct r600_texture 
*)view->texture;
 
si_sampler_view_add_buffer(sctx, view->texture);
 
pipe_sampler_view_reference(&views->views[slot], view);
memcpy(views->desc.list + slot * 16, rview->state, 8*4);
@@ -267,12 +271,24 @@ static void si_set_sampler_views(struct pipe_context *ctx,
samplers->depth_texture_mask &= ~(1 << slot);
samplers->compressed_colortex_mask &= ~(1 << slot);
}
}
 }
 
+void si_reset_sampler_views(struct si_context *sctx) {
+   unsigned shader, sampler;
+
+   for (shader = 0; shader < SI_NUM_SHADERS; ++shader) {
+   struct si_sampler_views *views = &sctx->samplers[shader].views;
+   for (sampler = 0; sampler < SI_NUM_SAMPLERS; ++sampler) {
+   si_set_sampler_view(sctx, views, sampler,
+   views->views[sampler]);
+   }
+   }
+}
+
 /* SAMPLER STATES */
 
 static void si_bind_sampler_states(struct pipe_context *ctx, unsigned shader,
unsigned start, unsigned count, void 
**states)
 {
struct si_context *sctx = (struct si_context *)ctx;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index fb16d0f..dab94e5 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -243,12 +243,13 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
bool add_tid, bool swizzle,
unsigned element_size, unsigned index_stride, uint64_t 
offset);
 void si_init_all_descriptors(struct si_context *sctx);
 bool si_upload_shader_descriptors(struct si_context *sctx);
 void si_release_all_descriptors(struct si_context *sctx);
 void si_all_descriptors_begin_new_cs(struct si_context *sctx);
+void si_reset_sampler_views(struct si_context *sctx);
 void si_upload_const_buffer(struct si_context *sctx, struct r600_resource 
**rbuffer,
const uint8_t *ptr, unsigned size, uint32_t 
*const_offset);
 void si_shader_change_notify(struct si_context *sctx);
 void si_emit_shader_userdata(struct si_context *sctx, struct r600_atom *atom);
 
 /* si_state.c */
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 5d094c7..585148d 100644
--- a/src/ga

Re: [Mesa-dev] [PATCH] i965/cfg: Remove redundant #pragma once.

2016-03-09 Thread Francisco Jerez
Iago Toral  writes:

> On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
>> brw_cfg.h already has include guards, remove the "#pragma once" which
>> is redundant and non-standard.
>
> FWIW, I think using both #pragma once and include guards is a way to
> keep portability while still getting the performance advantage of
> #pragma once where it is supported.
>
It's highly unlikely to make any significant difference on any
reasonably modern compiler.  I cannot measure any change in compilation
time locally from my cleanup.

> Also it seems that we do the same thing in many other files...
>
Really?  I'm not aware of any other file where we use both.

>> ---
>>  src/mesa/drivers/dri/i965/brw_cfg.h | 1 -
>>  1 file changed, 1 deletion(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
>> b/src/mesa/drivers/dri/i965/brw_cfg.h
>> index 405020b..a2ca6b1 100644
>> --- a/src/mesa/drivers/dri/i965/brw_cfg.h
>> +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
>> @@ -25,7 +25,6 @@
>>   *
>>   */
>>  
>> -#pragma once
>>  #ifndef BRW_CFG_H
>>  #define BRW_CFG_H
>>  


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: fix possible null dereference

2016-03-09 Thread Anuj Phogat
On Wed, Mar 9, 2016 at 10:21 AM, Lars Hamre  wrote:
> Fixes a possible null dereference.
>
> NOTE: this is my first time contributing, please let me know if I
>   should be doing anything differently, thanks!
Welcome to mesa-dev Lars.

>
> Signed-off-by: Lars Hamre 
> ---
>  src/mesa/swrast/s_triangle.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/swrast/s_triangle.c b/src/mesa/swrast/s_triangle.c
> index 876a74b..9225974 100644
> --- a/src/mesa/swrast/s_triangle.c
> +++ b/src/mesa/swrast/s_triangle.c
> @@ -781,7 +781,7 @@ fast_persp_span(struct gl_context *ctx, SWspan *span,
>}
>break;
> }
> -
> +
Drop this spurious change.
> assert(span->arrayMask & SPAN_RGBA);
> _swrast_write_rgba_span(ctx, span);
>
> @@ -1063,8 +1063,8 @@ _swrast_choose_triangle( struct gl_context *ctx )
>   swImg = swrast_texture_image_const(texImg);
>
>   format = texImg ? texImg->TexFormat : MESA_FORMAT_NONE;
> - minFilter = texObj2D ? samp->MinFilter : GL_NONE;
> - magFilter = texObj2D ? samp->MagFilter : GL_NONE;
> + minFilter = (texObj2D && samp) ? samp->MinFilter : GL_NONE;
> + magFilter = (texObj2D && samp) ? samp->MagFilter : GL_NONE;
>   envMode = ctx->Texture.Unit[0].EnvMode;
>
>   /* First see if we can use an optimized 2-D texture function */
> @@ -1073,6 +1073,7 @@ _swrast_choose_triangle( struct gl_context *ctx )
>   && !ctx->ATIFragmentShader._Enabled
>   && ctx->Texture._MaxEnabledTexImageUnit == 0
>   && ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D
> + && samp
>   && samp->WrapS == GL_REPEAT
>   && samp->WrapT == GL_REPEAT
>   && texObj2D->_Swizzle == SWIZZLE_NOOP
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Rest LGTM. With the suggested change, patch is:

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/10] i965: Add and use is_scheduling_barrier() function.

2016-03-09 Thread Francisco Jerez
Ian Romanick  writes:

> On 03/08/2016 04:47 PM, Francisco Jerez wrote:
>> Matt Turner  writes:
>> 
>>> On Fri, Mar 4, 2016 at 8:49 PM, Francisco Jerez  
>>> wrote:
 Matt Turner  writes:

> Though there is a lot of overlap with has_side_effects(), these do mean
> different things.

 Can we do it the other way around and implement is_scheduling_barrier()
 in terms of has_side_effects()?  has_side_effects() seems like the more
 fundamental of the two and because is_scheduling_barrier() is specific
 to the scheduler it would make more sense to keep it static inline in
 brw_schedule_instructions.cpp for the sake of encapsulation.

 AFAIUI is_scheduling_barrier() is merely a makeshift approximation at
 missing memory dependency analysis, and in the long term is the wrong
 question to ask (IMHO the right question is "does this instruction have
 an execution dependency with respect to this other?", which implies that
 either of the two instructions has some sort of side effect, but not the
 converse).  has_side_effects() OTOH has a well-defined answer that can
 be answered by looking at the semantics of the instruction alone,
 independent from scheduling heuristics and surrounding compiler
 infrastructure.

 I think for the moment I'd make is_scheduling_barrier return true if the
 instruction has side effects, except where you have the guarantee that
 the side-effectful instruction won't have a (non-dataflow related)
 execution dependency with any other instruction of the program, which is
 currently only the case for non-EOT FB_WRITE -- Pretty much has Ken had
 open-coded it in his scheduling changes except for the non-EOT part.
>>>
>>> I think your suggestion is to keep has_side_effects() the same and to
>>> wrap it with a static is_scheduling_barrier(). What we have is
>>>
>>> bool
>>> backend_instruction::has_side_effects() const
>>> {
>>>switch (opcode) {
>>>[list of instructions]
>>>   return true;
>>>default:
>>>   return false;
>>>}
>>> }
>>>
>>> bool
>>> fs_inst::has_side_effects() const
>>> {
>>>return this->eot || backend_instruction::has_side_effects();
>>> }
>>>
>>> And then in the scheduler,
>>>
>>> if ((inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
>>>  inst->has_side_effects()) &&
>>> inst->opcode != FS_OPCODE_FB_WRITE)
>>>add_barrier_deps(n);
>>>
>>> where the FS_OPCODE_FB_WRITE check was added recently to avoid
>>> treating it as a barrier. That seems pretty dirty, because
>>> has_side_effects() returns true for that opcode, so we're just hacking
>>> around that. I noted in my revert that it also had the effect of
>>> making an FB_WRITE with EOT not a barrier.
>>>
>>> So, your suggestion is to add another layer on top of that, that
>>> checks inst->eot?
>>>
>>> We'd have something like
>>>
>>> static inline is_scheduling_barrier(fs_inst *inst)
>>> {
>>>return inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
>>>   (inst->has_side_effects() &&
>>>inst->opcode != FS_OPCODE_FB_WRITE) ||
>>>   inst->eot;
>>> }
>>>
>>> where, tracing through the execution, fs_inst::has_side_effects would
>>> return true because inst->eot, but since opcode is FB_WRITE that
>>> expression would evaluate to false. But then because inst->eot, it'd
>>> evaluate to true.
>>>
>>> Doesn't that seem *really* hacky?
>>>
>>> Can I go ahead and get someone else's opinion? I doubt you're going to 
>>> agree.
>> 
>> Oh, I definitely agree, it seems a bit of a hack to me -- It seems hacky
>> because the whole scheduling barrier business IS a hack.  Instead of
>> implementing actual memory dependency analysis we want to do the
>> following (at least for the time being) which I think we all agree is a
>> hack:
>> 
>>  Treat any instruction with side-effects as a scheduling barrier except
>>  where we have the guarantee that the side-effects of the instruction
>>  won't have any influence on any other instruction in the program.
>> 
>> That's currently only the case for non-EOT FB_WRITE (and there's nothing
>> special about FB_WRITE that makes it different from other side-effectful
>> instructions other than the fact that we currently don't implement
>> framebuffer reads, so it's pretty much coincidental that they currently
>> have no influence on anything else).
>
> FB reads will be necessary for GL_KHR_blend_equation_advanced which is
> needed for OpenGL ES 3.2.
>
Yeah, exactly, that's part of the reason I regard this as a temporary
hack -- We will most likely end up reverting the special handling of
FB_WRITEs (and treat it as a regular side-effectful instruction during
scheduling) when framebuffer reads are implemented because the
assumption this is based on will break down.

> Implementing full memory dependency tracking is hard.

It's a pretty much straightforward extension of the use/def chains
analysis pass I've been working on.  I don't think it will be 

Re: [Mesa-dev] [PATCH] r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.

2016-03-09 Thread Glenn Kennard

On Wed, 09 Mar 2016 09:58:48 +0100, Xavier B  wrote:


From: xavier 

Previously it was doing this transformation for a Trine 3 shader:
 MUL R6.x.12,R13.x.23, 0.5|3f00
-MULADD R4.x.12,-R6.x.12, 2|4000, 1|3f80
+MULADD R4.x.12,-R13.x.23, -1|bf80, 1|3f80

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412
Signed-off-by: Xavier Bouchoux 
---
 src/gallium/drivers/r600/sb/sb_expr.cpp | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_expr.cpp 
b/src/gallium/drivers/r600/sb/sb_expr.cpp
index 556a05d..3dd3a48 100644
--- a/src/gallium/drivers/r600/sb/sb_expr.cpp
+++ b/src/gallium/drivers/r600/sb/sb_expr.cpp
@@ -598,9 +598,13 @@ bool expr_handler::fold_assoc(alu_node *n) {
unsigned op = n->bc.op;
bool allow_neg = false, cur_neg = false;
+   bool distribute_neg = false;
switch(op) {
case ALU_OP2_ADD:
+   distribute_neg = true;



+   allow_neg = true;


I'm not sure this change belongs in this patch, or even if its correct.


+   break;
case ALU_OP2_MUL:
case ALU_OP2_MUL_IEEE:
allow_neg = true;
@@ -632,7 +636,7 @@ bool expr_handler::fold_assoc(alu_node *n) {
if (v1->is_const()) {
literal arg = v1->get_const_value();
apply_alu_src_mod(a->bc, 1, arg);
-   if (cur_neg)
+   if (cur_neg && distribute_neg)
arg.f = -arg.f;
if (a == n)
@@ -660,7 +664,7 @@ bool expr_handler::fold_assoc(alu_node *n) {
if (v0->is_const()) {
literal arg = v0->get_const_value();
apply_alu_src_mod(a->bc, 0, arg);
-   if (cur_neg)
+   if (cur_neg && distribute_neg)
arg.f = -arg.f;
if (last_arg == 0) {



With the allow_neg change removed, patch is
Reviewed-by: Glenn Kennard 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Samuel Pitoiset



On 03/09/2016 09:28 PM, Vinson Lee wrote:

On Wed, Mar 9, 2016 at 5:25 AM, Samuel Pitoiset
 wrote:



On 03/09/2016 01:46 PM, Pierre Moreau wrote:


I did hit that issue as well, but I have C++11 forced on my SPIR-V branch.

I guess adding the whitespace will still result in code that works with
older
C++ version, so the fix can still be accepted even if we do not plan to
switch
to C++11 by default.



Sure, the patch looks fine, but I wonder how he did hit that issue. :-)

Anyway, if this doesn't break compilation without c++11, this patch is:

Reviewed-by: Samuel Pitoiset 



Pierre


On 11:16 AM - Mar 09 2016, Samuel Pitoiset wrote:


Nouveau doesn't use c++11 except the codegen part.
How do you hit that issue? Pretty sure that you forced c++11, right?

I can't reproduce that compilation error with clang 3.9 btw.

On 03/09/2016 09:57 AM, Vinson Lee wrote:


CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11
requires a space between literal and identifier
[-Wreserved-user-defined-literal]
 fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
   ^

Signed-off-by: Vinson Lee 
---
   src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h
b/src/gallium/drivers/nouveau/nouveau_debug.h
index d17df81..546a4ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_debug.h
+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
@@ -16,7 +16,7 @@
   #define NOUVEAU_DEBUG 0

   #define NOUVEAU_ERR(fmt, args...) \
-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)

   #define NOUVEAU_DBG(ch, args...)   \
  if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
-Samuel


Building swr also seems to be adding -std=c++11 to the nouveau portion
of the build. Can you try a clang build with this configure statement?

./autogen.sh --with-dri-drivers= --with-gallium-drivers=nouveau,swr


Yes, I'll do.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-09 Thread Ian Romanick
On 03/09/2016 02:25 AM, tournier.elie wrote:
> Hi everyone.
> 
> My name is Elie TOURNIER, I am enrolled in a French Engineering school
> (Telecom Physique Strasbourg) specialized in Medical ICT.
> I'm interested in implementing "Soft" double precision floating point
> support [1].
> Taking this subject seem to be a good way to get my feet wet in the Mesa
> code and discover how some of its components works.
> 
> I come to you in order to become know but also to retrieve valuable
> information for the success of this project.
> 
> I would like to know more about the following things to understand your
> requirements :
> 1- "/Each double precision value would be stored in a uvec2/" The IEEE
> double precision floating point standard representation requires a 64
> bit: 1 for sign, 11 for exponent and the others for fraction [2].
> -> How double precision value must be stored?

As Emil mentioned, on GLSL 1.30, a uvec2 consists of two, 32-bit
unsigned integers.  Each double precision value would be stored in a uvec2.

> 2- Where can I find |GL_ARB_gpu_shader_fp64 |documentation|?
> |
> 
> 
> This is my first exposure to Mesa. Please excuse me if I am asking basic
> questions.

For this particular project, you wouldn't need Mesa at all for quite
some time.  All of the initial project should be done in "raw" GLSL
1.30, and any OpenGL implementation capable of GLSL 1.30 can be used.
You would implement (and test!) a library of functions like 'uvec2
addDouble(uvec2 a, uvec2 b)' that would provide all of the required
double precision operations.

The set of required functions should be pretty small.  I think:

 - add
 - negate
 - absolute value
 - multiply
 - reciprocal
 - convert to single precision
 - convert from single precision
 - pow (maybe?)
 - exp (maybe?)
 - log (maybe?)

I think everything else could be implemented using those functions.

Like I mentioned in the project description, there are quite a few
existing C implementations of these functions.  Finding one of those
that you can understand and that has a compatible license is probably
the best place to start.

> Please point me to the right resources so that I can better understand
> the project. I would also be happy to fix a bug to familiarize myself 
> with the source code. Any suggestions on bugs that are relevant to the
> project will be of great help.
> 
> Regards,
> Elie
> 
> [1]
> http://www.x.org/wiki/SummerOfCodeIdeas/#softdoubleprecisionfloatingpointsupport
> [2] http://steve.hollasch.net/cgindex/coding/ieeefloat.html#storage
> 
> PS: If you have any questions, please don't hesitate to contact me.
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] r600g: update compressed_colortex_masks when a cmask is created or disabled

2016-03-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/r600/r600_state_common.c | 30 
 1 file changed, 30 insertions(+)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index e3314bb..40ceb8d 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -693,6 +693,26 @@ static void r600_set_sampler_views(struct pipe_context 
*pipe, unsigned shader,
}
 }
 
+static void r600_update_compressed_colortex_mask(struct r600_samplerview_state 
*views)
+{
+   uint32_t mask = views->enabled_mask;
+
+   while (mask) {
+   unsigned i = u_bit_scan(&mask);
+   struct pipe_resource *res = views->views[i]->base.texture;
+
+   if (res && res->target != PIPE_BUFFER) {
+   struct r600_texture *rtex = (struct r600_texture *)res;
+
+   if (rtex->cmask.size) {
+   views->compressed_colortex_mask |= 1 << i;
+   } else {
+   views->compressed_colortex_mask &= ~(1 << i);
+   }
+   }
+   }
+}
+
 static void r600_set_viewport_states(struct pipe_context *ctx,
  unsigned start_slot,
  unsigned num_viewports,
@@ -1457,6 +1477,16 @@ static bool r600_update_derived_state(struct 
r600_context *rctx)
 
if (!rctx->blitter->running) {
unsigned i;
+   unsigned counter;
+
+   counter = 
p_atomic_read(&rctx->screen->b.compressed_colortex_counter);
+   if (counter != rctx->b.last_compressed_colortex_counter) {
+   rctx->b.last_compressed_colortex_counter = counter;
+
+   for (i = 0; i < PIPE_SHADER_TYPES; ++i) {
+   
r600_update_compressed_colortex_mask(&rctx->samplers[i].views);
+   }
+   }
 
/* Decompress textures if needed. */
for (i = 0; i < PIPE_SHADER_TYPES; i++) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] gallium/radeon: notify all contexts when cmasks are enabled/disabled

2016-03-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

There is an annoying corner case that I stumbled across while looking into
piglit's 
arb_shader_image_load_store/execution/load-from-cleared-image.shader_test
(which can be easily adapted to demonstrate the bug without the
ARB_shader_image_load_store extension)

When we bind a texture and then clear it using glClear (by attaching it
to the current framebuffer) for the first time, we allocate a separate
cmask for the texture to do fast clear, but the corresponding bit in
compressed_colortex_mask is not set. Subsequent rendering will use
incorrect data.

Conversely, when a currently bound texture with an existing cmask is
exported leading to that cmask being disabled, the compressed_colortex_mask
bit will remain set, leading to an assertion later on in debug builds.

Since iterating through all contexts and/or remembering where every
texture is bound would be costly, and cmask enable/disable should be
rare, we will maintain a global counter to signal contexts that they
must update their compressed_colortex_masks.

This patch introduces the global counter, and subsequent patches will
do the mask update.
---
 src/gallium/drivers/radeon/r600_pipe_common.h | 7 +++
 src/gallium/drivers/radeon/r600_texture.c | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index d20069e..cf8dcf7 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -335,6 +335,12 @@ struct r600_common_screen {
 */
unsigneddirty_fb_counter;
 
+   /* Atomically increment this counter when an existing texture's
+* metadata is enabled or disabled in a way that requires changing
+* contexts' compressed texture binding masks.
+*/
+   unsignedcompressed_colortex_counter;
+
void (*query_opaque_metadata)(struct r600_common_screen *rscreen,
  struct r600_texture *rtex,
  struct radeon_bo_metadata *md);
@@ -406,6 +412,7 @@ struct r600_common_context {
unsignedinitial_gfx_cs_size;
unsignedgpu_reset_counter;
unsignedlast_dirty_fb_counter;
+   unsignedlast_compressed_colortex_counter;
 
struct u_upload_mgr *uploader;
struct u_suballocator   *allocator_so_filled_size;
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 1a8822c..6b2d909 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -287,6 +287,7 @@ static void r600_texture_disable_cmask(struct 
r600_common_screen *rscreen,
 
/* Notify all contexts about the change. */
r600_dirty_all_framebuffer_states(rscreen);
+   p_atomic_inc(&rscreen->compressed_colortex_counter);
 }
 
 static void r600_texture_disable_dcc(struct r600_common_screen *rscreen,
@@ -603,6 +604,8 @@ static void r600_texture_alloc_cmask_separate(struct 
r600_common_screen *rscreen
rtex->cb_color_info |= SI_S_028C70_FAST_CLEAR(1);
else
rtex->cb_color_info |= EG_S_028C70_FAST_CLEAR(1);
+
+   p_atomic_inc(&rscreen->compressed_colortex_counter);
 }
 
 static unsigned r600_texture_get_htile_size(struct r600_common_screen *rscreen,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radeonsi: update compressed_colortex_masks when a cmask is created or disabled

2016-03-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_blit.c|  9 ++
 src/gallium/drivers/radeonsi/si_descriptors.c | 43 +--
 src/gallium/drivers/radeonsi/si_state.h   |  1 +
 3 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index e17343f..f9a6de4 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -352,9 +352,18 @@ si_decompress_color_textures(struct si_context *sctx,
 
 void si_decompress_textures(struct si_context *sctx)
 {
+   unsigned compressed_colortex_counter;
+
if (sctx->blitter->running)
return;
 
+   /* Update the compressed_colortex_mask if necessary. */
+   compressed_colortex_counter = 
p_atomic_read(&sctx->screen->b.compressed_colortex_counter);
+   if (compressed_colortex_counter != 
sctx->b.last_compressed_colortex_counter) {
+   sctx->b.last_compressed_colortex_counter = 
compressed_colortex_counter;
+   si_update_compressed_colortex_masks(sctx);
+   }
+
/* Flush depth textures which need to be flushed. */
for (int i = 0; i < SI_NUM_SHADERS; i++) {
if (sctx->samplers[i].depth_texture_mask) {
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 37b9d68..9aa4877 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -224,6 +224,12 @@ static void si_set_sampler_view(struct si_context *sctx,
views->desc.list_dirty = true;
 }
 
+static bool is_compressed_colortex(struct r600_texture *rtex)
+{
+   return rtex->cmask.size || rtex->fmask.size ||
+  (rtex->dcc_offset && rtex->dirty_level_mask);
+}
+
 static void si_set_sampler_views(struct pipe_context *ctx,
 unsigned shader, unsigned start,
  unsigned count,
@@ -257,8 +263,7 @@ static void si_set_sampler_views(struct pipe_context *ctx,
} else {
samplers->depth_texture_mask &= ~(1 << slot);
}
-   if (rtex->cmask.size || rtex->fmask.size ||
-   (rtex->dcc_offset && rtex->dirty_level_mask)) {
+   if (is_compressed_colortex(rtex)) {
samplers->compressed_colortex_mask |= 1 << slot;
} else {
samplers->compressed_colortex_mask &= ~(1 << 
slot);
@@ -270,6 +275,27 @@ static void si_set_sampler_views(struct pipe_context *ctx,
}
 }
 
+static void
+si_samplers_update_compressed_colortex_mask(struct si_textures_info *samplers)
+{
+   uint64_t mask = samplers->views.desc.enabled_mask;
+
+   while (mask) {
+   int i = u_bit_scan64(&mask);
+   struct pipe_resource *res = samplers->views.views[i]->texture;
+
+   if (res && res->target != PIPE_BUFFER) {
+   struct r600_texture *rtex = (struct r600_texture *)res;
+
+   if (is_compressed_colortex(rtex)) {
+   samplers->compressed_colortex_mask |= 1 << i;
+   } else {
+   samplers->compressed_colortex_mask &= ~(1 << i);
+   }
+   }
+   }
+}
+
 /* SAMPLER STATES */
 
 static void si_bind_sampler_states(struct pipe_context *ctx, unsigned shader,
@@ -762,6 +788,19 @@ static void si_desc_reset_buffer_offset(struct 
pipe_context *ctx,
  S_008F04_BASE_ADDRESS_HI(va >> 32);
 }
 
+/* TEXTURE METADATA ENABLE/DISABLE */
+
+/* CMASK can be enabled (for fast clear) and disabled (for texture export)
+ * while the texture is bound, possibly by a different context. In that case,
+ * call this function to update compressed_colortex_masks.
+ */
+void si_update_compressed_colortex_masks(struct si_context *sctx)
+{
+   for (int i = 0; i < SI_NUM_SHADERS; ++i) {
+   si_samplers_update_compressed_colortex_mask(&sctx->samplers[i]);
+   }
+}
+
 /* BUFFER DISCARD/INVALIDATION */
 
 /* Reallocate a buffer a update all resource bindings where the buffer is
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index fb16d0f..60c34f1 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -249,6 +249,7 @@ void si_all_descriptors_begin_new_cs(struct si_context 
*sctx);
 void si_upload_const_buffer(struct si_context *sctx, struct r600_resource 
**rbuffer,
const uint8_t *ptr, unsigned size, uint32_t 
*const_offset);
 void si_shader_change_notify(struct si_context *sctx);
+void si_update_compressed_colortex_masks(struct si_context *sctx);
 void si_emit_shader_userdata(struct si_contex

[Mesa-dev] [PATCH 3/4] radeonsi: move si_decompress_textures to si_blit.c

2016-03-09 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Since it is all about calling into blitter functions, it makes more
sense here. This change also reduces the size of the interfaces between
.c files.
---
 src/gallium/drivers/radeonsi/si_blit.c   | 26 ++
 src/gallium/drivers/radeonsi/si_pipe.h   |  5 +
 src/gallium/drivers/radeonsi/si_state_draw.c | 15 ---
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index d9c4f8f..e17343f 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -241,8 +241,9 @@ static void si_blit_decompress_depth_in_place(struct 
si_context *sctx,
si_mark_atom_dirty(sctx, &sctx->db_render_state);
 }
 
-void si_flush_depth_textures(struct si_context *sctx,
-struct si_textures_info *textures)
+static void
+si_flush_depth_textures(struct si_context *sctx,
+   struct si_textures_info *textures)
 {
unsigned i;
unsigned mask = textures->depth_texture_mask;
@@ -323,8 +324,9 @@ static void si_blit_decompress_color(struct pipe_context 
*ctx,
}
 }
 
-void si_decompress_color_textures(struct si_context *sctx,
- struct si_textures_info *textures)
+static void
+si_decompress_color_textures(struct si_context *sctx,
+struct si_textures_info *textures)
 {
unsigned i;
unsigned mask = textures->compressed_colortex_mask;
@@ -348,6 +350,22 @@ void si_decompress_color_textures(struct si_context *sctx,
}
 }
 
+void si_decompress_textures(struct si_context *sctx)
+{
+   if (sctx->blitter->running)
+   return;
+
+   /* Flush depth textures which need to be flushed. */
+   for (int i = 0; i < SI_NUM_SHADERS; i++) {
+   if (sctx->samplers[i].depth_texture_mask) {
+   si_flush_depth_textures(sctx, &sctx->samplers[i]);
+   }
+   if (sctx->samplers[i].compressed_colortex_mask) {
+   si_decompress_color_textures(sctx, &sctx->samplers[i]);
+   }
+   }
+}
+
 static void si_clear(struct pipe_context *ctx, unsigned buffers,
 const union pipe_color_union *color,
 double depth, unsigned stencil)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 736307b..0fef5f7 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -333,10 +333,7 @@ void cik_sdma_copy(struct pipe_context *ctx,
 
 /* si_blit.c */
 void si_init_blit_functions(struct si_context *sctx);
-void si_flush_depth_textures(struct si_context *sctx,
-struct si_textures_info *textures);
-void si_decompress_color_textures(struct si_context *sctx,
- struct si_textures_info *textures);
+void si_decompress_textures(struct si_context *sctx);
 void si_resource_copy_region(struct pipe_context *ctx,
 struct pipe_resource *dst,
 unsigned dst_level,
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 5d094c7..84b850a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -33,21 +33,6 @@
 #include "util/u_upload_mgr.h"
 #include "util/u_prim.h"
 
-static void si_decompress_textures(struct si_context *sctx)
-{
-   if (!sctx->blitter->running) {
-   /* Flush depth textures which need to be flushed. */
-   for (int i = 0; i < SI_NUM_SHADERS; i++) {
-   if (sctx->samplers[i].depth_texture_mask) {
-   si_flush_depth_textures(sctx, 
&sctx->samplers[i]);
-   }
-   if (sctx->samplers[i].compressed_colortex_mask) {
-   si_decompress_color_textures(sctx, 
&sctx->samplers[i]);
-   }
-   }
-   }
-}
-
 static unsigned si_conv_pipe_prim(unsigned mode)
 {
 static const unsigned prim_conv[] = {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] nvc0: add MP performance counters for SM35 (GK110)

2016-03-09 Thread Samuel Pitoiset
Because compute support is not enabled by default for these chipsets,
NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable
performance counters.

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 216 +++--
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.h|   2 +
 .../drivers/nouveau/nvc0/nve4_compute.xml.h|   4 +
 3 files changed, 205 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
index 98d6840..08f508c 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
@@ -70,6 +70,7 @@ struct {
_Q(LOCAL_LD_TRANSACTIONS,"local_load_transactions" 
),
_Q(LOCAL_ST, "local_store" 
),
_Q(LOCAL_ST_TRANSACTIONS,"local_store_transactions"
),
+   _Q(NOT_PRED_OFF_INST_EXECUTED,   "not_predicated_off_thread_inst_executed" 
),
_Q(PROF_TRIGGER_0,   "prof_trigger_00" 
),
_Q(PROF_TRIGGER_1,   "prof_trigger_01" 
),
_Q(PROF_TRIGGER_2,   "prof_trigger_02" 
),
@@ -84,6 +85,7 @@ struct {
_Q(SHARED_ST_REPLAY, "shared_store_replay" 
),
_Q(SM_CTA_LAUNCHED,  "sm_cta_launched" 
),
_Q(THREADS_LAUNCHED, "threads_launched"
),
+   _Q(TH_INST_EXECUTED, "thread_inst_executed"
),
_Q(TH_INST_EXECUTED_0,   "thread_inst_executed_0"  
),
_Q(TH_INST_EXECUTED_1,   "thread_inst_executed_1"  
),
_Q(TH_INST_EXECUTED_2,   "thread_inst_executed_2"  
),
@@ -195,6 +197,49 @@ static const uint64_t nve4_read_hw_sm_counters_code[] =
0x80001de7ULL
 };
 
+static const uint64_t nvf0_read_hw_sm_counters_code[] =
+{
+   /* Same kernel as GK104 */
+   0x0880808080808080ULL,
+   0x8640109c0022ULL,
+   0x8640019c0032ULL,
+   0x8640021c0002ULL,
+   0x8640029c0006ULL,
+   0x8640031c000aULL,
+   0x8640039c000eULL,
+   0x8640041c0012ULL,
+   0x08ac1080108c8080ULL,
+   0x8640049c0016ULL,
+   0x8640051c001aULL,
+   0x8640059c001eULL,
+   0xdb201c007f9c201eULL,
+   0x64c03c1c002aULL,
+   0xc0020a1c3021ULL,
+   0x64c03c9c002eULL,
+   0x0810a0808010b810ULL,
+   0xc001041c3025ULL,
+   0x1820003cULL,
+   0xdb201c007f9c243eULL,
+   0xc1c0301c2021ULL,
+   0xc1c0081c2431ULL,
+   0xc1c0021c2435ULL,
+   0xe080069c2026ULL,
+   0x08b010b010b010a0ULL,
+   0xe080061c2022ULL,
+   0xe4c03c00051c0032ULL,
+   0xe084041c282aULL,
+   0xe4c03c00059c0036ULL,
+   0xe08040007f9c2c2eULL,
+   0xe084049c3032ULL,
+   0xfe80001c2800ULL,
+   0x08b81080b010ULL,
+   0x64c03c00011c0002ULL,
+   0xe08040007f9c3436ULL,
+   0xfe8020043010ULL,
+   0xfc80281c3000ULL,
+   0x181c003cULL,
+};
+
 /* For simplicity, we will allocate as many group slots as we allocate counter
  * slots. This means that a single counter which wants to source from 2 groups
  * will have to be declared as using 2 counter slots. This shouldn't really be
@@ -682,6 +727,121 @@ static const struct nvc0_hw_sm_query_cfg 
*sm30_hw_sm_queries[] =
&sm30_warps_launched,
 };
 
+/*  Compute capability 3.5 (GK110/GK208)  */
+static const struct nvc0_hw_sm_query_cfg
+sm35_atom_cas_count =
+{
+   .type = NVC0_HW_SM_QUERY_ATOM_CAS_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, UNK1A, 0x0014),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm35_atom_count =
+{
+   .type = NVC0_HW_SM_QUERY_ATOM_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, UNK1A, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm35_gred_count =
+{
+   .type = NVC0_HW_SM_QUERY_GRED_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, UNK1A, 0x0018),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm35_not_pred_off_inst_executed =
+{
+   .type = NVC0_HW_SM_QUERY_NOT_PRED_OFF_INST_EXECUTED,
+   .ctr[0]   = _CA(0x003f, B6, UNK14, 0x29062080),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm35_shared_ld_replay =
+{
+   .type = NVC0_HW_SM_QUERY_SHARED_LD_REPLAY,
+   .ctr[0]   = _CB(0x, LOGOP, UNK13, 0x0018),
+   .ctr[1]   = _CB(0x, LOGOP, REPLAY, 0x0151),
+   .num_counters = 2,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm35_shared_st_replay =
+{
+   .type = NVC0_HW_SM_QUERY_SHARED_ST_REPLAY,
+   .ctr[0]   = _CB(0x, LOGOP, UNK

[Mesa-dev] [PATCH 0/6] *nvc0: MP perf counters improvements & SM35 support

2016-03-09 Thread Samuel Pitoiset
Hi,

This series reworks the MP perf counters and the driver metrics infrastructure,
and it adds compute-related perf counters on GK110 (SM35).

This has been tested on GF119, GK104 and GK208.
No regressions with the HUD and with AMD_performance_monitor.

Please review,
Thanks.

Samuel Pitoiset (6):
  nvc0: rework the MP counters infrastructure
  nvc0: rework the driver metrics infrastructure
  nvc0: explode config of Kepler hardware SM events
  nvc0: add MP performance counters for SM35 (GK110)
  nvc0: add driver metrics for SM35 (GK110)
  nvc0: expose SM35 perf counters to AMD_performance_monitor

 src/gallium/drivers/nouveau/nvc0/nvc0_query.c  |   28 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c|  316 +++---
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.h|   23 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 1146 +++-
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.h|   81 +-
 .../drivers/nouveau/nvc0/nve4_compute.xml.h|4 +
 6 files changed, 1099 insertions(+), 499 deletions(-)

-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] nvc0: add driver metrics for SM35 (GK110)

2016-03-09 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
index a01ab3f..20ad558 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
@@ -300,6 +300,20 @@ static const struct nvc0_hw_metric_query_cfg 
*sm30_hw_metric_queries[] =
&sm30_shared_replay_overhead,
 };
 
+/*  Compute capability 3.5 (GK110)  */
+static const struct nvc0_hw_metric_query_cfg *sm35_hw_metric_queries[] =
+{
+   &sm30_achieved_occupancy,
+   &sm30_inst_issued,
+   &sm30_inst_per_wrap,
+   &sm30_inst_replay_overhead,
+   &sm30_issued_ipc,
+   &sm30_inst_issued,
+   &sm30_issue_slot_utilization,
+   &sm30_ipc,
+   &sm30_shared_replay_overhead,
+};
+
 #undef _SM
 
 static inline const struct nvc0_hw_metric_query_cfg **
@@ -308,6 +322,8 @@ nvc0_hw_metric_get_queries(struct nvc0_screen *screen)
struct nouveau_device *dev = screen->base.device;
 
switch (screen->base.class_3d) {
+   case NVF0_3D_CLASS:
+  return sm35_hw_metric_queries;
case NVE4_3D_CLASS:
   return sm30_hw_metric_queries;
default:
@@ -325,6 +341,8 @@ nvc0_hw_metric_get_num_queries(struct nvc0_screen *screen)
struct nouveau_device *dev = screen->base.device;
 
switch (screen->base.class_3d) {
+   case NVF0_3D_CLASS:
+  return ARRAY_SIZE(sm35_hw_metric_queries);
case NVE4_3D_CLASS:
   return ARRAY_SIZE(sm30_hw_metric_queries);
default:
@@ -558,6 +576,7 @@ nvc0_hw_metric_get_query_result(struct nvc0_context *nvc0,
}
 
switch (screen->base.class_3d) {
+   case NVF0_3D_CLASS:
case NVE4_3D_CLASS:
   value = sm30_hw_metric_calc_result(hq, res64);
   break;
@@ -629,7 +648,8 @@ nvc0_hw_metric_get_driver_query_info(struct nvc0_screen 
*screen, unsigned id,
 
if (id < count) {
   if (screen->compute) {
- if (screen->base.class_3d <= NVE4_3D_CLASS) {
+ if (screen->base.class_3d <= NVF0_3D_CLASS &&
+ screen->base.class_3d != NVEA_3D_CLASS) {
 const struct nvc0_hw_metric_query_cfg **queries =
nvc0_hw_metric_get_queries(screen);
 
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] nvc0: expose SM35 perf counters to AMD_performance_monitor

2016-03-09 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
index 6836432..5cbc66e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
@@ -204,7 +204,8 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,
 
if (screen->base.drm->version >= 0x01000101) {
   if (screen->compute) {
- if (screen->base.class_3d <= NVE4_3D_CLASS) {
+ if (screen->base.class_3d <= NVF0_3D_CLASS &&
+ screen->base.class_3d != NVEA_3D_CLASS) {
 count += 2;
  }
   }
@@ -230,7 +231,8 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,
} else
if (id == NVC0_HW_METRIC_QUERY_GROUP) {
   if (screen->compute) {
-  if (screen->base.class_3d <= NVE4_3D_CLASS) {
+  if (screen->base.class_3d <= NVF0_3D_CLASS &&
+  screen->base.class_3d != NVE4_3D_CLASS) {
 info->name = "Performance metrics";
 info->max_active_queries = 1;
 info->num_queries = nvc0_hw_metric_get_num_queries(screen);
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] nvc0: explode config of Kepler hardware SM events

2016-03-09 Thread Samuel Pitoiset
This is really verbose but most of the configuration will be reused
for SM35 (GK110).

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 555 ++---
 1 file changed, 477 insertions(+), 78 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
index 1cbcae1..98d6840 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
@@ -219,64 +219,472 @@ struct nvc0_hw_sm_query_cfg
uint8_t norm[2]; /* normalization num,denom */
 };
 
-#define _Q1A(n, f, m, g, s, nu, dn) { NVC0_HW_SM_QUERY_##n, { { f, 
NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, 0, s }, 
{}, {}, {} }, 1, { nu, dn } }
-#define _Q1B(n, f, m, g, s, nu, dn) { NVC0_HW_SM_QUERY_##n, { { f, 
NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, 0, s }, 
{}, {}, {} }, 1, { nu, dn } }
+#define _CA(f, m, g, s) { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 
NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, 0, s }
+#define _CB(f, m, g, s) { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 1, 
NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, 0, s }
+#define _Q(n, c) [NVE4_HW_SM_QUERY_##n] = c
+
+/*  Compute capability 3.0 (GK104:GK110)  */
+static const struct nvc0_hw_sm_query_cfg
+sm30_active_cycles =
+{
+   .type = NVC0_HW_SM_QUERY_ACTIVE_CYCLES,
+   .ctr[0]   = _CB(0x0001, B6, WARP, 0x),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_active_warps =
+{
+   .type = NVC0_HW_SM_QUERY_ACTIVE_WARPS,
+   .ctr[0]   = _CB(0x003f, B6, WARP, 0x31483104),
+   .num_counters = 1,
+   .norm = { 2, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_atom_cas_count =
+{
+   .type = NVC0_HW_SM_QUERY_ATOM_CAS_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x4),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_atom_count =
+{
+   .type = NVC0_HW_SM_QUERY_ATOM_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_branch =
+{
+   .type = NVC0_HW_SM_QUERY_BRANCH,
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x000c),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_divergent_branch =
+{
+   .type = NVC0_HW_SM_QUERY_DIVERGENT_BRANCH,
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gld_request =
+{
+   .type = NVC0_HW_SM_QUERY_GLD_REQUEST,
+   .ctr[0]   = _CA(0x0001, B6, LDST, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gld_mem_div_replay =
+{
+   .type = NVC0_HW_SM_QUERY_GLD_MEM_DIV_REPLAY,
+   .ctr[0]   = _CB(0x0001, B6, REPLAY, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gst_transactions =
+{
+   .type = NVC0_HW_SM_QUERY_GST_TRANSACTIONS,
+   .ctr[0]   = _CB(0x0001, B6, MEM, 0x0004),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gst_mem_div_replay =
+{
+   .type = NVC0_HW_SM_QUERY_GST_MEM_DIV_REPLAY,
+   .ctr[0]   = _CB(0x0001, B6, REPLAY, 0x0014),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gred_count =
+{
+   .type = NVC0_HW_SM_QUERY_GRED_COUNT,
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x0008),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gst_request =
+{
+   .type = NVC0_HW_SM_QUERY_GST_REQUEST,
+   .ctr[0]   = _CA(0x0001, B6, LDST, 0x0014),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_inst_executed =
+{
+   .type = NVC0_HW_SM_QUERY_INST_EXECUTED,
+   .ctr[0]   = _CA(0x0003, B6, EXEC, 0x0398),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_inst_issued1 =
+{
+   .type = NVC0_HW_SM_QUERY_INST_ISSUED1,
+   .ctr[0]   = _CA(0x0001, B6, ISSUE, 0x0004),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_inst_issued2 =
+{
+   .type = NVC0_HW_SM_QUERY_INST_ISSUED2,
+   .ctr[0]   = _CA(0x0001, B6, ISSUE, 0x0008),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_l1_gld_hit =
+{
+   .type = NVC0_HW_SM_QUERY_L1_GLD_HIT,
+   .ctr[0]   = _CB(0x0001, B6, L1, 0x0010),
+   .num_counters = 1,
+ 

Re: [Mesa-dev] [PATCH] glsl: dont allow undefined array sizes in ES

2016-03-09 Thread Timothy Arceri
On Wed, 2016-03-09 at 16:04 +0100, Iago Toral wrote:
> On Tue, 2016-03-08 at 20:35 +1100, Timothy Arceri wrote:
> > This applies the rule to empty declarations.
> > 
> > Fixes:
> > dEQP-
> > GLES3.functional.shaders.arrays.invalid.empty_declaration_without_v
> > ar_name_vertex
> > dEQP-
> > GLES3.functional.shaders.arrays.invalid.empty_declaration_without_v
> > ar_name_fragment
> > ---
> >  src/compiler/glsl/ast_to_hir.cpp | 11 +++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index d755a11..8918981 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -4223,6 +4223,17 @@ ast_declarator_list::hir(exec_list
> > *instructions,
> >    type_name);
> >    } else {
> >   if (decl_type->base_type == GLSL_TYPE_ARRAY) {
> > +/* From Section 13.22 (Array Declarations) of the GLSL
> > ES 3.2
> > + * spec:
> > + *
> > + *"... any declaration that leaves the size
> > undefined is
> > + *disallowed as this would add complexity and
> > there are no
> > + *use-cases."
> > + */
> > +if (state->es_shader && decl_type->is_unsized_array())
> > +   _mesa_glsl_error(&loc, state, "array size must be
> > explicitly "
> > +"or implicitly defined");
> 
> What about unsized arrays in SSBOs? Unsized arrays are allowed as the
> last element in a SSBO declaration. This is a special case because
> the
> size of the array is implicitly set by the size of the underlying
> buffer
> object.

This is only for empty declarations e.g int[]; so it shouldn't have any
impact on valid SSBOs.

> 
> Iago
> 
> >  /* From Section 4.12 (Empty Declarations) of the GLSL
> > 4.5 spec:
> >   *
> >   *"The combinations of types and qualifiers that
> > cause
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] nvc0: rework the MP counters infrastructure

2016-03-09 Thread Samuel Pitoiset
This mainly improves how we define the different list of queries.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_query.c  |  16 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c|   2 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 415 +++--
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.h|  79 ++--
 4 files changed, 244 insertions(+), 268 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
index d2acce7..f9f2bbe 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
@@ -204,10 +204,7 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,
 
if (screen->base.drm->version >= 0x01000101) {
   if (screen->compute) {
- if (screen->base.class_3d == NVE4_3D_CLASS) {
-count += 2;
- } else
- if (screen->base.class_3d < NVE4_3D_CLASS) {
+ if (screen->base.class_3d <= NVE4_3D_CLASS) {
 count += 2;
  }
   }
@@ -227,15 +224,8 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,
   * currently only used by AMD_performance_monitor.
   */
  info->max_active_queries = 1;
-
- if (screen->base.class_3d == NVE4_3D_CLASS) {
-info->num_queries = NVE4_HW_SM_QUERY_COUNT;
-return 1;
- } else
- if (screen->base.class_3d < NVE4_3D_CLASS) {
-info->num_queries = NVC0_HW_SM_QUERY_COUNT;
-return 1;
- }
+ info->num_queries = nvc0_hw_sm_get_num_queries(screen);
+ return 1;
   }
} else
if (id == NVC0_HW_METRIC_QUERY_GROUP) {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
index 7a64b69..c108551 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
@@ -172,7 +172,7 @@ static const char *nve4_hw_metric_names[] =
"metric-shared_replay_overhead",
 };
 
-#define _SM(n) NVE4_HW_SM_QUERY(NVE4_HW_SM_QUERY_ ##n)
+#define _SM(n) NVC0_HW_SM_QUERY(NVC0_HW_SM_QUERY_ ##n)
 #define _M(n, c) [NVE4_HW_METRIC_QUERY_##n] = c
 
 /*  Compute capability 3.0 (GK104/GK106/GK107)  */
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
index f5f9bb3..1cbcae1 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
@@ -30,59 +30,85 @@
 #include "nvc0/nve4_compute.xml.h"
 #include "nvc0/nvc0_compute.xml.h"
 
-/* === PERFORMANCE MONITORING COUNTERS for NVE4+ === */
-
 /* NOTE: intentionally using the same names as NV */
-static const char *nve4_hw_sm_query_names[] =
-{
-   /* MP counters */
-   "active_cycles",
-   "active_warps",
-   "atom_cas_count",
-   "atom_count",
-   "branch",
-   "divergent_branch",
-   "gld_request",
-   "global_ld_mem_divergence_replays",
-   "global_store_transaction",
-   "global_st_mem_divergence_replays",
-   "gred_count",
-   "gst_request",
-   "inst_executed",
-   "inst_issued1",
-   "inst_issued2",
-   "l1_global_load_hit",
-   "l1_global_load_miss",
-   "__l1_global_load_transactions",
-   "__l1_global_store_transactions",
-   "l1_local_load_hit",
-   "l1_local_load_miss",
-   "l1_local_store_hit",
-   "l1_local_store_miss",
-   "l1_shared_load_transactions",
-   "l1_shared_store_transactions",
-   "local_load",
-   "local_load_transactions",
-   "local_store",
-   "local_store_transactions",
-   "prof_trigger_00",
-   "prof_trigger_01",
-   "prof_trigger_02",
-   "prof_trigger_03",
-   "prof_trigger_04",
-   "prof_trigger_05",
-   "prof_trigger_06",
-   "prof_trigger_07",
-   "shared_load",
-   "shared_load_replay",
-   "shared_store",
-   "shared_store_replay",
-   "sm_cta_launched",
-   "threads_launched",
-   "uncached_global_load_transaction",
-   "warps_launched",
+#define _Q(t, n) { NVC0_HW_SM_QUERY_##t, n }
+struct {
+   unsigned type;
+   const char *name;
+} nvc0_hw_sm_queries[] = {
+   _Q(ACTIVE_CYCLES,"active_cycles"   
),
+   _Q(ACTIVE_WARPS, "active_warps"
),
+   _Q(ATOM_CAS_COUNT,   "atom_cas_count"  
),
+   _Q(ATOM_COUNT,   "atom_count"  
),
+   _Q(BRANCH,   "branch"  
),
+   _Q(DIVERGENT_BRANCH, "divergent_branch"
),
+   _Q(GLD_REQUEST,  "gld_request" 
),
+   _Q(GLD_MEM_DIV_REPLAY,   "global_ld_mem_divergence_replays"
),
+   _Q(GST_TRANSACTIONS, "global_store_transaction"
),
+   _Q(GST_MEM_DIV_REPLAY,   "global_st_mem_divergence_replays"
),

Re: [Mesa-dev] [PATCH] swrast: fix possible null dereference

2016-03-09 Thread Ian Romanick
On 03/09/2016 10:21 AM, Lars Hamre wrote:
> Fixes a possible null dereference.
> 
> NOTE: this is my first time contributing, please let me know if I
>   should be doing anything differently, thanks!
> 
> Signed-off-by: Lars Hamre 
> ---
>  src/mesa/swrast/s_triangle.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/swrast/s_triangle.c b/src/mesa/swrast/s_triangle.c
> index 876a74b..9225974 100644
> --- a/src/mesa/swrast/s_triangle.c
> +++ b/src/mesa/swrast/s_triangle.c
> @@ -781,7 +781,7 @@ fast_persp_span(struct gl_context *ctx, SWspan *span,
>}
>break;
> }
> -
> +
> assert(span->arrayMask & SPAN_RGBA);
> _swrast_write_rgba_span(ctx, span);
> 
> @@ -1063,8 +1063,8 @@ _swrast_choose_triangle( struct gl_context *ctx )
>   swImg = swrast_texture_image_const(texImg);
> 
>   format = texImg ? texImg->TexFormat : MESA_FORMAT_NONE;
> - minFilter = texObj2D ? samp->MinFilter : GL_NONE;
> - magFilter = texObj2D ? samp->MagFilter : GL_NONE;
> + minFilter = (texObj2D && samp) ? samp->MinFilter : GL_NONE;
> + magFilter = (texObj2D && samp) ? samp->MagFilter : GL_NONE;

NAK this hunk.  If texObj2D is not NULL, samp is also not NULL.

>   envMode = ctx->Texture.Unit[0].EnvMode;
> 
>   /* First see if we can use an optimized 2-D texture function */
> @@ -1073,6 +1073,7 @@ _swrast_choose_triangle( struct gl_context *ctx )
>   && !ctx->ATIFragmentShader._Enabled
>   && ctx->Texture._MaxEnabledTexImageUnit == 0
>   && ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D
> + && samp

I think the 'ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D'
implicitly ensures that samp cannot be NULL.  Have you been able to
cause a NULL dereference in this code path or is this just based on
speculation?

>   && samp->WrapS == GL_REPEAT
>   && samp->WrapT == GL_REPEAT
>   && texObj2D->_Swizzle == SWIZZLE_NOOP
> --
> 2.5.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] nvc0: rework the driver metrics infrastructure

2016-03-09 Thread Samuel Pitoiset
This follows the same design as MP perf counters.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_query.c  |  10 +-
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 296 -
 .../drivers/nouveau/nvc0/nvc0_query_hw_metric.h|  23 +-
 3 files changed, 172 insertions(+), 157 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
index f9f2bbe..6836432 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
@@ -230,16 +230,10 @@ nvc0_screen_get_driver_query_group_info(struct 
pipe_screen *pscreen,
} else
if (id == NVC0_HW_METRIC_QUERY_GROUP) {
   if (screen->compute) {
-  if (screen->base.class_3d == NVE4_3D_CLASS) {
+  if (screen->base.class_3d <= NVE4_3D_CLASS) {
 info->name = "Performance metrics";
 info->max_active_queries = 1;
-info->num_queries = NVE4_HW_METRIC_QUERY_COUNT;
-return 1;
- } else
- if (screen->base.class_3d < NVE4_3D_CLASS) {
-info->name = "Performance metrics";
-info->max_active_queries = 1;
-info->num_queries = NVC0_HW_METRIC_QUERY_COUNT;
+info->num_queries = nvc0_hw_metric_get_num_queries(screen);
 return 1;
  }
   }
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
index c108551..a01ab3f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c
@@ -24,32 +24,51 @@
 #include "nvc0/nvc0_query_hw_metric.h"
 #include "nvc0/nvc0_query_hw_sm.h"
 
-/* === PERFORMANCE MONITORING METRICS for NVC0:NVE4 === */
-static const char *nvc0_hw_metric_names[] =
-{
-   "metric-achieved_occupancy",
-   "metric-branch_efficiency",
-   "metric-inst_issued",
-   "metric-inst_per_wrap",
-   "metric-inst_replay_overhead",
-   "metric-issued_ipc",
-   "metric-issue_slots",
-   "metric-issue_slot_utilization",
-   "metric-ipc",
+#define _Q(t,n) { NVC0_HW_METRIC_QUERY_##t, n }
+struct {
+   unsigned type;
+   const char *name;
+} nvc0_hw_metric_queries[] = {
+   _Q(ACHIEVED_OCCUPANCY,   "metric-achieved_occupancy"   
),
+   _Q(BRANCH_EFFICIENCY,"metric-branch_efficiency"
),
+   _Q(INST_ISSUED,  "metric-inst_issued"  
),
+   _Q(INST_PER_WRAP,"metric-inst_per_wrap"
),
+   _Q(INST_REPLAY_OVERHEAD, "metric-inst_replay_overhead" 
),
+   _Q(ISSUED_IPC,   "metric-issued_ipc"   
),
+   _Q(ISSUE_SLOTS,  "metric-issue_slots"  
),
+   _Q(ISSUE_SLOT_UTILIZATION,   "metric-issue_slot_utilization"   
),
+   _Q(IPC,  "metric-ipc"  
),
+   _Q(SHARED_REPLAY_OVERHEAD,   "metric-shared_replay_overhead"   
),
 };
 
+#undef _Q
+
+static inline const char *
+nvc0_hw_metric_query_get_name(unsigned query_type)
+{
+   unsigned i;
+
+   for (i = 0; i < ARRAY_SIZE(nvc0_hw_metric_queries); i++) {
+  if (nvc0_hw_metric_queries[i].type == query_type)
+ return nvc0_hw_metric_queries[i].name;
+   }
+   assert(0);
+   return NULL;
+}
+
 struct nvc0_hw_metric_query_cfg {
+   unsigned type;
uint32_t queries[8];
uint32_t num_queries;
 };
 
 #define _SM(n) NVC0_HW_SM_QUERY(NVC0_HW_SM_QUERY_ ##n)
-#define _M(n, c) [NVC0_HW_METRIC_QUERY_##n] = c
 
 /*  Compute capability 2.0 (GF100/GF110)  */
 static const struct nvc0_hw_metric_query_cfg
 sm20_achieved_occupancy =
 {
+   .type= NVC0_HW_METRIC_QUERY_ACHIEVED_OCCUPANCY,
.queries[0]  = _SM(ACTIVE_WARPS),
.queries[1]  = _SM(ACTIVE_CYCLES),
.num_queries = 2,
@@ -58,6 +77,7 @@ sm20_achieved_occupancy =
 static const struct nvc0_hw_metric_query_cfg
 sm20_branch_efficiency =
 {
+   .type= NVC0_HW_METRIC_QUERY_BRANCH_EFFICIENCY,
.queries[0]  = _SM(BRANCH),
.queries[1]  = _SM(DIVERGENT_BRANCH),
.num_queries = 2,
@@ -66,6 +86,7 @@ sm20_branch_efficiency =
 static const struct nvc0_hw_metric_query_cfg
 sm20_inst_per_wrap =
 {
+   .type= NVC0_HW_METRIC_QUERY_INST_PER_WRAP,
.queries[0]  = _SM(INST_EXECUTED),
.queries[1]  = _SM(WARPS_LAUNCHED),
.num_queries = 2,
@@ -74,6 +95,7 @@ sm20_inst_per_wrap =
 static const struct nvc0_hw_metric_query_cfg
 sm20_inst_replay_overhead =
 {
+   .type= NVC0_HW_METRIC_QUERY_INST_REPLAY_OVERHEAD,
.queries[0]  = _SM(INST_ISSUED),
.queries[1]  = _SM(INST_EXECUTED),
.num_queries = 2,
@@ -82,6 +104,16 @@ sm20_inst_replay_overhead =
 static const struct nvc0_hw_metric_query_cfg
 sm20_issued_ipc =
 {
+   .type= NVC0_HW_METRIC_QUERY_ISSUED_IPC,
+   .queries[0]  = _SM(INST_ISSUED),
+   .queries[1]  = _SM(ACTIVE

Re: [Mesa-dev] [PATCH] swrast: fix possible null dereference

2016-03-09 Thread Lars Hamre
I have not been able to force a NULL dereference, this is based off
analyzing the code.
Yes that is implicitly true, but if at some point the implicit relationship
is broken, I would
rather not have a NULL dereference.

If you do not agree, I am fine deferring to your judgement!

On Wed, Mar 9, 2016 at 6:23 PM, Ian Romanick  wrote:

> On 03/09/2016 10:21 AM, Lars Hamre wrote:
> > Fixes a possible null dereference.
> >
> > NOTE: this is my first time contributing, please let me know if I
> >   should be doing anything differently, thanks!
> >
> > Signed-off-by: Lars Hamre 
> > ---
> >  src/mesa/swrast/s_triangle.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/mesa/swrast/s_triangle.c b/src/mesa/swrast/s_triangle.c
> > index 876a74b..9225974 100644
> > --- a/src/mesa/swrast/s_triangle.c
> > +++ b/src/mesa/swrast/s_triangle.c
> > @@ -781,7 +781,7 @@ fast_persp_span(struct gl_context *ctx, SWspan *span,
> >}
> >break;
> > }
> > -
> > +
> > assert(span->arrayMask & SPAN_RGBA);
> > _swrast_write_rgba_span(ctx, span);
> >
> > @@ -1063,8 +1063,8 @@ _swrast_choose_triangle( struct gl_context *ctx )
> >   swImg = swrast_texture_image_const(texImg);
> >
> >   format = texImg ? texImg->TexFormat : MESA_FORMAT_NONE;
> > - minFilter = texObj2D ? samp->MinFilter : GL_NONE;
> > - magFilter = texObj2D ? samp->MagFilter : GL_NONE;
> > + minFilter = (texObj2D && samp) ? samp->MinFilter : GL_NONE;
> > + magFilter = (texObj2D && samp) ? samp->MagFilter : GL_NONE;
>
> NAK this hunk.  If texObj2D is not NULL, samp is also not NULL.
>
> >   envMode = ctx->Texture.Unit[0].EnvMode;
> >
> >   /* First see if we can use an optimized 2-D texture function */
> > @@ -1073,6 +1073,7 @@ _swrast_choose_triangle( struct gl_context *ctx )
> >   && !ctx->ATIFragmentShader._Enabled
> >   && ctx->Texture._MaxEnabledTexImageUnit == 0
> >   && ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D
> > + && samp
>
> I think the 'ctx->Texture.Unit[0]._Current->Target == GL_TEXTURE_2D'
> implicitly ensures that samp cannot be NULL.  Have you been able to
> cause a NULL dereference in this code path or is this just based on
> speculation?
>
> >   && samp->WrapS == GL_REPEAT
> >   && samp->WrapT == GL_REPEAT
> >   && texObj2D->_Swizzle == SWIZZLE_NOOP
> > --
> > 2.5.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: report correct number of allowed vertex inputs and fragment outputs

2016-03-09 Thread Timothy Arceri
On Wed, 2016-03-09 at 11:48 +0100, Iago Toral Quiroga wrote:
> Before we would always report 16 for both and we would only fail if
> either
> one exceeded 16. Now we fail if the maximum for each is exceeded,
> even if
> it is smaller than 16 and we report the correct maximum.
> 
> Also, expand the size of to_assign[] to 32. There is code at the top
> of the function handling max_index up to 32, so this just makes the
> code more consistent.

Looks good.

Reviewed-by: Timothy Arceri 

> ---
>  src/compiler/glsl/linker.cpp | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp
> b/src/compiler/glsl/linker.cpp
> index 4cec107..76b700d 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -2417,7 +2417,8 @@
> assign_attribute_or_color_locations(gl_shader_program *prog,
>    /* Reversed because we want a descending order sort below.
> */
>    return r->slots - l->slots;
>    }
> -   } to_assign[16];
> +   } to_assign[32];
> +   assert(max_index <= 32);
>  
> unsigned num_attr = 0;
>  
> @@ -2625,11 +2626,11 @@
> assign_attribute_or_color_locations(gl_shader_program *prog,
>    continue;
>    }
>  
> -  if (num_attr >= ARRAY_SIZE(to_assign)) {
> +  if (num_attr >= max_index) {
>   linker_error(prog, "too many %s (max %u)",
>    target_index == MESA_SHADER_VERTEX ?
>    "vertex shader inputs" : "fragment shader
> outputs",
> -  (unsigned)ARRAY_SIZE(to_assign));
> +  max_index);
>   return false;
>    }
>    to_assign[num_attr].slots = slots;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nouveau: Fix clang reserved-user-defined-literal error.

2016-03-09 Thread Samuel Pitoiset



On 03/09/2016 11:27 PM, Samuel Pitoiset wrote:



On 03/09/2016 09:28 PM, Vinson Lee wrote:

On Wed, Mar 9, 2016 at 5:25 AM, Samuel Pitoiset
 wrote:



On 03/09/2016 01:46 PM, Pierre Moreau wrote:


I did hit that issue as well, but I have C++11 forced on my SPIR-V
branch.

I guess adding the whitespace will still result in code that works with
older
C++ version, so the fix can still be accepted even if we do not plan to
switch
to C++11 by default.



Sure, the patch looks fine, but I wonder how he did hit that issue. :-)

Anyway, if this doesn't break compilation without c++11, this patch is:

Reviewed-by: Samuel Pitoiset 



Pierre


On 11:16 AM - Mar 09 2016, Samuel Pitoiset wrote:


Nouveau doesn't use c++11 except the codegen part.
How do you hit that issue? Pretty sure that you forced c++11, right?

I can't reproduce that compilation error with clang 3.9 btw.

On 03/09/2016 09:57 AM, Vinson Lee wrote:


CXX  codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11
requires a space between literal and identifier
[-Wreserved-user-defined-literal]
 fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
   ^

Signed-off-by: Vinson Lee 
---
   src/gallium/drivers/nouveau/nouveau_debug.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_debug.h
b/src/gallium/drivers/nouveau/nouveau_debug.h
index d17df81..546a4ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_debug.h
+++ b/src/gallium/drivers/nouveau/nouveau_debug.h
@@ -16,7 +16,7 @@
   #define NOUVEAU_DEBUG 0

   #define NOUVEAU_ERR(fmt,
args...) \
-   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
+   fprintf(stderr, "%s:%d - " fmt, __FUNCTION__, __LINE__, ##args)

   #define NOUVEAU_DBG(ch, args...)   \
  if ((NOUVEAU_DEBUG) & (NOUVEAU_DEBUG_##ch))\



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
-Samuel


Building swr also seems to be adding -std=c++11 to the nouveau portion
of the build. Can you try a clang build with this configure statement?

./autogen.sh --with-dri-drivers= --with-gallium-drivers=nouveau,swr


Yes, I'll do.


Yes, you are right, swr adds -std=c++11.
Feel free to push the patch.

Thanks.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium/radeon: notify all contexts when cmasks are enabled/disabled

2016-03-09 Thread Bas Nieuwenhuizen
FWIW The series is
Reviewed-by: Bas Nieuwenhuizen 

- Bas

On Thu, Mar 10, 2016 at 12:07 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> There is an annoying corner case that I stumbled across while looking into
> piglit's 
> arb_shader_image_load_store/execution/load-from-cleared-image.shader_test
> (which can be easily adapted to demonstrate the bug without the
> ARB_shader_image_load_store extension)
>
> When we bind a texture and then clear it using glClear (by attaching it
> to the current framebuffer) for the first time, we allocate a separate
> cmask for the texture to do fast clear, but the corresponding bit in
> compressed_colortex_mask is not set. Subsequent rendering will use
> incorrect data.
>
> Conversely, when a currently bound texture with an existing cmask is
> exported leading to that cmask being disabled, the compressed_colortex_mask
> bit will remain set, leading to an assertion later on in debug builds.
>
> Since iterating through all contexts and/or remembering where every
> texture is bound would be costly, and cmask enable/disable should be
> rare, we will maintain a global counter to signal contexts that they
> must update their compressed_colortex_masks.
>
> This patch introduces the global counter, and subsequent patches will
> do the mask update.
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.h | 7 +++
>  src/gallium/drivers/radeon/r600_texture.c | 3 +++
>  2 files changed, 10 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
> b/src/gallium/drivers/radeon/r600_pipe_common.h
> index d20069e..cf8dcf7 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
> @@ -335,6 +335,12 @@ struct r600_common_screen {
>  */
> unsigneddirty_fb_counter;
>
> +   /* Atomically increment this counter when an existing texture's
> +* metadata is enabled or disabled in a way that requires changing
> +* contexts' compressed texture binding masks.
> +*/
> +   unsignedcompressed_colortex_counter;
> +
> void (*query_opaque_metadata)(struct r600_common_screen *rscreen,
>   struct r600_texture *rtex,
>   struct radeon_bo_metadata *md);
> @@ -406,6 +412,7 @@ struct r600_common_context {
> unsignedinitial_gfx_cs_size;
> unsignedgpu_reset_counter;
> unsignedlast_dirty_fb_counter;
> +   unsignedlast_compressed_colortex_counter;
>
> struct u_upload_mgr *uploader;
> struct u_suballocator   *allocator_so_filled_size;
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index 1a8822c..6b2d909 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -287,6 +287,7 @@ static void r600_texture_disable_cmask(struct 
> r600_common_screen *rscreen,
>
> /* Notify all contexts about the change. */
> r600_dirty_all_framebuffer_states(rscreen);
> +   p_atomic_inc(&rscreen->compressed_colortex_counter);
>  }
>
>  static void r600_texture_disable_dcc(struct r600_common_screen *rscreen,
> @@ -603,6 +604,8 @@ static void r600_texture_alloc_cmask_separate(struct 
> r600_common_screen *rscreen
> rtex->cb_color_info |= SI_S_028C70_FAST_CLEAR(1);
> else
> rtex->cb_color_info |= EG_S_028C70_FAST_CLEAR(1);
> +
> +   p_atomic_inc(&rscreen->compressed_colortex_counter);
>  }
>
>  static unsigned r600_texture_get_htile_size(struct r600_common_screen 
> *rscreen,
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] nvc0: add MP performance counters for SM35 (GK110)

2016-03-09 Thread Ilia Mirkin
On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
 wrote:
> + if (screen->base.class_3d <= NVF0_3D_CLASS &&
> + screen->base.class_3d != NVEA_3D_CLASS) {

Why? NVEA should be the same as NVF0 I think... and actually
NVEA_3D_CLASS is 0xa297, while the NVF0 one is a197...

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] nvc0: expose SM35 perf counters to AMD_performance_monitor

2016-03-09 Thread Ilia Mirkin
On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> index 6836432..5cbc66e 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
> @@ -204,7 +204,8 @@ nvc0_screen_get_driver_query_group_info(struct 
> pipe_screen *pscreen,
>
> if (screen->base.drm->version >= 0x01000101) {
>if (screen->compute) {
> - if (screen->base.class_3d <= NVE4_3D_CLASS) {
> + if (screen->base.class_3d <= NVF0_3D_CLASS &&
> + screen->base.class_3d != NVEA_3D_CLASS) {
>  count += 2;
>   }
>}
> @@ -230,7 +231,8 @@ nvc0_screen_get_driver_query_group_info(struct 
> pipe_screen *pscreen,
> } else
> if (id == NVC0_HW_METRIC_QUERY_GROUP) {
>if (screen->compute) {
> -  if (screen->base.class_3d <= NVE4_3D_CLASS) {
> +  if (screen->base.class_3d <= NVF0_3D_CLASS &&
> +  screen->base.class_3d != NVE4_3D_CLASS) {

4's do tend to look a lot like A's...

with the unnecessary attempt to filter out NVEA_3D_CLASS (which is
already filtered out because it's > NVF0_3D_CLASS), this whole series
is

Acked-by: Ilia Mirkin 

>  info->name = "Performance metrics";
>  info->max_active_queries = 1;
>  info->num_queries = nvc0_hw_metric_get_num_queries(screen);
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-09 Thread Roland Scheidegger
Am 09.03.2016 um 23:51 schrieb Ian Romanick:
> On 03/09/2016 02:25 AM, tournier.elie wrote:
>> Hi everyone.
>>
>> My name is Elie TOURNIER, I am enrolled in a French Engineering school
>> (Telecom Physique Strasbourg) specialized in Medical ICT.
>> I'm interested in implementing "Soft" double precision floating point
>> support [1].
>> Taking this subject seem to be a good way to get my feet wet in the Mesa
>> code and discover how some of its components works.
>>
>> I come to you in order to become know but also to retrieve valuable
>> information for the success of this project.
>>
>> I would like to know more about the following things to understand your
>> requirements :
>> 1- "/Each double precision value would be stored in a uvec2/" The IEEE
>> double precision floating point standard representation requires a 64
>> bit: 1 for sign, 11 for exponent and the others for fraction [2].
>> -> How double precision value must be stored?
> 
> As Emil mentioned, on GLSL 1.30, a uvec2 consists of two, 32-bit
> unsigned integers.  Each double precision value would be stored in a uvec2.
> 
>> 2- Where can I find |GL_ARB_gpu_shader_fp64 |documentation|?
>> |
>>
>>
>> This is my first exposure to Mesa. Please excuse me if I am asking basic
>> questions.
> 
> For this particular project, you wouldn't need Mesa at all for quite
> some time.  All of the initial project should be done in "raw" GLSL
> 1.30, and any OpenGL implementation capable of GLSL 1.30 can be used.
> You would implement (and test!) a library of functions like 'uvec2
> addDouble(uvec2 a, uvec2 b)' that would provide all of the required
> double precision operations.
> 
> The set of required functions should be pretty small.  I think:
> 
>  - add
>  - negate
>  - absolute value
>  - multiply
>  - reciprocal
>  - convert to single precision
>  - convert from single precision
>  - pow (maybe?)
>  - exp (maybe?)
>  - log (maybe?)

I don't think you need exp/log. At least glsl dosen't require it, though
the project isn't clear about it.
(pow all hw I know of with exactly one exception (that would be intel
graphics...) implements it as log2/mul/exp2 even for f32 anyway).
I think though you need sqrt (or rsqrt). And some functions for
rounding, plus comparison operations. Maybe min/max too (albeit if you
have comparisons you can emulate them of course).

Roland


> 
> I think everything else could be implemented using those functions.
> 
> Like I mentioned in the project description, there are quite a few
> existing C implementations of these functions.  Finding one of those
> that you can understand and that has a compatible license is probably
> the best place to start.
> 
>> Please point me to the right resources so that I can better understand
>> the project. I would also be happy to fix a bug to familiarize myself 
>> with the source code. Any suggestions on bugs that are relevant to the
>> project will be of great help.
>>
>> Regards,
>> Elie
>>
>> [1]
>> http://www.x.org/wiki/SummerOfCodeIdeas/#softdoubleprecisionfloatingpointsupport
>> [2] http://steve.hollasch.net/cgindex/coding/ieeefloat.html#storage
>>
>> PS: If you have any questions, please don't hesitate to contact me.
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] GSOC 2016

2016-03-09 Thread Harisu fanyui
hello developers,
 my name is Harisu fanyui and i am a second year computer engineering
student in the university  of buea  in cameroon. i  am interested in the
project
"Soft" double precision floating point support

listed on the ideas page . As stated on the project i have been programming
in the c programming language  and am still learning and programming in the
stated language.I am also very good in working with the linux environment.

i am also good in digital and  numerical analysis which i think will be
very usefull for me  in the realisation of this project.

can someone tell me how to go about?

thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >