from:"Mike Stroyan"

[Mesa-dev] [PATCH] Only change and restore viewport 0 in mesa meta mode

2015-06-26 Thread Mike Stroyan

The meta code was setting a default depth range for all viewports
and 'restoring' all viewports to depth range values saved from viewport 0.
---
 src/mesa/drivers/common/meta.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 214a68a..9a75019 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -728,7 +728,7 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
   save->DepthNear = ctx->ViewportArray[0].Near;
   save->DepthFar = ctx->ViewportArray[0].Far;
   /* set depth range to default */
-  _mesa_DepthRange(0.0, 1.0);
+  _mesa_set_depth_range(ctx, 0, 0.0, 1.0);
}
 
if (state & MESA_META_CLAMP_FRAGMENT_COLOR) {
@@ -1129,7 +1129,7 @@ _mesa_meta_end(struct gl_context *ctx)
  _mesa_set_viewport(ctx, 0, save->ViewportX, save->ViewportY,
 save->ViewportW, save->ViewportH);
   }
-  _mesa_DepthRange(save->DepthNear, save->DepthFar);
+  _mesa_set_depth_range(ctx, 0, save->DepthNear, save->DepthFar);
}
 
if (state & MESA_META_CLAMP_FRAGMENT_COLOR &&
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Only change and restore viewport 0 in mesa meta mode

2015-06-26 Thread Mike Stroyan

There isn't any bugzilla entry for this yet.  I just saw it in the source
code so far rather than in a misbehaving program.
Perhaps piglit could use a few tests for whether meta operations damage
context attributes.

On Fri, Jun 26, 2015 at 3:26 PM, Kenneth Graunke 
wrote:

> On Friday, June 26, 2015 03:15:46 PM Mike Stroyan wrote:
> > The meta code was setting a default depth range for all viewports
> > and 'restoring' all viewports to depth range values saved from viewport
> 0.
> > ---
> >  src/mesa/drivers/common/meta.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/common/meta.c
> b/src/mesa/drivers/common/meta.c
> > index 214a68a..9a75019 100644
> > --- a/src/mesa/drivers/common/meta.c
> > +++ b/src/mesa/drivers/common/meta.c
> > @@ -728,7 +728,7 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield
> state)
> >save->DepthNear = ctx->ViewportArray[0].Near;
> >save->DepthFar = ctx->ViewportArray[0].Far;
> >/* set depth range to default */
> > -  _mesa_DepthRange(0.0, 1.0);
> > +  _mesa_set_depth_range(ctx, 0, 0.0, 1.0);
> > }
> >
> > if (state & MESA_META_CLAMP_FRAGMENT_COLOR) {
> > @@ -1129,7 +1129,7 @@ _mesa_meta_end(struct gl_context *ctx)
> >   _mesa_set_viewport(ctx, 0, save->ViewportX, save->ViewportY,
> >  save->ViewportW, save->ViewportH);
> >}
> > -  _mesa_DepthRange(save->DepthNear, save->DepthFar);
> > +  _mesa_set_depth_range(ctx, 0, save->DepthNear, save->DepthFar);
> > }
> >
> > if (state & MESA_META_CLAMP_FRAGMENT_COLOR &&
> >
>
> Good catch - this code predates GL_ARB_viewport_array, and really ought
> to only change viewport 0.  Thanks, Mike!
>
> Cc: "10.6 10.5" 
> Reviewed-by: Kenneth Graunke 
>
> Is there a bugzilla entry related to this patch?
>
> I'll plan to push this tonight/tomorrow unless someone else objects.
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-01 Thread Mike Stroyan

When there are no color buffer render targets, gen6 and gen7 still
use the first BLEND_STATE element to determine alpha test.
gen6_upload_blend_state was allocating zero elements when
ctx->Color.AlphaEnabled was false.
That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
pointing to random data from some previous brw_state_batch().
That sometimes suppressed depth rendering when those bits
happened to mean COMPAREFUNC_NEVER.
This produced flickering shadows for dota2 reborn.
---
 src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
b/src/mesa/drivers/dri/i965/gen6_cc.c
index 2bfa271..2b76e24 100644
--- a/src/mesa/drivers/dri/i965/gen6_cc.c
+++ b/src/mesa/drivers/dri/i965/gen6_cc.c
@@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
 * with render target 0, which will reference BLEND_STATE[0] for
 * alpha test enable.
 */
-   if (nr_draw_buffers == 0 && ctx->Color.AlphaEnabled)
+   if (nr_draw_buffers == 0)
   nr_draw_buffers = 1;
 
size = sizeof(*blend) * nr_draw_buffers;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-01 Thread Mike Stroyan

Fixes: (Flickering shadows in unreleased title trace)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80500
<https://bugs.freedesktop.org/show_bug.cgi?id=91173>

On Wed, Jul 1, 2015 at 10:16 AM, Mike Stroyan  wrote:

> When there are no color buffer render targets, gen6 and gen7 still
> use the first BLEND_STATE element to determine alpha test.
> gen6_upload_blend_state was allocating zero elements when
> ctx->Color.AlphaEnabled was false.
> That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
> pointing to random data from some previous brw_state_batch().
> That sometimes suppressed depth rendering when those bits
> happened to mean COMPAREFUNC_NEVER.
> This produced flickering shadows for dota2 reborn.
> ---
>  src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c
> b/src/mesa/drivers/dri/i965/gen6_cc.c
> index 2bfa271..2b76e24 100644
> --- a/src/mesa/drivers/dri/i965/gen6_cc.c
> +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
> @@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
>  * with render target 0, which will reference BLEND_STATE[0] for
>  * alpha test enable.
>  */
> -   if (nr_draw_buffers == 0 && ctx->Color.AlphaEnabled)
> +   if (nr_draw_buffers == 0)
>nr_draw_buffers = 1;
>
> size = sizeof(*blend) * nr_draw_buffers;
> --
> 2.1.0
>
>


-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-02 Thread Mike Stroyan

I had actually made the change to gen8_upload_blend_state, but after
reading through the gen8 PRM a few times I decided to back it out.
It does seem that the initial gen8 BLEND_STATE DWord can disable alpha test.
Of course, new hardware features may not always behave as described.
In that case allowances will need to be made.
Or, we could always fill in the extra eight bytes for the first BLEND_STATE
entry.
Given that the structure is 64 byte aligned, that 'extra' data is almost
always free.

On Thu, Jul 2, 2015 at 1:45 AM, Kenneth Graunke 
wrote:

> On Wednesday, July 01, 2015 10:16:28 AM Mike Stroyan wrote:
> > When there are no color buffer render targets, gen6 and gen7 still
> > use the first BLEND_STATE element to determine alpha test.
> > gen6_upload_blend_state was allocating zero elements when
> > ctx->Color.AlphaEnabled was false.
> > That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
> > pointing to random data from some previous brw_state_batch().
> > That sometimes suppressed depth rendering when those bits
> > happened to mean COMPAREFUNC_NEVER.
> > This produced flickering shadows for dota2 reborn.
> > ---
> >  src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c
> b/src/mesa/drivers/dri/i965/gen6_cc.c
> > index 2bfa271..2b76e24 100644
> > --- a/src/mesa/drivers/dri/i965/gen6_cc.c
> > +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
> > @@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
> >  * with render target 0, which will reference BLEND_STATE[0] for
> >  * alpha test enable.
> >  */
> > -   if (nr_draw_buffers == 0 && ctx->Color.AlphaEnabled)
> > +   if (nr_draw_buffers == 0)
> >nr_draw_buffers = 1;
> >
> > size = sizeof(*blend) * nr_draw_buffers;
> >
>
> Great catch!
>
> Reviewed-by: Kenneth Graunke 
>
> And pushed:
>9d408a4..fe2b748  master -> master
>
> I think we ought to change gen8_blend_state.c as well, but I'm not quite
> sure what change to make.  Either we should make the same change you did
> here, or delete the whole "We need at least 1 BLEND_STATE written"
> block.
>
> On Gen8+, it looks like the alpha test and other functions that might
> discard pixels are all in the shared/common DWord, and the per-color
> target DWord pairs look relatively harmless.  I suppose the null RT
> would still refer to BLEND_STATE[0]...so it might still be worth
> emitting one.  Any thoughts?
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/13] mesa/math: Avoid double promotion.

2015-07-14 Thread Mike Stroyan

.. but I
> guess that is a different topic.
>
> In any case, as it is, is there any gain with your changes? Before we
> had  double->float conversion in the result, but now we have a
> double->float conversion in the argument to sinf/cosf, right?
>
> Iago
>
> > I guess we will need wrappers for sinf and cosf.
> >
> > Reviewed-by: Iago Toral Quiroga 
> >
> > >
> > > memcpy(m, Identity, sizeof(GLfloat)*16);
> > > optimized = GL_FALSE;
> > > @@ -859,7 +859,7 @@ _math_matrix_rotate( GLmatrix *mat,
> > > if (!optimized) {
> > >const GLfloat mag = sqrtf(x * x + y * y + z * z);
> > >
> > > -  if (mag <= 1.0e-4) {
> > > +  if (mag <= 1.0e-4F) {
> > >   /* no rotation, leave mat as-is */
> > >   return;
> > >}
> > > @@ -1070,7 +1070,7 @@ _math_matrix_scale( GLmatrix *mat, GLfloat x,
> GLfloat y, GLfloat z )
> > > m[2] *= x;   m[6] *= y;   m[10] *= z;
> > > m[3] *= x;   m[7] *= y;   m[11] *= z;
> > >
> > > -   if (fabsf(x - y) < 1e-8 && fabsf(x - z) < 1e-8)
> > > +   if (fabsf(x - y) < 1e-8F && fabsf(x - z) < 1e-8F)
> > >mat->flags |= MAT_FLAG_UNIFORM_SCALE;
> > > else
> > >mat->flags |= MAT_FLAG_GENERAL_SCALE;
> > > @@ -1206,7 +1206,7 @@ static void analyse_from_scratch( GLmatrix *mat )
> > > GLuint i;
> > >
> > > for (i = 0 ; i < 16 ; i++) {
> > > -  if (m[i] == 0.0) mask |= (1< > > +  if (m[i] == 0.0F) mask |= (1< > > }
> > >
> > > if (m[0] == 1.0F) mask |= (1<<16);
> > > @@ -1240,12 +1240,12 @@ static void analyse_from_scratch( GLmatrix
> *mat )
> > >mat->type = MATRIX_2D;
> > >
> > >/* Check for scale */
> > > -  if (SQ(mm-1) > SQ(1e-6) ||
> > > - SQ(m4m4-1) > SQ(1e-6))
> > > +  if (SQ(mm-1) > SQ(1e-6F) ||
> > > + SQ(m4m4-1) > SQ(1e-6F))
> > >  mat->flags |= MAT_FLAG_GENERAL_SCALE;
> > >
> > >/* Check for rotation */
> > > -  if (SQ(mm4) > SQ(1e-6))
> > > +  if (SQ(mm4) > SQ(1e-6F))
> > >  mat->flags |= MAT_FLAG_GENERAL_3D;
> > >else
> > >  mat->flags |= MAT_FLAG_ROTATION;
> > > @@ -1255,9 +1255,9 @@ static void analyse_from_scratch( GLmatrix *mat )
> > >mat->type = MATRIX_3D_NO_ROT;
> > >
> > >/* Check for scale */
> > > -  if (SQ(m[0]-m[5]) < SQ(1e-6) &&
> > > - SQ(m[0]-m[10]) < SQ(1e-6)) {
> > > -if (SQ(m[0]-1.0) > SQ(1e-6)) {
> > > +  if (SQ(m[0]-m[5]) < SQ(1e-6F) &&
> > > + SQ(m[0]-m[10]) < SQ(1e-6F)) {
> > > +if (SQ(m[0]-1.0F) > SQ(1e-6F)) {
> > > mat->flags |= MAT_FLAG_UNIFORM_SCALE;
> > >   }
> > >}
> > > @@ -1275,8 +1275,8 @@ static void analyse_from_scratch( GLmatrix *mat )
> > >mat->type = MATRIX_3D;
> > >
> > >/* Check for scale */
> > > -  if (SQ(c1-c2) < SQ(1e-6) && SQ(c1-c3) < SQ(1e-6)) {
> > > -if (SQ(c1-1.0) > SQ(1e-6))
> > > +  if (SQ(c1-c2) < SQ(1e-6F) && SQ(c1-c3) < SQ(1e-6F)) {
> > > +if (SQ(c1-1.0F) > SQ(1e-6F))
> > > mat->flags |= MAT_FLAG_UNIFORM_SCALE;
> > >  /* else no scale at all */
> > >}
> > > @@ -1285,10 +1285,10 @@ static void analyse_from_scratch( GLmatrix
> *mat )
> > >}
> > >
> > >/* Check for rotation */
> > > -  if (SQ(d1) < SQ(1e-6)) {
> > > +  if (SQ(d1) < SQ(1e-6F)) {
> > >  CROSS3( cp, m, m+4 );
> > >  SUB_3V( cp, cp, (m+8) );
> > > -if (LEN_SQUARED_3FV(cp) < SQ(1e-6))
> > > +if (LEN_SQUARED_3FV(cp) < SQ(1e-6F))
> > > mat->flags |= MAT_FLAG_ROTATION;
> > >  else
> > > mat->flags |= MAT_FLAG_GENERAL_3D;
> > > diff --git a/src/mesa/math/m_norm_tmp.h b/src/mesa/math/m_norm_tmp.h
> > > index d3ec1c2..6f1db8d 100644
> > > --- a/src/mesa/math/m_norm_tmp.h
> > > +++ b/src/mesa/math/m_norm_tmp.h
> > > @@ -80,7 +80,7 @@ TAG(transform_normalize_normals)( const GLmatrix
> *mat,
> > >}
> > > }
> > > else {
> > > -  if (scale != 1.0) {
> > > +  if (scale != 1.0f) {
> > >  m0 *= scale,  m4 *= scale,  m8 *= scale;
> > >  m1 *= scale,  m5 *= scale,  m9 *= scale;
> > >  m2 *= scale,  m6 *= scale,  m10 *= scale;
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Mike Stroyan

git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > index 5996a1b..59700ed 100644
>> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
>> > @@ -1219,7 +1219,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
>> >  * Note: the clear value for MCS buffers is all 1's, so we memset
>> to 0xff.
>> >  */
>> > void *data = intel_miptree_map_raw(brw, mt->mcs_mt);
>> > -   memset(data, 0xff, mt->mcs_mt->region->bo->size);
>> > +   memset(data, 0xff, mt->mcs_mt->region->height *
>> mt->mcs_mt->region->pitch);
>> > intel_miptree_unmap_raw(brw, mt->mcs_mt);
>> > mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;
>>
>> This does seem to fix the KWin problem, as well as the glxgears problem.
>>
>> I agree this is the correct amount of data to memset, and even if we
>> make the libdrm change I suggested, this seems worth doing.  bo->size
>> may have been rounded up beyond what we need, and memsetting that extra
>> space is wasteful (even if it did work).
>>
>> Reviewed-by: Kenneth Graunke 
>>
>> Thanks a ton for your help on this, Eric.  I was really stumped.
>>
>>
>> ___
>> mesa-stable mailing list
>> mesa-sta...@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-stable
>>
>>
>
>
> --
> Courtney Goeltzenleuchter
> LunarG
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>


-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

2014-02-20 Thread Mike Stroyan

Ian,

  Here is a shader_test version.  It lacks the sparkling uncertainty of the
pixel values in the previous animated example program.
It just gets all vertices uniformly wrong when I run it.


On Thu, Feb 20, 2014 at 4:31 PM, Ian Romanick  wrote:

> On 02/12/2014 04:24 PM, m...@lunarg.com wrote:
> > From: Mike Stroyan 
> >
> >   Putting NoDDClr and NoDDChk dependency control on instruction
> > sequences that include math opcodes can cause corruption of channels.
> > Treat math opcodes like send opcodes and suppress dependency hinting.
>
> Since you've analyised the failure in the real application, can you
> produce a minimal shader_runner test case that exhibits the same
> problem?  Eric mentioned to me that he'd like to play with it to better
> understand what's going on...
>
> > Signed-off-by: Mike Stroyan 
> > Tested-by: Tony Bertapelli 
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > index dd23ed4..1c42ca8 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > @@ -717,6 +717,14 @@ vec4_visitor::opt_set_dependency_control()
> >  continue;
> >   }
> >
> > + /* Dependency control does not work well over math
> instructions.
> > +  */
> > + if (inst->is_math()) {
> > +memset(last_grf_write, 0, sizeof(last_grf_write));
> > +memset(last_mrf_write, 0, sizeof(last_mrf_write));
> > +continue;
> > +     }
> > +
> >   /* Now, see if we can do dependency control for this
> instruction
> >* against a previous one writing to its destination.
> >*/
> >
>
>


-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com


glsl-vs-math-dependency.shader_test
Description: Binary data
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

2014-02-28 Thread Mike Stroyan

Matt,

  You haven't replied to my mail with an updated shader test that shows the
math instructions alone causing trouble.
Do you now agree that my patch avoiding math instructions in
opt_set_dependency_control is the appropriate fix?


On Fri, Feb 21, 2014 at 11:30 AM, Mike Stroyan  wrote:

> Matt,
>
>   I still see the math instructions as the troublemakers.  The NoDDChk bit
> is about write hazards and should not disable checking for read hazards.
>
>   Here is a slightly changed shader test that avoids using NoDDChk on the
> instruction that consumes the result of the 'math exp' instructions.  The
> only difference between the shaders when applying my patch is the presence
> of NoDDChk and NoDDClr on the math instructions.  And the z channel still
> gets almost always incorrect results when NoDDChk is used.
>
>
>
> On Thu, Feb 20, 2014 at 7:52 PM, Matt Turner  wrote:
>
>> On Thu, Feb 20, 2014 at 4:55 PM, Mike Stroyan  wrote:
>> > Ian,
>> >
>> >   Here is a shader_test version.  It lacks the sparkling uncertainty of
>> the
>> > pixel values in the previous animated example program.
>> > It just gets all vertices uniformly wrong when I run it.
>>
>> Thanks for the test Mike. I reproduced it locally.
>>
>> Comparing the vertex shader assembly before and after your patch, I
>> get the attached diff.
>>
>> The good news is that it doesn't look like the math instructions'
>> dependency control flags are broken. The bad news is that our
>> dependency control optimization has a bug. :)
>>
>> After three math exp instructions write to g8.{x,y,z}, we mov.sat
>> g8.xyz into g116.xyz which we use as a message register. The problem
>> is that we wrote a 1.0 into g116.w before the exp instructions, and
>> the dependency control optimization code recognized that the write to
>> g116.xyz would stall waiting for the write to g116.w and marked the
>> instructions with NoDDClr/NoDDChk. Unfortunately, marking the write to
>> g116.xyz with NoDDChk means that the instruction doesn't wait on the
>> writes to g8 to complete either!
>>
>
>
>
> --
>
>  Mike Stroyan - Software Architect
>  LunarG, Inc.  - The Graphics Experts
>  Cell:  (970) 219-7905
>  Email: m...@lunarg.com
>  Website: http://www.lunarg.com
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] Release gl_debug_state when destroying context.

2014-03-11 Thread Mike Stroyan

Commit 6e8d04a caused a leak by allocating ctx->Debug but never freeing it.
Release the memory in _mesa_free_errors_data when destroying a context.
Use FREE to match CALLOC_STRUCT from _mesa_get_debug_state.
---
 src/mesa/main/errors.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 8ec6a8c..9151718 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -969,7 +969,7 @@ _mesa_init_errors(struct gl_context *ctx)
 
 /**
  * Loop through debug group stack tearing down states for
- * filtering debug messages.
+ * filtering debug messages.  Then free debug output state.
  */
 void
 _mesa_free_errors_data(struct gl_context *ctx)
@@ -980,6 +980,9 @@ _mesa_free_errors_data(struct gl_context *ctx)
   for (i = 0; i <= ctx->Debug->GroupStackDepth; i++) {
  free_errors_data(ctx, i);
   }
+  FREE(ctx->Debug);
+  /* set to NULL just in case it is used before context is completely 
gone. */
+  ctx->Debug = NULL;
}
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Release gl_debug_state when destroying context.

2014-03-12 Thread Mike Stroyan

Brian,

  Please push that.  I haven't gotten commit access for myself yet.


On Tue, Mar 11, 2014 at 5:42 PM, Brian Paul  wrote:

> On 03/11/2014 05:07 PM, Mike Stroyan wrote:
>
>> Commit 6e8d04a caused a leak by allocating ctx->Debug but never freeing
>> it.
>> Release the memory in _mesa_free_errors_data when destroying a context.
>> Use FREE to match CALLOC_STRUCT from _mesa_get_debug_state.
>> ---
>>   src/mesa/main/errors.c | 5 -
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
>> index 8ec6a8c..9151718 100644
>> --- a/src/mesa/main/errors.c
>> +++ b/src/mesa/main/errors.c
>> @@ -969,7 +969,7 @@ _mesa_init_errors(struct gl_context *ctx)
>>
>>   /**
>>* Loop through debug group stack tearing down states for
>> - * filtering debug messages.
>> + * filtering debug messages.  Then free debug output state.
>>*/
>>   void
>>   _mesa_free_errors_data(struct gl_context *ctx)
>> @@ -980,6 +980,9 @@ _mesa_free_errors_data(struct gl_context *ctx)
>> for (i = 0; i <= ctx->Debug->GroupStackDepth; i++) {
>>free_errors_data(ctx, i);
>> }
>> +  FREE(ctx->Debug);
>> +  /* set to NULL just in case it is used before context is
>> completely gone. */
>> +  ctx->Debug = NULL;
>>  }
>>   }
>>
>>
>>
> Reviewed-by: Brian Paul 
>
> Thanks, Mike!  Do you need me to push this for you?
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

2014-03-19 Thread Mike Stroyan

Ken,

  The defect depends on details of the shader compiler reusing registers.
Old register values are sometimes left in some channels when the
write-after-write hints are used.
Further reducing the GLSL test case would be tricky because the reuse of g8
would be likely to be changed.
I did already spend quite a long time reducing a larger shader to this
relatively small example.
The GLSL is not anyone's source code.  It results from at least two levels
of machine translation and my own rewrite.

The difference in the resulting native code is now very simple.
The only difference from my patch is the presence of NoDDClr and NoDDChk
hints on the three math opcodes.
The results are then used by an opcode without any NoDDChk hinting.
With math data dependency hints in place the results are often incorrect.
They clearly don't work reliably.

   (assign  (x) (var_ref R_2)  (swiz x (expression vec4 exp2 (swiz 
(var_ref R_1) )) ))
0x0570: math exp(8) g8<1>.xFg7<4,4,1>.xFnull
{ align16 WE_normal NoDDClr 1Q };
   (assign  (y) (var_ref R_2)  (swiz y (expression vec4 exp2 (expression
vec4 + (swiz  (var_ref R_1) )(constant float (0x1.5798eep-27)) ) ) ))
0x0580: add(8)  g81<1>F g7<4,4,1>.yF1e-08F
{ align16 WE_normal 1Q };
0x0590: math exp(8) g8<1>.yFg81<4,4,1>F null
{ align16 WE_normal NoDDClr,NoDDChk 1Q };
   (assign  (z) (var_ref R_2)  (swiz z (expression vec4 exp2 (swiz 
(var_ref R_1) )) ))
0x05a0: math exp(8) g8<1>.zFg7<4,4,1>.zFnull
{ align16 WE_normal NoDDChk 1Q };
   (assign  (xyz) (var_ref gl_FrontColor)  (expression vec3 * (swiz xyz
(var_ref R_2) )(constant vec3 (0.99 0.99 0.99)) ) )
0x05b0: mul(8)  g4<1>.xyzF  g8<4,4,1>.xyzzF 0.99F
{ align16 WE_normal 1Q };

On Tue, Mar 18, 2014 at 4:05 PM, Kenneth Graunke wrote:

> On 02/28/2014 02:35 PM, Mike Stroyan wrote:
> > Matt,
> >
> >   You haven't replied to my mail with an updated shader test that shows
> > the math instructions alone causing trouble.
>
> I don't think Matt has time to do that.  Could you please trim down your
> shader test to a smaller case which demonstrates the problem?  As is,
> it's pretty large.  It also looks like it was ripped directly out of
> someone's application, which makes us nervous about copyright infringement.
>
> > Do you now agree that my patch avoiding math instructions in
> > opt_set_dependency_control is the appropriate fix?
>
> I see no documentation indicating that there are bugs with dependency
> control and math instructions.  I also don't see any workarounds for
> that in the Windows driver.
>
> --Ken
>
>

-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

2014-03-21 Thread Mike Stroyan

 On Wed, Mar 19, 2014 at 3:16 PM, Matt Turner  wrote:
> - Does the broken behavior depend on the hardware generation? I.e.,
> broken on Ivy Bridge and Haswell but not Ironlake?
> - There are ~12 math ops. Are the dependency control hints broken for
> all of them, or just exp2?

The problem with NoDDClr and NoDDChk are the same on Ivy Bridge and Haswell.
It doesn't happen with Ironlake because compiling for Ironlake uses send
instead of math opcodes.
Using send means that it also avoids using NoDDClr and NoDDChk.

Mutating my piglit shader_runner test shows the same problem occurs for
math with exp2, log, rsq, sqrt, and sin.
I didn't create test cases for the two operand opcodes.  But I am not
optimistic about them.

-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

2014-12-22 Thread Mike Stroyan

A fragment program can change because of Non-Orthogonal-State changes.
brw_update_texture_surfaces needs to run because of changed surface offsets.
Set BRW_NEW_FRAGMENT_PROGRAM dirty bit in brw_upload_wm_prog to signal that.
---
 src/mesa/drivers/dri/i965/brw_wm.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index e7939f0..c212892 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -601,7 +601,14 @@ brw_upload_wm_prog(struct brw_context *brw)
   (void) success;
   assert(success);
}
-   brw->wm.base.prog_data = &brw->wm.prog_data->base;
+   if (brw->wm.base.prog_data != &brw->wm.prog_data->base) {
+  /* Fragment program can change because of only NOS changes.
+   * Set dirty bit to signal that change.
+   * brw_update_texture_surfaces needs to run for changed surface offsets.
+   */
+  brw->wm.base.prog_data = &brw->wm.prog_data->base;
+  brw->state.dirty.brw |= BRW_NEW_FRAGMENT_PROGRAM;
+   }
 }
 
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

2014-12-22 Thread Mike Stroyan

This patch fixes a problem I reported as
[Bug 87619] Changes to state such as render targets change fragment shader 
without marking it dirty.

I sent a test that demonstrates the problem to the piglit mailing list as
fbo: Changing mrt binding with same shader source

The root cause of problem is rather generic.
brw_upload_wm_prog() calls brw_search_cache() to find the right
 fragment shader for a particular key from brw_wm_populate_key().
It does not set any dirty bit for changes to the shader.
There is a test in brw_upload_state() that checks for changes-

   if (brw->fragment_program != ctx->FragmentProgram._Current) {
  brw->fragment_program = ctx->FragmentProgram._Current;
  brw->state.dirty.brw |= BRW_NEW_FRAGMENT_PROGRAM;
   }

But that test is not looking for changes to NOS in the cache key.
It only sees more direct changes to the fragment program.

Setting BRW_NEW_FRAGMENT_PROGRAM in brw_upload_wm_prog() fixes the
particular program that I was debuggging and the piglit test I created.
But I wonder how many other cases occur.  There are six other callers
of brw_search_cache() that may not be getting the right dirty bits
set when cache key changes.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.

2014-12-23 Thread Mike Stroyan

Reviewed-by: Mike Stroyan 

On Mon, Dec 22, 2014 at 9:28 PM, Chris Forbes  wrote:
>
> Reviewed-by: Chris Forbes 
>
> On Tue, Dec 23, 2014 at 3:58 PM, Kenneth Graunke 
> wrote:
> > This was probably missed when moving from a fixed binding table layout
> > to a dynamic one that changes based on the shader.
> >
> > Fixes newly proposed Piglit test fbo-mrt-new-bind.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
> > Signed-off-by: Kenneth Graunke 
> > Cc: Mike Stroyan 
> > Cc: "10.4 10.3" 
> > ---
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > index 7361c2f..85a08d5 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > @@ -536,6 +536,7 @@ brw_update_null_renderbuffer_surface(struct
> brw_context *brw, unsigned int unit)
> > drm_intel_bo *bo = NULL;
> > unsigned pitch_minus_1 = 0;
> > uint32_t multisampling_state = 0;
> > +   /* BRW_NEW_FS_PROG_DATA */
> > uint32_t surf_index =
> >brw->wm.prog_data->binding_table.render_target_start + unit;
> >
> > @@ -621,6 +622,7 @@ brw_update_renderbuffer_surface(struct brw_context
> *brw,
> > uint32_t format = 0;
> > /* _NEW_BUFFERS */
> > mesa_format rb_format = _mesa_get_render_format(ctx,
> intel_rb_format(irb));
> > +   /* BRW_NEW_FS_PROG_DATA */
> > uint32_t surf_index =
> >brw->wm.prog_data->binding_table.render_target_start + unit;
> >
> > @@ -737,7 +739,8 @@ const struct brw_tracked_state
> brw_renderbuffer_surfaces = {
> > .dirty = {
> >.mesa = _NEW_BUFFERS |
> >_NEW_COLOR,
> > -  .brw = BRW_NEW_BATCH,
> > +  .brw = BRW_NEW_BATCH |
> > + BRW_NEW_FS_PROG_DATA,
> > },
> > .emit = brw_update_renderbuffer_surfaces,
> >  };
> > @@ -763,6 +766,8 @@ update_stage_texture_surfaces(struct brw_context
> *brw,
> > struct gl_context *ctx = &brw->ctx;
> >
> > uint32_t *surf_offset = stage_state->surf_offset;
> > +
> > +   /* BRW_NEW_*_PROG_DATA */
> > if (for_gather)
> >surf_offset +=
> stage_state->prog_data->binding_table.gather_texture_start;
> > else
> > @@ -824,9 +829,12 @@ const struct brw_tracked_state brw_texture_surfaces
> = {
> >.mesa = _NEW_TEXTURE,
> >.brw = BRW_NEW_BATCH |
> >   BRW_NEW_FRAGMENT_PROGRAM |
> > + BRW_NEW_FS_PROG_DATA |
> >   BRW_NEW_GEOMETRY_PROGRAM |
> > + BRW_NEW_GS_PROG_DATA |
> >   BRW_NEW_TEXTURE_BUFFER |
> > - BRW_NEW_VERTEX_PROGRAM,
> > + BRW_NEW_VERTEX_PROGRAM |
> > + BRW_NEW_VS_PROG_DATA,
> > },
> > .emit = brw_update_texture_surfaces,
> >  };
> > --
> > 2.2.1
> >
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-stable
>


-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

2014-12-23 Thread Mike Stroyan

Ah, I see.
I was assuming that a necessary dirty bit had not been set.
But the right dirty bit had been set and was being ignored.
Watching the right dirty bit is going to look a bit different in 10.3 or
10.4.
It will need to  use ".cache = CACHE_NEW_WM_PROG" in brw_texture_surfaces
instead of adding BRW_NEW_FS_PROG_DATA to the brw mask.


On Mon, Dec 22, 2014 at 8:09 PM, Kenneth Graunke 
wrote:

> On Monday, December 22, 2014 05:22:06 PM Mike Stroyan wrote:
> > This patch fixes a problem I reported as
> > [Bug 87619] Changes to state such as render targets change fragment
> shader without marking it dirty.
> >
> > I sent a test that demonstrates the problem to the piglit mailing list as
> > fbo: Changing mrt binding with same shader source
>
> Thanks for tracking this down (and writing a test)!
>
> > The root cause of problem is rather generic.
> > brw_upload_wm_prog() calls brw_search_cache() to find the right
> >  fragment shader for a particular key from brw_wm_populate_key().
> > It does not set any dirty bit for changes to the shader.
>
> It actually does.  brw_search_cache() contains:
>
>if (item->offset != *inout_offset) {
>   brw->state.dirty.brw |= (1 << cache_id);
>   *inout_offset = item->offset;
>}
>
> (1 << cache_id) corresponds to the BRW_NEW_*_PROG_DATA dirty bits
> (formerly known as CACHE_NEW_*_PROG).
>
> Looking at a call of brw_search_cache, we see that inout_offset
> corresponds to brw->wm.base.prog_offset:
>
>if (!brw_search_cache(&brw->cache, BRW_CACHE_FS_PROG,
>  &key, sizeof(key),
>  &brw->wm.base.prog_offset, &brw->wm.prog_data)) {
>
> So, if brw->wm.base.prog_offset changes, we flag BRW_NEW_FS_PROG_DATA.
> In other words, whenever we search the cache, if we select a different
> cache entry than we were using on the previous draw, we flag
> BRW_NEW_*_PROG_DATA.
>
> I explained the difference between the two dirty bits in brw_context.h:
>
> /**
>  * BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM are similar, but distinct.
>  *
>  * BRW_NEW_*_PROGRAM relates to the gl_shader_program/gl_program
> structures.
>  * When the currently bound shader program differs from the previous draw
>  * call, these will be flagged.  They cover brw->{stage}_program and
>  * ctx->{Stage}Program->_Current.
>  *
>  * BRW_NEW_*_PROG_DATA is flagged when the effective shaders change, from a
>  * driver perspective.  Even if the same shader is bound at the API level,
>  * we may need to switch between multiple versions of that shader to handle
>  * changes in non-orthagonal state.
>  *
>  * Additionally, multiple shader programs may have identical vertex shaders
>  * (for example), or compile down to the same code in the backend.  We
> combine
>  * those into a single program cache entry.
>  *
>  * BRW_NEW_*_PROG_DATA occurs when switching program cache entries, which
>  * covers the brw_*_prog_data structures, and brw->*.prog_offset.
>  */
>
> Here, the problem was that brw_upload_texture_surfaces was referring to
> brw->wm.base.prog_data, but not listening to BRW_NEW_FS_PROG_DATA.
> I've sent a patch to fix that (and other similar failures in the area).
>
> I've been slowly migrating the state upload code to only use
> brw_*_prog_data
> and BRW_NEW_*_PROG_DATA, and stop looking at the Mesa program structures
> (covered by BRW_NEW_{VERTEX,GEOMETRY,FRAGMENT}_PROGRAM).  In some cases we
> just refer to the core Mesa structures when the value doesn't change
> between
> NOS specializations (i.e. does the program ever read gl_FragCoord).  I'd
> like to stop doing that - looking at the effective program makes more
> sense,
> simplifies the code, and makes it harder to botch things like this.
>
> I've got a few more patches toward that end.
>
> > There is a test in brw_upload_state() that checks for changes-
> >
> >if (brw->fragment_program != ctx->FragmentProgram._Current) {
> >   brw->fragment_program = ctx->FragmentProgram._Current;
> >   brw->state.dirty.brw |= BRW_NEW_FRAGMENT_PROGRAM;
> >}
> >
> > But that test is not looking for changes to NOS in the cache key.
> > It only sees more direct changes to the fragment program.
> >
> > Setting BRW_NEW_FRAGMENT_PROGRAM in brw_upload_wm_prog() fixes the
> > particular program that I was debuggging and the piglit test I created.
> > But I wonder how many other cases occur.  There are six other callers
> > of brw_search_cache() that may not be getting the right dirty bits
> > set when cache key changes.
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Compiling of shader gets stuck in infinite loop

2014-09-11 Thread Mike Stroyan

swiz x (expression float + (swiz x (expression
> float + (swiz x (expression float + (var_ref col_y) (constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.10)) ) )(constant float
> (0.10)) ) )(constant float (0.50)) ) )
>
> And when we feed these to do_constant_folding() it takes forever to
> finish. For this shader in particular, removing the tree grafting pass
> from do_common_optimization eliminates the problem.
>
> Notice that small, seemingly irrelevant changes to the shader code, can
> make it so that this never happens. For example, if we initialize 'col'
> to something like vec4(0,0,0,0) instead of using the texture function,
> or we remove the division by 2.0 in the last assignment to 'col', these
> instructions are never produced and the shader compiles okay.
>
> The number of iterations in the loop is also important, if we have too
> many we do not unroll the loop and the problem never happens, if we have
> too few, rather than generating a super large tree of expressions like
> above, we generate something like this and the problem, again, does not
> happen: (notice how it adds 0.1 nine times to make 0.9 rather than
> chaining 9 add expressions for 10 iterations of the loop):
>
> (assign  (x) (var_ref flattening_tmp_y)  (expression float * (expression
> float + (constant float (0.90)) (var_ref col_y) ) (constant float
> (0.50)) ) )
>
> So it seems that whether we generate a huge chunk of expressions or not
> is subject to a number of factors, but when the right conditions are met
> we can generate code that can stall compilation forever.
>
> Reading what tree grafting is supposed to do, this does not seem to be
> an unexpected result though, so I wonder what would be the right way to
> fix this. It would look like we would want to do whatever we are doing
> when we only have a few iterations in the loop, but I don't know why we
> generate different code in that case and I am not familiar enough with
> all the optimization and lowering passes to assess what would make sense
> to do here... so, any suggestions?
>
> Iago
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Compiling of shader gets stuck in infinite loop

2014-09-12 Thread Mike Stroyan

This extremely slow compilation is not actually an infinite loop.
But the compile time does increase with every unrolled loop step in the
shader.
The time to complete is 2^N, where N is the number of loop iterations.

The call to
 (*rvalue)->accept(this);
in ir_constant_folding_visitor::handle_rvalue is key to this.
Dropping that call for the case when rvalue is not a constant makes
compilation
finish very quickly.  And for at least this shader it produces exactly the
same results.  Constant folding is done very effectively for the y and z
channels.

But the x channel still produces a series of adds of constants instead of
one add with the sum.
That is a separate issue that could still be investigated.

On Thu, Sep 11, 2014 at 1:53 PM, Mike Stroyan  wrote:

> I have looked at this problem quite a bit but never got to the bottom of
> it.
> This problem really started to show with commit 857f3a6 - "glsl: Ignore
> loop-too-large heuristic if there's bad variable indexing."
> That commit makes many more loops unroll.
> Here is another example piglit shader_runner test that shows the problem.
> Changing the value of LOOP_COUNT and running this with "time shader_runner
> -auto"
> shows that the compile time doubles each time the loop count is
> incremented by one.
> Large values may seem to take forever.  But they do eventually finish.
> Loop counts over 32 will still prevent unrolling and avoid the slow
> compile.
>
> A key part of the problem is the assignment to "col.rgb" in your shader or
> "tmpvar_3.xyz" in this shader.
> The operation on only some channels results in splitting the vec4 into one
> temporary per channel.
> This comment from src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp is
> telling.
>  27│  * If a vector is only ever referenced by its components, then
>  28│  * split those components out to individual variables so they can be
>  29│  * handled normally by other optimization passes.
>
> brw_do_vector_splitting creates the flattening_tmp_y and flattening_tmp_z
> temporaries.
> Operations on one of the channels are optimized quickly.
> But the other two channels are handled badly.
> The operations on the first channel prevent the same simplification of the
> expressions for the other two channels.
>
> Changing ir_vector_splitting_visitor::visit_leave to use "writemask = 1 <<
> i;" instead of "writemask = 1;"
> in the "if (lhs)" case makes the y and z channels get handled like the x
> channel.
> That results in something like
>   (assign  (y) (var_ref flattening_tmp_y)  (expression float * (swiz y
> (var_ref texture2D_retval) )(var_ref channel_expressions@8114) ) )
> It is very fast to compile, but produces bad code that hangs the GPU.
> It is putting the y channel float value into a non-existent "y" channel of
> a simple float temporary, then later reading the real x channel.
>
> [require]
> GLSL >= 1.10
>
> [vertex shader]
> #version 120
> attribute vec2 Tex0;
> attribute vec3 Position;
> void main ()
> {
>   vec4 inPos_1;
>   inPos_1.xy = Position.xy;
>   inPos_1.z = 1.0;
>   inPos_1.w = 1.0;
>   gl_Position = inPos_1;
>   vec4 tmpvar_2;
>   tmpvar_2.zw = vec2(0.0, 0.0);
>   tmpvar_2.xy = Tex0;
>   gl_TexCoord[0] = tmpvar_2;
> }
>
> [fragment shader]
> #version 120
> #define LOOP_COUNT 25
> uniform sampler2D u_sampler;
> void main ()
> {
>   vec2 tmpvar_1;
>   tmpvar_1 = gl_TexCoord[0].xy;
>   vec4 tmpvar_3;
>   tmpvar_3 = vec4(0.0, 0.0, 0.0, 1.0);
>   float weighting_5[LOOP_COUNT];
>   for (int i = 0; i < LOOP_COUNT; i++) {
> float tmpvar_10;
> tmpvar_10 = ((float(int(abs ((float(i) - 15.0) / 15.);
> float tmpvar_11;
> tmpvar_11 = exp ((-(tmpvar_10) * tmpvar_10));
> weighting_5[i] = tmpvar_11;
>   };
>   for (int k = 0; k < LOOP_COUNT; k++) {
> tmpvar_3.xyz += (texture2D (u_sampler, tmpvar_1).xyz * weighting_5[k]);
>   };
>   gl_FragData[0] = tmpvar_3;
> }
>
> [test]
> draw rect -1 -1 2 2
> probe rgb 1 1 0.0 0.0 0.0
>
>
> On Thu, Sep 11, 2014 at 2:02 AM, Iago Toral Quiroga 
> wrote:
>
>> Hi,
>>
>> I have been looking into this bug:
>>
>> Compiling of shader gets stuck in infinite loop
>> https://bugs.freedesktop.org/show_bug.cgi?id=78468
>>
>> Although this occurs at link time when the Intel driver has run some of
>> its specific lowering passes, it looks like the problem could hit other
>> drivers if the right conditions are met, as the actual problem happens
>> inside common optimization passes.
>>
>> I reproduced the problem with a very simple shader li

[Mesa-dev] [PATCH] Only change and restore viewport 0 in mesa meta mode

Re: [Mesa-dev] [PATCH] Only change and restore viewport 0 in mesa meta mode

[Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

Re: [Mesa-dev] [PATCH 12/13] mesa/math: Avoid double promotion.

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

[Mesa-dev] [PATCH] Release gl_debug_state when destroying context.

Re: [Mesa-dev] [PATCH] Release gl_debug_state when destroying context.

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

Re: [Mesa-dev] [PATCH] i965: Avoid dependency hints on math opcodes

[Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

[Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.

Re: [Mesa-dev] [PATCH] i965: Set dirty bit for NOS fragment shader change

Re: [Mesa-dev] Compiling of shader gets stuck in infinite loop

Re: [Mesa-dev] Compiling of shader gets stuck in infinite loop

19 matches

Site Navigation

Mail list logo

Footer information