date:20150622

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Marek Olšák

On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin  wrote:
> On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer  wrote:
>> On 22.06.2015 00:31, Ilia Mirkin wrote:
>>> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov  
>>> wrote:
 On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
> Ilia Mirkin  writes:
>
>> Hello,
>>
>> There are a *ton* of branches in the upstream mesa git. Here is a full 
>> list:
>>
> [...]
>> is there
>> any reason to keep these around with the exception of:
>>
>> master
>> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
>
> Instead of outright deleting old branches, it would be possible to set
> up an "archive" repository which mirrors all branches of the main
> repository. And then delete "obsolete" branches only from the main
> repository. Ideally, you would want a git hook to refuse to create a new
> branch (in the main repository) if a branch by that name already exists
> in the archive repository. Possibly with the exception that creating a
> same-named branch on the same commit would be allowed.
>
> (And the same for tags, of course)
>
 Personally I am fine with either approach - stay/nuke/move. But I'm
 thinking that having a mix of the two suggestions might be a nice middle
 ground.

 Write a script that nukes branches that are merged in master (check the
 top commit of the branch) and have an 'archive' repo that contains
 everything else (minus the stable branches).
>>
>> Sounds good to me, FWIW.
>>
>>
>>> That still leaves a ton around, and curiously removes mesa_7_5 and mesa_7_6.
>>
>> I think the latter is expected, we were using a different branching
>> model back in those days.
>>
>>
>>>origin/amdgpu
>>
>> Note that this is a currently active branch, to be merged to master soon.
>
> Perhaps there's something I don't understand, but why is a feature
> branch made available on the shared tree? In my view of things the
> only branches on the shared mesa.git tree should be the version
> branches.

As you can see, a lot of feature branches are in the shared tree
already, so there is a precedent. Sharing a branch among people in
this way sometimes tends to be more convenient.

The reason here is that it's the only mesa repository where most
people from our team have commit access.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] i965: Don't consider uniform value locations in program uploads

2015-06-22 Thread Pohjolainen, Topi

On Thu, Jun 04, 2015 at 05:35:11PM -0700, Ben Widawsky wrote:
> On Wed, Jun 03, 2015 at 09:32:55PM +0300, Pohjolainen, Topi wrote:
> > On Wed, Jun 03, 2015 at 09:21:11PM +0300, Topi Pohjolainen wrote:
> > > Shader programs are cached per stage (FS, VS, GS) using the
> > > corresponding shader source identifier and compile time choices
> > > as key. However, one not only stores the program binary but
> > > a pair consisting of program binary and program data. The latter
> > > represents the store of constants (such as uniforms) used by
> > > the program.
> > > 
> > > However, when programs are searched in the cache for reloading
> > > only the program key representing the binary is considered
> > > (see for example, brw_upload_wm_prog() and brw_search_cache()).
> > > Hence, when programs are re-loaded from cache the first program
> > > binary, program data pair is extracted without considering if
> > > the program data matches the currently in use uniform storage
> > > as well.
> > > 
> > > My reasoning Why this actually works is because the key
> > > contains the identifier of the corresponding gl_program that
> > > represents the source code for the shader program. Hence,
> > > two programs having identical source code still have unique
> > > keys.
> > > And therefore brw_try_upload_using_copy() never encounters
> > > a case where a matching binary is found but the program data
> > > doesn't match.
> > 
> > In fact, thinking some more I think this is possible when the
> > same, say fragment shader, is used with two different vertex
> > shaders. This results into there being matching binaries but
> > program data pointing to different storage. Looking at
> > brw_upload_cache() I still can't see how failing
> > brw_try_upload_using_copy() makes a difference. We only upload
> > the program binary again (even though that is the part that
> > actually matches). And then proceed the same way regardless
> > of the result of brw_try_upload_using_copy(). The program data
> > gets augmented with the key.
> > 
> > But the point remains that when a program is reloaded through
> > the brw_search_cache() only the key (and not the program data)
> > is considered returning the first matching pair.
> > 
> > I probably need to write a piglit test for this.
> > 
> > > 
> > > My ultimate goal is to stop storing pointers to the individual
> > > components of a uniform but to store only a pointer to the
> > > "struct gl_uniform_storage" instead, and allow
> > > gen6_upload_push_constants() to iterate over individual
> > > components and array elements. This is needed to be able to
> > > convert 32-bits floats to fp16 - otherwise there is only
> > > pointer to 32-bits without knowing its type (int, float, etc)
> > > let alone its target precision.
> > > 
> > > No regression in jenkins. However, we talked about this with
> > > Ken and this doesn't really tell much as piglit doesn't really
> > > re-use shader sources during one execution.
> > > 
> > > Signed-off-by: Topi Pohjolainen 
> > > CC: Kenneth Graunke 
> > > CC: Tapani P\344lli 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_program.c | 6 --
> > >  1 file changed, 6 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> > > b/src/mesa/drivers/dri/i965/brw_program.c
> > > index e5c0d3c..7f5fde8 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_program.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_program.c
> > > @@ -576,12 +576,6 @@ brw_stage_prog_data_compare(const struct 
> > > brw_stage_prog_data *a,
> > > if (memcmp(a, b, offsetof(struct brw_stage_prog_data, param)))
> > >return false;
> > >  
> > > -   if (memcmp(a->param, b->param, a->nr_params * sizeof(void *)))
> > > -  return false;
> > > -
> > > -   if (memcmp(a->pull_param, b->pull_param, a->nr_pull_params * 
> > > sizeof(void *)))
> > > -  return false;
> > > -
> > > return true;
> > >  }
> > >  
> 
> I am looking at a lot of this code for the first time, and I have a kind of 
> wild
> guess.
> 
> The first time you upload a program, the program (kinda annoying that
> brw_upload_item_data doesn't seem to actually do that). Malloc a pointer (tmp,
> item->key), store the program and aux there. Set that pointer as the key.
> 
> The aux data lives at key + key_size.
> 
> Indeed search_cache() only checks the key. For WM it does contain the
> urb_entry data that I think would change if number of uniforms differed. So 
> for
> your example above with 2 VS sharing an FS, if the number of uniforms are the
> same, then the program should be identical in the FS, right? Similarly for the
> GS with input_varyings. I think generally this is the behavior you'd want.
> 
> brw_try_upload_using_copy() seems correct to me as it does do the aux_compare
> (and falls back to memcmp).

Well, I've been looking this quite a bit now, and I'm still somewhat confused
what brw_upload_cache() tries to achieve with brw_try_upload_using_copy().

If you check brw_try_upload_using_copy() you

Re: [Mesa-dev] [RFC] i965: Don't consider uniform value locations in program uploads

2015-06-22 Thread Pohjolainen, Topi

On Mon, Jun 22, 2015 at 01:28:12PM +0300, Pohjolainen, Topi wrote:
> On Thu, Jun 04, 2015 at 05:35:11PM -0700, Ben Widawsky wrote:
> > On Wed, Jun 03, 2015 at 09:32:55PM +0300, Pohjolainen, Topi wrote:
> > > On Wed, Jun 03, 2015 at 09:21:11PM +0300, Topi Pohjolainen wrote:
> > > > Shader programs are cached per stage (FS, VS, GS) using the
> > > > corresponding shader source identifier and compile time choices
> > > > as key. However, one not only stores the program binary but
> > > > a pair consisting of program binary and program data. The latter
> > > > represents the store of constants (such as uniforms) used by
> > > > the program.
> > > > 
> > > > However, when programs are searched in the cache for reloading
> > > > only the program key representing the binary is considered
> > > > (see for example, brw_upload_wm_prog() and brw_search_cache()).
> > > > Hence, when programs are re-loaded from cache the first program
> > > > binary, program data pair is extracted without considering if
> > > > the program data matches the currently in use uniform storage
> > > > as well.
> > > > 
> > > > My reasoning Why this actually works is because the key
> > > > contains the identifier of the corresponding gl_program that
> > > > represents the source code for the shader program. Hence,
> > > > two programs having identical source code still have unique
> > > > keys.
> > > > And therefore brw_try_upload_using_copy() never encounters
> > > > a case where a matching binary is found but the program data
> > > > doesn't match.
> > > 
> > > In fact, thinking some more I think this is possible when the
> > > same, say fragment shader, is used with two different vertex
> > > shaders. This results into there being matching binaries but
> > > program data pointing to different storage. Looking at
> > > brw_upload_cache() I still can't see how failing
> > > brw_try_upload_using_copy() makes a difference. We only upload
> > > the program binary again (even though that is the part that
> > > actually matches). And then proceed the same way regardless
> > > of the result of brw_try_upload_using_copy(). The program data
> > > gets augmented with the key.
> > > 
> > > But the point remains that when a program is reloaded through
> > > the brw_search_cache() only the key (and not the program data)
> > > is considered returning the first matching pair.
> > > 
> > > I probably need to write a piglit test for this.
> > > 
> > > > 
> > > > My ultimate goal is to stop storing pointers to the individual
> > > > components of a uniform but to store only a pointer to the
> > > > "struct gl_uniform_storage" instead, and allow
> > > > gen6_upload_push_constants() to iterate over individual
> > > > components and array elements. This is needed to be able to
> > > > convert 32-bits floats to fp16 - otherwise there is only
> > > > pointer to 32-bits without knowing its type (int, float, etc)
> > > > let alone its target precision.
> > > > 
> > > > No regression in jenkins. However, we talked about this with
> > > > Ken and this doesn't really tell much as piglit doesn't really
> > > > re-use shader sources during one execution.
> > > > 
> > > > Signed-off-by: Topi Pohjolainen 
> > > > CC: Kenneth Graunke 
> > > > CC: Tapani P\344lli 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_program.c | 6 --
> > > >  1 file changed, 6 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> > > > b/src/mesa/drivers/dri/i965/brw_program.c
> > > > index e5c0d3c..7f5fde8 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_program.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_program.c
> > > > @@ -576,12 +576,6 @@ brw_stage_prog_data_compare(const struct 
> > > > brw_stage_prog_data *a,
> > > > if (memcmp(a, b, offsetof(struct brw_stage_prog_data, param)))
> > > >return false;
> > > >  
> > > > -   if (memcmp(a->param, b->param, a->nr_params * sizeof(void *)))
> > > > -  return false;
> > > > -
> > > > -   if (memcmp(a->pull_param, b->pull_param, a->nr_pull_params * 
> > > > sizeof(void *)))
> > > > -  return false;
> > > > -
> > > > return true;
> > > >  }
> > > >  
> > 
> > I am looking at a lot of this code for the first time, and I have a kind of 
> > wild
> > guess.
> > 
> > The first time you upload a program, the program (kinda annoying that
> > brw_upload_item_data doesn't seem to actually do that). Malloc a pointer 
> > (tmp,
> > item->key), store the program and aux there. Set that pointer as the key.
> > 
> > The aux data lives at key + key_size.
> > 
> > Indeed search_cache() only checks the key. For WM it does contain the
> > urb_entry data that I think would change if number of uniforms differed. So 
> > for
> > your example above with 2 VS sharing an FS, if the number of uniforms are 
> > the
> > same, then the program should be identical in the FS, right? Similarly for 
> > the
> > GS with input_varyings. I think generally this is the behavior you'd want.
> > 
> > brw_try_upload_usi

Re: [Mesa-dev] [PATCH] tgsi: handle indirect sampler arrays. (v2)

2015-06-22 Thread Roland Scheidegger

Should there be some clamping somewhere to prevent crashes due to
out-of-bound unit index?
In any case,
Reviewed-by: Roland Scheidegger 

Am 22.06.2015 um 05:18 schrieb Dave Airlie:
> This is required for ARB_gpu_shader5 support in softpipe.
> 
> v2: add support to txd/txf/txq paths.
> 
> Signed-off-by: Dave Airlie 
> ---
> 
>  src/gallium/auxiliary/tgsi/tgsi_exec.c | 42 
> ++
>  1 file changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
> b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> index fde99b9..44000ff 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> @@ -1988,6 +1988,35 @@ fetch_assign_deriv_channel(struct tgsi_exec_machine 
> *mach,
> derivs[1][3] = d.f[3];
>  }
>  
> +static uint
> +fetch_sampler_unit(struct tgsi_exec_machine *mach,
> +   const struct tgsi_full_instruction *inst,
> +   uint sampler)
> +{
> +   uint unit;
> +
> +   if (inst->Src[sampler].Register.Indirect) {
> +  const struct tgsi_full_src_register *reg = &inst->Src[sampler];
> +  union tgsi_exec_channel indir_index, index2;
> +
> +  index2.i[0] =
> +  index2.i[1] =
> +  index2.i[2] =
> +  index2.i[3] = reg->Indirect.Index;
> +
> +  fetch_src_file_channel(mach,
> + 0,
> + reg->Indirect.File,
> + reg->Indirect.Swizzle,
> + &index2,
> + &ZeroVec,
> + &indir_index);
> +  unit = inst->Src[sampler].Register.Index + indir_index.i[0];
> +   } else {
> +  unit = inst->Src[sampler].Register.Index;
> +   }
> +   return unit;
> +}
>  
>  /*
>   * execute a texture instruction.
> @@ -2001,14 +2030,15 @@ exec_tex(struct tgsi_exec_machine *mach,
>   const struct tgsi_full_instruction *inst,
>   uint modifier, uint sampler)
>  {
> -   const uint unit = inst->Src[sampler].Register.Index;
> const union tgsi_exec_channel *args[5], *proj = NULL;
> union tgsi_exec_channel r[5];
> enum tgsi_sampler_control control =  tgsi_sampler_lod_none;
> uint chan;
> +   uint unit;
> int8_t offsets[3];
> int dim, shadow_ref, i;
>  
> +   unit = fetch_sampler_unit(mach, inst, sampler);
> /* always fetch all 3 offsets, overkill but keeps code simple */
> fetch_texel_offsets(mach, inst, offsets);
>  
> @@ -2107,12 +2137,13 @@ static void
>  exec_txd(struct tgsi_exec_machine *mach,
>   const struct tgsi_full_instruction *inst)
>  {
> -   const uint unit = inst->Src[3].Register.Index;
> union tgsi_exec_channel r[4];
> float derivs[3][2][TGSI_QUAD_SIZE];
> uint chan;
> +   uint unit;
> int8_t offsets[3];
>  
> +   unit = fetch_sampler_unit(mach, inst, 3);
> /* always fetch all 3 offsets, overkill but keeps code simple */
> fetch_texel_offsets(mach, inst, offsets);
>  
> @@ -2214,14 +2245,15 @@ static void
>  exec_txf(struct tgsi_exec_machine *mach,
>   const struct tgsi_full_instruction *inst)
>  {
> -   const uint unit = inst->Src[1].Register.Index;
> union tgsi_exec_channel r[4];
> uint chan;
> +   uint unit;
> float rgba[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];
> int j;
> int8_t offsets[3];
> unsigned target;
>  
> +   unit = fetch_sampler_unit(mach, inst, 1);
> /* always fetch all 3 offsets, overkill but keeps code simple */
> fetch_texel_offsets(mach, inst, offsets);
>  
> @@ -2296,12 +2328,14 @@ static void
>  exec_txq(struct tgsi_exec_machine *mach,
>   const struct tgsi_full_instruction *inst)
>  {
> -   const uint unit = inst->Src[1].Register.Index;
> int result[4];
> union tgsi_exec_channel r[4], src;
> uint chan;
> +   uint unit;
> int i,j;
>  
> +   unit = fetch_sampler_unit(mach, inst, 1);
> +
> fetch_source(mach, &src, &inst->Src[0], TGSI_CHAN_X, TGSI_EXEC_DATA_INT);
>  
> /* XXX: This interface can't return per-pixel values */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/gen9: Implement Push Constant Buffer workaround

2015-06-22 Thread Rantala, Valtteri

Ran multiple test cases multiple times that were introducing GPU hangs. 
Applying this patch fixed the GPU hang issues on SKL.

Tested-by: Valtteri Rantala 

> -Original Message-
> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf
> Of Anuj Phogat
> Sent: Friday, June 19, 2015 4:27 AM
> To: Widawsky, Benjamin
> Cc: mesa-dev; Deak, Imre; Phogat, Anuj; Ben Widawsky
> Subject: Re: [Mesa-dev] [PATCH 1/2] i965/gen9: Implement Push Constant
> Buffer workaround
> 
> On Wed, Jun 3, 2015 at 9:35 PM, Ben Widawsky
>  wrote:
> > This implements a workaround (exact excerpt as a comment in the code).
> > The docs specify [clearly, after you struggle for a while] that the
> > offset isn't relative to state base. This actually makes sense.
> >
> > Buffer #0 is meant to be used for normal uniforms.
> > Buffer #1 is typically used for gather constants when using RS.
> > Buffer #1-#3 could be used to push a bunch of UBO data which would just be
> >   somewhere in memory, and not relative to the dynamic state.
> >
> > NOTE: I've moved away from the ternary operator for the new gen9
> conditions.
> > Admittedly it's probably not great to do this, but I really want to
> > fix this all up in the subsequent patch and doing it here makes that
> > diff a lot nicer. I want to split out the gen8/9 code to make the
> > function a bit more readable, but to keep this easily cherry-pickable
> > I am doing this fix first. If we decide not to merge the cleanup patch then 
> > I can
> revisit this.
> >
> > Anuj ran this on his SKL and said there were no fixes on regressions.
> > There is some hope it fixes BXT issues.
> >
> > Cc: Imre Deak 
> > Cc: Neil Roberts 
> > Cc: Anuj Phogat 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/gen7_vs_state.c | 48
> > ++-
> >  1 file changed, 41 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c
> > b/src/mesa/drivers/dri/i965/gen7_vs_state.c
> > index 278b3ec..4b17d06 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
> > @@ -43,18 +43,52 @@ gen7_upload_constant_state(struct brw_context
> *brw,
> > int dwords = brw->gen >= 8 ? 11 : 7;
> > BEGIN_BATCH(dwords);
> > OUT_BATCH(opcode << 16 | (dwords - 2));
> > -   OUT_BATCH(active ? stage_state->push_const_size : 0);
> > -   OUT_BATCH(0);
> > +
> > +   /* Workaround for SKL+ (we use option #2 until we have a need for more
> > +* constant buffers). This comes from the documentation for
> 3DSTATE_CONSTANT_*
> > +*
> > +* The driver must ensure The following case does not occur without a 
> > flush
> > +* to the 3D engine: 3DSTATE_CONSTANT_* with buffer 3 read length equal
> to
> > +* zero committed followed by a 3DSTATE_CONSTANT_* with buffer 0 read
> length
> > +* not equal to zero committed. Possible ways to avoid this condition
> > +* include:
> > +* 1. always force buffer 3 to have a non zero read length
> > +* 2. always force buffer 0 to a zero read length
> > +*/
> > +   if (brw->gen >= 9 && active) {
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(stage_state->push_const_size);
> > +   } else {
> > +  OUT_BATCH(active ? stage_state->push_const_size : 0);
> > +  OUT_BATCH(0);
> > +   }
> > /* Pointer to the constant buffer.  Covered by the set of state flags
> >  * from gen6_prepare_wm_contants
> >  */
> > -   OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0);
> > -   OUT_BATCH(0);
> > -   OUT_BATCH(0);
> > -   OUT_BATCH(0);
> > -   if (brw->gen >= 8) {
> > +   if (brw->gen >= 9 && active) {
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  /* XXX: When using buffers other than 0, you need to specify the
> > +   * graphics virtual address regardless of INSPM/debug bits
> INSTPM
> > +   */
> > +  OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_RENDER, 0,
> > +  stage_state->push_const_offset);
> >OUT_BATCH(0);
> >OUT_BATCH(0);
> > +   } else if (brw->gen>= 8) {
> > +  OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +  OUT_BATCH(0);
> > +   } else {
> > +  OUT_BATCH(active ? (stage_state->push_const_offset | mocs) : 0);
> > +  OUT_BATCH(0);
> >OUT_BATCH(0);
> >OUT_BATCH(0);
> > }
> > --
> > 2.4.2
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> Verified with the spec. LGTM.
> 
> Reviewed-by: Anuj Phogat 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listi

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Tom Stellard

On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:
> On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin  wrote:
> > On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer  wrote:
> >> On 22.06.2015 00:31, Ilia Mirkin wrote:
> >>> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov  
> >>> wrote:
>  On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
> > Ilia Mirkin  writes:
> >
> >> Hello,
> >>
> >> There are a *ton* of branches in the upstream mesa git. Here is a full 
> >> list:
> >>
> > [...]
> >> is there
> >> any reason to keep these around with the exception of:
> >>
> >> master
> >> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
> >
> > Instead of outright deleting old branches, it would be possible to set
> > up an "archive" repository which mirrors all branches of the main
> > repository. And then delete "obsolete" branches only from the main
> > repository. Ideally, you would want a git hook to refuse to create a new
> > branch (in the main repository) if a branch by that name already exists
> > in the archive repository. Possibly with the exception that creating a
> > same-named branch on the same commit would be allowed.
> >
> > (And the same for tags, of course)
> >
>  Personally I am fine with either approach - stay/nuke/move. But I'm
>  thinking that having a mix of the two suggestions might be a nice middle
>  ground.
> 
>  Write a script that nukes branches that are merged in master (check the
>  top commit of the branch) and have an 'archive' repo that contains
>  everything else (minus the stable branches).
> >>
> >> Sounds good to me, FWIW.
> >>
> >>
> >>> That still leaves a ton around, and curiously removes mesa_7_5 and 
> >>> mesa_7_6.
> >>
> >> I think the latter is expected, we were using a different branching
> >> model back in those days.
> >>
> >>
> >>>origin/amdgpu
> >>
> >> Note that this is a currently active branch, to be merged to master soon.
> >
> > Perhaps there's something I don't understand, but why is a feature
> > branch made available on the shared tree? In my view of things the
> > only branches on the shared mesa.git tree should be the version
> > branches.
> 
> As you can see, a lot of feature branches are in the shared tree
> already, so there is a precedent. Sharing a branch among people in
> this way sometimes tends to be more convenient.
> 
> The reason here is that it's the only mesa repository where most
> people from our team have commit access.
> 

Also, the shared git tree supports https access, which means it is
accessible when behind a firewall.

-Tom

> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Ilia Mirkin

On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:
> On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:
>> On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin  wrote:
>> > On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer  wrote:
>> >> On 22.06.2015 00:31, Ilia Mirkin wrote:
>> >>> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov 
>> >>>  wrote:
>>  On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
>> > Ilia Mirkin  writes:
>> >
>> >> Hello,
>> >>
>> >> There are a *ton* of branches in the upstream mesa git. Here is a 
>> >> full list:
>> >>
>> > [...]
>> >> is there
>> >> any reason to keep these around with the exception of:
>> >>
>> >> master
>> >> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
>> >
>> > Instead of outright deleting old branches, it would be possible to set
>> > up an "archive" repository which mirrors all branches of the main
>> > repository. And then delete "obsolete" branches only from the main
>> > repository. Ideally, you would want a git hook to refuse to create a 
>> > new
>> > branch (in the main repository) if a branch by that name already exists
>> > in the archive repository. Possibly with the exception that creating a
>> > same-named branch on the same commit would be allowed.
>> >
>> > (And the same for tags, of course)
>> >
>>  Personally I am fine with either approach - stay/nuke/move. But I'm
>>  thinking that having a mix of the two suggestions might be a nice middle
>>  ground.
>> 
>>  Write a script that nukes branches that are merged in master (check the
>>  top commit of the branch) and have an 'archive' repo that contains
>>  everything else (minus the stable branches).
>> >>
>> >> Sounds good to me, FWIW.
>> >>
>> >>
>> >>> That still leaves a ton around, and curiously removes mesa_7_5 and 
>> >>> mesa_7_6.
>> >>
>> >> I think the latter is expected, we were using a different branching
>> >> model back in those days.
>> >>
>> >>
>> >>>origin/amdgpu
>> >>
>> >> Note that this is a currently active branch, to be merged to master soon.
>> >
>> > Perhaps there's something I don't understand, but why is a feature
>> > branch made available on the shared tree? In my view of things the
>> > only branches on the shared mesa.git tree should be the version
>> > branches.
>>
>> As you can see, a lot of feature branches are in the shared tree
>> already, so there is a precedent. Sharing a branch among people in
>> this way sometimes tends to be more convenient.
>>
>> The reason here is that it's the only mesa repository where most
>> people from our team have commit access.
>>
>
> Also, the shared git tree supports https access, which means it is
> accessible when behind a firewall.

OK, well if that's the prevailing attitude, then I'm on a fool's
errand, and I'll just drop this.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-22 Thread Jose Fonseca


On 19/06/15 23:09, Emil Velikov wrote:

On 19 June 2015 at 21:26, Jose Fonseca  wrote:

On 19/06/15 20:56, Emil Velikov wrote:


Hi all,

A lovely series inspired (more like 'was awaken to send these out') by
Pal Rohár, who was having issues when building xlib-libgl (plus the now
enabled gles*)

So here, we teach the final two static glapi users about shared-glapi,
plus some related fixes. After this is done we can finally start
transitioning to shared-only glapi, with some more details as mentioned
in one of the patches:

  XXX: With this one done, we can finally transition with enforcing
  shared-glapi, and

   - link the dri modules against libglapi.so, add --no-undefined to
  the LDFLAGS
   - drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
  in the loaders - libGL, libEGL and libgbm.
   - start killing off/cleaning up the dispatch ?

  The caveats:
  1) up to what stage do we care about static libraries
   - libgl (either dri or xlib based)
   - osmesa
   - libEGL

  2) how about other platforms (scons) ?
   - currently the scons uses static glapi,
   - would we need the dlopen(...) on windows ?

Hope everyone is excited about this one as I am :-)



Maybe I missed the context of this changes, but why this matters or is an
improvement?


If one goes the extra mile (which this series doesn't) - one configure
option less, substantial some code de-duplication and consistent use
of the code amongst all components provided. This way any
improvements/cleanups made to the shared glapi will be available to
osmesa/xlib-libgl.


I'm perfectly happy with removing the configure option.

And I understand the benefits of unified code paths, but I believe that 
for this particular case, the difference in requirements really demands 
the separate code paths.



In summary, having the ability of using a shared glapi sounds great, but
forcing shared glapi everywhere, sounds a bad idea.


I'm suspecting that people might be keen on the following idea - use
static glapi for osmesa/xlib-libgl and shared one everywhere else?


Yes, that sounds reasonable for me.  (Needs libgl-gdi too.)



I fear that this will lead to further separation/bit-rot between the
different implementations, but it seems like the bester compromise.


I don't feel strongly between: a) using the same source code for both 
static/shared glapi (switched by a pre-processor define), or b) only 
share the interface but have shared/static glapi implementations.  I'm 
actually not that familiar with that code.



Either way, we can have two glapi build targets (a shared-glapi and a 
static-glapipe) side-by-side, so that there are no more source-wide 
configure flags.



I believe a lot of the complexity of that code comes from assembly.  I 
wonder if it's really justified nowadays (and even if it is, whether it 
would be better served with GNU C assembly.) Futhermore, I believe on 
Windows we use any assembly, so if we split shared/static glapi source 
code, we could probably abandon assembly from the static-glapi.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: use _mesa_lookup_enum_by_nr() in print_array()

2015-06-22 Thread Brian Paul

Print GL_FLOAT, etc. instead of hex value.
---
 src/mesa/main/varray.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 7389037..ebdd9ea 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -2309,10 +2309,10 @@ print_array(const char *name, GLint index, const struct 
gl_client_array *array)
   fprintf(stderr, "  %s[%d]: ", name, index);
else
   fprintf(stderr, "  %s: ", name);
-   fprintf(stderr, "Ptr=%p, Type=0x%x, Size=%d, ElemSize=%u, Stride=%d, 
Buffer=%u(Size %lu)\n",
- array->Ptr, array->Type, array->Size,
- array->_ElementSize, array->StrideB,
- array->BufferObj->Name, (unsigned long) array->BufferObj->Size);
+   fprintf(stderr, "Ptr=%p, Type=%s, Size=%d, ElemSize=%u, Stride=%d, 
Buffer=%u(Size %lu)\n",
+   array->Ptr, _mesa_lookup_enum_by_nr(array->Type), array->Size,
+   array->_ElementSize, array->StrideB, array->BufferObj->Name,
+   (unsigned long) array->BufferObj->Size);
 }
 
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/6] mesa: don't rebind constant buffers after every state change if GS is active

2015-06-22 Thread Eero Tamminen


Hi,

On 06/18/2015 03:17 PM, Emil Velikov wrote:

Strange I was under the impression that there are apps that make use
of GS, albeit not too many.


So far I haven't seen any e.g. Steam game using GS, but Unreal Engine 4 
demos:

https://wiki.unrealengine.com/Linux_Demos

use them.  Of the 4 demos I checked, all compiled at least one geometry 
shader, Vehicle Game demo compiled three.


I didn't check what they use them for, in total they compile hundred(s) 
of shaders.



- Eero


On the perf side - I was thinking about the hardware (i.e. regardless
if the driver does extra state-tracking or not) - would there be the
optimisation mentioned, would there be a "stall" in the pipeline, due
to the "new" values being flushed/fetched/etc. Now that I think about
it, only a few of the HW guys may know the answer on this one, so
don't bother with this.

Thanks
Emil

On 16 June 2015 at 20:56, Marek Olšák  wrote:

There are probably 0 apps using GS, so the answer is 0.

The hardware doesn't ignore anything. It only does what it's told to do.

The radeonsi driver doesn't check if the state change is redundant or not.

Marek

On Tue, Jun 16, 2015 at 10:13 PM, Emil Velikov  wrote:

Hi Marek,

Out of curiosity:
Any rough idea of how much of a perf. improvement this might bring ?
Would the hardware ignore the newly (re)bound const. bufs, when the
values are unchanged ?

Thanks
Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: use _mesa_lookup_enum_by_nr() in print_array()

2015-06-22 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Mon, Jun 22, 2015 at 10:33 AM, Brian Paul  wrote:
> Print GL_FLOAT, etc. instead of hex value.
> ---
>  src/mesa/main/varray.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
> index 7389037..ebdd9ea 100644
> --- a/src/mesa/main/varray.c
> +++ b/src/mesa/main/varray.c
> @@ -2309,10 +2309,10 @@ print_array(const char *name, GLint index, const 
> struct gl_client_array *array)
>fprintf(stderr, "  %s[%d]: ", name, index);
> else
>fprintf(stderr, "  %s: ", name);
> -   fprintf(stderr, "Ptr=%p, Type=0x%x, Size=%d, ElemSize=%u, Stride=%d, 
> Buffer=%u(Size %lu)\n",
> - array->Ptr, array->Type, array->Size,
> - array->_ElementSize, array->StrideB,
> - array->BufferObj->Name, (unsigned long) array->BufferObj->Size);
> +   fprintf(stderr, "Ptr=%p, Type=%s, Size=%d, ElemSize=%u, Stride=%d, 
> Buffer=%u(Size %lu)\n",
> +   array->Ptr, _mesa_lookup_enum_by_nr(array->Type), array->Size,
> +   array->_ElementSize, array->StrideB, array->BufferObj->Name,
> +   (unsigned long) array->BufferObj->Size);
>  }
>
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Christian König


On 22.06.2015 15:41, Ilia Mirkin wrote:

On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:

On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:

On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin  wrote:

On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer  wrote:

On 22.06.2015 00:31, Ilia Mirkin wrote:

On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov  wrote:

On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:

Ilia Mirkin  writes:


Hello,

There are a *ton* of branches in the upstream mesa git. Here is a full list:


[...]

is there
any reason to keep these around with the exception of:

master
$version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)

Instead of outright deleting old branches, it would be possible to set
up an "archive" repository which mirrors all branches of the main
repository. And then delete "obsolete" branches only from the main
repository. Ideally, you would want a git hook to refuse to create a new
branch (in the main repository) if a branch by that name already exists
in the archive repository. Possibly with the exception that creating a
same-named branch on the same commit would be allowed.

(And the same for tags, of course)


Personally I am fine with either approach - stay/nuke/move. But I'm
thinking that having a mix of the two suggestions might be a nice middle
ground.

Write a script that nukes branches that are merged in master (check the
top commit of the branch) and have an 'archive' repo that contains
everything else (minus the stable branches).

Sounds good to me, FWIW.



That still leaves a ton around, and curiously removes mesa_7_5 and mesa_7_6.

I think the latter is expected, we were using a different branching
model back in those days.



origin/amdgpu

Note that this is a currently active branch, to be merged to master soon.

Perhaps there's something I don't understand, but why is a feature
branch made available on the shared tree? In my view of things the
only branches on the shared mesa.git tree should be the version
branches.

As you can see, a lot of feature branches are in the shared tree
already, so there is a precedent. Sharing a branch among people in
this way sometimes tends to be more convenient.

The reason here is that it's the only mesa repository where most
people from our team have commit access.


Also, the shared git tree supports https access, which means it is
accessible when behind a firewall.

OK, well if that's the prevailing attitude, then I'm on a fool's
errand, and I'll just drop this.


I still think it would be a good idea to archive the branches after a 
while, cause the current status is rather confusing if you search for 
something specifc.


Regards,
Christian.



   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: Specify the shader stage in linker errors due to too many in/outputs.

2015-06-22 Thread Ian Romanick

This patch is

Reviewed-by: Ian Romanick 

On 06/19/2015 06:08 AM, Jose Fonseca wrote:
> ---
>  src/glsl/link_varyings.cpp | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
> index 7b2d4bd..278a778 100644
> --- a/src/glsl/link_varyings.cpp
> +++ b/src/glsl/link_varyings.cpp
> @@ -1540,13 +1540,15 @@ check_against_output_limit(struct gl_context *ctx,
> const unsigned output_components = output_vectors * 4;
> if (output_components > max_output_components) {
>if (ctx->API == API_OPENGLES2 || prog->IsES)
> - linker_error(prog, "shader uses too many output vectors "
> + linker_error(prog, "%s shader uses too many output vectors "
>"(%u > %u)\n",
> +  _mesa_shader_stage_to_string(producer->Stage),
>output_vectors,
>max_output_components / 4);
>else
> - linker_error(prog, "shader uses too many output components "
> + linker_error(prog, "%s shader uses too many output components "
>"(%u > %u)\n",
> +  _mesa_shader_stage_to_string(producer->Stage),
>output_components,
>max_output_components);
>  
> @@ -1579,13 +1581,15 @@ check_against_input_limit(struct gl_context *ctx,
> const unsigned input_components = input_vectors * 4;
> if (input_components > max_input_components) {
>if (ctx->API == API_OPENGLES2 || prog->IsES)
> - linker_error(prog, "shader uses too many input vectors "
> + linker_error(prog, "%s shader uses too many input vectors "
>"(%u > %u)\n",
> +  _mesa_shader_stage_to_string(consumer->Stage),
>input_vectors,
>max_input_components / 4);
>else
> - linker_error(prog, "shader uses too many input components "
> + linker_error(prog, "%s shader uses too many input components "
>"(%u > %u)\n",
> +  _mesa_shader_stage_to_string(consumer->Stage),
>input_components,
>max_input_components);
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: handle conversions to double when comparing param matches

2015-06-22 Thread Ian Romanick

This seems believable... is there a piglit test?

On 06/17/2015 12:15 PM, Ilia Mirkin wrote:
> This allows mod(int, int) to become selected as float mod when doubles
> are supported.
> 
> Signed-off-by: Ilia Mirkin 
> Cc: "10.6" 
> ---
>  src/glsl/ir_function.cpp | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/src/glsl/ir_function.cpp b/src/glsl/ir_function.cpp
> index 2b2643c..1319443 100644
> --- a/src/glsl/ir_function.cpp
> +++ b/src/glsl/ir_function.cpp
> @@ -148,9 +148,11 @@ get_parameter_match_type(const ir_variable *param,
> if (from_type == to_type)
>return PARAMETER_EXACT_MATCH;
>  
> -   /* XXX: When ARB_gpu_shader_fp64 support is added, check for 
> float->double,
> -* and int/uint->double conversions
> -*/
> +   if (to_type->base_type == GLSL_TYPE_DOUBLE) {
> +  if (from_type->base_type == GLSL_TYPE_FLOAT)
> + return PARAMETER_FLOAT_TO_DOUBLE;
> +  return PARAMETER_INT_TO_DOUBLE;
> +   }
>  
> if (to_type->base_type == GLSL_TYPE_FLOAT)
>return PARAMETER_INT_TO_FLOAT;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] glcpp: Allow arithmetic integer expressions in #line

2015-06-22 Thread Antía Puentes

First, sorry for the late answer, I somehow missed your replies (I was
not in CC).

On mar, 2015-06-09 at 10:59 -0700, Ian Romanick wrote:
> On 06/09/2015 10:40 AM, Carl Worth wrote:
> > On Tue, Jun 09 2015, Ian Romanick wrote:
> >>> From section 3.4 ("Preprocessor") of the GLSL ES 3.00 specification:
> >>> "#line must have, after macro substitution, one of the following forms:
> >>>   #line line
> >>>   #line line source-string-number
> >>> where line and source-string-number are constant integral
> >>> expressions."
> > ...
> >>> From section 4.3.3 ("Constant Expressions") of the same specification:
> >>> "A constant integral expression is a constant expression that evaluates
> >>> to a scalar signed or unsigned integer."
> > 
> > Yes. That's an extremely unfortunate piece of the specification.
> > 
> > This, together with unary operators introduces inherent ambiguity into
> > the grammar. Just think about things like:

I forgot to mention in the patch's commit message that, because of the
ambiguity of the grammar, I made the assumption that one (or more
blanks) that are not part of an expression between parentheses, act as
parameter separators. Then, for the examples mentioned in the thread the
output would be:

#line 2-1+5 -> #line 6
#line 2 -1+5 -> #line 2 4
#line 2-1 +5 -> #line 1 5
#line 2-1+5 3 -> #line 6 3
#line 2 -1+5 3 -> compilation  error
#line 2-1 +5 3 -> compilation error

#line 3 +3 -> #line 3 3
#line 3 (+3) -> #line 3 3

And for the parentheses the behavior is:

#line (2  -1)+5 -> #line 6
#line 3 (4+1)-1 -> #line 3 4
#line (3) ((4+1) -1) -> #line 3 4
#line 3 (4+1) -1 -> compilation error

> The spec was supposed to get updated to say that parsing is greedy, so
> we at least know what those should do.  I say "supposed to" instead of
> "was" because I don't know for sure that it was updated.

I am afraid that greedy parsing is not what this patch implements, as
#line 3 +3 4 will not be evaluated as #line 6 4, but will raise an error
instead.

> > But I'll also take a look at this patch. Thanks for bringing it to my
> > attention, Ian.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Ilia Mirkin

On Mon, Jun 22, 2015 at 11:30 AM, Christian König
 wrote:
> On 22.06.2015 15:41, Ilia Mirkin wrote:
>>
>> On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:
>>>
>>> On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:

 On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin 
 wrote:
>
> On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer 
> wrote:
>>
>> On 22.06.2015 00:31, Ilia Mirkin wrote:
>>>
>>> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov
>>>  wrote:

 On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
>
> Ilia Mirkin  writes:
>
>> Hello,
>>
>> There are a *ton* of branches in the upstream mesa git. Here is a
>> full list:
>>
> [...]
>>
>> is there
>> any reason to keep these around with the exception of:
>>
>> master
>> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
>
> Instead of outright deleting old branches, it would be possible to
> set
> up an "archive" repository which mirrors all branches of the main
> repository. And then delete "obsolete" branches only from the main
> repository. Ideally, you would want a git hook to refuse to create
> a new
> branch (in the main repository) if a branch by that name already
> exists
> in the archive repository. Possibly with the exception that
> creating a
> same-named branch on the same commit would be allowed.
>
> (And the same for tags, of course)
>
 Personally I am fine with either approach - stay/nuke/move. But I'm
 thinking that having a mix of the two suggestions might be a nice
 middle
 ground.

 Write a script that nukes branches that are merged in master (check
 the
 top commit of the branch) and have an 'archive' repo that contains
 everything else (minus the stable branches).
>>
>> Sounds good to me, FWIW.
>>
>>
>>> That still leaves a ton around, and curiously removes mesa_7_5 and
>>> mesa_7_6.
>>
>> I think the latter is expected, we were using a different branching
>> model back in those days.
>>
>>
>>> origin/amdgpu
>>
>> Note that this is a currently active branch, to be merged to master
>> soon.
>
> Perhaps there's something I don't understand, but why is a feature
> branch made available on the shared tree? In my view of things the
> only branches on the shared mesa.git tree should be the version
> branches.

 As you can see, a lot of feature branches are in the shared tree
 already, so there is a precedent. Sharing a branch among people in
 this way sometimes tends to be more convenient.

 The reason here is that it's the only mesa repository where most
 people from our team have commit access.

>>> Also, the shared git tree supports https access, which means it is
>>> accessible when behind a firewall.
>>
>> OK, well if that's the prevailing attitude, then I'm on a fool's
>> errand, and I'll just drop this.
>
>
> I still think it would be a good idea to archive the branches after a while,
> cause the current status is rather confusing if you search for something
> specifc.

Yeah, but if the policy is "create random branches whenever you feel
like on the upstream mesa tree", then this same current situation will
happen again, so it's not really worth fixing now (at least for me).
I'm not aware of any other major project with this sort of branching
policy, but I guess there's always a first!

I don't really see why you wouldn't just use a shared tree in
someone's ~/foo, chgrp'd to mesa or some other convenient group, or
how https plays into things, but I'm sure there's some reason for it.
[Or why those amdgpu patches are on a branch in the first place rather
than in master.] If the final state isn't a tree with a policy of not
adding (non-release) branches, I don't think I'm particularly
interested in doing the legwork.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: handle conversions to double when comparing param matches

2015-06-22 Thread Ilia Mirkin

http://patchwork.freedesktop.org/patch/52138/

I've already pushed this patch btw, Chris gave me a r-b over IRC. But
it seems I neglected to push the piglit patch, my bad.

On Mon, Jun 22, 2015 at 11:35 AM, Ian Romanick  wrote:
> This seems believable... is there a piglit test?
>
> On 06/17/2015 12:15 PM, Ilia Mirkin wrote:
>> This allows mod(int, int) to become selected as float mod when doubles
>> are supported.
>>
>> Signed-off-by: Ilia Mirkin 
>> Cc: "10.6" 
>> ---
>>  src/glsl/ir_function.cpp | 8 +---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/glsl/ir_function.cpp b/src/glsl/ir_function.cpp
>> index 2b2643c..1319443 100644
>> --- a/src/glsl/ir_function.cpp
>> +++ b/src/glsl/ir_function.cpp
>> @@ -148,9 +148,11 @@ get_parameter_match_type(const ir_variable *param,
>> if (from_type == to_type)
>>return PARAMETER_EXACT_MATCH;
>>
>> -   /* XXX: When ARB_gpu_shader_fp64 support is added, check for 
>> float->double,
>> -* and int/uint->double conversions
>> -*/
>> +   if (to_type->base_type == GLSL_TYPE_DOUBLE) {
>> +  if (from_type->base_type == GLSL_TYPE_FLOAT)
>> + return PARAMETER_FLOAT_TO_DOUBLE;
>> +  return PARAMETER_INT_TO_DOUBLE;
>> +   }
>>
>> if (to_type->base_type == GLSL_TYPE_FLOAT)
>>return PARAMETER_INT_TO_FLOAT;
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] ARB_arrays_of_arrays GLSL ES

2015-06-22 Thread Eero Tamminen


Hi,

On 06/20/2015 03:32 PM, Timothy Arceri wrote:

The restrictions in ES make the extension easier to implement so
I thought I'd try get this stuff reviewed an committed before finishing
up the full extension.
The bits that I'm still working on for the desktop version are AoA inputs
outputs, and interface blocks.

The only thing I know is definatly missing in this series for ES is
support for indirect indexing of samplers, but that didn't seem like
something that should hold up the series.

Once the SSBO series lands (with a patch that restricts unsized arrays)
then all the AoA ES conformance tests will pass.

There are already a bunch of piglit tests in git but I've just sent a
series with all the patches still waiting review here:
http://lists.freedesktop.org/archives/piglit/2015-June/016312.html

I haven't made a patch marking this as done yet because currently
the i965 backend takes a very long time trying to optimise some of the
conformance tests. They still pass but they are taking 15-minutes+ just
to compile so this really needs to be sorted out first. If someone with
more knowledge in this area than me wants to take a look at this I would
be greatful for being pointed in the right direction.


Are there individual shaders which compilation take several minutes?

Do you have any perf [1] or valgrind [2] tool output for compiling the 
slowest one?



- Eero

[1] # perf record -a
 ^C
# perf report -n
-> text output
[2] $ valgrind --tool=callgrind 
$ kcachegrind 
-> callgraphs, callee maps etc
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Fix counting of varyings.

2015-06-22 Thread Ian Romanick

On 06/19/2015 06:08 AM, Jose Fonseca wrote:
> When input and output varyings started to be counted separately (commit
> 42305fb5) the is_varying_var function wasn't updated to return true for
> output varyings or input varyings for stages other than the fragment
> shader), effectively making the varying limit to never be checked.

Without SSO, counting the varying inputs used by, say, the fragment
shader, should be sufficient.  With SSO, it's more difficult.

> With this change, color, texture coord, and generic varyings are not
> counted, but others are ignored.  It is assumed the hardware will handle
> special varyings internally (ie, optimistic rather than pessimistic), to
> avoid causing regressions where things were working somehow.
> 
> This fixes `glsl-max-varyings --exceed-limits` with softpipe/llvmpipe,
> which was asserting because we were getting varyings beyond
> VARYING_SLOT_MAX in st_glsl_to_tgsi.cpp.
> 
> It also prevents the assertion failure with
> https://bugs.freedesktop.org/show_bug.cgi?id=90539 but the tests still
> fails due to the link error.
> 
> This change also adds a few assertions to catch this sort of errors
> earlier, and potentially prevent buffer overflows in the future (no
> buffer overflow was detected here though).
> 
> However, this change causes several tests to regress:
> 
>   spec/glsl-1.10/execution/varying-packing/simple ivec3 array
>   spec/glsl-1.10/execution/varying-packing/simple ivec3 separate
>   spec/glsl-1.10/execution/varying-packing/simple uvec3 array
>   spec/glsl-1.10/execution/varying-packing/simple uvec3 separate

Wait... so the ivec3 and uvec3 tests fail, but the vec3 test passes?

>   spec/arb_gpu_shader_fp64/varying-packing/simple dmat3 array
>   spec/glsl-1.50/execution/geometry/max-input-components
>   spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd
>   
> spec/glsl-1.50/execution/variable-indexing/vs-output-array-vec4-index-wr-before-gs
> 
> But this all seem to be issues either in the way we count varyings
> (e.g., geometry inputs get counted multiple times) or in the tests
> themselves, or limitations in the varying packer, and deserve attention
> on their own right.

Do you have a feeling for which tests are which sorts of problems?

I'd like to run this through GLES3 conformance before it gets pushed.
I'm not too worried about the geometry shader issues, but the ivec /
uvec tests seem more problematic.

> ---
>  src/glsl/link_varyings.cpp | 70 
> --
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
>  2 files changed, 58 insertions(+), 14 deletions(-)
> 
> diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
> index 278a778..7649720 100644
> --- a/src/glsl/link_varyings.cpp
> +++ b/src/glsl/link_varyings.cpp
> @@ -190,6 +190,8 @@ cross_validate_outputs_to_inputs(struct gl_shader_program 
> *prog,
>*/
>   const unsigned idx = var->data.location - VARYING_SLOT_VAR0;
>  
> + assert(idx < MAX_VARYING);
> +
>   if (explicit_locations[idx] != NULL) {
>  linker_error(prog,
>   "%s shader has multiple outputs explicitly "
> @@ -1031,25 +1033,63 @@ varying_matches::match_comparator(const void 
> *x_generic, const void *y_generic)
>  /**
>   * Is the given variable a varying variable to be counted against the
>   * limit in ctx->Const.MaxVarying?
> - * This includes variables such as texcoords, colors and generic
> - * varyings, but excludes variables such as gl_FrontFacing and gl_FragCoord.
> + *
> + * OpenGL specification states:

Please use the canonical format.

* Section A.B (Foo Bar) of the OpenGL X.Y Whichever Profile spec
* says:

That enables later readers to more easily find the text in the spec.
Also, the language changes from time to time.

> + *
> + *   Each output variable component used as either a vertex shader output or
> + *   fragment shader input counts against this limit, except for the 
> components
> + *   of gl_Position. A program containing only a vertex and fragment shader

This bit about gl_Position is tricky... I believe this language has
changed more than once in the spec.  It's also the reason the varying
limit has changed from 64 components to 60 components.  I don't think
that affects this patch... it's just a thing I thought was worth
pointing out.

> + *   that accesses more than this limit's worth of components of outputs may
> + *   fail to link, unless device-dependent optimizations are able to make the
> + *   program fit within available hardware resources.
> + *
>   */
>  static bool
>  var_counts_against_varying_limit(gl_shader_stage stage, const ir_variable 
> *var)
>  {
> -   /* Only fragment shaders will take a varying variable as an input */
> -   if (stage == MESA_SHADER_FRAGMENT &&
> -   var->data.mode == ir_var_shader_in) {
> -  switch (var->data.location) {
> -  case VARYING_SLOT_POS:
> -  case VARYING_SLOT_FACE:
> -  ca

[Mesa-dev] [Bug 91044] piglit spec/egl_khr_create_context/valid debug flag gles* fail

2015-06-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=91044

--- Comment #1 from Emil Velikov  ---
Based of the patch date (17 July 2012) and the extension revision history I'd
say that things were changed/nuked in Version 12 or later. With Version 15
being the prime suspect.

As Intel is a Khronos member, you should have access to the SVN repo/history
for the exact details. I'd assume that it would be the better option.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Marek Olšák

I will happily remove the branch after the kernel driver lands.

I also wonder why all Mesa developers can force-push branches in Mesa
but not libdrm.

Marek

On Mon, Jun 22, 2015 at 5:39 PM, Ilia Mirkin  wrote:
> On Mon, Jun 22, 2015 at 11:30 AM, Christian König
>  wrote:
>> On 22.06.2015 15:41, Ilia Mirkin wrote:
>>>
>>> On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:

 On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:
>
> On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin 
> wrote:
>>
>> On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer 
>> wrote:
>>>
>>> On 22.06.2015 00:31, Ilia Mirkin wrote:

 On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov
  wrote:
>
> On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
>>
>> Ilia Mirkin  writes:
>>
>>> Hello,
>>>
>>> There are a *ton* of branches in the upstream mesa git. Here is a
>>> full list:
>>>
>> [...]
>>>
>>> is there
>>> any reason to keep these around with the exception of:
>>>
>>> master
>>> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
>>
>> Instead of outright deleting old branches, it would be possible to
>> set
>> up an "archive" repository which mirrors all branches of the main
>> repository. And then delete "obsolete" branches only from the main
>> repository. Ideally, you would want a git hook to refuse to create
>> a new
>> branch (in the main repository) if a branch by that name already
>> exists
>> in the archive repository. Possibly with the exception that
>> creating a
>> same-named branch on the same commit would be allowed.
>>
>> (And the same for tags, of course)
>>
> Personally I am fine with either approach - stay/nuke/move. But I'm
> thinking that having a mix of the two suggestions might be a nice
> middle
> ground.
>
> Write a script that nukes branches that are merged in master (check
> the
> top commit of the branch) and have an 'archive' repo that contains
> everything else (minus the stable branches).
>>>
>>> Sounds good to me, FWIW.
>>>
>>>
 That still leaves a ton around, and curiously removes mesa_7_5 and
 mesa_7_6.
>>>
>>> I think the latter is expected, we were using a different branching
>>> model back in those days.
>>>
>>>
 origin/amdgpu
>>>
>>> Note that this is a currently active branch, to be merged to master
>>> soon.
>>
>> Perhaps there's something I don't understand, but why is a feature
>> branch made available on the shared tree? In my view of things the
>> only branches on the shared mesa.git tree should be the version
>> branches.
>
> As you can see, a lot of feature branches are in the shared tree
> already, so there is a precedent. Sharing a branch among people in
> this way sometimes tends to be more convenient.
>
> The reason here is that it's the only mesa repository where most
> people from our team have commit access.
>
 Also, the shared git tree supports https access, which means it is
 accessible when behind a firewall.
>>>
>>> OK, well if that's the prevailing attitude, then I'm on a fool's
>>> errand, and I'll just drop this.
>>
>>
>> I still think it would be a good idea to archive the branches after a while,
>> cause the current status is rather confusing if you search for something
>> specifc.
>
> Yeah, but if the policy is "create random branches whenever you feel
> like on the upstream mesa tree", then this same current situation will
> happen again, so it's not really worth fixing now (at least for me).
> I'm not aware of any other major project with this sort of branching
> policy, but I guess there's always a first!
>
> I don't really see why you wouldn't just use a shared tree in
> someone's ~/foo, chgrp'd to mesa or some other convenient group, or
> how https plays into things, but I'm sure there's some reason for it.
> [Or why those amdgpu patches are on a branch in the first place rather
> than in master.] If the final state isn't a tree with a policy of not
> adding (non-release) branches, I don't think I'm particularly
> interested in doing the legwork.
>
> Cheers,
>
>   -ilia
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Ian Romanick

On 06/22/2015 10:40 AM, Marek Olšák wrote:
> I will happily remove the branch after the kernel driver lands.
> 
> I also wonder why all Mesa developers can force-push branches in Mesa
> but not libdrm.

That's probably just historical.  We probably ought to restrict that on
Mesa as well.

It sounds like you guys have some requirements for a shared repo.  It
seems like a repo on fd.o could work.  I think you'd just need a
"amddevs" group and make the repo group rwx.  I thought fd.o GIT did
https (maybe just SSH?).

> Marek
> 
> On Mon, Jun 22, 2015 at 5:39 PM, Ilia Mirkin  wrote:
>> On Mon, Jun 22, 2015 at 11:30 AM, Christian König
>>  wrote:
>>> On 22.06.2015 15:41, Ilia Mirkin wrote:

 On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:
>
> On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:
>>
>> On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin 
>> wrote:
>>>
>>> On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer 
>>> wrote:

 On 22.06.2015 00:31, Ilia Mirkin wrote:
>
> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov
>  wrote:
>>
>> On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:
>>>
>>> Ilia Mirkin  writes:
>>>
 Hello,

 There are a *ton* of branches in the upstream mesa git. Here is a
 full list:

>>> [...]

 is there
 any reason to keep these around with the exception of:

 master
 $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)
>>>
>>> Instead of outright deleting old branches, it would be possible to
>>> set
>>> up an "archive" repository which mirrors all branches of the main
>>> repository. And then delete "obsolete" branches only from the main
>>> repository. Ideally, you would want a git hook to refuse to create
>>> a new
>>> branch (in the main repository) if a branch by that name already
>>> exists
>>> in the archive repository. Possibly with the exception that
>>> creating a
>>> same-named branch on the same commit would be allowed.
>>>
>>> (And the same for tags, of course)
>>>
>> Personally I am fine with either approach - stay/nuke/move. But I'm
>> thinking that having a mix of the two suggestions might be a nice
>> middle
>> ground.
>>
>> Write a script that nukes branches that are merged in master (check
>> the
>> top commit of the branch) and have an 'archive' repo that contains
>> everything else (minus the stable branches).

 Sounds good to me, FWIW.


> That still leaves a ton around, and curiously removes mesa_7_5 and
> mesa_7_6.

 I think the latter is expected, we were using a different branching
 model back in those days.


> origin/amdgpu

 Note that this is a currently active branch, to be merged to master
 soon.
>>>
>>> Perhaps there's something I don't understand, but why is a feature
>>> branch made available on the shared tree? In my view of things the
>>> only branches on the shared mesa.git tree should be the version
>>> branches.
>>
>> As you can see, a lot of feature branches are in the shared tree
>> already, so there is a precedent. Sharing a branch among people in
>> this way sometimes tends to be more convenient.
>>
>> The reason here is that it's the only mesa repository where most
>> people from our team have commit access.
>>
> Also, the shared git tree supports https access, which means it is
> accessible when behind a firewall.

 OK, well if that's the prevailing attitude, then I'm on a fool's
 errand, and I'll just drop this.
>>>
>>>
>>> I still think it would be a good idea to archive the branches after a while,
>>> cause the current status is rather confusing if you search for something
>>> specifc.
>>
>> Yeah, but if the policy is "create random branches whenever you feel
>> like on the upstream mesa tree", then this same current situation will
>> happen again, so it's not really worth fixing now (at least for me).
>> I'm not aware of any other major project with this sort of branching
>> policy, but I guess there's always a first!
>>
>> I don't really see why you wouldn't just use a shared tree in
>> someone's ~/foo, chgrp'd to mesa or some other convenient group, or
>> how https plays into things, but I'm sure there's some reason for it.
>> [Or why those amdgpu patches are on a branch in the first place rather
>> than in master.] If the final state isn't a tree with a policy of not
>> adding (non-release) branches, I don't think I'm particularly
>> interested in doing the legwork.
>>
>> Cheers,
>>
>>   -ilia
>> __

Re: [Mesa-dev] [PATCH 11/11] android: egl: do not link against libglapi

2015-06-22 Thread Emil Velikov

Niiice, thank you. For most drivers - gallium, i965 this is
implemented, leaving nouveau_vieux, radeon, r200 and i915. From these
i915 does work with EGL, while nouveau_vieux dies miserably (missing
__DRI_IMAGE v7 iirc). How well does radeon/r200 fair ?

So as a nice starter task one can, modify EGL to use flush_with_flags
and fall-back do glFlush. Hmm... seems perfect for Google Code-In
(junior GSoC). The application for mentoring org. is around October,
perhaps we can give it a bash :-)

You did bring a very nice topic though... up-to when are we going to
support every loader/dri module combination out there  ?

Emil

On 21 June 2015 at 10:22, Marek Olšák  wrote:
> FWIW, flushing can be done through
> flush_with_flags(__DRI2_FLUSH_CONTEXT), so glFlush shouldn't be
> needed, but some drivers don't implement flush_with_flags and I've
> heard libEGL and libGL need to support DRI drivers from older Mesas too.
>
> Marek
>
> On Fri, Jun 19, 2015 at 9:56 PM, Emil Velikov  
> wrote:
>> The only reason we touch glapi is to dlopen it to:
>>  - make sure that the unresolved _glapi* symbols in the dri modules are
>> provided.
>>  - fetch glFlush() and use it at various stages in the dri2 driver.
>>
>> XXX: If anyone has suggestions why the latter is required (or can
>> recommend any reading material) I'm all ears.
>>
>> Cc: Chih-Wei Huang 
>> Cc: Eric Anholt 
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/egl/main/Android.mk | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/src/egl/main/Android.mk b/src/egl/main/Android.mk
>> index 8f687e9..0ba7295 100644
>> --- a/src/egl/main/Android.mk
>> +++ b/src/egl/main/Android.mk
>> @@ -44,7 +44,6 @@ LOCAL_CFLAGS := \
>> -D_EGL_OS_UNIX=1
>>
>>  LOCAL_SHARED_LIBRARIES := \
>> -   libglapi \
>> libdl \
>> libhardware \
>> liblog \
>> --
>> 2.4.2
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Fix ir_txs in emit_texture_gen4_simd16().

2015-06-22 Thread Kenneth Graunke

We were not emitting the LOD, which led to message lengths of 1 instead
of 3.  Setting has_lod makes us emit the LOD, but I had to make changes
to avoid emitting the non-existent coordinate as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91022
Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..12253e4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -247,7 +247,7 @@ fs_visitor::emit_texture_gen4_simd16(ir_texture_opcode op, 
fs_reg dst,
  uint32_t sampler)
 {
fs_reg message(MRF, 2, BRW_REGISTER_TYPE_F, dispatch_width);
-   bool has_lod = op == ir_txl || op == ir_txb || op == ir_txf;
+   bool has_lod = op == ir_txl || op == ir_txb || op == ir_txf || op == ir_txs;
 
if (has_lod && shadow_c.file != BAD_FILE)
   no16("TXB and TXL with shadow comparison unsupported in SIMD16.");
@@ -264,14 +264,15 @@ fs_visitor::emit_texture_gen4_simd16(ir_texture_opcode 
op, fs_reg dst,
fs_reg msg_end = offset(message, vector_elements);
 
/* Messages other than sample and ld require all three components */
-   if (has_lod || shadow_c.file != BAD_FILE) {
+   if (vector_elements > 0 && (has_lod || shadow_c.file != BAD_FILE)) {
   for (int i = vector_elements; i < 3; i++) {
  bld.MOV(offset(message, i), fs_reg(0.0f));
   }
+  msg_end = offset(message, 3);
}
 
if (has_lod) {
-  fs_reg msg_lod = retype(offset(message, 3), op == ir_txf ?
+  fs_reg msg_lod = retype(msg_end, op == ir_txf ?
   BRW_REGISTER_TYPE_UD : BRW_REGISTER_TYPE_F);
   bld.MOV(msg_lod, lod);
   msg_end = offset(msg_lod, 1);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] abundance of branches in mesa.git

2015-06-22 Thread Marek Olšák

It's not so important now that the amdgpu driver is about to be merged.

Speaking of other branches, I think removing the old feature branches
is a good idea.

Marek

On Mon, Jun 22, 2015 at 8:02 PM, Ian Romanick  wrote:
> On 06/22/2015 10:40 AM, Marek Olšák wrote:
>> I will happily remove the branch after the kernel driver lands.
>>
>> I also wonder why all Mesa developers can force-push branches in Mesa
>> but not libdrm.
>
> That's probably just historical.  We probably ought to restrict that on
> Mesa as well.
>
> It sounds like you guys have some requirements for a shared repo.  It
> seems like a repo on fd.o could work.  I think you'd just need a
> "amddevs" group and make the repo group rwx.  I thought fd.o GIT did
> https (maybe just SSH?).
>
>> Marek
>>
>> On Mon, Jun 22, 2015 at 5:39 PM, Ilia Mirkin  wrote:
>>> On Mon, Jun 22, 2015 at 11:30 AM, Christian König
>>>  wrote:
 On 22.06.2015 15:41, Ilia Mirkin wrote:
>
> On Mon, Jun 22, 2015 at 9:39 AM, Tom Stellard  wrote:
>>
>> On Mon, Jun 22, 2015 at 12:23:54PM +0200, Marek Olšák wrote:
>>>
>>> On Mon, Jun 22, 2015 at 5:36 AM, Ilia Mirkin 
>>> wrote:

 On Sun, Jun 21, 2015 at 11:33 PM, Michel Dänzer 
 wrote:
>
> On 22.06.2015 00:31, Ilia Mirkin wrote:
>>
>> On Sun, Jun 21, 2015 at 12:22 PM, Emil Velikov
>>  wrote:
>>>
>>> On 20/06/15 10:01, Eirik Byrkjeflot Anonsen wrote:

 Ilia Mirkin  writes:

> Hello,
>
> There are a *ton* of branches in the upstream mesa git. Here is a
> full list:
>
 [...]
>
> is there
> any reason to keep these around with the exception of:
>
> master
> $version (i.e. 9.0, 10.0, mesa_7_7_branch, etc)

 Instead of outright deleting old branches, it would be possible to
 set
 up an "archive" repository which mirrors all branches of the main
 repository. And then delete "obsolete" branches only from the main
 repository. Ideally, you would want a git hook to refuse to create
 a new
 branch (in the main repository) if a branch by that name already
 exists
 in the archive repository. Possibly with the exception that
 creating a
 same-named branch on the same commit would be allowed.

 (And the same for tags, of course)

>>> Personally I am fine with either approach - stay/nuke/move. But I'm
>>> thinking that having a mix of the two suggestions might be a nice
>>> middle
>>> ground.
>>>
>>> Write a script that nukes branches that are merged in master (check
>>> the
>>> top commit of the branch) and have an 'archive' repo that contains
>>> everything else (minus the stable branches).
>
> Sounds good to me, FWIW.
>
>
>> That still leaves a ton around, and curiously removes mesa_7_5 and
>> mesa_7_6.
>
> I think the latter is expected, we were using a different branching
> model back in those days.
>
>
>> origin/amdgpu
>
> Note that this is a currently active branch, to be merged to master
> soon.

 Perhaps there's something I don't understand, but why is a feature
 branch made available on the shared tree? In my view of things the
 only branches on the shared mesa.git tree should be the version
 branches.
>>>
>>> As you can see, a lot of feature branches are in the shared tree
>>> already, so there is a precedent. Sharing a branch among people in
>>> this way sometimes tends to be more convenient.
>>>
>>> The reason here is that it's the only mesa repository where most
>>> people from our team have commit access.
>>>
>> Also, the shared git tree supports https access, which means it is
>> accessible when behind a firewall.
>
> OK, well if that's the prevailing attitude, then I'm on a fool's
> errand, and I'll just drop this.


 I still think it would be a good idea to archive the branches after a 
 while,
 cause the current status is rather confusing if you search for something
 specifc.
>>>
>>> Yeah, but if the policy is "create random branches whenever you feel
>>> like on the upstream mesa tree", then this same current situation will
>>> happen again, so it's not really worth fixing now (at least for me).
>>> I'm not aware of any other major project with this sort of branching
>>> policy, but I guess there's always a first!
>>>
>>> I don't really see why you wouldn't just use a shared tree in
>>> someone's ~/foo, chgrp'd to mesa or some other convenient group, or

Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-22 Thread Ian Romanick

On 06/22/2015 07:01 AM, Jose Fonseca wrote:
> On 19/06/15 23:09, Emil Velikov wrote:
>> On 19 June 2015 at 21:26, Jose Fonseca  wrote:
>>> On 19/06/15 20:56, Emil Velikov wrote:

 Hi all,

 A lovely series inspired (more like 'was awaken to send these out') by
 Pal Rohár, who was having issues when building xlib-libgl (plus the now
 enabled gles*)

 So here, we teach the final two static glapi users about shared-glapi,
 plus some related fixes. After this is done we can finally start
 transitioning to shared-only glapi, with some more details as mentioned
 in one of the patches:

   XXX: With this one done, we can finally transition with enforcing
   shared-glapi, and

- link the dri modules against libglapi.so, add
 --no-undefined to
   the LDFLAGS
- drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
   in the loaders - libGL, libEGL and libgbm.
- start killing off/cleaning up the dispatch ?

   The caveats:
   1) up to what stage do we care about static libraries
- libgl (either dri or xlib based)
- osmesa
- libEGL

   2) how about other platforms (scons) ?
- currently the scons uses static glapi,
- would we need the dlopen(...) on windows ?

 Hope everyone is excited about this one as I am :-)
>>>
>>>
>>> Maybe I missed the context of this changes, but why this matters or
>>> is an
>>> improvement?
>>>
>> If one goes the extra mile (which this series doesn't) - one configure
>> option less, substantial some code de-duplication and consistent use
>> of the code amongst all components provided. This way any
>> improvements/cleanups made to the shared glapi will be available to
>> osmesa/xlib-libgl.
> 
> I'm perfectly happy with removing the configure option.
> 
> And I understand the benefits of unified code paths, but I believe that
> for this particular case, the difference in requirements really demands
> the separate code paths.
> 
>>> In summary, having the ability of using a shared glapi sounds great, but
>>> forcing shared glapi everywhere, sounds a bad idea.
>>>
>> I'm suspecting that people might be keen on the following idea - use
>> static glapi for osmesa/xlib-libgl and shared one everywhere else?
> 
> Yes, that sounds reasonable for me.  (Needs libgl-gdi too.)
> 
>>
>> I fear that this will lead to further separation/bit-rot between the
>> different implementations, but it seems like the bester compromise.
> 
> I don't feel strongly between: a) using the same source code for both
> static/shared glapi (switched by a pre-processor define), or b) only
> share the interface but have shared/static glapi implementations.  I'm
> actually not that familiar with that code.
> 
> 
> Either way, we can have two glapi build targets (a shared-glapi and a
> static-glapipe) side-by-side, so that there are no more source-wide
> configure flags.
> 
> 
> I believe a lot of the complexity of that code comes from assembly.  I
> wonder if it's really justified nowadays (and even if it is, whether it
> would be better served with GNU C assembly.) Futhermore, I believe on
> Windows we use any assembly, so if we split shared/static glapi source
> code, we could probably abandon assembly from the static-glapi.

It comes from the intersection of the assembly and the myriad threading
options.  Having TLS and shared-glapi is the only "option" for DRI
builds would be terrific.  We have a couple work loads that, especially
on Atom CPUs, are sensitive to any added overhead.  My recollection was
that GCC does not generate the code you want for the dispatch functions.

I feel like we keeping coming around to the loader/driver interface
needing some significant work.  I certainly have a bunch of ideas for
how things could be improved.  I'll start working on a proposal.

> Jose
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC] Compatibility between old dri modules and new loaders, and vice verse

2015-06-22 Thread Emil Velikov

Hi all,

As kindly hinted by Marek, currently we do have a wide selection of
supported dri <> loader combinations.

Although we like to think that things never break, we have to admit
that not many of us test every possible combinations of dri modules
and loaders. With the chances getting smaller as the time gap (age)
between the two increases. As such I would like to ask if we're
interested in gradually depreciating as the gap grows beyond X years.

The rough idea that I have in my mind is:
- Check for obsolete extensions (requirements for such) - both in the
dri modules and the loaders (including the xserver).
- Add some WARN messages ("You're using an old loader/DRI module.
Update to XXX or later") when such code path is hit.
- After X mesa releases, we remove the dri extension from the
module(s) and bump the requirement(s) in the loader(s).

And now the more important question why ?
 - Very rarely tested and not actively supported - if it works it
works, we only cover one stable branch.
 - Having a quick look at the the "if extension && extension.version
>= y" maze does leave most of us speechless.
 - Will allow us to start removing a few of the nasty quirks/hacks
that we currently have laying around.

Worth mentioning:
 - Depreciation period will be based on the longest time frame set by
LTS versions of distros. For example if Debian A ships X and mesa 3
years apart, while Ubuntu does is ~2.5 and RedHat ~2.8, we'll stick
with 3 years.
 - libGL dri1 support... it's been almost four years since the removal
of the dri1 modules. Since then the only activity that I've noticed by
Connor Behan on the r128 front. Although it seems that he has covered
the ddx and is just looking at the kernel side of things. Should we
consider mesa X (10.6 ?) as the last one that supports such old
modules in it's libGL and give it a much needed cleanup ?


How would people feel about this - do we have any strong ack/nack
about the idea ? Are there many people/companies that support distros
where the xserver <> mesa gap is over, say 2 years ?

Looking forward to any feedback,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Building Mesa/LLVMpipe on Windows

2015-06-22 Thread Florian Link

Hi everyone,

I spent some time building Mesa/llvmpipe on Windows and created a Python
script
that implements all the required steps (downloading/extracting all
prerequisites and sources,
configuring and building LLVM and Mesa).

The script is available at:

https://github.com/florianlink/MesaOnWindows

I hope it helps some people struggling with the build details on Windows!
If you are interested, feel free to incorporate it into Mesa, I placed the
script into the public domain.

Best regards,
Florian

P.S. Is there any reason why there are no prebuilt Mesa opengl32.dll files
available on the web? I considered putting a current dll onto Github as
well, are there any reasons why I should not do that?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeon: Advertise correct GL_SAMPLES_PASSED value.

2015-06-22 Thread Ian Romanick

From: Ian Romanick 

Commit b765119c changed the default value of all the counter bits to
64.  However, older hardware only has 32 counter bits.

This has only been build-tested.  We don't have any tests that verify
the advertised value against implementation behavior, so I don't know
what additional testing could be done.

NOTE: It appears that many Gallium drivers (at least r300 and i915g)
have the same problem, but I don't see a way for the state-tracker to
determine the counter size.

Signed-off-by: Ian Romanick 
Cc: Marek Olšák 
Cc: Alex Deucher 
---
 .../drivers/dri/radeon/radeon_common_context.c | 23 ++
 1 file changed, 23 insertions(+)

diff --git a/src/mesa/drivers/dri/radeon/radeon_common_context.c 
b/src/mesa/drivers/dri/radeon/radeon_common_context.c
index 9699dcb..3d0ceda 100644
--- a/src/mesa/drivers/dri/radeon/radeon_common_context.c
+++ b/src/mesa/drivers/dri/radeon/radeon_common_context.c
@@ -194,6 +194,29 @@ GLboolean radeonInitContext(radeonContextPtr radeon,
 
radeon_init_dma(radeon);
 
+/* _mesa_initialize_context calls _mesa_init_queryobj which
+ * initializes all of the counter sizes to 64.  The counters on r100
+ * and r200 are only 32-bits for occlusion queries.  Those are the
+ * only counters, so set the other sizes to zero.
+ */
+radeon->glCtx.Const.QueryCounterBits.SamplesPassed = 32;
+
+radeon->glCtx.Const.QueryCounterBits.TimeElapsed = 0;
+radeon->glCtx.Const.QueryCounterBits.Timestamp = 0;
+radeon->glCtx.Const.QueryCounterBits.PrimitivesGenerated = 0;
+radeon->glCtx.Const.QueryCounterBits.PrimitivesWritten = 0;
+radeon->glCtx.Const.QueryCounterBits.VerticesSubmitted = 0;
+radeon->glCtx.Const.QueryCounterBits.PrimitivesSubmitted = 0;
+radeon->glCtx.Const.QueryCounterBits.VsInvocations = 0;
+radeon->glCtx.Const.QueryCounterBits.TessPatches = 0;
+radeon->glCtx.Const.QueryCounterBits.TessInvocations = 0;
+radeon->glCtx.Const.QueryCounterBits.GsInvocations = 0;
+radeon->glCtx.Const.QueryCounterBits.GsPrimitives = 0;
+radeon->glCtx.Const.QueryCounterBits.FsInvocations = 0;
+radeon->glCtx.Const.QueryCounterBits.ComputeInvocations = 0;
+radeon->glCtx.Const.QueryCounterBits.ClInPrimitives = 0;
+radeon->glCtx.Const.QueryCounterBits.ClOutPrimitives = 0;
+
return GL_TRUE;
 }
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/11] android: egl: do not link against libglapi

2015-06-22 Thread Marek Olšák

Yes, I think we need to support every loader/driver combination, but
I'm not sure.

Ian, please how much do we care about compatibility between loaders
(libGL, libEGL) and DRI drivers?

Thanks,

Marek

On Mon, Jun 22, 2015 at 8:04 PM, Emil Velikov  wrote:
> Niiice, thank you. For most drivers - gallium, i965 this is
> implemented, leaving nouveau_vieux, radeon, r200 and i915. From these
> i915 does work with EGL, while nouveau_vieux dies miserably (missing
> __DRI_IMAGE v7 iirc). How well does radeon/r200 fair ?
>
> So as a nice starter task one can, modify EGL to use flush_with_flags
> and fall-back do glFlush. Hmm... seems perfect for Google Code-In
> (junior GSoC). The application for mentoring org. is around October,
> perhaps we can give it a bash :-)
>
> You did bring a very nice topic though... up-to when are we going to
> support every loader/dri module combination out there  ?
>
> Emil
>
> On 21 June 2015 at 10:22, Marek Olšák  wrote:
>> FWIW, flushing can be done through
>> flush_with_flags(__DRI2_FLUSH_CONTEXT), so glFlush shouldn't be
>> needed, but some drivers don't implement flush_with_flags and I've
>> heard libEGL and libGL need to support DRI drivers from older Mesas too.
>>
>> Marek
>>
>> On Fri, Jun 19, 2015 at 9:56 PM, Emil Velikov  
>> wrote:
>>> The only reason we touch glapi is to dlopen it to:
>>>  - make sure that the unresolved _glapi* symbols in the dri modules are
>>> provided.
>>>  - fetch glFlush() and use it at various stages in the dri2 driver.
>>>
>>> XXX: If anyone has suggestions why the latter is required (or can
>>> recommend any reading material) I'm all ears.
>>>
>>> Cc: Chih-Wei Huang 
>>> Cc: Eric Anholt 
>>> Signed-off-by: Emil Velikov 
>>> ---
>>>  src/egl/main/Android.mk | 1 -
>>>  1 file changed, 1 deletion(-)
>>>
>>> diff --git a/src/egl/main/Android.mk b/src/egl/main/Android.mk
>>> index 8f687e9..0ba7295 100644
>>> --- a/src/egl/main/Android.mk
>>> +++ b/src/egl/main/Android.mk
>>> @@ -44,7 +44,6 @@ LOCAL_CFLAGS := \
>>> -D_EGL_OS_UNIX=1
>>>
>>>  LOCAL_SHARED_LIBRARIES := \
>>> -   libglapi \
>>> libdl \
>>> libhardware \
>>> liblog \
>>> --
>>> 2.4.2
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-22 Thread Emil Velikov

On 22 June 2015 at 15:01, Jose Fonseca  wrote:
> On 19/06/15 23:09, Emil Velikov wrote:
>>
>> On 19 June 2015 at 21:26, Jose Fonseca  wrote:
>>>
>>> On 19/06/15 20:56, Emil Velikov wrote:


 Hi all,

 A lovely series inspired (more like 'was awaken to send these out') by
 Pal Rohár, who was having issues when building xlib-libgl (plus the now
 enabled gles*)

 So here, we teach the final two static glapi users about shared-glapi,
 plus some related fixes. After this is done we can finally start
 transitioning to shared-only glapi, with some more details as mentioned
 in one of the patches:

   XXX: With this one done, we can finally transition with enforcing
   shared-glapi, and

- link the dri modules against libglapi.so, add --no-undefined to
   the LDFLAGS
- drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
   in the loaders - libGL, libEGL and libgbm.
- start killing off/cleaning up the dispatch ?

   The caveats:
   1) up to what stage do we care about static libraries
- libgl (either dri or xlib based)
- osmesa
- libEGL

   2) how about other platforms (scons) ?
- currently the scons uses static glapi,
- would we need the dlopen(...) on windows ?

 Hope everyone is excited about this one as I am :-)
>>>
>>>
>>>
>>> Maybe I missed the context of this changes, but why this matters or is an
>>> improvement?
>>>
>> If one goes the extra mile (which this series doesn't) - one configure
>> option less, substantial some code de-duplication and consistent use
>> of the code amongst all components provided. This way any
>> improvements/cleanups made to the shared glapi will be available to
>> osmesa/xlib-libgl.
>
>
> I'm perfectly happy with removing the configure option.
>
> And I understand the benefits of unified code paths, but I believe that for
> this particular case, the difference in requirements really demands the
> separate code paths.
>
>>> In summary, having the ability of using a shared glapi sounds great, but
>>> forcing shared glapi everywhere, sounds a bad idea.
>>>
>> I'm suspecting that people might be keen on the following idea - use
>> static glapi for osmesa/xlib-libgl and shared one everywhere else?
>
>
> Yes, that sounds reasonable for me.  (Needs libgl-gdi too.)
>
Indeed. Everything gdi is build only via scons so we'll touch it only if needed.

>>
>> I fear that this will lead to further separation/bit-rot between the
>> different implementations, but it seems like the bester compromise.
>
>
> I don't feel strongly between: a) using the same source code for both
> static/shared glapi (switched by a pre-processor define), or b) only share
> the interface but have shared/static glapi implementations.  I'm actually
> not that familiar with that code.
>
>
> Either way, we can have two glapi build targets (a shared-glapi and a
> static-glapipe) side-by-side, so that there are no more source-wide
> configure flags.
>
In theory it should be fine, in practise... I'm rather cautious as
mapi is the most convoluted part in mesa, and with the
"subdir-objects" option being toggled soon things may go (albeit
unlikely) subtly haywire.

>
> I believe a lot of the complexity of that code comes from assembly.  I
> wonder if it's really justified nowadays (and even if it is, whether it
> would be better served with GNU C assembly.) Futhermore, I believe on
> Windows we use any assembly, so if we split shared/static glapi source code,
> we could probably abandon assembly from the static-glapi.
>
I'm not 100% sure but I'd suspect that Cygwin might use it when
combined with swrast_dri. Don't know what others use - iirc some of
the BSD folks are moving over to llvm. That I aside there is a massive
amount of #ifdef spaghetti, apart from the assembly code.

Can I have your ack/nack on the idea of having shared-glapi available
for xlib-libgl (patches 2, 3 and 4), until we have both glapi's built
in in parallel ? As mentioned originally, currently we fail to build
if one enabled gles* and xlib-libgl and adding another hack in
configure.ac is feel like flocking up a dead horse.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Compatibility between old dri modules and new loaders, and vice verse

2015-06-22 Thread Dave Airlie

>
> As kindly hinted by Marek, currently we do have a wide selection of
> supported dri <> loader combinations.
>
> Although we like to think that things never break, we have to admit
> that not many of us test every possible combinations of dri modules
> and loaders. With the chances getting smaller as the time gap (age)
> between the two increases. As such I would like to ask if we're
> interested in gradually depreciating as the gap grows beyond X years.
>
> The rough idea that I have in my mind is:
> - Check for obsolete extensions (requirements for such) - both in the
> dri modules and the loaders (including the xserver).
> - Add some WARN messages ("You're using an old loader/DRI module.
> Update to XXX or later") when such code path is hit.
> - After X mesa releases, we remove the dri extension from the
> module(s) and bump the requirement(s) in the loader(s).
>
> And now the more important question why ?
>  - Very rarely tested and not actively supported - if it works it
> works, we only cover one stable branch.
>  - Having a quick look at the the "if extension && extension.version
>>= y" maze does leave most of us speechless.
>  - Will allow us to start removing a few of the nasty quirks/hacks
> that we currently have laying around.
>
> Worth mentioning:
>  - Depreciation period will be based on the longest time frame set by
> LTS versions of distros. For example if Debian A ships X and mesa 3
> years apart, while Ubuntu does is ~2.5 and RedHat ~2.8, we'll stick
> with 3 years.
>  - libGL dri1 support... it's been almost four years since the removal
> of the dri1 modules. Since then the only activity that I've noticed by
> Connor Behan on the r128 front. Although it seems that he has covered
> the ddx and is just looking at the kernel side of things. Should we
> consider mesa X (10.6 ?) as the last one that supports such old
> modules in it's libGL and give it a much needed cleanup ?
>
>
> How would people feel about this - do we have any strong ack/nack
> about the idea ? Are there many people/companies that support distros
> where the xserver <> mesa gap is over, say 2 years ?

We still ship 7.11 based dri1 drivers in RHEL6, and there is still a
chance of us rebasing to newer Mesa in that depending on schedules.

ajax might have a different opinion, on how likely that is, but
that would be at least another year from now where we'd want DRI1
to work.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Building Mesa/LLVMpipe on Windows

2015-06-22 Thread Jose Fonseca

On 22/06/15 19:40, Florian Link wrote:

Hi everyone,

I spent some time building Mesa/llvmpipe on Windows and created a Python
script
that implements all the required steps (downloading/extracting all
prerequisites and sources,
configuring and building LLVM and Mesa).

The script is available at:

https://github.com/florianlink/MesaOnWindows

Given you're building for MSVC, you could avoid MinGW by using 
http://winflexbison.sourceforge.net/ .

BTW, I've been playing with AppVeyor for building Mesa builds with MSVC. 
 You can see the builds log

https://ci.appveyor.com/project/jrfonseca/mesa

It doesn't build everything -- it uses pre-compiled LLVM binaries --, 
and it also leverages a lot of software that is pre-installed int 
AppVeyor build images.

>
> I hope it helps some people struggling with the build details on Windows!
> If you are interested, feel free to incorporate it into Mesa,

Maybe this sort of script wouldn't be a bad idea indeed.

> I placed the script into the public domain.

Didn't know about unlicense.org . Interesting.  A bit off-topic, but I 
actually have been considering public domain for future personal pet 
projects, because when

Best regards,
Florian

P.S. Is there any reason why there are no prebuilt Mesa opengl32.dll
files available on the web? I considered putting a current dll onto
Github as well, are there any reasons why I should not do that?

No particular reason other than nobody could be bothered.  Mesa doesn't 
ship compiled binaries for any OS, not just Windows.

Personally I don't the time to prepare binaries.  If this ever was to 
happen it would have to be fully automated via something like AppVeyor 
(MSVC) or Travis-Ci (mingw cross-compilers).

I also worry about people just downloading opengl32.dll, without 
understanding what they are doing, running into all sort of troubles, 
and flooding with bug reports / support requests.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/14] mesa: Fix conditions to test signed, unsigned integer format

2015-06-22 Thread Anuj Phogat

On Sun, Jun 21, 2015 at 11:25 PM, Iago Toral  wrote:
> On Fri, 2015-06-19 at 13:32 -0700, Anuj Phogat wrote:
>> On Thu, Jun 18, 2015 at 11:41 PM, Iago Toral  wrote:
>> > On Thu, 2015-06-18 at 09:19 -0700, Anuj Phogat wrote:
>> >> On Thu, Jun 18, 2015 at 7:09 AM, Iago Toral  wrote:
>> >> > On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
>> >> >> Signed-off-by: Anuj Phogat 
>> >> >> Cc: 
>> >> >> ---
>> >> >>  src/mesa/main/readpix.c | 2 ++
>> >> >>  1 file changed, 2 insertions(+)
>> >> >>
>> >> >> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
>> >> >> index caa2648..a9416ef 100644
>> >> >> --- a/src/mesa/main/readpix.c
>> >> >> +++ b/src/mesa/main/readpix.c
>> >> >> @@ -160,10 +160,12 @@ _mesa_readpixels_needs_slow_path(const struct 
>> >> >> gl_context *ctx, GLenum format,
>> >> >>srcType = _mesa_get_format_datatype(rb->Format);
>> >> >>
>> >> >>if ((srcType == GL_INT &&
>> >> >> +   _mesa_is_enum_format_integer(format) &&
>> >> >> (type == GL_UNSIGNED_INT ||
>> >> >>  type == GL_UNSIGNED_SHORT ||
>> >> >>  type == GL_UNSIGNED_BYTE)) ||
>> >> >>(srcType == GL_UNSIGNED_INT &&
>> >> >> +   _mesa_is_enum_format_integer(format) &&
>> >> >> (type == GL_INT ||
>> >> >>  type == GL_SHORT ||
>> >> >>  type == GL_BYTE))) {
>> >> >
>> >> > As far as I understand this code we are trying to see if we can use
>> >> > memcpy to directly copy the contents of the framebuffer to the
>> >> > destination buffer. In that case, as long as the src/dst types have
>> >> > different sign we can't just use memcpy, right? In fact it looks like we
>> >> > might need to expand the checks to include the cases where srcType is
>> >> > GL_(UNSIGNED_)SHORT and GL_(UNSIGNED_)BYTE as well.
>> >> >
>> >> srcType returned by _mesa_get_format_datatype() is one of:
>> >> GL_UNSIGNED_NORMALIZED
>> >> GL_SIGNED_NORMALIZED
>> >> GL_UNSIGNED_INT
>> >> GL_INT
>> >> GL_FLOAT
>> >> So, the suggested checks for srcType are not required.
>> >
>> > Oh, right, although I think that does not invalidate my point: can we
>> > memcpy from a GL_UNSIGNED_NORMALIZED to a format with type GL_FLOAT or
>> > GL_SIGNED_NORMALIZED? It does not look like these checks here are
>> > thorough.
>> >
>> Helper function _mesa_need_signed_unsigned_int_conversion() is
>> meant to do the checks only for integer formats. May be add another
>> function to do the missing checks for other formats?
>
> I have no concerns about the _mesa_need_signed_unsigned_int_conversion
> function that you add in a later patch for your PBO work, my concern is
> related to the fact that you are assuming that the checks that you need
> in the PBO path are the same that we have in
> _mesa_readpixels_needs_slow_path, so you make both the same when I think
> they are trying to address different things.
>
> In your PBO code, you can't handle signed/unsigned integer conversions,
> so you need to detect that and fall back to another path. That should be
> fine I guess and the function _mesa_need_signed_unsigned_int_conversion
> does what you need, so no problems there.
>
> However, in _mesa_readpixels_needs_slow_path I think we don't want to
> just do integer checking. The purpose of the function is to tell whether
> we can use memcpy to copy pixels from the framebuffer to the dst, and if
> we have types with different signs, *whether they are integer or not*,
> we can't, so limiting the check only to integer types does not look
> right to me. The key aspect here is that what this function needs to
> check is not specific to integer types, even if the current code only
> seems to check things when the framebuffer has an integer format.
>
>> > In any case, that's beyond the point of your patch. Talking specifically
>> > about your patch: can we memcpy, for example, from a _signed_ integer
>> > format like MESA_FORMAT_R_SINT8 to an _unsigned_ format (integer or
>> > not)? I don't think we can, in which case your patch would not look
>> > correct to me.
>> >
>> Reading integer format to a non integer format is not allowed in
>> glReadPixels. That's why those cases are not relevant here and
>> we just check for integer formats. From ext_texture_integer:
>> "INVALID_OPERATON is generated by ReadPixels if  is
>> an integer format and the color buffer is not an integer format, or
>>  if  is not an integer format and the color buffer is an
>>  integer format."
>
> Right, that was not a good example, but forget about integer types, what
> if the framebuffer is something like MESA_FORMAT_R8G8B8A8_UNORM and our
> dst format/type is GL_RGBA/GL_BYTE? These are not integer types but we
> can't memcpy anyway because the framebuffer is unsigned and the dst is
> signed so a conversion is needed.
>
> Of course, the current code in this function only cares about the
> framebuffer being an integer format, but for the reasons I explain
> above, I think that is wrong in this case, I think

Re: [Mesa-dev] [PATCH] radeon: Advertise correct GL_SAMPLES_PASSED value.

2015-06-22 Thread Marek Olšák

Reviewed-by: Marek Olšák 

For Gallium, a new PIPE_CAP or new get_xxx_param function will be needed.

Marek


On Mon, Jun 22, 2015 at 8:41 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Commit b765119c changed the default value of all the counter bits to
> 64.  However, older hardware only has 32 counter bits.
>
> This has only been build-tested.  We don't have any tests that verify
> the advertised value against implementation behavior, so I don't know
> what additional testing could be done.
>
> NOTE: It appears that many Gallium drivers (at least r300 and i915g)
> have the same problem, but I don't see a way for the state-tracker to
> determine the counter size.
>
> Signed-off-by: Ian Romanick 
> Cc: Marek Olšák 
> Cc: Alex Deucher 
> ---
>  .../drivers/dri/radeon/radeon_common_context.c | 23 
> ++
>  1 file changed, 23 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/radeon/radeon_common_context.c 
> b/src/mesa/drivers/dri/radeon/radeon_common_context.c
> index 9699dcb..3d0ceda 100644
> --- a/src/mesa/drivers/dri/radeon/radeon_common_context.c
> +++ b/src/mesa/drivers/dri/radeon/radeon_common_context.c
> @@ -194,6 +194,29 @@ GLboolean radeonInitContext(radeonContextPtr radeon,
>
> radeon_init_dma(radeon);
>
> +/* _mesa_initialize_context calls _mesa_init_queryobj which
> + * initializes all of the counter sizes to 64.  The counters on r100
> + * and r200 are only 32-bits for occlusion queries.  Those are the
> + * only counters, so set the other sizes to zero.
> + */
> +radeon->glCtx.Const.QueryCounterBits.SamplesPassed = 32;
> +
> +radeon->glCtx.Const.QueryCounterBits.TimeElapsed = 0;
> +radeon->glCtx.Const.QueryCounterBits.Timestamp = 0;
> +radeon->glCtx.Const.QueryCounterBits.PrimitivesGenerated = 0;
> +radeon->glCtx.Const.QueryCounterBits.PrimitivesWritten = 0;
> +radeon->glCtx.Const.QueryCounterBits.VerticesSubmitted = 0;
> +radeon->glCtx.Const.QueryCounterBits.PrimitivesSubmitted = 0;
> +radeon->glCtx.Const.QueryCounterBits.VsInvocations = 0;
> +radeon->glCtx.Const.QueryCounterBits.TessPatches = 0;
> +radeon->glCtx.Const.QueryCounterBits.TessInvocations = 0;
> +radeon->glCtx.Const.QueryCounterBits.GsInvocations = 0;
> +radeon->glCtx.Const.QueryCounterBits.GsPrimitives = 0;
> +radeon->glCtx.Const.QueryCounterBits.FsInvocations = 0;
> +radeon->glCtx.Const.QueryCounterBits.ComputeInvocations = 0;
> +radeon->glCtx.Const.QueryCounterBits.ClInPrimitives = 0;
> +radeon->glCtx.Const.QueryCounterBits.ClOutPrimitives = 0;
> +
> return GL_TRUE;
>  }
>
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals

2015-06-22 Thread Chad Versace

On Thu 18 Jun 2015, Emil Velikov wrote:
> Hi Boyan,
> 
> On 13 June 2015 at 08:33, Boyan Ding  wrote:
> > The call to dri2_x11_add_configs_for_visuals (previously
> > dri2_add_configs_for_visuals) was moved downwards in commit f8c5b8a1,
> > but appeared again in its original position after its rename in
> > d019cd81. Remove it.
> >
> I believe you're bang on the spot here. The latter commit mentions
> only about the renaming, so it seems that the hunk got back in as the
> patch was rebased. Adding Chad to the Cc list, just in case we've
> missed something :-)
> 
> Fwiw the patch is
> Reviewed-by: Emil Velikov 

Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 8/8] nv50: enable GL_AMD_performance_monitor

2015-06-22 Thread Samuel Pitoiset

This exposes a group of global performance counters that enables
GL_AMD_performance_monitor. All piglit tests are okay.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 35 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  6 +
 3 files changed, 42 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 062d427..6638e82 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -1566,6 +1566,7 @@ nv50_screen_get_driver_query_info(struct pipe_screen 
*pscreen,
 
  info->name = cfg->event->name;
  info->query_type = NV50_HW_PM_QUERY(id);
+ info->group_id = NV50_HW_PM_QUERY_GROUP;
  info->max_value.u64 =
 (cfg->event->display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0;
  return 1;
@@ -1576,6 +1577,40 @@ nv50_screen_get_driver_query_info(struct pipe_screen 
*pscreen,
return 0;
 }
 
+int
+nv50_screen_get_driver_query_group_info(struct pipe_screen *pscreen,
+unsigned id,
+struct pipe_driver_query_group_info 
*info)
+{
+   struct nv50_screen *screen = nv50_screen(pscreen);
+   int count = 0;
+
+   // TODO: Check DRM version when nvif will be merged in libdrm!
+   if (screen->base.perfmon) {
+  count++; /* NV50_HW_PM_QUERY_GROUP */
+   }
+
+   if (!info)
+  return count;
+
+   if (id == NV50_HW_PM_QUERY_GROUP) {
+  if (screen->base.perfmon) {
+ info->name = "Global performance counters";
+ info->type = PIPE_DRIVER_QUERY_GROUP_TYPE_GPU;
+ info->num_queries = NV50_HW_PM_QUERY_COUNT;
+ info->max_active_queries = 1; /* TODO: get rid of this limitation! */
+ return 1;
+  }
+   }
+
+   /* user asked for info about non-existing query group */
+   info->name = "this_is_not_the_query_group_you_are_looking_for";
+   info->max_active_queries = 0;
+   info->num_queries = 0;
+   info->type = 0;
+   return 0;
+}
+
 void
 nv50_init_query_functions(struct nv50_context *nv50)
 {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index f07798e..dfe20c9 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -746,6 +746,7 @@ nv50_screen_create(struct nouveau_device *dev)
pscreen->get_shader_param = nv50_screen_get_shader_param;
pscreen->get_paramf = nv50_screen_get_paramf;
pscreen->get_driver_query_info = nv50_screen_get_driver_query_info;
+   pscreen->get_driver_query_group_info = 
nv50_screen_get_driver_query_group_info;
 
nv50_screen_init_resource_functions(pscreen);
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 69127c0..807ae0e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -114,6 +114,9 @@ nv50_screen(struct pipe_screen *screen)
return (struct nv50_screen *)screen;
 }
 
+/* Hardware global performance counters groups. */
+#define NV50_HW_PM_QUERY_GROUP 0
+
 /* Hardware global performance counters. */
 #define NV50_HW_PM_QUERY_COUNT  24
 #define NV50_HW_PM_QUERY(i)(PIPE_QUERY_DRIVER_SPECIFIC + (i))
@@ -146,6 +149,9 @@ nv50_screen(struct pipe_screen *screen)
 int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned,
   struct pipe_driver_query_info *);
 
+int nv50_screen_get_driver_query_group_info(struct pipe_screen *, unsigned,
+struct 
pipe_driver_query_group_info *);
+
 boolean nv50_blitter_create(struct nv50_screen *);
 void nv50_blitter_destroy(struct nv50_screen *);
 
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 5/8] nv50: prevent NULL pointer dereference with pipe_query functions

2015-06-22 Thread Samuel Pitoiset

This may happen when nv50_query_create() fails to create a new query.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 55fcac8..1162110 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -96,6 +96,9 @@ nv50_query_allocate(struct nv50_context *nv50, struct 
nv50_query *q, int size)
 static void
 nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq)
 {
+   if (!pq)
+  return;
+
nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0);
nouveau_fence_ref(NULL, &nv50_query(pq)->fence);
FREE(nv50_query(pq));
@@ -152,6 +155,9 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
struct nouveau_pushbuf *push = nv50->base.pushbuf;
struct nv50_query *q = nv50_query(pq);
 
+   if (!pq)
+  return FALSE;
+
/* For occlusion queries we have to change the storage, because a previous
 * query might set the initial render conition to FALSE even *after* we re-
 * initialized it to TRUE.
@@ -218,6 +224,9 @@ nv50_query_end(struct pipe_context *pipe, struct pipe_query 
*pq)
struct nouveau_pushbuf *push = nv50->base.pushbuf;
struct nv50_query *q = nv50_query(pq);
 
+   if (!pq)
+  return;
+
q->state = NV50_QUERY_STATE_ENDED;
 
switch (q->type) {
@@ -294,9 +303,12 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
uint64_t *res64 = (uint64_t *)result;
uint32_t *res32 = (uint32_t *)result;
boolean *res8 = (boolean *)result;
-   uint64_t *data64 = (uint64_t *)q->data;
+   uint64_t *data64;
int i;
 
+   if (!pq)
+  return FALSE;
+
if (q->state != NV50_QUERY_STATE_READY)
   nv50_query_update(q);
 
@@ -314,6 +326,7 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
}
q->state = NV50_QUERY_STATE_READY;
 
+   data64 = (uint64_t *)q->data;
switch (q->type) {
case PIPE_QUERY_GPU_FINISHED:
   res8[0] = TRUE;
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 2/8] nv50: allocate a software object class

2015-06-22 Thread Samuel Pitoiset

This will allow to monitor global performance counters through the
command stream of the GPU instead of using ioctls.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 11 +++
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_winsys.h |  1 +
 3 files changed, 13 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 6583a35..c985344 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -367,6 +367,7 @@ nv50_screen_destroy(struct pipe_screen *pscreen)
nouveau_object_del(&screen->eng2d);
nouveau_object_del(&screen->m2mf);
nouveau_object_del(&screen->sync);
+   nouveau_object_del(&screen->sw);
 
nouveau_screen_fini(&screen->base);
 
@@ -437,6 +438,9 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
BEGIN_NV04(push, SUBC_3D(NV01_SUBCHAN_OBJECT), 1);
PUSH_DATA (push, screen->tesla->handle);
 
+   BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1);
+   PUSH_DATA (push, screen->sw->handle);
+
BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS);
 
@@ -768,6 +772,13 @@ nv50_screen_create(struct nouveau_device *dev)
   goto fail;
}
 
+   ret = nouveau_object_new(chan, 0xbeef506e, 0x506e,
+NULL, 0, &screen->sw);
+   if (ret) {
+  NOUVEAU_ERR("Failed to allocate SW object: %d\n", ret);
+  goto fail;
+   }
+
ret = nouveau_object_new(chan, 0xbeef5039, NV50_M2MF_CLASS,
 NULL, 0, &screen->m2mf);
if (ret) {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 881051b..69fdfdb 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -93,6 +93,7 @@ struct nv50_screen {
struct nouveau_object *tesla;
struct nouveau_object *eng2d;
struct nouveau_object *m2mf;
+   struct nouveau_object *sw;
 };
 
 static INLINE struct nv50_screen *
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h 
b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
index e8578c8..5cb33ef 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
@@ -60,6 +60,7 @@ PUSH_REFN(struct nouveau_pushbuf *push, struct nouveau_bo 
*bo, uint32_t flags)
 #define SUBC_COMPUTE(m) 6, (m)
 #define NV50_COMPUTE(n) SUBC_COMPUTE(NV50_COMPUTE_##n)
 
+#define SUBC_SW(m) 7, (m)
 
 static INLINE uint32_t
 NV50_FIFO_PKHDR(int subc, int mthd, unsigned size)
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 3/8] nv50: allocate and map a notifier buffer object for PM

2015-06-22 Thread Samuel Pitoiset

This notifier buffer object will be used to read back global performance
counters results written by the kernel.

For each domain, we will store the handle of the perfdom object, an
array of 4 counters and the number of cycles. Like the Gallium's HUD,
we keep a list of busy queries in a ring in order to prevent stalls
when reading queries.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 29 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  6 ++
 2 files changed, 35 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index c985344..3a99cc8 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -368,6 +368,7 @@ nv50_screen_destroy(struct pipe_screen *pscreen)
nouveau_object_del(&screen->m2mf);
nouveau_object_del(&screen->sync);
nouveau_object_del(&screen->sw);
+   nouveau_object_del(&screen->query);
 
nouveau_screen_fini(&screen->base);
 
@@ -699,9 +700,11 @@ nv50_screen_create(struct nouveau_device *dev)
struct nv50_screen *screen;
struct pipe_screen *pscreen;
struct nouveau_object *chan;
+   struct nv04_fifo *fifo;
uint64_t value;
uint32_t tesla_class;
unsigned stack_size;
+   uint32_t length;
int ret;
 
screen = CALLOC_STRUCT(nv50_screen);
@@ -727,6 +730,7 @@ nv50_screen_create(struct nouveau_device *dev)
screen->base.pushbuf->rsvd_kick = 5;
 
chan = screen->base.channel;
+   fifo = chan->data;
 
pscreen->destroy = nv50_screen_destroy;
pscreen->context_create = nv50_create;
@@ -772,6 +776,23 @@ nv50_screen_create(struct nouveau_device *dev)
   goto fail;
}
 
+   /* Compute size (in bytes) of the notifier buffer object which is used
+* in order to read back global performance counters results written
+* by the kernel. For each domain, we store the handle of the perfdom
+* object, an array of 4 counters and the number of cycles. Like for
+* the Gallium's HUD, we keep a list of busy queries in a ring in order
+* to prevent stalls when reading queries. */
+   length = (1 + (NV50_HW_PM_RING_BUFFER_NUM_DOMAINS * 6) *
+  NV50_HW_PM_RING_BUFFER_MAX_QUERIES) * 4;
+
+   ret = nouveau_object_new(chan, 0xbeef0302, NOUVEAU_NOTIFIER_CLASS,
+&(struct nv04_notify){ .length = length },
+sizeof(struct nv04_notify), &screen->query);
+   if (ret) {
+   NOUVEAU_ERR("Failed to allocate notifier object for PM: %d\n", ret);
+   goto fail;
+   }
+
ret = nouveau_object_new(chan, 0xbeef506e, 0x506e,
 NULL, 0, &screen->sw);
if (ret) {
@@ -845,6 +866,14 @@ nv50_screen_create(struct nouveau_device *dev)
nouveau_heap_init(&screen->gp_code_heap, 0, 1 << NV50_CODE_BO_SIZE_LOG2);
nouveau_heap_init(&screen->fp_code_heap, 0, 1 << NV50_CODE_BO_SIZE_LOG2);
 
+   ret = nouveau_bo_wrap(screen->base.device, fifo->notify, 
&screen->notify_bo);
+   if (ret == 0)
+  nouveau_bo_map(screen->notify_bo, 0, screen->base.client);
+   if (ret) {
+  NOUVEAU_ERR("Failed to map notifier object for PM: %d\n", ret);
+  goto fail;
+   }
+
nouveau_getparam(dev, NOUVEAU_GETPARAM_GRAPH_UNITS, &value);
 
screen->TPs = util_bitcount(value & 0x);
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 69fdfdb..71a5247 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -59,6 +59,7 @@ struct nv50_screen {
struct nouveau_bo *txc; /* TIC (offset 0) and TSC (65536) */
struct nouveau_bo *stack_bo;
struct nouveau_bo *tls_bo;
+   struct nouveau_bo *notify_bo;
 
unsigned TPs;
unsigned MPsInTP;
@@ -89,6 +90,7 @@ struct nv50_screen {
} fence;
 
struct nouveau_object *sync;
+   struct nouveau_object *query;
 
struct nouveau_object *tesla;
struct nouveau_object *eng2d;
@@ -96,6 +98,10 @@ struct nv50_screen {
struct nouveau_object *sw;
 };
 
+/* Parameters of the ring buffer used to read back global PM counters. */
+#define NV50_HW_PM_RING_BUFFER_NUM_DOMAINS 8
+#define NV50_HW_PM_RING_BUFFER_MAX_QUERIES 9 /* HUD_NUM_QUERIES + 1 */
+
 static INLINE struct nv50_screen *
 nv50_screen(struct pipe_screen *screen)
 {
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 7/8] nv50: expose global performance counters to the HUD

2015-06-22 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 41 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |  3 ++
 3 files changed, 45 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index b9d2914..062d427 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -1535,6 +1535,47 @@ nv50_hw_pm_query_result(struct nv50_context *nv50, 
struct nv50_query *q,
return TRUE;
 }
 
+int
+nv50_screen_get_driver_query_info(struct pipe_screen *pscreen,
+  unsigned id,
+  struct pipe_driver_query_info *info)
+{
+   struct nv50_screen *screen = nv50_screen(pscreen);
+   int count = 0;
+
+   // TODO: Check DRM version when nvif will be merged in libdrm!
+   if (screen->base.perfmon) {
+  nv50_identify_events(screen);
+  count += NV50_HW_PM_QUERY_COUNT;
+   }
+
+   if (!info)
+  return count;
+
+   /* Init default values. */
+   info->name = "this_is_not_the_query_you_are_looking_for";
+   info->query_type = 0xdeadd01d;
+   info->type = PIPE_DRIVER_QUERY_TYPE_UINT64;
+   info->max_value.u64 = 0;
+   info->group_id = -1;
+
+   if (id < count) {
+  if (screen->base.perfmon) {
+ const struct nv50_hw_pm_query_cfg *cfg =
+nv50_hw_pm_query_get_cfg(screen, NV50_HW_PM_QUERY(id));
+
+ info->name = cfg->event->name;
+ info->query_type = NV50_HW_PM_QUERY(id);
+ info->max_value.u64 =
+(cfg->event->display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0;
+ return 1;
+  }
+   }
+
+   /* User asked for info about non-existing query. */
+   return 0;
+}
+
 void
 nv50_init_query_functions(struct nv50_context *nv50)
 {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 53817c0..f07798e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -745,6 +745,7 @@ nv50_screen_create(struct nouveau_device *dev)
pscreen->get_param = nv50_screen_get_param;
pscreen->get_shader_param = nv50_screen_get_shader_param;
pscreen->get_paramf = nv50_screen_get_paramf;
+   pscreen->get_driver_query_info = nv50_screen_get_driver_query_info;
 
nv50_screen_init_resource_functions(pscreen);
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 0449659..69127c0 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -143,6 +143,9 @@ nv50_screen(struct pipe_screen *screen)
 #define NV50_HW_PM_QUERY_TEX_CACHE_HIT  22
 #define NV50_HW_PM_QUERY_TEX_WAITS_FOR_FB   23
 
+int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned,
+  struct pipe_driver_query_info *);
+
 boolean nv50_blitter_create(struct nv50_screen *);
 void nv50_blitter_destroy(struct nv50_screen *);
 
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 6/8] nv50: add support for compute/graphics global performance counters

2015-06-22 Thread Samuel Pitoiset

This commit adds support for both compute and graphics global
performance counters which have been reverse engineered with
CUPTI (Linux) and PerfKit (Windows).

Currently, only one query type can be monitored at the same time because
the Gallium's HUD doesn't fit pretty well. This will be improved later.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 1057 +++-
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |   35 +
 2 files changed, 1087 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 1162110..b9d2914 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -27,6 +27,8 @@
 #include "nv50/nv50_context.h"
 #include "nv_object.xml.h"
 
+#include "nouveau_perfmon.h"
+
 #define NV50_QUERY_STATE_READY   0
 #define NV50_QUERY_STATE_ACTIVE  1
 #define NV50_QUERY_STATE_ENDED   2
@@ -51,10 +53,25 @@ struct nv50_query {
boolean is64bit;
struct nouveau_mm_allocation *mm;
struct nouveau_fence *fence;
+   struct nouveau_object *perfdom;
 };
 
 #define NV50_QUERY_ALLOC_SPACE 256
 
+#ifdef DEBUG
+static void nv50_hw_pm_dump_perfdom(struct nvif_perfdom_v0 *args);
+#endif
+
+static boolean
+nv50_hw_pm_query_create(struct nv50_context *, struct nv50_query *);
+static void
+nv50_hw_pm_query_destroy(struct nv50_context *, struct nv50_query *);
+static boolean
+nv50_hw_pm_query_begin(struct nv50_context *, struct nv50_query *);
+static void nv50_hw_pm_query_end(struct nv50_context *, struct nv50_query *);
+static boolean nv50_hw_pm_query_result(struct nv50_context *,
+struct nv50_query *, boolean, void *);
+
 static INLINE struct nv50_query *
 nv50_query(struct pipe_query *pipe)
 {
@@ -96,12 +113,18 @@ nv50_query_allocate(struct nv50_context *nv50, struct 
nv50_query *q, int size)
 static void
 nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq)
 {
+   struct nv50_context *nv50 = nv50_context(pipe);
+   struct nv50_query *q = nv50_query(pq);
+
if (!pq)
   return;
 
-   nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0);
-   nouveau_fence_ref(NULL, &nv50_query(pq)->fence);
-   FREE(nv50_query(pq));
+   if ((q->type >= NV50_HW_PM_QUERY(0) && q->type <= NV50_HW_PM_QUERY_LAST))
+  nv50_hw_pm_query_destroy(nv50, q);
+
+   nv50_query_allocate(nv50, q, 0);
+   nouveau_fence_ref(NULL, &q->fence);
+   FREE(q);
 }
 
 static struct pipe_query *
@@ -130,6 +153,11 @@ nv50_query_create(struct pipe_context *pipe, unsigned 
type, unsigned index)
   q->data -= 32 / sizeof(*q->data); /* we advance before query_begin ! */
}
 
+   if ((q->type >= NV50_HW_PM_QUERY(0) && q->type <= NV50_HW_PM_QUERY_LAST)) {
+  if (!nv50_hw_pm_query_create(nv50, q))
+ return NULL;
+   }
+
return (struct pipe_query *)q;
 }
 
@@ -154,6 +182,7 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
struct nv50_context *nv50 = nv50_context(pipe);
struct nouveau_pushbuf *push = nv50->base.pushbuf;
struct nv50_query *q = nv50_query(pq);
+   boolean ret = TRUE;
 
if (!pq)
   return FALSE;
@@ -211,10 +240,13 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
   nv50_query_get(push, q, 0x10, 0x5002);
   break;
default:
+  if ((q->type >= NV50_HW_PM_QUERY(0) && q->type <= 
NV50_HW_PM_QUERY_LAST)) {
+ ret = nv50_hw_pm_query_begin(nv50, q);
+  }
   break;
}
q->state = NV50_QUERY_STATE_ACTIVE;
-   return true;
+   return ret;
 }
 
 static void
@@ -274,7 +306,9 @@ nv50_query_end(struct pipe_context *pipe, struct pipe_query 
*pq)
   q->state = NV50_QUERY_STATE_READY;
   break;
default:
-  assert(0);
+  if ((q->type >= NV50_HW_PM_QUERY(0) && q->type <= 
NV50_HW_PM_QUERY_LAST)) {
+ nv50_hw_pm_query_end(nv50, q);
+  }
   break;
}
 
@@ -309,6 +343,10 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
if (!pq)
   return FALSE;
 
+   if ((q->type >= NV50_HW_PM_QUERY(0) && q->type <= NV50_HW_PM_QUERY_LAST)) {
+  return nv50_hw_pm_query_result(nv50, q, wait, result);
+   }
+
if (q->state != NV50_QUERY_STATE_READY)
   nv50_query_update(q);
 
@@ -488,6 +526,1015 @@ nva0_so_target_save_offset(struct pipe_context *pipe,
nv50_query_end(pipe, targ->pq);
 }
 
+/* === HARDWARE GLOBAL PERFORMANCE COUNTERS for NV50 === */
+
+struct nv50_hw_pm_source_cfg
+{
+   const char *name;
+   uint64_t value;
+};
+
+struct nv50_hw_pm_signal_cfg
+{
+   const char *name;
+   const struct nv50_hw_pm_source_cfg src[8];
+};
+
+struct nv50_hw_pm_counter_cfg
+{
+   uint16_t logic_op;
+   const struct nv50_hw_pm_signal_cfg sig[4];
+};
+
+enum nv50_hw_pm_query_display
+{
+   NV50_HW_PM_EVENT_DISPLAY_RAW,
+   NV50_HW_PM_EVENT_DISPLAY_RATIO,
+};
+
+enum nv50_hw_pm_query_count
+{
+   NV50_HW_PM_EVENT_COUNT_SIMPLE,
+   NV50_H

[Mesa-dev] [RFC PATCH 0/8] nv50: expose global performance counters

2015-06-22 Thread Samuel Pitoiset

Hello there,

This series exposes NVIDIA's global performance counters for Tesla through the
Gallium's HUD and the GL_AMD_performance_monitor extension.

This adds support for 24 hardware events which have been reverse engineered
with PerfKit (Windows) and CUPTI (Linux). These hardware events will allow
developers to profile OpenGL applications.

To reduce latency and to improve accuracy, these global performance counters
are tied to the command stream of the GPU using a set of software methods
instead of ioctls. Results are then written by the kernel to a mapped notifier
buffer object that allows the userspace to read back them.

However, the libdrm branch which implements the new nvif interface exposed by
Nouveau and the software methods interface are not upstream yet. I hope this
should done in the next days.

The code of this series can be found here:
http://cgit.freedesktop.org/~hakzsam/mesa/log/?h=nouveau_perfmon

The libdrm branch can be found here:
http://cgit.freedesktop.org/~hakzsam/drm/log/?h=nouveau_perfmon

The code of the software methods interface can be found here (two last commits):
http://cgit.freedesktop.org/~hakzsam/nouveau/log/?h=nouveau_perfmon

An other series which exposes global performance counters for Fermi and Kepler
will be submitted once I have got enough reviews for this one.

Feel free to make a review.

Thanks,
Samuel.

Samuel Pitoiset (8):
  nouveau: implement the nvif hardware performance counters interface
  nv50: allocate a software object class
  nv50: allocate and map a notifier buffer object for PM
  nv50: configure the ring buffer for reading back PM counters
  nv50: prevent NULL pointer dereference with pipe_query functions
  nv50: add support for compute/graphics global performance counters
  nv50: expose global performance counters to the HUD
  nv50: enable GL_AMD_performance_monitor

 src/gallium/drivers/nouveau/Makefile.sources   |2 +
 src/gallium/drivers/nouveau/nouveau_perfmon.c  |  302 +++
 src/gallium/drivers/nouveau/nouveau_perfmon.h  |   59 ++
 src/gallium/drivers/nouveau/nouveau_screen.c   |5 +
 src/gallium/drivers/nouveau/nouveau_screen.h   |1 +
 src/gallium/drivers/nouveau/nv50/nv50_query.c  | 1148 +++-
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   49 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.h |   51 ++
 src/gallium/drivers/nouveau/nv50/nv50_winsys.h |1 +
 9 files changed, 1612 insertions(+), 6 deletions(-)
 create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.c
 create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.h

-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 4/8] nv50: configure the ring buffer for reading back PM counters

2015-06-22 Thread Samuel Pitoiset

To write data at the right offset, the kernel has to know some
parameters of this ring buffer, like the number of domains and the
maximum number of queries.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_screen.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 3a99cc8..53817c0 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -441,6 +441,13 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
 
BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1);
PUSH_DATA (push, screen->sw->handle);
+   BEGIN_NV04(push, SUBC_SW(0x0190), 1);
+   PUSH_DATA (push, screen->query->handle);
+   // XXX: Maybe add a check for DRM version here ?
+   BEGIN_NV04(push, SUBC_SW(0x0600), 1);
+   PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_MAX_QUERIES);
+   BEGIN_NV04(push, SUBC_SW(0x0604), 1);
+   PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_NUM_DOMAINS);
 
BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS);
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 1/8] nouveau: implement the nvif hardware performance counters interface

2015-06-22 Thread Samuel Pitoiset

This commit implements the base interface for hardware performance
counters that will be shared between nv50 and nvc0 drivers.

TODO: Bump libdrm version of mesa when nvif will be merged.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/Makefile.sources  |   2 +
 src/gallium/drivers/nouveau/nouveau_perfmon.c | 302 ++
 src/gallium/drivers/nouveau/nouveau_perfmon.h |  59 +
 src/gallium/drivers/nouveau/nouveau_screen.c  |   5 +
 src/gallium/drivers/nouveau/nouveau_screen.h  |   1 +
 5 files changed, 369 insertions(+)
 create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.c
 create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.h

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index 3fae3bc..3da0bdc 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -10,6 +10,8 @@ C_SOURCES := \
nouveau_heap.h \
nouveau_mm.c \
nouveau_mm.h \
+   nouveau_perfmon.c \
+   nouveau_perfmon.h \
nouveau_screen.c \
nouveau_screen.h \
nouveau_statebuf.h \
diff --git a/src/gallium/drivers/nouveau/nouveau_perfmon.c 
b/src/gallium/drivers/nouveau/nouveau_perfmon.c
new file mode 100644
index 000..3798612
--- /dev/null
+++ b/src/gallium/drivers/nouveau/nouveau_perfmon.c
@@ -0,0 +1,302 @@
+/*
+ * Copyright 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include 
+
+#include "util/u_memory.h"
+
+#include "nouveau_debug.h"
+#include "nouveau_winsys.h"
+#include "nouveau_perfmon.h"
+
+static int
+nouveau_perfmon_query_sources(struct nouveau_perfmon *pm,
+  struct nouveau_perfmon_dom *dom,
+  struct nouveau_perfmon_sig *sig)
+{
+   struct nvif_perfmon_query_source_v0 args = {};
+
+   args.domain = dom->id;
+   args.signal = sig->signal;
+   do {
+   uint8_t prev_iter = args.iter;
+   struct nouveau_perfmon_src *src;
+   int ret;
+
+   ret = nouveau_object_mthd(pm->object, 
NVIF_PERFMON_V0_QUERY_SOURCE,
+   &args, sizeof(args));
+   if (ret)
+   return ret;
+
+   if (prev_iter) {
+   args.iter = prev_iter;
+   ret = nouveau_object_mthd(pm->object, 
NVIF_PERFMON_V0_QUERY_SOURCE,
+   &args, 
sizeof(args));
+   if (ret)
+   return ret;
+
+   src = CALLOC_STRUCT(nouveau_perfmon_src);
+   if (!src)
+   return -ENOMEM;
+
+#if 0
+   debug_printf("id   = %d\n", args.source);
+   debug_printf("name = %s\n", args.name);
+   debug_printf("mask = %08x\n", args.mask);
+   debug_printf("\n");
+#endif
+
+  src->id = args.source;
+ strncpy(src->name, args.name, sizeof(src->name));
+   list_addtail(&src->head, &sig->sources);
+   }
+   } while (args.iter != 0xff);
+
+   return 0;
+}
+
+static int
+nouveau_perfmon_query_signals(struct nouveau_perfmon *pm,
+  struct nouveau_perfmon_dom *dom)
+{
+   struct nvif_perfmon_query_signal_v0 args = {};
+
+   args.domain = dom->id;
+   do {
+  uint16_t prev_iter = args.iter;
+  struct nouveau_perfmon_sig *sig;
+  int ret;
+
+  ret = nouveau_object_mthd(pm->object, NVIF_PERFMON_V0_QUERY_SIGNAL,
+&args, sizeof(args));
+  if (ret)
+ return ret;
+
+  if (prev_iter) {
+ args.iter = prev_iter;
+ ret = nouveau_object_mthd(pm->object, NVIF_PERFMON_V0_QUERY_SIGNAL,
+

Re: [Mesa-dev] [Nouveau] [RFC PATCH 5/8] nv50: prevent NULL pointer dereference with pipe_query functions

2015-06-22 Thread Ilia Mirkin

If query_create fails, why would any of these functions get called?

On Mon, Jun 22, 2015 at 4:53 PM, Samuel Pitoiset
 wrote:
> This may happen when nv50_query_create() fails to create a new query.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_query.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> index 55fcac8..1162110 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
> @@ -96,6 +96,9 @@ nv50_query_allocate(struct nv50_context *nv50, struct 
> nv50_query *q, int size)
>  static void
>  nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq)
>  {
> +   if (!pq)
> +  return;
> +
> nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0);
> nouveau_fence_ref(NULL, &nv50_query(pq)->fence);
> FREE(nv50_query(pq));
> @@ -152,6 +155,9 @@ nv50_query_begin(struct pipe_context *pipe, struct 
> pipe_query *pq)
> struct nouveau_pushbuf *push = nv50->base.pushbuf;
> struct nv50_query *q = nv50_query(pq);
>
> +   if (!pq)
> +  return FALSE;
> +
> /* For occlusion queries we have to change the storage, because a previous
>  * query might set the initial render conition to FALSE even *after* we 
> re-
>  * initialized it to TRUE.
> @@ -218,6 +224,9 @@ nv50_query_end(struct pipe_context *pipe, struct 
> pipe_query *pq)
> struct nouveau_pushbuf *push = nv50->base.pushbuf;
> struct nv50_query *q = nv50_query(pq);
>
> +   if (!pq)
> +  return;
> +
> q->state = NV50_QUERY_STATE_ENDED;
>
> switch (q->type) {
> @@ -294,9 +303,12 @@ nv50_query_result(struct pipe_context *pipe, struct 
> pipe_query *pq,
> uint64_t *res64 = (uint64_t *)result;
> uint32_t *res32 = (uint32_t *)result;
> boolean *res8 = (boolean *)result;
> -   uint64_t *data64 = (uint64_t *)q->data;
> +   uint64_t *data64;
> int i;
>
> +   if (!pq)
> +  return FALSE;
> +
> if (q->state != NV50_QUERY_STATE_READY)
>nv50_query_update(q);
>
> @@ -314,6 +326,7 @@ nv50_query_result(struct pipe_context *pipe, struct 
> pipe_query *pq,
> }
> q->state = NV50_QUERY_STATE_READY;
>
> +   data64 = (uint64_t *)q->data;
> switch (q->type) {
> case PIPE_QUERY_GPU_FINISHED:
>res8[0] = TRUE;
> --
> 2.4.4
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Nouveau] [RFC PATCH 5/8] nv50: prevent NULL pointer dereference with pipe_query functions

2015-06-22 Thread Samuel Pitoiset




On 06/22/2015 10:52 PM, Ilia Mirkin wrote:

If query_create fails, why would any of these functions get called?


Because the HUD doesn't check if query_create() fails and it calls other 
pipe_query functions with NULL pointer instead of a valid query object.




On Mon, Jun 22, 2015 at 4:53 PM, Samuel Pitoiset
 wrote:

This may happen when nv50_query_create() fails to create a new query.

Signed-off-by: Samuel Pitoiset 
---
  src/gallium/drivers/nouveau/nv50/nv50_query.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 55fcac8..1162110 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -96,6 +96,9 @@ nv50_query_allocate(struct nv50_context *nv50, struct 
nv50_query *q, int size)
  static void
  nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq)
  {
+   if (!pq)
+  return;
+
 nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0);
 nouveau_fence_ref(NULL, &nv50_query(pq)->fence);
 FREE(nv50_query(pq));
@@ -152,6 +155,9 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
 struct nouveau_pushbuf *push = nv50->base.pushbuf;
 struct nv50_query *q = nv50_query(pq);

+   if (!pq)
+  return FALSE;
+
 /* For occlusion queries we have to change the storage, because a previous
  * query might set the initial render conition to FALSE even *after* we re-
  * initialized it to TRUE.
@@ -218,6 +224,9 @@ nv50_query_end(struct pipe_context *pipe, struct pipe_query 
*pq)
 struct nouveau_pushbuf *push = nv50->base.pushbuf;
 struct nv50_query *q = nv50_query(pq);

+   if (!pq)
+  return;
+
 q->state = NV50_QUERY_STATE_ENDED;

 switch (q->type) {
@@ -294,9 +303,12 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
 uint64_t *res64 = (uint64_t *)result;
 uint32_t *res32 = (uint32_t *)result;
 boolean *res8 = (boolean *)result;
-   uint64_t *data64 = (uint64_t *)q->data;
+   uint64_t *data64;
 int i;

+   if (!pq)
+  return FALSE;
+
 if (q->state != NV50_QUERY_STATE_READY)
nv50_query_update(q);

@@ -314,6 +326,7 @@ nv50_query_result(struct pipe_context *pipe, struct 
pipe_query *pq,
 }
 q->state = NV50_QUERY_STATE_READY;

+   data64 = (uint64_t *)q->data;
 switch (q->type) {
 case PIPE_QUERY_GPU_FINISHED:
res8[0] = TRUE;
--
2.4.4

___
Nouveau mailing list
nouv...@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Don't count NIR instructions for shader-db.

2015-06-22 Thread Kenneth Graunke

Matt, Jason, and I haven't found this useful in a long time.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c |   31 ---
 1 file changed, 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index c13708a..dffb8ab 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -57,28 +57,6 @@ nir_optimize(nir_shader *nir)
} while (progress);
 }
 
-static bool
-count_nir_instrs_in_block(nir_block *block, void *state)
-{
-   int *count = (int *) state;
-   nir_foreach_instr(block, instr) {
-  *count = *count + 1;
-   }
-   return true;
-}
-
-static int
-count_nir_instrs(nir_shader *nir)
-{
-   int count = 0;
-   nir_foreach_overload(nir, overload) {
-  if (!overload->impl)
- continue;
-  nir_foreach_block(overload->impl, count_nir_instrs_in_block, &count);
-   }
-   return count;
-}
-
 nir_shader *
 brw_create_nir(struct brw_context *brw,
const struct gl_shader_program *shader_prog,
@@ -178,15 +156,6 @@ brw_create_nir(struct brw_context *brw,
   nir_print_shader(nir, stderr);
}
 
-   static GLuint msg_id = 0;
-   _mesa_gl_debug(&brw->ctx, &msg_id,
-  MESA_DEBUG_SOURCE_SHADER_COMPILER,
-  MESA_DEBUG_TYPE_OTHER,
-  MESA_DEBUG_SEVERITY_NOTIFICATION,
-  "%s NIR shader: %d inst\n",
-  _mesa_shader_stage_to_abbrev(stage),
-  count_nir_instrs(nir));
-
nir_convert_from_ssa(nir);
nir_validate_shader(nir);
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Don't count NIR instructions for shader-db.

2015-06-22 Thread Matt Turner

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 26/82] glsl: Don't do copy propagation on buffer variables

2015-06-22 Thread Jordan Justen

24-26 once again makes me wonder if these optimization *can* be used
with SSBOs based on the same ext spec wording I referenced before:

"The ability to write to buffer objects creates the potential for
 multiple independent shader invocations to read and write the same
 underlying memory. The same issue exists with the
 ARB_shader_image_load_store extension provided in OpenGL 4.2, which
 can write to texture objects and buffers. In both cases, the
 specification makes few guarantees related to the relative order of
 memory reads and writes performed by the shader invocations."

In these patches "other threads" were specifically mentioned.

Did these patches also prevent bad things from happening in generated
code? (Like mentioned for patch 23.)

-Jordan

On 2015-06-03 00:01:16, Iago Toral Quiroga wrote:
> Since the backing storage for these is shared we cannot ensure that the
> value won't change by writes from other threads.
> ---
>  src/glsl/opt_copy_propagation.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/glsl/opt_copy_propagation.cpp 
> b/src/glsl/opt_copy_propagation.cpp
> index 806027b..f206995 100644
> --- a/src/glsl/opt_copy_propagation.cpp
> +++ b/src/glsl/opt_copy_propagation.cpp
> @@ -330,7 +330,7 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir)
>   */
>  ir->condition = new(ralloc_parent(ir)) ir_constant(false);
>  this->progress = true;
> -  } else {
> +  } else if (lhs_var->data.mode != ir_var_shader_storage) {
>  entry = new(this->acp) acp_entry(lhs_var, rhs_var);
>  this->acp->push_tail(entry);
>}
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] draw/gallivm: add invocation ID support for llvmpipe.

2015-06-22 Thread Roland Scheidegger

For the series:
Reviewed-by: Roland Scheidegger 


Am 22.06.2015 um 06:01 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> This extends the draw code to add support for invocations.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/auxiliary/draw/draw_gs.c| 3 ++-
>  src/gallium/auxiliary/draw/draw_llvm.c  | 5 -
>  src/gallium/auxiliary/draw/draw_llvm.h  | 3 ++-
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h | 1 +
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 5 +
>  5 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
> b/src/gallium/auxiliary/draw/draw_gs.c
> index 755e527..a1564f9 100644
> --- a/src/gallium/auxiliary/draw/draw_gs.c
> +++ b/src/gallium/auxiliary/draw/draw_gs.c
> @@ -391,7 +391,8 @@ llvm_gs_run(struct draw_geometry_shader *shader,
>(struct vertex_header*)input,
>input_primitives,
>shader->draw->instance_id,
> -  shader->llvm_prim_ids);
> +  shader->llvm_prim_ids,
> +  shader->invocation_id);
>  
> return ret;
>  }
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 9629a8a..90a31bc 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -2069,7 +2069,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
> struct gallivm_state *gallivm = variant->gallivm;
> LLVMContextRef context = gallivm->context;
> LLVMTypeRef int32_type = LLVMInt32TypeInContext(context);
> -   LLVMTypeRef arg_types[6];
> +   LLVMTypeRef arg_types[7];
> LLVMTypeRef func_type;
> LLVMValueRef variant_func;
> LLVMValueRef context_ptr;
> @@ -2105,6 +2105,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
> arg_types[4] = int32_type;  /* instance_id */
> arg_types[5] = LLVMPointerType(
>LLVMVectorType(int32_type, vector_length), 0);   /* prim_id_ptr */
> +   arg_types[6] = int32_type;
>  
> func_type = LLVMFunctionType(int32_type, arg_types, Elements(arg_types), 
> 0);
>  
> @@ -2125,6 +2126,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
> num_prims = LLVMGetParam(variant_func, 3);
> system_values.instance_id = LLVMGetParam(variant_func, 4);
> prim_id_ptr   = LLVMGetParam(variant_func, 5);
> +   system_values.invocation_id = LLVMGetParam(variant_func, 6);
>  
> lp_build_name(context_ptr, "context");
> lp_build_name(input_array, "input");
> @@ -2132,6 +2134,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
> lp_build_name(num_prims, "num_prims");
> lp_build_name(system_values.instance_id, "instance_id");
> lp_build_name(prim_id_ptr, "prim_id_ptr");
> +   lp_build_name(system_values.invocation_id, "invocation_id");
>  
> variant->context_ptr = context_ptr;
> variant->io_ptr = io_ptr;
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.h 
> b/src/gallium/auxiliary/draw/draw_llvm.h
> index 9565fc6..d48ed72 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.h
> +++ b/src/gallium/auxiliary/draw/draw_llvm.h
> @@ -298,7 +298,8 @@ typedef int
>  struct vertex_header *output,
>  unsigned num_prims,
>  unsigned instance_id,
> -int *prim_ids);
> +int *prim_ids,
> +unsigned invocation_id);
>  
>  struct draw_llvm_variant_key
>  {
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> index 3f76b79..967373c 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
> @@ -165,6 +165,7 @@ struct lp_bld_tgsi_system_values {
> LLVMValueRef vertex_id_nobase;
> LLVMValueRef prim_id;
> LLVMValueRef basevertex;
> +   LLVMValueRef invocation_id;
>  };
>  
>  
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index 092bd18..268379e 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -1532,6 +1532,11 @@ emit_fetch_system_value(
>atype = TGSI_TYPE_UNSIGNED;
>break;
>  
> +   case TGSI_SEMANTIC_INVOCATIONID:
> +  res = lp_build_broadcast_scalar(&bld_base->uint_bld, 
> bld->system_values.invocation_id);
> +  atype = TGSI_TYPE_UNSIGNED;
> +  break;
> +
> default:
>assert(!"unexpected semantic in emit_fetch_system_value");
>res = bld_base->base.zero;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 27/82] mesa: Add new IR node ir_ssbo_store

2015-06-22 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:17, Iago Toral Quiroga wrote:
> Shader storage buffer objects (SSBO) require special handling: when we
> detect writes to any channel of a shader buffer variable we need to
> emit the corresponding write to memory. We will later add a lowering pass
> that detects these writes and injects ir_ssbo_store nodes in the IR so
> drivers can  generate code for the memory writes.
> ---
>  src/glsl/ir.h  | 38 
> ++
>  src/glsl/ir_hierarchical_visitor.cpp   | 18 
>  src/glsl/ir_hierarchical_visitor.h |  2 ++
>  src/glsl/ir_hv_accept.cpp  | 23 
>  src/glsl/ir_print_visitor.cpp  | 12 
>  src/glsl/ir_print_visitor.h|  1 +
>  src/glsl/ir_rvalue_visitor.cpp | 21 ++
>  src/glsl/ir_rvalue_visitor.h   |  3 ++
>  src/glsl/ir_visitor.h  |  2 ++
>  src/glsl/nir/glsl_to_nir.cpp   |  7 +
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  6 
>  src/mesa/program/ir_to_mesa.cpp|  7 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  7 +
>  14 files changed, 148 insertions(+)
> 
> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> index 1118732..2a0b28c 100644
> --- a/src/glsl/ir.h
> +++ b/src/glsl/ir.h
> @@ -78,6 +78,7 @@ enum ir_node_type {
> ir_type_discard,
> ir_type_emit_vertex,
> ir_type_end_primitive,
> +   ir_type_ssbo_store,
> ir_type_max, /**< maximum ir_type enum number, for validation */
> ir_type_unset = ir_type_max
>  };
> @@ -2407,6 +2408,43 @@ public:
> ir_rvalue *stream;
>  };
>  
> +/**
> + * IR instruction to write to a shader storage buffer object (SSBO)
> + */
> +class ir_ssbo_store : public ir_instruction {
> +public:
> +   ir_ssbo_store(ir_rvalue *block, ir_rvalue *offset, ir_rvalue *val,
> + unsigned write_mask)
> +  : ir_instruction(ir_type_ssbo_store),
> +block(block), offset(offset), val(val), write_mask(write_mask)
> +   {
> +  assert(block);
> +  assert(offset);
> +  assert(val);
> +  assert(write_mask != 0);
> +   }
> +
> +   virtual void accept(ir_visitor *v)
> +   {
> +  v->visit(this);
> +   }
> +
> +   virtual ir_ssbo_store *clone(void *mem_ctx, struct hash_table *ht) const
> +   {
> +  return new(mem_ctx) ir_ssbo_store(this->block->clone(mem_ctx, ht),
> +this->offset->clone(mem_ctx, ht),
> +this->val->clone(mem_ctx, ht),
> +this->write_mask);
> +   }
> +
> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
> +
> +   ir_rvalue *block;
> +   ir_rvalue *offset;
> +   ir_rvalue *val;
> +   unsigned write_mask;
> +};
> +
>  /*@}*/
>  
>  /**
> diff --git a/src/glsl/ir_hierarchical_visitor.cpp 
> b/src/glsl/ir_hierarchical_visitor.cpp
> index adb6294..1aa5cc0 100644
> --- a/src/glsl/ir_hierarchical_visitor.cpp
> +++ b/src/glsl/ir_hierarchical_visitor.cpp
> @@ -349,6 +349,24 @@ ir_hierarchical_visitor::visit_leave(ir_end_primitive 
> *ir)
> return visit_continue;
>  }
>  
> +ir_visitor_status
> +ir_hierarchical_visitor::visit_enter(ir_ssbo_store *ir)
> +{
> +   if (this->callback_enter != NULL)
> +  this->callback_enter(ir, this->data_enter);
> +
> +   return visit_continue;
> +}
> +
> +ir_visitor_status
> +ir_hierarchical_visitor::visit_leave(ir_ssbo_store *ir)
> +{
> +   if (this->callback_leave != NULL)
> +  this->callback_leave(ir, this->data_leave);
> +
> +   return visit_continue;
> +}
> +
>  void
>  ir_hierarchical_visitor::run(exec_list *instructions)
>  {
> diff --git a/src/glsl/ir_hierarchical_visitor.h 
> b/src/glsl/ir_hierarchical_visitor.h
> index faa52fd..49dc37e 100644
> --- a/src/glsl/ir_hierarchical_visitor.h
> +++ b/src/glsl/ir_hierarchical_visitor.h
> @@ -139,6 +139,8 @@ public:
> virtual ir_visitor_status visit_leave(class ir_emit_vertex *);
> virtual ir_visitor_status visit_enter(class ir_end_primitive *);
> virtual ir_visitor_status visit_leave(class ir_end_primitive *);
> +   virtual ir_visitor_status visit_enter(class ir_ssbo_store *);
> +   virtual ir_visitor_status visit_leave(class ir_ssbo_store *);
> /*@}*/
>  
>  
> diff --git a/src/glsl/ir_hv_accept.cpp b/src/glsl/ir_hv_accept.cpp
> index be5b3ea..500ce4b 100644
> --- a/src/glsl/ir_hv_accept.cpp
> +++ b/src/glsl/ir_hv_accept.cpp
> @@ -429,3 +429,26 @@ ir_end_primitive::accept(ir_hierarchical_visitor *v)
>  
> return (s == visit_stop) ? s : v->visit_leave(this);
>  }
> +
> +
> +ir_visitor_status
> +ir_ssbo_store::accept(ir_hierarchical_visitor *v)
> +{
> +   ir_visitor_status s = v->visit_enter(this);
> +   if (s != visit_continue)
> +  return (s == visit_continue_with_parent) ? visit_continue : s;
> +
>

Re: [Mesa-dev] [PATCH 2/5] i965/gen9: Plugin the code for selecting YF/YS tiling on skl+

2015-06-22 Thread Ben Widawsky

On Wed, Jun 10, 2015 at 03:30:47PM -0700, Anuj Phogat wrote:
> Buffers with Yf/Ys tiling end up using meta upload / download
> paths or the blitter for cases where they used tiled_memcpy paths
> in case of Y tiling. This has exposed some bugs in meta path. To
> avoid any piglit regressions on SKL this patch keeps the Yf/Ys
> tiling disabled at the moment.
> 
> V3: Make brw_miptree_choose_tr_mode() actually choose TRMODE. (Ben)
> Few cosmetic changes.
> V4: Get rid of brw_miptree_choose_tr_mode().
> Take care of all tile resource modes {Yf, Ys, none} for all
> generations at one place.
> 
> Signed-off-by: Anuj Phogat 
> Cc: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 97 
> --
>  1 file changed, 79 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index b9ac4cf..c0ef5cc 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -807,27 +807,88 @@ brw_miptree_layout(struct brw_context *brw,
> enum intel_miptree_tiling_mode requested,
> struct intel_mipmap_tree *mt)
>  {
> -   mt->tr_mode = INTEL_MIPTREE_TRMODE_NONE;
> +   const unsigned bpp = mt->cpp * 8;
> +   const bool is_tr_mode_yf_ys_allowed =
> +  brw->gen >= 9 &&
> +  !for_bo &&
> +  !mt->compressed &&
> +  /* Enable YF/YS tiling only for color surfaces because depth and
> +   * stencil surfaces are not supported in blitter using fast copy
> +   * blit and meta PBO upload, download paths. No other paths
> +   * currently support Yf/Ys tiled surfaces.
> +   * FIXME:  Remove this restriction once we have a tiled_memcpy()
> +   * path to do depth/stencil data upload/download to Yf/Ys tiled
> +   * surfaces.
> +   */

I think it's more readable to move this comment above the variable declaration.
Up to you though. Also I think "FINISHME" is the more appropriate classification
for this type of thing.

> +  _mesa_is_format_color_format(mt->format) &&
> +  (requested == INTEL_MIPTREE_TILING_Y ||
> +   requested == INTEL_MIPTREE_TILING_ANY) &&

This is where my tiling flags would have helped a bit since you should be able
to do flags & Y_TILED :P

> +  (bpp && is_power_of_two(bpp)) &&
> +  /* FIXME: To avoid piglit regressions keep the Yf/Ys tiling
> +   * disabled at the moment.
> +   */
> +  false;

Also, "FINISHME"

>  
> -   intel_miptree_set_alignment(brw, mt);
> -   intel_miptree_set_total_width_height(brw, mt);
> +   /* Lower index (Yf) is the higher priority mode */
> +   const uint32_t tr_mode[3] = {INTEL_MIPTREE_TRMODE_YF,
> +INTEL_MIPTREE_TRMODE_YS,
> +INTEL_MIPTREE_TRMODE_NONE};
> +   int i = is_tr_mode_yf_ys_allowed ? 0 : ARRAY_SIZE(tr_mode) - 1;
>  
> -   if (!mt->total_width || !mt->total_height) {
> -  intel_miptree_release(&mt);
> -  return;
> -   }
> +   while (i < ARRAY_SIZE(tr_mode)) {
> +  if (brw->gen < 9)
> + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
> +  else
> + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_YF ||
> +tr_mode[i] == INTEL_MIPTREE_TRMODE_YS ||
> +tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
>  
> -   /* On Gen9+ the alignment values are expressed in multiples of the block
> -* size
> -*/
> -   if (brw->gen >= 9) {
> -  unsigned int i, j;
> -  _mesa_get_format_block_size(mt->format, &i, &j);
> -  mt->align_w /= i;
> -  mt->align_h /= j;
> -   }
> +  mt->tr_mode = tr_mode[i];
> +  intel_miptree_set_alignment(brw, mt);
> +  intel_miptree_set_total_width_height(brw, mt);
>  
> -   if (!for_bo)
> -  mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);
> +  if (!mt->total_width || !mt->total_height) {
> + intel_miptree_release(&mt);
> + return;
> +  }
> +
> +  /* On Gen9+ the alignment values are expressed in multiples of the
> +   * block size.
> +   */
> +  if (brw->gen >= 9) {
> + unsigned int i, j;
> + _mesa_get_format_block_size(mt->format, &i, &j);
> + mt->align_w /= i;
> + mt->align_h /= j;
> +  }

Can we just combine this alignment calculation into
intel_miptree_set_alignment()?

> +
> +  if (!for_bo)
> + mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);

Perhaps (fwiw, I prefer break instead of returning within a loop, but that's
just me)?
/* If there is already a BO, we cannot effect tiling modes */
if (for_bo)
break;


mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);;
if (is_tr_mode_yf_ys_allowed) {
...
}

This sort of reflects how I felt earlier about pushing the YF/YS decision into
choose tiling. The code is heading in that direction though, so I am content.


> +
> +  if (is_tr_mode_yf_ys_allowed

[Mesa-dev] [PATCH 3/3] i965: Initialize backend_shader::mem_ctx in its constructor.

2015-06-22 Thread Matt Turner

We were initializing it in each subclasses' constructors for some
reason.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 4 +---
 src/mesa/drivers/dri/i965/brw_shader.cpp   | 2 ++
 src/mesa/drivers/dri/i965/brw_shader.h | 1 +
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 +--
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..dc992dd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1984,13 +1984,11 @@ fs_visitor::fs_visitor(struct brw_context *brw,
struct gl_shader_program *shader_prog,
struct gl_program *prog,
unsigned dispatch_width)
-   : backend_shader(brw, shader_prog, prog, prog_data, stage),
+   : backend_shader(brw, mem_ctx, shader_prog, prog, prog_data, stage),
  key(key), prog_data(prog_data),
  dispatch_width(dispatch_width), promoted_constants(0),
  bld(fs_builder(this, dispatch_width).at_end())
 {
-   this->mem_ctx = mem_ctx;
-
switch (stage) {
case MESA_SHADER_FRAGMENT:
   key_tex = &((const brw_wm_prog_key *) key)->tex;
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 545ec26..7a26939 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -757,6 +757,7 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg 
*reg)
 }
 
 backend_shader::backend_shader(struct brw_context *brw,
+   void *mem_ctx,
struct gl_shader_program *shader_prog,
struct gl_program *prog,
struct brw_stage_prog_data *stage_prog_data,
@@ -769,6 +770,7 @@ backend_shader::backend_shader(struct brw_context *brw,
  shader_prog(shader_prog),
  prog(prog),
  stage_prog_data(stage_prog_data),
+ mem_ctx(mem_ctx),
  cfg(NULL),
  stage(stage)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index da01d2f..e647749 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -215,6 +215,7 @@ class backend_shader {
 protected:
 
backend_shader(struct brw_context *brw,
+  void *mem_ctx,
   struct gl_shader_program *shader_prog,
   struct gl_program *prog,
   struct brw_stage_prog_data *stage_prog_data,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 0a76bde..669f769 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3691,7 +3691,7 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
shader_time_shader_type st_base,
shader_time_shader_type st_written,
shader_time_shader_type st_reset)
-   : backend_shader(brw, shader_prog, prog, &prog_data->base, stage),
+   : backend_shader(brw, mem_ctx, shader_prog, prog, &prog_data->base, stage),
  c(c),
  key(key),
  prog_data(prog_data),
@@ -3704,7 +3704,6 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
  st_written(st_written),
  st_reset(st_reset)
 {
-   this->mem_ctx = mem_ctx;
this->failed = false;
 
this->base_ir = NULL;
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] i965/cfg: Assert that cur_do/while/if pointers are non-NULL.

2015-06-22 Thread Matt Turner

Coverity sees that the functions immediately below the new assertions
dereference these pointers, but is unaware that an ENDIF always follows
an IF, etc.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index 39c419b..f1f230e 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -231,6 +231,7 @@ cfg_t::cfg_t(exec_list *instructions)
  if (cur_else) {
 cur_else->add_successor(mem_ctx, cur_endif);
  } else {
+assert(cur_if != NULL);
 cur_if->add_successor(mem_ctx, cur_endif);
  }
 
@@ -299,6 +300,7 @@ cfg_t::cfg_t(exec_list *instructions)
  inst->exec_node::remove();
  cur->instructions.push_tail(inst);
 
+ assert(cur_do != NULL && cur_while != NULL);
 cur->add_successor(mem_ctx, cur_do);
 set_next_block(&cur, cur_while, ip);
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] i965: Assert that the GL primitive isn't out of range.

2015-06-22 Thread Matt Turner

Coverity sees the if (mode >= BRW_PRIM_OFFSET (128)) test and assumes
that the else-branch might execute for mode to up 127, which out be out
of bounds.
---
 src/mesa/drivers/dri/i965/brw_draw.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index a7164db..b91597a 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -92,8 +92,10 @@ get_hw_prim_for_gl_prim(int mode)
 {
if (mode >= BRW_PRIM_OFFSET)
   return mode - BRW_PRIM_OFFSET;
-   else
+   else {
+  assert(mode < ARRAY_SIZE(prim_to_hw_prim));
   return prim_to_hw_prim[mode];
+   }
 }
 
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] i965: Make a helper function intel_miptree_release_levels()

2015-06-22 Thread Ben Widawsky

I am shocked this is the only place we do this...

On Wed, Jun 10, 2015 at 03:30:48PM -0700, Anuj Phogat wrote:
> Signed-off-by: Anuj Phogat 
> Cc: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index c0ef5cc..c185e41 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -801,6 +801,17 @@ intel_miptree_set_alignment(struct brw_context *brw,
> }
>  }
>  
> +static void
> +intel_miptree_release_levels(struct intel_mipmap_tree *mt)
> +{
> +   unsigned int level = 0;
> +
> +   for (level = mt->first_level; level <= mt->last_level; level++) {
> +  free(mt->level[level].slice);
> +  mt->level[level].slice = NULL;
> +   }
> +}
> +
>  void
>  brw_miptree_layout(struct brw_context *brw,
> bool for_bo,
> @@ -866,7 +877,6 @@ brw_miptree_layout(struct brw_context *brw,
>   mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);
>  
>if (is_tr_mode_yf_ys_allowed) {
> - unsigned int level = 0;
>   assert(brw->gen >= 9);
>  
>   if (mt->tiling == I915_TILING_Y ||
> @@ -883,10 +893,7 @@ brw_miptree_layout(struct brw_context *brw,
>   /* Failed to use selected tr_mode. Free up the memory allocated
>* for miptree levels in intel_miptree_total_width_height().
>*/
> - for (level = mt->first_level; level <= mt->last_level; level++) {
> -free(mt->level[level].slice);
> -mt->level[level].slice = NULL;
> - }
> + intel_miptree_release_levels(mt);
>}
>i++;
> }
> -- 
> 1.9.3
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] i965: Make a helper function intel_miptree_can_use_tr_mode()

2015-06-22 Thread Ben Widawsky

1-4 (with/without changes) are:
Reviewed-by: Ben Widawsky 

On Wed, Jun 10, 2015 at 03:30:49PM -0700, Anuj Phogat wrote:
> Signed-off-by: Anuj Phogat 
> Cc: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 30 
> +++---
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index c185e41..39c6a39 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -812,6 +812,23 @@ intel_miptree_release_levels(struct intel_mipmap_tree 
> *mt)
> }
>  }
>  
> +static bool
> +intel_miptree_can_use_tr_mode(const struct intel_mipmap_tree *mt)
> +{
> +   if (mt->tiling == I915_TILING_Y ||
> +   mt->tiling == (I915_TILING_Y | I915_TILING_X) ||
> +   mt->tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
> +  /* FIXME: Don't allow YS tiling at the moment. Using 64KB tiling
> +   * for small textures might result in to memory wastage. Revisit
> +   * this condition when we have more information about the specific
> +   * cases where using YS over YF will be useful.
> +   */
> +  if (mt->tr_mode != INTEL_MIPTREE_TRMODE_YS)
> + return true;
> +   }
> +   return false;
> +}
> +
>  void
>  brw_miptree_layout(struct brw_context *brw,
> bool for_bo,
> @@ -879,17 +896,8 @@ brw_miptree_layout(struct brw_context *brw,
>if (is_tr_mode_yf_ys_allowed) {
>   assert(brw->gen >= 9);
>  
> - if (mt->tiling == I915_TILING_Y ||
> - mt->tiling == (I915_TILING_Y | I915_TILING_X) ||
> - mt->tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
> -/* FIXME: Don't allow YS tiling at the moment. Using 64KB tiling
> - * for small textures might result in to memory wastage. Revisit
> - * this condition when we have more information about the 
> specific
> - * cases where using YS over YF will be useful.
> - */
> -if (mt->tr_mode != INTEL_MIPTREE_TRMODE_YS)
> -   return;
> - }
> + if (intel_miptree_can_use_tr_mode(mt))
> +return;
>   /* Failed to use selected tr_mode. Free up the memory allocated
>* for miptree levels in intel_miptree_total_width_height().
>*/
> -- 
> 1.9.3
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Don't mess up stride for uniform integer multiplication.

2015-06-22 Thread Matt Turner

If the stride is 0, the source is a uniform and we should not modify the
stride.

Cc: "10.6" 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91047
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5563c5a..903624c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3196,10 +3196,16 @@ fs_visitor::lower_integer_multiplication()
src1_1_w.fixed_hw_reg.dw1.ud >>= 16;
 } else {
src1_0_w.type = BRW_REGISTER_TYPE_UW;
-   src1_0_w.stride = 2;
+   if (src1_0_w.stride != 0) {
+  assert(src1_0_w.stride == 1);
+  src1_0_w.stride = 2;
+   }
 
src1_1_w.type = BRW_REGISTER_TYPE_UW;
-   src1_1_w.stride = 2;
+   if (src1_1_w.stride != 0) {
+  assert(src1_1_w.stride == 1);
+  src1_1_w.stride = 2;
+   }
src1_1_w.subreg_offset += type_sz(BRW_REGISTER_TYPE_UW);
 }
 ibld.MUL(low, inst->src[0], src1_0_w);
@@ -3209,10 +3215,16 @@ fs_visitor::lower_integer_multiplication()
 fs_reg src0_1_w = inst->src[0];
 
 src0_0_w.type = BRW_REGISTER_TYPE_UW;
-src0_0_w.stride = 2;
+if (src0_0_w.stride != 0) {
+   assert(src0_0_w.stride == 1);
+   src0_0_w.stride = 2;
+}
 
 src0_1_w.type = BRW_REGISTER_TYPE_UW;
-src0_1_w.stride = 2;
+if (src0_1_w.stride != 0) {
+   assert(src0_1_w.stride == 1);
+   src0_1_w.stride = 2;
+}
 src0_1_w.subreg_offset += type_sz(BRW_REGISTER_TYPE_UW);
 
 ibld.MUL(low, src0_0_w, inst->src[1]);
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Delete unused ICEIL().

2015-06-22 Thread Matt Turner

Can't find any uses of it in git history.
---
Strangely, when it was moved to its current location in commit 27558a1,
it was moved from mmath.h... which seems to have been lost from git's
history. Searching further git log --grep mmath.h shows that various
commit messages mention modifying mmath.h and none of the commits
actually do.

 src/mesa/main/imports.h | 32 
 1 file changed, 32 deletions(-)

diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h
index c4d917e..9ffe3de 100644
--- a/src/mesa/main/imports.h
+++ b/src/mesa/main/imports.h
@@ -230,38 +230,6 @@ static inline int IFLOOR(float f)
 }
 
 
-/** Return (as an integer) ceiling of float */
-static inline int ICEIL(float f)
-{
-#if defined(USE_X86_ASM) && defined(__GNUC__) && defined(__i386__)
-   /*
-* IEEE ceil for computers that round to nearest or even.
-* 'f' must be between -4194304 and 4194303.
-* This ceil operation is done by "(iround(f + .5) + iround(f - .5) + 1) >> 
1",
-* but uses some IEEE specific tricks for better speed.
-* Contributed by Josh Vanderhoof
-*/
-   int ai, bi;
-   double af, bf;
-   af = (3 << 22) + 0.5 + (double)f;
-   bf = (3 << 22) + 0.5 - (double)f;
-   /* GCC generates an extra fstp/fld without this. */
-   __asm__ ("fstps %0" : "=m" (ai) : "t" (af) : "st");
-   __asm__ ("fstps %0" : "=m" (bi) : "t" (bf) : "st");
-   return (ai - bi + 1) >> 1;
-#else
-   int ai, bi;
-   double af, bf;
-   fi_type u;
-   af = (3 << 22) + 0.5 + (double)f;
-   bf = (3 << 22) + 0.5 - (double)f;
-   u.f = (float) af; ai = u.i;
-   u.f = (float) bf; bi = u.i;
-   return (ai - bi + 1) >> 1;
-#endif
-}
-
-
 /**
  * Is x a power of two?
  */
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Delete unused ICEIL().

2015-06-22 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2015-06-22 14:58:27, Matt Turner wrote:
> Can't find any uses of it in git history.
> ---
> Strangely, when it was moved to its current location in commit 27558a1,
> it was moved from mmath.h... which seems to have been lost from git's
> history. Searching further git log --grep mmath.h shows that various
> commit messages mention modifying mmath.h and none of the commits
> actually do.
> 
>  src/mesa/main/imports.h | 32 
>  1 file changed, 32 deletions(-)
> 
> diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h
> index c4d917e..9ffe3de 100644
> --- a/src/mesa/main/imports.h
> +++ b/src/mesa/main/imports.h
> @@ -230,38 +230,6 @@ static inline int IFLOOR(float f)
>  }
>  
>  
> -/** Return (as an integer) ceiling of float */
> -static inline int ICEIL(float f)
> -{
> -#if defined(USE_X86_ASM) && defined(__GNUC__) && defined(__i386__)
> -   /*
> -* IEEE ceil for computers that round to nearest or even.
> -* 'f' must be between -4194304 and 4194303.
> -* This ceil operation is done by "(iround(f + .5) + iround(f - .5) + 1) 
> >> 1",
> -* but uses some IEEE specific tricks for better speed.
> -* Contributed by Josh Vanderhoof
> -*/
> -   int ai, bi;
> -   double af, bf;
> -   af = (3 << 22) + 0.5 + (double)f;
> -   bf = (3 << 22) + 0.5 - (double)f;
> -   /* GCC generates an extra fstp/fld without this. */
> -   __asm__ ("fstps %0" : "=m" (ai) : "t" (af) : "st");
> -   __asm__ ("fstps %0" : "=m" (bi) : "t" (bf) : "st");
> -   return (ai - bi + 1) >> 1;
> -#else
> -   int ai, bi;
> -   double af, bf;
> -   fi_type u;
> -   af = (3 << 22) + 0.5 + (double)f;
> -   bf = (3 << 22) + 0.5 - (double)f;
> -   u.f = (float) af; ai = u.i;
> -   u.f = (float) bf; bi = u.i;
> -   return (ai - bi + 1) >> 1;
> -#endif
> -}
> -
> -
>  /**
>   * Is x a power of two?
>   */
> -- 
> 2.3.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Compatibility between old dri modules and new loaders, and vice verse

2015-06-22 Thread Ian Romanick

On 06/22/2015 11:54 AM, Dave Airlie wrote:
>>
>> As kindly hinted by Marek, currently we do have a wide selection of
>> supported dri <> loader combinations.
>>
>> Although we like to think that things never break, we have to admit
>> that not many of us test every possible combinations of dri modules
>> and loaders. With the chances getting smaller as the time gap (age)
>> between the two increases. As such I would like to ask if we're
>> interested in gradually depreciating as the gap grows beyond X years.
>>
>> The rough idea that I have in my mind is:
>> - Check for obsolete extensions (requirements for such) - both in the
>> dri modules and the loaders (including the xserver).
>> - Add some WARN messages ("You're using an old loader/DRI module.
>> Update to XXX or later") when such code path is hit.
>> - After X mesa releases, we remove the dri extension from the
>> module(s) and bump the requirement(s) in the loader(s).
>>
>> And now the more important question why ?
>>  - Very rarely tested and not actively supported - if it works it
>> works, we only cover one stable branch.
>>  - Having a quick look at the the "if extension && extension.version
>>> = y" maze does leave most of us speechless.
>>  - Will allow us to start removing a few of the nasty quirks/hacks
>> that we currently have laying around.
>>
>> Worth mentioning:
>>  - Depreciation period will be based on the longest time frame set by
>> LTS versions of distros. For example if Debian A ships X and mesa 3
>> years apart, while Ubuntu does is ~2.5 and RedHat ~2.8, we'll stick
>> with 3 years.
>>  - libGL dri1 support... it's been almost four years since the removal
>> of the dri1 modules. Since then the only activity that I've noticed by
>> Connor Behan on the r128 front. Although it seems that he has covered
>> the ddx and is just looking at the kernel side of things. Should we
>> consider mesa X (10.6 ?) as the last one that supports such old
>> modules in it's libGL and give it a much needed cleanup ?
>>
>>
>> How would people feel about this - do we have any strong ack/nack
>> about the idea ? Are there many people/companies that support distros
>> where the xserver <> mesa gap is over, say 2 years ?
> 
> We still ship 7.11 based dri1 drivers in RHEL6, and there is still a
> chance of us rebasing to newer Mesa in that depending on schedules.
> 
> ajax might have a different opinion, on how likely that is, but
> that would be at least another year from now where we'd want DRI1
> to work.

A time line would be good.  I think it will take a fair amount of time
to get a new loader<>driver interface in order.  If we can't change
anything for two years, then there's not a lot of point to thinking
about it now.  If it's a year or less away, that's a different story.

The other possibility would be for RHEL to ship more than one libGL...
one for DRI1 drivers and one for everything else.  I don't know how
horrible that would be.

> Dave.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Compatibility between old dri modules and new loaders, and vice verse

2015-06-22 Thread Dave Airlie

On 23 June 2015 at 08:16, Ian Romanick  wrote:
> On 06/22/2015 11:54 AM, Dave Airlie wrote:
>>>
>>> As kindly hinted by Marek, currently we do have a wide selection of
>>> supported dri <> loader combinations.
>>>
>>> Although we like to think that things never break, we have to admit
>>> that not many of us test every possible combinations of dri modules
>>> and loaders. With the chances getting smaller as the time gap (age)
>>> between the two increases. As such I would like to ask if we're
>>> interested in gradually depreciating as the gap grows beyond X years.
>>>
>>> The rough idea that I have in my mind is:
>>> - Check for obsolete extensions (requirements for such) - both in the
>>> dri modules and the loaders (including the xserver).
>>> - Add some WARN messages ("You're using an old loader/DRI module.
>>> Update to XXX or later") when such code path is hit.
>>> - After X mesa releases, we remove the dri extension from the
>>> module(s) and bump the requirement(s) in the loader(s).
>>>
>>> And now the more important question why ?
>>>  - Very rarely tested and not actively supported - if it works it
>>> works, we only cover one stable branch.
>>>  - Having a quick look at the the "if extension && extension.version
 = y" maze does leave most of us speechless.
>>>  - Will allow us to start removing a few of the nasty quirks/hacks
>>> that we currently have laying around.
>>>
>>> Worth mentioning:
>>>  - Depreciation period will be based on the longest time frame set by
>>> LTS versions of distros. For example if Debian A ships X and mesa 3
>>> years apart, while Ubuntu does is ~2.5 and RedHat ~2.8, we'll stick
>>> with 3 years.
>>>  - libGL dri1 support... it's been almost four years since the removal
>>> of the dri1 modules. Since then the only activity that I've noticed by
>>> Connor Behan on the r128 front. Although it seems that he has covered
>>> the ddx and is just looking at the kernel side of things. Should we
>>> consider mesa X (10.6 ?) as the last one that supports such old
>>> modules in it's libGL and give it a much needed cleanup ?
>>>
>>>
>>> How would people feel about this - do we have any strong ack/nack
>>> about the idea ? Are there many people/companies that support distros
>>> where the xserver <> mesa gap is over, say 2 years ?
>>
>> We still ship 7.11 based dri1 drivers in RHEL6, and there is still a
>> chance of us rebasing to newer Mesa in that depending on schedules.
>>
>> ajax might have a different opinion, on how likely that is, but
>> that would be at least another year from now where we'd want DRI1
>> to work.
>
> A time line would be good.  I think it will take a fair amount of time
> to get a new loader<>driver interface in order.  If we can't change
> anything for two years, then there's not a lot of point to thinking
> about it now.  If it's a year or less away, that's a different story.
>
> The other possibility would be for RHEL to ship more than one libGL...
> one for DRI1 drivers and one for everything else.  I don't know how
> horrible that would be.

That would worse than impossible, it's bad enough nvidia overwrite
libGL I don't want us to do it as well to ourselves :-)

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Don't mess up stride for uniform integer multiplication.

2015-06-22 Thread Kenneth Graunke

On Monday, June 22, 2015 02:58:36 PM Matt Turner wrote:
> If the stride is 0, the source is a uniform and we should not modify the
> stride.
> 
> Cc: "10.6" 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91047
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 5563c5a..903624c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3196,10 +3196,16 @@ fs_visitor::lower_integer_multiplication()
> src1_1_w.fixed_hw_reg.dw1.ud >>= 16;
>  } else {
> src1_0_w.type = BRW_REGISTER_TYPE_UW;
> -   src1_0_w.stride = 2;
> +   if (src1_0_w.stride != 0) {
> +  assert(src1_0_w.stride == 1);
> +  src1_0_w.stride = 2;
> +   }
>  
> src1_1_w.type = BRW_REGISTER_TYPE_UW;
> -   src1_1_w.stride = 2;
> +   if (src1_1_w.stride != 0) {
> +  assert(src1_1_w.stride == 1);
> +  src1_1_w.stride = 2;
> +   }
> src1_1_w.subreg_offset += type_sz(BRW_REGISTER_TYPE_UW);
>  }
>  ibld.MUL(low, inst->src[0], src1_0_w);
> @@ -3209,10 +3215,16 @@ fs_visitor::lower_integer_multiplication()
>  fs_reg src0_1_w = inst->src[0];
>  
>  src0_0_w.type = BRW_REGISTER_TYPE_UW;
> -src0_0_w.stride = 2;
> +if (src0_0_w.stride != 0) {
> +   assert(src0_0_w.stride == 1);
> +   src0_0_w.stride = 2;
> +}
>  
>  src0_1_w.type = BRW_REGISTER_TYPE_UW;
> -src0_1_w.stride = 2;
> +if (src0_1_w.stride != 0) {
> +   assert(src0_1_w.stride == 1);
> +   src0_1_w.stride = 2;
> +}
>  src0_1_w.subreg_offset += type_sz(BRW_REGISTER_TYPE_UW);
>  
>  ibld.MUL(low, src0_0_w, inst->src[1]);
> 

Whoops.  Yeah, this makes sense.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/cfg: Assert that cur_do/while/if pointers are non-NULL.

2015-06-22 Thread Jordan Justen

Series Reviewed-by: Jordan Justen 

On 2015-06-22 14:56:06, Matt Turner wrote:
> Coverity sees that the functions immediately below the new assertions
> dereference these pointers, but is unaware that an ENDIF always follows
> an IF, etc.
> ---
>  src/mesa/drivers/dri/i965/brw_cfg.cpp | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
> b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> index 39c419b..f1f230e 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> @@ -231,6 +231,7 @@ cfg_t::cfg_t(exec_list *instructions)
>   if (cur_else) {
>  cur_else->add_successor(mem_ctx, cur_endif);
>   } else {
> +assert(cur_if != NULL);
>  cur_if->add_successor(mem_ctx, cur_endif);
>   }
>  
> @@ -299,6 +300,7 @@ cfg_t::cfg_t(exec_list *instructions)
>   inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
> + assert(cur_do != NULL && cur_while != NULL);
>  cur->add_successor(mem_ctx, cur_do);
>  set_next_block(&cur, cur_while, ip);
>  
> -- 
> 2.3.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 06/18] mesa/glformats: recognize ASTC formats as compressed

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/glformats.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index ac69fab..e7363b5 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -1262,6 +1262,35 @@ _mesa_is_compressed_format(const struct gl_context *ctx, 
GLenum format)
case GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT:
   return _mesa_is_desktop_gl(ctx) &&
  ctx->Extensions.ARB_texture_compression_bptc;
+   case GL_COMPRESSED_RGBA_ASTC_4x4_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_5x4_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_5x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_6x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_6x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x8_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x8_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x10_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_12x10_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_12x12_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR:
+  return ctx->Extensions.KHR_texture_compression_astc_ldr;
case GL_PALETTE4_RGB8_OES:
case GL_PALETTE4_RGBA8_OES:
case GL_PALETTE4_R5_G6_B5_OES:
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 03/18] mesa: disable online compression for ASTC formats

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/texcompress.c | 22 ++
 src/mesa/main/teximage.c| 28 
 2 files changed, 50 insertions(+)

diff --git a/src/mesa/main/texcompress.c b/src/mesa/main/texcompress.c
index 0fd1a36..1654fc6 100644
--- a/src/mesa/main/texcompress.c
+++ b/src/mesa/main/texcompress.c
@@ -229,6 +229,28 @@ _mesa_gl_compressed_format_base_format(GLenum format)
  *what GL_NUM_COMPRESSED_TEXTURE_FORMATS and
  *GL_COMPRESSED_TEXTURE_FORMATS return."
  *
+ * The KHR_texture_compression_astc_hdr spec says:
+ *
+ *"Interactions with OpenGL 4.2
+ *
+ *OpenGL 4.2 supports the feature that compressed textures can be
+ *compressed online, by passing the compressed texture format enum as
+ *the internal format when uploading a texture using TexImage1D,
+ *TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
+ *Specification, subsection Encoding of Special Internal Formats).
+ *
+ *Due to the complexity of the ASTC compression algorithm, it is not
+ *usually suitable for online use, and therefore ASTC support will be
+ *limited to pre-compressed textures only. Where on-device compression
+ *is required, a domain-specific limited compressor will typically
+ *be used, and this is therefore not suitable for implementation in
+ *the driver.
+ *
+ *In particular, the ASTC format specifiers will not be added to
+ *Table 3.14, and thus will not be accepted by the TexImage*D
+ *functions, and will not be returned by the (already deprecated)
+ *COMPRESSED_TEXTURE_FORMATS query."
+ *
  * There is no formal spec for GL_ATI_texture_compression_3dc.  Since the
  * formats added by this extension are luminance-alpha formats, it is
  * reasonable to expect them to follow the same rules as
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 3d85615..86ef407 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -1778,6 +1778,34 @@ compressedteximage_only_format(const struct gl_context 
*ctx, GLenum format)
case GL_PALETTE8_R5_G6_B5_OES:
case GL_PALETTE8_RGBA4_OES:
case GL_PALETTE8_RGB5_A1_OES:
+   case GL_COMPRESSED_RGBA_ASTC_4x4_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_5x4_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_5x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_6x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_6x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_8x8_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x5_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x6_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x8_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_10x10_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_12x10_KHR:
+   case GL_COMPRESSED_RGBA_ASTC_12x12_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR:
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR:
   return GL_TRUE;
default:
   return GL_FALSE;
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 02/18] glapi: add support for KHR_texture_compression_astc_ldr

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

v2: correct the spelling of the sRGB variants.
remove spaces around "=" when setting the enum value.

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 .../glapi/gen/KHR_texture_compression_astc.xml | 40 ++
 src/mapi/glapi/gen/Makefile.am |  1 +
 src/mapi/glapi/gen/gl_API.xml  |  2 +-
 3 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 src/mapi/glapi/gen/KHR_texture_compression_astc.xml

diff --git a/src/mapi/glapi/gen/KHR_texture_compression_astc.xml 
b/src/mapi/glapi/gen/KHR_texture_compression_astc.xml
new file mode 100644
index 000..7b5864d
--- /dev/null
+++ b/src/mapi/glapi/gen/KHR_texture_compression_astc.xml
@@ -0,0 +1,40 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 5b163b0..53edab5 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -187,6 +187,7 @@ API_XML = \
INTEL_performance_query.xml \
KHR_debug.xml \
KHR_context_flush_control.xml \
+   KHR_texture_compression_astc.xml \
NV_conditional_render.xml \
NV_primitive_restart.xml \
NV_texture_barrier.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 2f33075..8df58a3 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8162,7 +8162,7 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+http://www.w3.org/2001/XInclude"/>
 
 http://www.w3.org/2001/XInclude"/>
 
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 04/18] mesa: return bool instead of GLboolean in compressedteximage_only_format()

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

In agreement with the coding style, functions that aren't directly visible
to the GL API should prefer the use of bool over GLboolean.

Suggested-by: Ian Romanick 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/teximage.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 86ef407..0e0488a 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -1763,7 +1763,7 @@ _mesa_test_proxy_teximage(struct gl_context *ctx, GLenum 
target, GLint level,
 /**
  * Return true if the format is only valid for glCompressedTexImage.
  */
-static GLboolean
+static bool
 compressedteximage_only_format(const struct gl_context *ctx, GLenum format)
 {
switch (format) {
@@ -1806,9 +1806,9 @@ compressedteximage_only_format(const struct gl_context 
*ctx, GLenum format)
case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR:
case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR:
case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR:
-  return GL_TRUE;
+  return true;
default:
-  return GL_FALSE;
+  return false;
}
 }
 
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 07/18] mesa/texcompress: enable translation between MESA and GL ASTC formats

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/texcompress.c | 114 
 1 file changed, 114 insertions(+)

diff --git a/src/mesa/main/texcompress.c b/src/mesa/main/texcompress.c
index 1654fc6..203a065 100644
--- a/src/mesa/main/texcompress.c
+++ b/src/mesa/main/texcompress.c
@@ -471,6 +471,63 @@ _mesa_glenum_to_compressed_format(GLenum format)
case GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT:
   return MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT;
 
+   case GL_COMPRESSED_RGBA_ASTC_4x4_KHR:
+  return MESA_FORMAT_ASTC_4x4_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_5x4_KHR:
+  return MESA_FORMAT_ASTC_5x4_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_5x5_KHR:
+  return MESA_FORMAT_ASTC_5x5_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_6x5_KHR:
+  return MESA_FORMAT_ASTC_6x5_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_6x6_KHR:
+  return MESA_FORMAT_ASTC_6x6_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_8x5_KHR:
+  return MESA_FORMAT_ASTC_8x5_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_8x6_KHR:
+  return MESA_FORMAT_ASTC_8x6_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_8x8_KHR:
+  return MESA_FORMAT_ASTC_8x8_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_10x5_KHR:
+  return MESA_FORMAT_ASTC_10x5_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_10x6_KHR:
+  return MESA_FORMAT_ASTC_10x6_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_10x8_KHR:
+  return MESA_FORMAT_ASTC_10x8_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_10x10_KHR:
+  return MESA_FORMAT_ASTC_10x10_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_12x10_KHR:
+  return MESA_FORMAT_ASTC_12x10_RGBA;
+   case GL_COMPRESSED_RGBA_ASTC_12x12_KHR:
+  return MESA_FORMAT_ASTC_12x12_RGBA;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR:
+  return MESA_FORMAT_ASTC_4x4_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR:
+  return MESA_FORMAT_ASTC_5x4_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR:
+  return MESA_FORMAT_ASTC_5x5_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR:
+  return MESA_FORMAT_ASTC_6x5_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR:
+  return MESA_FORMAT_ASTC_6x6_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR:
+  return MESA_FORMAT_ASTC_8x5_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR:
+  return MESA_FORMAT_ASTC_8x6_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR:
+  return MESA_FORMAT_ASTC_8x8_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR:
+  return MESA_FORMAT_ASTC_10x5_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR:
+  return MESA_FORMAT_ASTC_10x6_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR:
+  return MESA_FORMAT_ASTC_10x8_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR:
+  return MESA_FORMAT_ASTC_10x10_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR:
+  return MESA_FORMAT_ASTC_12x10_SRGB8_ALPHA8;
+   case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR:
+  return MESA_FORMAT_ASTC_12x12_SRGB8_ALPHA8;
+
default:
   return MESA_FORMAT_NONE;
}
@@ -561,6 +618,63 @@ _mesa_compressed_format_to_glenum(struct gl_context *ctx, 
mesa_format mesaFormat
case MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT:
   return GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT;
 
+   case MESA_FORMAT_ASTC_4x4_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_4x4_KHR;
+   case MESA_FORMAT_ASTC_5x4_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_5x4_KHR;
+   case MESA_FORMAT_ASTC_5x5_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_5x5_KHR;
+   case MESA_FORMAT_ASTC_6x5_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_6x5_KHR;
+   case MESA_FORMAT_ASTC_6x6_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_6x6_KHR;
+   case MESA_FORMAT_ASTC_8x5_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_8x5_KHR;
+   case MESA_FORMAT_ASTC_8x6_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_8x6_KHR;
+   case MESA_FORMAT_ASTC_8x8_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_8x8_KHR;
+   case MESA_FORMAT_ASTC_10x5_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_10x5_KHR;
+   case MESA_FORMAT_ASTC_10x6_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_10x6_KHR;
+   case MESA_FORMAT_ASTC_10x8_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_10x8_KHR;
+   case MESA_FORMAT_ASTC_10x10_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_10x10_KHR;
+   case MESA_FORMAT_ASTC_12x10_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_12x10_KHR;
+   case MESA_FORMAT_ASTC_12x12_RGBA:
+  return GL_COMPRESSED_RGBA_ASTC_12x12_KHR;
+   case MESA_FORMAT_ASTC_4x4_SRGB8_ALPHA8:
+  return GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR;
+   case MESA_FORMAT_ASTC_5x4_SRGB8_ALPHA8:
+  return GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR;
+   case MESA_FORMAT_ASTC_5x5_SRGB8_ALPHA8:
+  return GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR;
+   case MESA_FORMAT_ASTC_6x5_SRGB8_ALPHA8:
+  return GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR;
+   case MESA_FORMAT_ASTC_6x6_SRGB8_ALPHA8:
+

[Mesa-dev] [PATCH v3 05/18] mesa: add ASTC extensions to the extensions table

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

v2: alphabetize the extensions.
remove OES ASTC extension.

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/extensions.c | 2 ++
 src/mesa/main/mtypes.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 4176a69..adbeecc 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -337,6 +337,8 @@ static const struct extension extension_table[] = {
/* KHR extensions */
{ "GL_KHR_debug",   o(dummy_true),  
GL, 2012 },
{ "GL_KHR_context_flush_control",   o(dummy_true),  
GL   | ES2, 2014 },
+   { "GL_KHR_texture_compression_astc_hdr",
o(KHR_texture_compression_astc_hdr),GL   | ES2, 2012 },
+   { "GL_KHR_texture_compression_astc_ldr",
o(KHR_texture_compression_astc_ldr),GL   | ES2, 2012 },
 
/* Vendor extensions */
{ "GL_3DFX_texture_compression_FXT1",   
o(TDFX_texture_compression_FXT1),   GL, 1999 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 983b9dc..6a5d15f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3772,6 +3772,8 @@ struct gl_extensions
GLboolean ATI_fragment_shader;
GLboolean ATI_separate_stencil;
GLboolean INTEL_performance_query;
+   GLboolean KHR_texture_compression_astc_hdr;
+   GLboolean KHR_texture_compression_astc_ldr;
GLboolean MESA_pack_invert;
GLboolean MESA_ycbcr_texture;
GLboolean NV_conditional_render;
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 09/18] mesa/formats: store whether or not a format is sRGB in gl_format_info

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

v2: remove extra newline.
v3: use bool instead of GLboolean.

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/format_info.py |  2 ++
 src/mesa/main/formats.c  | 28 
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/src/mesa/main/format_info.py b/src/mesa/main/format_info.py
index 40104a2..8134e8e 100644
--- a/src/mesa/main/format_info.py
+++ b/src/mesa/main/format_info.py
@@ -191,6 +191,8 @@ for fmat in formats:
bits = [ get_channel_bits(fmat, name) for name in ['l', 'i', 'z', 's']]
print '  {0},'.format(', '.join(map(str, bits)))
 
+   print '  {0:d},'.format(fmat.colorspace == 'srgb')
+
print '  {0}, {1}, {2},'.format(fmat.block_width, fmat.block_height,
int(fmat.block_size() / 8))
 
diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c
index 745fd8c..1f5a2b9 100644
--- a/src/mesa/main/formats.c
+++ b/src/mesa/main/formats.c
@@ -65,6 +65,8 @@ struct gl_format_info
GLubyte DepthBits;
GLubyte StencilBits;
 
+   bool IsSRGBFormat;
+
/**
 * To describe compressed formats.  If not compressed, Width=Height=1.
 */
@@ -553,30 +555,8 @@ _mesa_is_format_color_format(mesa_format format)
 GLenum
 _mesa_get_format_color_encoding(mesa_format format)
 {
-   /* XXX this info should be encoded in gl_format_info */
-   switch (format) {
-   case MESA_FORMAT_BGR_SRGB8:
-   case MESA_FORMAT_A8B8G8R8_SRGB:
-   case MESA_FORMAT_B8G8R8A8_SRGB:
-   case MESA_FORMAT_A8R8G8B8_SRGB:
-   case MESA_FORMAT_R8G8B8A8_SRGB:
-   case MESA_FORMAT_L_SRGB8:
-   case MESA_FORMAT_L8A8_SRGB:
-   case MESA_FORMAT_A8L8_SRGB:
-   case MESA_FORMAT_SRGB_DXT1:
-   case MESA_FORMAT_SRGBA_DXT1:
-   case MESA_FORMAT_SRGBA_DXT3:
-   case MESA_FORMAT_SRGBA_DXT5:
-   case MESA_FORMAT_R8G8B8X8_SRGB:
-   case MESA_FORMAT_ETC2_SRGB8:
-   case MESA_FORMAT_ETC2_SRGB8_ALPHA8_EAC:
-   case MESA_FORMAT_ETC2_SRGB8_PUNCHTHROUGH_ALPHA1:
-   case MESA_FORMAT_B8G8R8X8_SRGB:
-   case MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM:
-  return GL_SRGB;
-   default:
-  return GL_LINEAR;
-   }
+   const struct gl_format_info *info = _mesa_get_format_info(format);
+   return info->IsSRGBFormat ? GL_SRGB : GL_LINEAR;
 }
 
 
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 00/18] Enable support for 2D ASTC (LDR and HDR modes) in SKL

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

This patch series enables support for the KHR_texture_compression_astc_{ldr,hdr}
extensions on Skylake machines. This revision includes developer suggestions and
fixes rendering issues on previously untested systems. The sRGB issues were
fixed and determined to be unrelated to this patchset.

The Piglit tests for this extension can be found here:
cgit.freedesktop.org/~nchery/piglit

Nanley Chery (18):
  mesa/formats: define the 2D ASTC formats
  glapi: add support for KHR_texture_compression_astc_ldr
  mesa: disable online compression for ASTC formats
  mesa: return bool instead of GLboolean in
compressedteximage_only_format()
  mesa: add ASTC extensions to the extensions table
  mesa/glformats: recognize ASTC formats as compressed
  mesa/texcompress: enable translation between MESA and GL ASTC formats
  mesa/teximage: return the base internal format of the ASTC formats
  mesa/formats: store whether or not a format is sRGB in gl_format_info
  i965/surface_formats: add support for 2D ASTC surface formats
  mesa/macros: add power-of-two assertions for alignment macros
  mesa/macros: move ALIGN_NPOT to macros.h
  i965: use ALIGN_NPOT for setting ASTC mipmap layouts
  i965: correct mt->align_h for 2D textures on Skylake
  i965: change the meaning of cpp for compressed textures
  i965: enable ASTC support for Skylake
  i965: refactor miptree alignment calculation code
  swrast: add a new macro, FETCH_COMPRESSED

 .../glapi/gen/KHR_texture_compression_astc.xml |  40 +++
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   2 +-
 src/mesa/drivers/dri/i965/brw_defines.h|  32 +++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |   2 +-
 src/mesa/drivers/dri/i965/brw_surface_formats.c|  80 ++
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 105 
 src/mesa/drivers/dri/i965/intel_copy_image.c   |  19 +-
 src/mesa/drivers/dri/i965/intel_extensions.c   |   5 +
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |  15 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h  |   2 +-
 src/mesa/drivers/dri/i965/intel_upload.c   |   6 -
 src/mesa/main/extensions.c |   2 +
 src/mesa/main/format_info.py   |   5 +
 src/mesa/main/formats.c| 158 ++--
 src/mesa/main/formats.csv  |  31 +++
 src/mesa/main/formats.h|  30 +++
 src/mesa/main/glformats.c  |  29 +++
 src/mesa/main/macros.h |  22 +-
 src/mesa/main/mtypes.h |   2 +
 src/mesa/main/texcompress.c| 136 +++
 src/mesa/main/teximage.c   |  70 +-
 src/mesa/swrast/s_texfetch.c   | 269 ++---
 23 files changed, 736 insertions(+), 327 deletions(-)
 create mode 100644 src/mapi/glapi/gen/KHR_texture_compression_astc.xml

-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 01/18] mesa/formats: define the 2D ASTC formats

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Includes definition of the formats, updates to functions likely to be used, as
well as changes necessary for compilation.

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/format_info.py |   3 +
 src/mesa/main/formats.c  | 130 +++
 src/mesa/main/formats.csv|  31 +++
 src/mesa/main/formats.h  |  30 ++
 src/mesa/swrast/s_texfetch.c |  32 ++-
 5 files changed, 225 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/format_info.py b/src/mesa/main/format_info.py
index 3bae57e..40104a2 100644
--- a/src/mesa/main/format_info.py
+++ b/src/mesa/main/format_info.py
@@ -130,6 +130,9 @@ def get_channel_bits(fmat, chan_name):
   elif fmat.layout == 'bptc':
  bits = 16 if fmat.name.endswith('_FLOAT') else 8
  return bits if fmat.has_channel(chan_name) else 0
+  elif fmat.layout == 'astc':
+ bits = 16 if fmat.name.endswith('_RGBA') else 8
+ return bits if fmat.has_channel(chan_name) else 0
   else:
  assert False
else:
diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c
index baeb1bf..745fd8c 100644
--- a/src/mesa/main/formats.c
+++ b/src/mesa/main/formats.c
@@ -667,6 +667,48 @@ _mesa_get_srgb_format_linear(mesa_format format)
case MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM:
   format = MESA_FORMAT_BPTC_RGBA_UNORM;
   break;
+   case MESA_FORMAT_ASTC_4x4_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_4x4_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_5x4_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_5x4_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_5x5_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_5x5_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_6x5_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_6x5_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_6x6_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_6x6_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_8x5_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_8x5_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_8x6_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_8x6_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_8x8_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_8x8_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_10x5_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_10x5_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_10x6_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_10x6_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_10x8_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_10x8_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_10x10_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_10x10_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_12x10_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_12x10_RGBA;
+  break;
+   case MESA_FORMAT_ASTC_12x12_SRGB8_ALPHA8:
+  format = MESA_FORMAT_ASTC_12x12_RGBA;
+  break;
case MESA_FORMAT_B8G8R8X8_SRGB:
   format = MESA_FORMAT_B8G8R8X8_UNORM;
   break;
@@ -741,6 +783,36 @@ _mesa_get_uncompressed_format(mesa_format format)
case MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT:
case MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT:
   return MESA_FORMAT_RGB_FLOAT32;
+   case MESA_FORMAT_ASTC_4x4_RGBA:
+   case MESA_FORMAT_ASTC_5x4_RGBA:
+   case MESA_FORMAT_ASTC_5x5_RGBA:
+   case MESA_FORMAT_ASTC_6x5_RGBA:
+   case MESA_FORMAT_ASTC_6x6_RGBA:
+   case MESA_FORMAT_ASTC_8x5_RGBA:
+   case MESA_FORMAT_ASTC_8x6_RGBA:
+   case MESA_FORMAT_ASTC_8x8_RGBA:
+   case MESA_FORMAT_ASTC_10x5_RGBA:
+   case MESA_FORMAT_ASTC_10x6_RGBA:
+   case MESA_FORMAT_ASTC_10x8_RGBA:
+   case MESA_FORMAT_ASTC_10x10_RGBA:
+   case MESA_FORMAT_ASTC_12x10_RGBA:
+   case MESA_FORMAT_ASTC_12x12_RGBA:
+   return MESA_FORMAT_RGBA_FLOAT16;
+   case MESA_FORMAT_ASTC_4x4_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_5x4_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_5x5_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_6x5_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_6x6_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_8x5_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_8x6_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_8x8_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_10x5_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_10x6_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_10x8_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_10x10_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_12x10_SRGB8_ALPHA8:
+   case MESA_FORMAT_ASTC_12x12_SRGB8_ALPHA8:
+   return MESA_FORMAT_A8B8G8R8_SRGB;
default:
 #ifdef DEBUG
   assert(!_mesa_is_format_compressed(format));
@@ -1253,6 +1325,34 @@ _mesa_format_to_type_and_comps(mesa_format format,
case MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM:
case MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT:
case MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT:
+   case MESA_FORMAT_ASTC_4x4_RGBA:
+   case MESA_FORMAT_ASTC_5x4_RGBA:
+   case MESA_FORMAT_ASTC_5x5_RGBA:
+   case MESA_FORMAT_ASTC_6x5_RGBA:
+   case MESA_FORMAT_ASTC_6x6_RGBA:
+   case MESA_FORMAT_ASTC_8x5_RGBA:
+   case MESA_FORMAT_ASTC_8x6_RGBA:
+   case MESA_FORMAT_ASTC_8x8_RGBA:
+   case MESA_FORMAT_ASTC_10x5_RGBA:
+   case MESA

[Mesa-dev] [PATCH v3 14/18] i965: correct mt->align_h for 2D textures on Skylake

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

In agreement with commit 4ab8d59a23, vertical alignment values are equal to
four times the block height on Gen9+.

v2: add newlines to separate declarations, statments, and comments.

Reviewed-by: Anuj Phogat 
Reviewed-by: Neil Roberts 
Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 4007697..ade2940 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -270,9 +270,14 @@ intel_vertical_texture_alignment_unit(struct brw_context 
*brw,
 * Where "*" means either VALIGN_2 or VALIGN_4 depending on the setting of
 * the SURFACE_STATE "Surface Vertical Alignment" field.
 */
-   if (_mesa_is_format_compressed(mt->format))
-  /* See comment above for the horizontal alignment */
-  return brw->gen >= 9 ? 16 : 4;
+if (_mesa_is_format_compressed(mt->format)) {
+   unsigned int i, j;
+
+   _mesa_get_format_block_size(mt->format, &i, &j);
+
+   /* See comment above for the horizontal alignment */
+   return brw->gen >= 9 ? j * 4 : 4;
+}
 
if (mt->format == MESA_FORMAT_S_UINT8)
   return brw->gen >= 7 ? 8 : 4;
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 17/18] i965: refactor miptree alignment calculation code

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Remove redundant checks and comments by grouping our calculations for
align_w and align_h wherever possible.

v2: reintroduce brw.
don't include functional changes.
don't adjust function parameters or create a new function.

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 85 +++---
 1 file changed, 30 insertions(+), 55 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 840a069..493ed4f 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -123,12 +123,6 @@ intel_horizontal_texture_alignment_unit(struct brw_context 
*brw,
   return 16;
 
/**
-* From the "Alignment Unit Size" section of various specs, namely:
-* - Gen3 Spec: "Memory Data Formats" Volume, Section 1.20.1.4
-* - i965 and G45 PRMs: Volume 1, Section 6.17.3.4.
-* - Ironlake and Sandybridge PRMs: Volume 1, Part 1, Section 7.18.3.4
-* - BSpec (for Ivybridge and slight variations in separate stencil)
-*
 * +--+
 * || alignment unit width  ("i") |
 * | Surface Property   |-|
@@ -146,32 +140,6 @@ intel_horizontal_texture_alignment_unit(struct brw_context 
*brw,
 * On IVB+, non-special cases can be overridden by setting the SURFACE_STATE
 * "Surface Horizontal Alignment" field to HALIGN_4 or HALIGN_8.
 */
-if (_mesa_is_format_compressed(mt->format)) {
-   /* The hardware alignment requirements for compressed textures
-* happen to match the block boundaries.
-*/
-  unsigned int i, j;
-  _mesa_get_format_block_size(mt->format, &i, &j);
-
-  /* On Gen9+ we can pick our own alignment for compressed textures but it
-   * has to be a multiple of the block size. The minimum alignment we can
-   * pick is 4 so we effectively have to align to 4 times the block
-   * size
-   */
-  if (brw->gen >= 9)
- return i * 4;
-  else
- return i;
-}
-
-   if (mt->format == MESA_FORMAT_S_UINT8)
-  return 8;
-
-   if (brw->gen >= 9 && mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
-  uint32_t align = tr_mode_horizontal_texture_alignment(brw, mt);
-  /* XY_FAST_COPY_BLT doesn't support horizontal alignment < 32. */
-  return align < 32 ? 32 : align;
-   }
 
if (brw->gen >= 7 && mt->format == MESA_FORMAT_Z_UNORM16)
   return 8;
@@ -248,12 +216,6 @@ intel_vertical_texture_alignment_unit(struct brw_context 
*brw,
   const struct intel_mipmap_tree *mt)
 {
/**
-* From the "Alignment Unit Size" section of various specs, namely:
-* - Gen3 Spec: "Memory Data Formats" Volume, Section 1.20.1.4
-* - i965 and G45 PRMs: Volume 1, Section 6.17.3.4.
-* - Ironlake and Sandybridge PRMs: Volume 1, Part 1, Section 7.18.3.4
-* - BSpec (for Ivybridge and slight variations in separate stencil)
-*
 * +--+
 * || alignment unit height ("j") |
 * | Surface Property   |-|
@@ -270,23 +232,6 @@ intel_vertical_texture_alignment_unit(struct brw_context 
*brw,
 * Where "*" means either VALIGN_2 or VALIGN_4 depending on the setting of
 * the SURFACE_STATE "Surface Vertical Alignment" field.
 */
-if (_mesa_is_format_compressed(mt->format)) {
-   unsigned int i, j;
-
-   _mesa_get_format_block_size(mt->format, &i, &j);
-
-   /* See comment above for the horizontal alignment */
-   return brw->gen >= 9 ? j * 4 : 4;
-}
-
-   if (mt->format == MESA_FORMAT_S_UINT8)
-  return brw->gen >= 7 ? 8 : 4;
-
-   if (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
-  uint32_t align = tr_mode_vertical_texture_alignment(brw, mt);
-  /* XY_FAST_COPY_BLT doesn't support vertical alignment < 64 */
-  return align < 64 ? 64 : align;
-   }
 
/* Broadwell only supports VALIGN of 4, 8, and 16.  The BSpec says 4
 * should always be used, except for stencil buffers, which should be 8.
@@ -780,6 +725,13 @@ brw_miptree_layout(struct brw_context *brw,
 
mt->tr_mode = INTEL_MIPTREE_TRMODE_NONE;
 
+   /**
+* From the "Alignment Unit Size" section of various specs, namely:
+* - Gen3 Spec: "Memory Data Formats" Volume, Section 1.20.1.4
+* - i965 and G45 PRMs: Volume 1, Section 6.17.3.4.
+* - Ironlake and Sandybridge PRMs: Volume 1, Part 1, Section 7.18.3.4
+* - BSpec (for Ivybridge and slight variations in separate stencil)
+*/
if (brw->gen == 6 && mt->array_layout == ALL_SLICES_AT_EACH_LOD) {
   const GLenum base_format = _mesa_

[Mesa-dev] [PATCH v3 11/18] mesa/macros: add power-of-two assertions for alignment macros

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

ALIGN and ROUND_DOWN_TO both require that the alignment value passed
into the macro be a power of two in the comments. Using software assertions
verifies this to be the case.

v2: use static inline functions instead of gcc-specific statement expressions.

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
 src/mesa/main/macros.h   | 16 +---
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 59081ea..1a57784 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -134,7 +134,7 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
: var->type->vector_elements;
 
   if (stage == MESA_SHADER_VERTEX) {
- for (int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) {
+ for (unsigned int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) 
{
 int output = var->data.location + i;
 this->outputs[output] = offset(reg, 4 * i);
 this->output_components[output] = vector_elements;
diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
index 0608650..4a640ad 100644
--- a/src/mesa/main/macros.h
+++ b/src/mesa/main/macros.h
@@ -684,7 +684,7 @@ minify(unsigned value, unsigned levels)
  * Note that this considers 0 a power of two.
  */
 static inline bool
-is_power_of_two(unsigned value)
+is_power_of_two(uintptr_t value)
 {
return (value & (value - 1)) == 0;
 }
@@ -700,7 +700,12 @@ is_power_of_two(unsigned value)
  *
  * \sa ROUND_DOWN_TO()
  */
-#define ALIGN(value, alignment)  (((value) + (alignment) - 1) & ~((alignment) 
- 1))
+static inline uintptr_t
+ALIGN(uintptr_t value, uintptr_t alignment)
+{
+  assert(is_power_of_two(alignment));
+  return (((value) + (alignment) - 1) & ~((alignment) - 1));
+}
 
 /**
  * Align a value down to an alignment value
@@ -713,7 +718,12 @@ is_power_of_two(unsigned value)
  *
  * \sa ALIGN()
  */
-#define ROUND_DOWN_TO(value, alignment) ((value) & ~(alignment - 1))
+static inline uintptr_t
+ROUND_DOWN_TO(uintptr_t value, uintptr_t alignment)
+{
+  assert(is_power_of_two(alignment));
+  return ((value) & ~(alignment - 1));
+}
 
 
 /** Cross product of two 3-element vectors */
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 08/18] mesa/teximage: return the base internal format of the ASTC formats

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

This is necesary to initialize the gl_texture_image struct.

From the KHR_texture_compression_astc_ldr spec:
  "Added to Section 3.8.6, Compressed Texture Images

   Add the tokens specified above to Table 3.16, Compressed Internal Formats.
   In all cases, the base internal format will be RGBA. The encoding allows
   images to be encoded with fewer channels, but this is always presented as
   RGBA to the sampler."

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/main/teximage.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 0e0488a..8de0c11 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -565,6 +565,42 @@ _mesa_base_tex_format( struct gl_context *ctx, GLint 
internalFormat )
   }
}
 
+   if (ctx->Extensions.KHR_texture_compression_astc_ldr) {
+  switch (internalFormat) {
+  case GL_COMPRESSED_RGBA_ASTC_4x4_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_5x4_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_5x5_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_6x5_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_6x6_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_8x5_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_8x6_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_8x8_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_10x5_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_10x6_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_10x8_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_10x10_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_12x10_KHR:
+  case GL_COMPRESSED_RGBA_ASTC_12x12_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR:
+ return GL_RGBA;
+  default:
+ ; /* fallthrough */
+  }
+   }
+
if (_mesa_is_gles3(ctx) || ctx->Extensions.ARB_ES3_compatibility) {
   switch (internalFormat) {
   case GL_COMPRESSED_RGB8_ETC2:
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 16/18] i965: enable ASTC support for Skylake

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

v2: remove OES ASTC extension reference.

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 365b4b8..cc793e5 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -354,6 +354,11 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_stencil_texturing = true;
}
 
+   if (brw->gen >= 9) {
+  ctx->Extensions.KHR_texture_compression_astc_ldr = true;
+  ctx->Extensions.KHR_texture_compression_astc_hdr = true;
+   }
+
if (ctx->API == API_OPENGL_CORE)
   ctx->Extensions.ARB_base_instance = true;
if (ctx->API != API_OPENGL_CORE)
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 12/18] mesa/macros: move ALIGN_NPOT to macros.h

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Aligning with a non-power-of-two number is a general task that can be used in
various places. This commit is required for the next one.

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/intel_upload.c | 6 --
 src/mesa/main/macros.h   | 6 ++
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_upload.c 
b/src/mesa/drivers/dri/i965/intel_upload.c
index 870aabc..deaae6c 100644
--- a/src/mesa/drivers/dri/i965/intel_upload.c
+++ b/src/mesa/drivers/dri/i965/intel_upload.c
@@ -44,12 +44,6 @@
 
 #define INTEL_UPLOAD_SIZE (64*1024)
 
-/**
- * Like ALIGN(), but works with a non-power-of-two alignment.
- */
-#define ALIGN_NPOT(value, alignment) \
-   (((value) + (alignment) - 1) / (alignment) * (alignment))
-
 void
 intel_upload_finish(struct brw_context *brw)
 {
diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
index 4a640ad..4a08130 100644
--- a/src/mesa/main/macros.h
+++ b/src/mesa/main/macros.h
@@ -708,6 +708,12 @@ ALIGN(uintptr_t value, uintptr_t alignment)
 }
 
 /**
+ * Like ALIGN(), but works with a non-power-of-two alignment.
+ */
+#define ALIGN_NPOT(value, alignment) \
+   (((value) + (alignment) - 1) / (alignment) * (alignment))
+
+/**
  * Align a value down to an alignment value
  *
  * If \c value is not already aligned to the requested alignment value, it
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 18/18] swrast: add a new macro, FETCH_COMPRESSED

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

This patch creates a new macro, FETCH_COMPRESSED - similar in nature
to the other FETCH_* macros. This reduces repetition in the code that
deals with compressed textures.

Reviewed-by: Anuj Phogat 
Signed-off-by: Nanley Chery 
---
 src/mesa/swrast/s_texfetch.c | 239 ---
 1 file changed, 41 insertions(+), 198 deletions(-)

diff --git a/src/mesa/swrast/s_texfetch.c b/src/mesa/swrast/s_texfetch.c
index 14e5293..92a4a37 100644
--- a/src/mesa/swrast/s_texfetch.c
+++ b/src/mesa/swrast/s_texfetch.c
@@ -116,6 +116,14 @@ static void fetch_null_texelf( const struct 
swrast_texture_image *texImage,
   NULL  \
}
 
+#define FETCH_COMPRESSED(NAME)  \
+   {\
+  MESA_FORMAT_ ## NAME, \
+  fetch_compressed, \
+  fetch_compressed, \
+  fetch_compressed  \
+   }
+
 /**
  * Table to map MESA_FORMAT_ to texel fetch/store funcs.
  */
@@ -344,214 +352,49 @@ texfetch_funcs[] =
FETCH_NULL(RGBX_SINT32),
 
/* DXT compressed formats */
-   {
-  MESA_FORMAT_RGB_DXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RGBA_DXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RGBA_DXT3,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RGBA_DXT5,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
+   FETCH_COMPRESSED(RGB_DXT1),
+   FETCH_COMPRESSED(RGBA_DXT1),
+   FETCH_COMPRESSED(RGBA_DXT3),
+   FETCH_COMPRESSED(RGBA_DXT5),
 
/* DXT sRGB compressed formats */
-   {
-  MESA_FORMAT_SRGB_DXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_SRGBA_DXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_SRGBA_DXT3,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_SRGBA_DXT5,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
+   FETCH_COMPRESSED(SRGB_DXT1),
+   FETCH_COMPRESSED(SRGBA_DXT1),
+   FETCH_COMPRESSED(SRGBA_DXT3),
+   FETCH_COMPRESSED(SRGBA_DXT5),
 
/* FXT1 compressed formats */
-   {
-  MESA_FORMAT_RGB_FXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RGBA_FXT1,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
+   FETCH_COMPRESSED(RGB_FXT1),
+   FETCH_COMPRESSED(RGBA_FXT1),
 
/* RGTC compressed formats */
-   {
-  MESA_FORMAT_R_RGTC1_UNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_R_RGTC1_SNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RG_RGTC2_UNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_RG_RGTC2_SNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
+   FETCH_COMPRESSED(R_RGTC1_UNORM),
+   FETCH_COMPRESSED(R_RGTC1_SNORM),
+   FETCH_COMPRESSED(RG_RGTC2_UNORM),
+   FETCH_COMPRESSED(RG_RGTC2_SNORM),
 
/* LATC1/2 compressed formats */
-   {
-  MESA_FORMAT_L_LATC1_UNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_L_LATC1_SNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_LA_LATC2_UNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_LA_LATC2_SNORM,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
+   FETCH_COMPRESSED(L_LATC1_UNORM),
+   FETCH_COMPRESSED(L_LATC1_SNORM),
+   FETCH_COMPRESSED(LA_LATC2_UNORM),
+   FETCH_COMPRESSED(LA_LATC2_SNORM),
 
/* ETC1/2 compressed formats */
-   {
-  MESA_FORMAT_ETC1_RGB8,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_RGB8,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_SRGB8,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_RGBA8_EAC,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_SRGB8_ALPHA8_EAC,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_R11_EAC,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_RG11_EAC,
-  fetch_compressed,
-  fetch_compressed,
-  fetch_compressed
-   },
-   {
-  MESA_FORMAT_ETC2_SIGNED_R11_EAC,
-  fetch_compressed,
-  fetch_compressed,
-

[Mesa-dev] [PATCH v3 13/18] i965: use ALIGN_NPOT for setting ASTC mipmap layouts

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

ALIGN is changed to ALIGN_NPOT because alignment values are sometimes not
powers of two when working with ASTC.

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 12 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 998d8c4..4007697 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -367,7 +367,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
mt->total_width = mt->physical_width0;
 
if (mt->compressed) {
-   mt->total_width = ALIGN(mt->physical_width0, mt->align_w);
+   mt->total_width = ALIGN_NPOT(mt->physical_width0, mt->align_w);
}
 
/* May need to adjust width to accommodate the placement of
@@ -379,10 +379,10 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
unsigned mip1_width;
 
if (mt->compressed) {
-  mip1_width = ALIGN(minify(mt->physical_width0, 1), mt->align_w) +
- ALIGN(minify(mt->physical_width0, 2), bw);
+  mip1_width = ALIGN_NPOT(minify(mt->physical_width0, 1), mt->align_w) 
+
+ ALIGN_NPOT(minify(mt->physical_width0, 2), bw);
} else {
-  mip1_width = ALIGN(minify(mt->physical_width0, 1), mt->align_w) +
+  mip1_width = ALIGN_NPOT(minify(mt->physical_width0, 1), mt->align_w) 
+
  minify(mt->physical_width0, 2);
}
 
@@ -398,7 +398,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 
   intel_miptree_set_level_info(mt, level, x, y, depth);
 
-  img_height = ALIGN(height, mt->align_h);
+  img_height = ALIGN_NPOT(height, mt->align_h);
   if (mt->compressed)
 img_height /= bh;
 
@@ -415,7 +415,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
   /* Layout_below: step right after second mipmap.
*/
   if (level == mt->first_level + 1) {
-x += ALIGN(width, mt->align_w);
+x += ALIGN_NPOT(width, mt->align_w);
   } else {
 y += img_height;
   }
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 6aa969a..b47f49d0 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1213,8 +1213,8 @@ intel_miptree_copy_slice(struct brw_context *brw,
if (dst_mt->compressed) {
   unsigned int i, j;
   _mesa_get_format_block_size(dst_mt->format, &i, &j);
-  height = ALIGN(height, j) / j;
-  width = ALIGN(width, i);
+  height = ALIGN_NPOT(height, j) / j;
+  width = ALIGN_NPOT(width, i);
}
 
/* If it's a packed depth/stencil buffer with separate stencil, the blit
-- 
2.4.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 10/18] i965/surface_formats: add support for 2D ASTC surface formats

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

Intel surface formats default to LDR unless there is hardware
support for HDR and the texture is able to be processed in HDR mode.

v2: remove extra newlines.
v3: follow existing coding style in translate_tex_format().

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_defines.h | 32 ++
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 80 +
 2 files changed, 112 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index bfcc442..da5d434 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -504,6 +504,38 @@
 #define BRW_SURFACEFORMAT_R8G8B8_UINT0x1C8
 #define BRW_SURFACEFORMAT_R8G8B8_SINT0x1C9
 #define BRW_SURFACEFORMAT_RAW0x1FF
+
+#define GEN9_SURFACE_ASTC_HDR_FORMAT_BIT 0x100
+
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_4x4_U8sRGB 0x200
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_5x4_U8sRGB 0x208
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_5x5_U8sRGB 0x209
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_6x5_U8sRGB 0x211
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_6x6_U8sRGB 0x212
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x5_U8sRGB 0x221
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x6_U8sRGB 0x222
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x8_U8sRGB 0x224
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x5_U8sRGB0x231
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x6_U8sRGB0x232
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x8_U8sRGB0x234
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x10_U8sRGB   0x236
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_12x10_U8sRGB   0x23E
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_12x12_U8sRGB   0x23F
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_4x4_FLT16  0x240
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_5x4_FLT16  0x248
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_5x5_FLT16  0x249
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_6x5_FLT16  0x251
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_6x6_FLT16  0x252
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x5_FLT16  0x261
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x6_FLT16  0x262
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_8x8_FLT16  0x264
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x5_FLT16 0x271
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x6_FLT16 0x272
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x8_FLT16 0x274
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_10x10_FLT160x276
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_12x10_FLT160x27E
+#define BRW_SURFACEFORMAT_ASTC_LDR_2D_12x12_FLT160x27F
+
 #define BRW_SURFACE_FORMAT_SHIFT   18
 #define BRW_SURFACE_FORMAT_MASKINTEL_MASK(26, 18)
 
diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index 0501606..a896b79 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -307,6 +307,34 @@ const struct surface_format_info surface_formats[] = {
SF( x,  x,  x,  x,  x,  x,  x,  x,  x, ETC2_EAC_SRGB8_A8)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x, R8G8B8_UINT)
SF( x,  x,  x,  x,  x,  x,  x,  x,  x, R8G8B8_SINT)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_4x4_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_5x4_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_5x5_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_6x5_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_6x6_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x5_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x6_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x8_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x5_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x6_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x8_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x10_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_12x10_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_12x12_FLT16)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_4x4_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_5x4_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_5x5_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_6x5_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_6x6_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x5_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x6_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_8x8_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x5_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x,  x,  x, ASTC_LDR_2D_10x6_U8sRGB)
+   SF(90, 90,  x,  x,  x,  x,  x

[Mesa-dev] [PATCH v3 15/18] i965: change the meaning of cpp for compressed textures

2015-06-22 Thread Nanley Chery

From: Nanley Chery 

An ASTC block takes up 16 bytes for all block width and height configurations.
This size is not integrally divisible by all ASTC block widths. Therefore cpp
is changed to mean bytes per block if the texture is compressed.

Because the original definition was bytes per block divided by block width, all
references to the mipmap width must be divided the block width. This keeps the
address calculation formulas consistent. For example, the units for 
miptree_level
x_offset and miptree total_width has changed from pixels to blocks.

v2: reuse preexisting ALIGN_NPOT macro located in an i965 driver file.
v3: move ALIGN_NPOT into seperate commit.
simplify cpp assignment in copy_image_with_blitter().
update miptree width and offset variables in: intel_miptree_copy_slice(),
intel_miptree_map_gtt(), and brw_miptree_layout_texture_3d().

Signed-off-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 15 +--
 src/mesa/drivers/dri/i965/intel_copy_image.c  | 19 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 13 +++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 +-
 4 files changed, 14 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index ade2940..840a069 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -396,6 +396,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
}
}
 
+   mt->total_width /= bw;
mt->total_height = 0;
 
for (unsigned level = mt->first_level; level <= mt->last_level; level++) {
@@ -420,7 +421,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
   /* Layout_below: step right after second mipmap.
*/
   if (level == mt->first_level + 1) {
-x += ALIGN_NPOT(width, mt->align_w);
+x += ALIGN_NPOT(width, mt->align_w) / bw;
   } else {
 y += img_height;
   }
@@ -582,12 +583,14 @@ static void
 brw_miptree_layout_texture_3d(struct brw_context *brw,
   struct intel_mipmap_tree *mt)
 {
-   unsigned yscale = mt->compressed ? 4 : 1;
-
mt->total_width = 0;
mt->total_height = 0;
 
unsigned ysum = 0;
+   unsigned bh, bw;
+
+   _mesa_get_format_block_size(mt->format, &bw, &bh);
+
for (unsigned level = mt->first_level; level <= mt->last_level; level++) {
   unsigned WL = MAX2(mt->physical_width0 >> level, 1);
   unsigned HL = MAX2(mt->physical_height0 >> level, 1);
@@ -604,9 +607,9 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
  unsigned x = (q % (1 << level)) * wL;
  unsigned y = ysum + (q >> level) * hL;
 
- intel_miptree_set_image_offset(mt, level, q, x, y / yscale);
- mt->total_width = MAX2(mt->total_width, x + wL);
- mt->total_height = MAX2(mt->total_height, (y + hL) / yscale);
+ intel_miptree_set_image_offset(mt, level, q, x / bw, y / bh);
+ mt->total_width = MAX2(mt->total_width, (x + wL) / bw);
+ mt->total_height = MAX2(mt->total_height, (y + hL) / bh);
   }
 
   ysum += ALIGN(DL, 1 << level) / (1 << level) * hL;
diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c 
b/src/mesa/drivers/dri/i965/intel_copy_image.c
index f4c7eff..93a64b5 100644
--- a/src/mesa/drivers/dri/i965/intel_copy_image.c
+++ b/src/mesa/drivers/dri/i965/intel_copy_image.c
@@ -41,7 +41,6 @@ copy_image_with_blitter(struct brw_context *brw,
 {
GLuint bw, bh;
uint32_t src_image_x, src_image_y, dst_image_x, dst_image_y;
-   int cpp;
 
/* The blitter doesn't understand multisampling at all. */
if (src_mt->num_samples > 0 || dst_mt->num_samples > 0)
@@ -86,16 +85,6 @@ copy_image_with_blitter(struct brw_context *brw,
   src_y /= (int)bh;
   src_width /= (int)bw;
   src_height /= (int)bh;
-
-  /* Inside of the miptree, the x offsets are stored in pixels while
-   * the y offsets are stored in blocks.  We need to scale just the x
-   * offset.
-   */
-  src_image_x /= bw;
-
-  cpp = _mesa_get_format_bytes(src_mt->format);
-   } else {
-  cpp = src_mt->cpp;
}
src_x += src_image_x;
src_y += src_image_y;
@@ -111,18 +100,12 @@ copy_image_with_blitter(struct brw_context *brw,
 
   dst_x /= (int)bw;
   dst_y /= (int)bh;
-
-  /* Inside of the miptree, the x offsets are stored in pixels while
-   * the y offsets are stored in blocks.  We need to scale just the x
-   * offset.
-   */
-  dst_image_x /= bw;
}
dst_x += dst_image_x;
dst_y += dst_image_y;
 
return intelEmitCopyBlit(brw,
-cpp,
+src_mt->cpp,
 src_mt->pitch,
 src_mt->bo, src_mt->offset,
 src_mt->tiling,
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipma

Re: [Mesa-dev] [PATCH v3 11/18] mesa/macros: add power-of-two assertions for alignment macros

2015-06-22 Thread Brian Paul


On 06/22/2015 05:02 PM, Nanley Chery wrote:

From: Nanley Chery 

ALIGN and ROUND_DOWN_TO both require that the alignment value passed
into the macro be a power of two in the comments. Using software assertions
verifies this to be the case.

v2: use static inline functions instead of gcc-specific statement expressions.

Signed-off-by: Nanley Chery 
---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
  src/mesa/main/macros.h   | 16 +---
  2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 59081ea..1a57784 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -134,7 +134,7 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
 : var->type->vector_elements;

if (stage == MESA_SHADER_VERTEX) {
- for (int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) {
+ for (unsigned int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) 
{
  int output = var->data.location + i;
  this->outputs[output] = offset(reg, 4 * i);
  this->output_components[output] = vector_elements;
diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
index 0608650..4a640ad 100644
--- a/src/mesa/main/macros.h
+++ b/src/mesa/main/macros.h
@@ -684,7 +684,7 @@ minify(unsigned value, unsigned levels)
   * Note that this considers 0 a power of two.
   */
  static inline bool
-is_power_of_two(unsigned value)
+is_power_of_two(uintptr_t value)
  {
 return (value & (value - 1)) == 0;
  }
@@ -700,7 +700,12 @@ is_power_of_two(unsigned value)
   *
   * \sa ROUND_DOWN_TO()
   */
-#define ALIGN(value, alignment)  (((value) + (alignment) - 1) & ~((alignment) 
- 1))
+static inline uintptr_t
+ALIGN(uintptr_t value, uintptr_t alignment)
+{
+  assert(is_power_of_two(alignment));
+  return (((value) + (alignment) - 1) & ~((alignment) - 1));


Looks like more than 3-space indentation here and below.

-Brian


+}

  /**
   * Align a value down to an alignment value
@@ -713,7 +718,12 @@ is_power_of_two(unsigned value)
   *
   * \sa ALIGN()
   */
-#define ROUND_DOWN_TO(value, alignment) ((value) & ~(alignment - 1))
+static inline uintptr_t
+ROUND_DOWN_TO(uintptr_t value, uintptr_t alignment)
+{
+  assert(is_power_of_two(alignment));
+  return ((value) & ~(alignment - 1));
+}


  /** Cross product of two 3-element vectors */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 28/46] glsl: don't lower variable indexing on non-patch tessellation inputs/outputs

2015-06-22 Thread Kenneth Graunke

On Wednesday, June 17, 2015 01:01:24 AM Marek Olšák wrote:
> From: Marek Olšák 
> 
> There is no way to lower them, because the array sizes are unknown
> at compile time.
> 
> Based on a patch from: Fabian Bieler 

I'm a bit confused by the justification given for this patch.

TCS/TES per-vertex inputs:
--

...are always fixed-size arrays of length gl_MaxPatchVertices, because:

"The length of gl_in is equal to the implementation-dependent maximum
 patch size (gl_MaxPatchVertices)."

"Similarly to the built-in inputs, each user-defined input variable has
 a value for each vertex and thus needs to be declared as arrays or
 inside input blocks declared as arrays.  Declaring an array size is
 optional.  If no size is specified, it will be taken from the
 implementation-dependent maximum patch size (gl_MaxPatchVertices).
 If a size is specified, it must match the maximum patch size;
 otherwise, a link-error will occur."

This same text exists for both TCS inputs and TES inputs.  Since we
always know the array size, I don't see why we can't do lowering in
this case.

I'm pretty new to tessellation shaders, so am I missing something?

TCS per-patch inputs:
-

...don't exist AFAICT.

TES per-patch inputs:
-

...do exist, require no special handling.

TCS per-vertex outputs:
---

...are arrays whose size is known at link time, but not necessarily
compile time.

"The length of gl_out is equal to the output patch size specified in the
 tessellation control shader output layout declaration."

"A tessellation control shader may also declare user-defined per-vertex
 output variables. User-defined per-vertex output variables are declared
 with the qualifier out and have a value for each vertex in the output
 patch. Such variables must be declared as arrays or inside output blocks
 declared as arrays. Declaring an array size is optional. If no size is
 specified, it will be taken from the output patch size declared in the
 shader."

Apparently, the index must also be gl_InvocationID when writing:

"While per-vertex output variables are declared as arrays indexed by
 vertex number, each tessellation control shader invocation may write only
 to those outputs corresponding to its output patch vertex. Tessellation
 control shaders must use the input variable gl_InvocationID as the
 vertex number index when writing to per-vertex output variables."

So we clearly don't want to do lowering on writes.  But for reads, it
seems like we could do lowering when the array size is known (such as
post-linking).  I'm not sure whether or not it's beneficial...

It might be nice to add a comment explaining why it makes no sense to
lower variable indexing on TCS output writes (with the above spec
citation).

TES outputs:

...require no special handling.

> ---
>  src/glsl/ir_optimization.h   |  5 +--
>  src/glsl/lower_variable_index_to_cond_assign.cpp | 43 
> +---
>  src/glsl/test_optpass.cpp|  3 +-
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  8 +++--
>  src/mesa/program/ir_to_mesa.cpp  |  2 +-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp   |  2 +-
>  6 files changed, 42 insertions(+), 21 deletions(-)
> 
> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
> index 688a5e1..a174c96 100644
> --- a/src/glsl/ir_optimization.h
> +++ b/src/glsl/ir_optimization.h
> @@ -114,8 +114,9 @@ bool lower_discard(exec_list *instructions);
>  void lower_discard_flow(exec_list *instructions);
>  bool lower_instructions(exec_list *instructions, unsigned what_to_lower);
>  bool lower_noise(exec_list *instructions);
> -bool lower_variable_index_to_cond_assign(exec_list *instructions,
> -bool lower_input, bool lower_output, bool lower_temp, bool 
> lower_uniform);
> +bool lower_variable_index_to_cond_assign(gl_shader_stage stage,
> +exec_list *instructions, bool lower_input, bool lower_output,
> +bool lower_temp, bool lower_uniform);
>  bool lower_quadop_vector(exec_list *instructions, bool dont_lower_swz);
>  bool lower_const_arrays_to_uniforms(exec_list *instructions);
>  bool lower_clip_distance(gl_shader *shader);
> diff --git a/src/glsl/lower_variable_index_to_cond_assign.cpp 
> b/src/glsl/lower_variable_index_to_cond_assign.cpp
> index d878cb0..b6421f5 100644
> --- a/src/glsl/lower_variable_index_to_cond_assign.cpp
> +++ b/src/glsl/lower_variable_index_to_cond_assign.cpp
> @@ -335,12 +335,14 @@ struct switch_generator
>  
>  class variable_index_to_cond_assign_visitor : public ir_rvalue_visitor {
>  public:
> -   variable_index_to_cond_assign_visitor(bool lower_input,
> -  bool lower_output,
> -  bool lower_temp,
> -  bool lower_uniform)
> +   variable_index_to_cond_assign_visitor(gl_shader_stage stage,
> +

Re: [Mesa-dev] [PATCH 5/5] i965/gen9: Allocate YF/YS tiled buffer objects

2015-06-22 Thread Ben Widawsky

On Wed, Jun 10, 2015 at 03:30:50PM -0700, Anuj Phogat wrote:
> In case of I915_TILING_{X,Y} we need to pass tiling format to libdrm
> using drm_intel_bo_alloc_tiled(). But, In case of YF/YS tiled buffers
> libdrm need not know about the tiling format because these buffers
> don't have hardware support to be tiled or detiled through a fenced
> region. libdrm still need to know buffer alignment value for its use
> in kernel when resolving the relocation.
> 
> Using drm_intel_bo_alloc_for_render() for YF/YS tiled buffers
> satisfy both the above conditions.
> 
> Signed-off-by: Anuj Phogat 
> Cc: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 86 
> +--
>  1 file changed, 80 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 615cbfb..d4d9e76 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -522,6 +522,65 @@ intel_lower_compressed_format(struct brw_context *brw, 
> mesa_format format)
> }
>  }
>  
> +/* This function computes Yf/Ys tiled bo size and alignment. */

It also computes pitch for the yf/ys case

> +static uint64_t
> +intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment)
> +{
> +   const uint32_t bpp = mt->cpp * 8;
> +   const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
> +   uint32_t tile_width, tile_height;
> +   const uint64_t min_size = 512 * 1024;
> +   const uint64_t max_size = 64 * 1024 * 1024;

Where do min/max come from? Add a comment?

> +   uint64_t i, stride, size, aligned_y;
> +
> +   assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);
> +
> +   switch (bpp) {
> +   case 8:
> +  tile_height = 64;
> +  break;
> +   case 16:
> +   case 32:
> +  tile_height = 32;
> +  break;
> +   case 64:
> +   case 128:
> +  tile_height = 16;
> +  break;
> +   default:
> +  tile_height = 0;

make this unreachable()

> +  printf("Invalid bits per pixel in %s: bpp = %d\n",
> + __FUNCTION__, bpp);
> +   }

I think ideally you should roll this logic into intel_miptree_get_tile_masks().

> +
> +   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
> +  tile_height *= 4;
> +
> +   aligned_y = ALIGN(mt->total_height, tile_height);
> +
> +   stride = mt->total_width * mt->cpp;
> +   tile_width = tile_height * mt->cpp * aspect_ratio;
> +   stride = ALIGN(stride, tile_width);
> +   size = stride * aligned_y;
> +
> +   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF) {
> +  *alignment = 4096;
> +  size = ALIGN(size, 4096);
> +   } else {
> +  *alignment = 64 * 1024;
> +  size = ALIGN(size, 64 * 1024);
> +   }

Hmm. I think the above calculation for size is redundant since you already
aligned to tile_width and height, above. Right? assert((size % 64K) == 0);

> +
> +   if (size > max_size) {
> +  mt->tr_mode = INTEL_MIPTREE_TRMODE_NONE;
> +  return 0;
> +   } else {
> +  mt->pitch = stride;
> +  for (i = min_size; i < size; i <<= 1)
> + ;
> +  return i;

I don't understand this. Why don't you just return size? It seems incredibly
wasteful to both start a 512K, and to increment by powers of 2. Did I miss
something?

Also, I don't understand max_size. I must be missing something in the spec with
the min/max values, can you point me to them?

> +   }
> +}
>  
>  struct intel_mipmap_tree *
>  intel_miptree_create(struct brw_context *brw,
> @@ -575,12 +634,27 @@ intel_miptree_create(struct brw_context *brw,
>  
> unsigned long pitch;
> mt->etc_format = etc_format;
> -   mt->bo = drm_intel_bo_alloc_tiled(brw->bufmgr, "miptree",
> - total_width, total_height, mt->cpp,
> - &mt->tiling, &pitch,
> - (expect_accelerated_upload ?
> -  BO_ALLOC_FOR_RENDER : 0));
> -   mt->pitch = pitch;
> +
> +   if (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
> +  unsigned alignment = 0;
> +  unsigned long size;
> +  size = intel_get_yf_ys_bo_size(mt, &alignment);
> +
> +  /* intel_get_yf_ys_bo_size() might change the tr_mode. */
> +  if (size > 0 && mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE) {
> + mt->bo = drm_intel_bo_alloc_for_render(brw->bufmgr, "miptree",
> +size, alignment);
> +  }
> +   }
> +
> +   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
> +  mt->bo = drm_intel_bo_alloc_tiled(brw->bufmgr, "miptree",
> +total_width, total_height, mt->cpp,
> +&mt->tiling, &pitch,
> +(expect_accelerated_upload ?
> + BO_ALLOC_FOR_RENDER : 0));
> +  mt->pitch = pitch;
> +   }
>  
> /* If the BO is too large to fit in the apertu

Re: [Mesa-dev] [PATCH 2/5] i965/gen9: Plugin the code for selecting YF/YS tiling on skl+

2015-06-22 Thread Anuj Phogat

On Mon, Jun 22, 2015 at 2:53 PM, Ben Widawsky  wrote:
> On Wed, Jun 10, 2015 at 03:30:47PM -0700, Anuj Phogat wrote:
>> Buffers with Yf/Ys tiling end up using meta upload / download
>> paths or the blitter for cases where they used tiled_memcpy paths
>> in case of Y tiling. This has exposed some bugs in meta path. To
>> avoid any piglit regressions on SKL this patch keeps the Yf/Ys
>> tiling disabled at the moment.
>>
>> V3: Make brw_miptree_choose_tr_mode() actually choose TRMODE. (Ben)
>> Few cosmetic changes.
>> V4: Get rid of brw_miptree_choose_tr_mode().
>> Take care of all tile resource modes {Yf, Ys, none} for all
>> generations at one place.
>>
>> Signed-off-by: Anuj Phogat 
>> Cc: Ben Widawsky 
>> ---
>>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 97 
>> --
>>  1 file changed, 79 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
>> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> index b9ac4cf..c0ef5cc 100644
>> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> @@ -807,27 +807,88 @@ brw_miptree_layout(struct brw_context *brw,
>> enum intel_miptree_tiling_mode requested,
>> struct intel_mipmap_tree *mt)
>>  {
>> -   mt->tr_mode = INTEL_MIPTREE_TRMODE_NONE;
>> +   const unsigned bpp = mt->cpp * 8;
>> +   const bool is_tr_mode_yf_ys_allowed =
>> +  brw->gen >= 9 &&
>> +  !for_bo &&
>> +  !mt->compressed &&
>> +  /* Enable YF/YS tiling only for color surfaces because depth and
>> +   * stencil surfaces are not supported in blitter using fast copy
>> +   * blit and meta PBO upload, download paths. No other paths
>> +   * currently support Yf/Ys tiled surfaces.
>> +   * FIXME:  Remove this restriction once we have a tiled_memcpy()
>> +   * path to do depth/stencil data upload/download to Yf/Ys tiled
>> +   * surfaces.
>> +   */
>
> I think it's more readable to move this comment above the variable 
> declaration.
> Up to you though. Also I think "FINISHME" is the more appropriate 
> classification
> for this type of thing.
>
Sure.
>> +  _mesa_is_format_color_format(mt->format) &&
>> +  (requested == INTEL_MIPTREE_TILING_Y ||
>> +   requested == INTEL_MIPTREE_TILING_ANY) &&
>
> This is where my tiling flags would have helped a bit since you should be able
> to do flags & Y_TILED :P
>
Yes, I will do a follow up patch to make use of that.
>> +  (bpp && is_power_of_two(bpp)) &&
>> +  /* FIXME: To avoid piglit regressions keep the Yf/Ys tiling
>> +   * disabled at the moment.
>> +   */
>> +  false;
>
> Also, "FINISHME"
>
>>
>> -   intel_miptree_set_alignment(brw, mt);
>> -   intel_miptree_set_total_width_height(brw, mt);
>> +   /* Lower index (Yf) is the higher priority mode */
>> +   const uint32_t tr_mode[3] = {INTEL_MIPTREE_TRMODE_YF,
>> +INTEL_MIPTREE_TRMODE_YS,
>> +INTEL_MIPTREE_TRMODE_NONE};
>> +   int i = is_tr_mode_yf_ys_allowed ? 0 : ARRAY_SIZE(tr_mode) - 1;
>>
>> -   if (!mt->total_width || !mt->total_height) {
>> -  intel_miptree_release(&mt);
>> -  return;
>> -   }
>> +   while (i < ARRAY_SIZE(tr_mode)) {
>> +  if (brw->gen < 9)
>> + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
>> +  else
>> + assert(tr_mode[i] == INTEL_MIPTREE_TRMODE_YF ||
>> +tr_mode[i] == INTEL_MIPTREE_TRMODE_YS ||
>> +tr_mode[i] == INTEL_MIPTREE_TRMODE_NONE);
>>
>> -   /* On Gen9+ the alignment values are expressed in multiples of the block
>> -* size
>> -*/
>> -   if (brw->gen >= 9) {
>> -  unsigned int i, j;
>> -  _mesa_get_format_block_size(mt->format, &i, &j);
>> -  mt->align_w /= i;
>> -  mt->align_h /= j;
>> -   }
>> +  mt->tr_mode = tr_mode[i];
>> +  intel_miptree_set_alignment(brw, mt);
>> +  intel_miptree_set_total_width_height(brw, mt);
>>
>> -   if (!for_bo)
>> -  mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);
>> +  if (!mt->total_width || !mt->total_height) {
>> + intel_miptree_release(&mt);
>> + return;
>> +  }
>> +
>> +  /* On Gen9+ the alignment values are expressed in multiples of the
>> +   * block size.
>> +   */
>> +  if (brw->gen >= 9) {
>> + unsigned int i, j;
>> + _mesa_get_format_block_size(mt->format, &i, &j);
>> + mt->align_w /= i;
>> + mt->align_h /= j;
>> +  }
>
> Can we just combine this alignment calculation into
> intel_miptree_set_alignment()?
>
No. intel_miptree_set_total_width_height() called after
intel_miptree_set_alignment() needs align_w and align_h values in
pixels. We do the division later to directly use mt->align_w and
mt->align_h while setting the surface state which needs the values
in number of blocks. I have a cleanup patch moving this code to
surface state setup.

>> +
>> +

Re: [Mesa-dev] [PATCH 3/4] i965/gen9: Don't use encrypted MOCS

2015-06-22 Thread Ben Widawsky

On Thu, Jun 18, 2015 at 03:41:50PM -0700, Kenneth Graunke wrote:
> On Wednesday, June 17, 2015 03:50:13 PM Ben Widawsky wrote:
> > On gen9+ MOCS is an index into a table. It is 7 bits, and AFAICT, bit 0 is 
> > for
> > doing encrypted reads.
> > 
> > I don't recall how I decided to do this for BXT. I don't know this patch was
> > ever needed, since it seems nothing is broken today on SKL. Furthermore, 
> > this
> > patch may no longer be needed because of the ongoing changes with MOCS 
> > setup. It
> > is what is being used/tested, so it's included in the series.
> > 
> > The chosen values are the old values left shifted. That was also an 
> > arbitrary
> > choice.
> > 
> > Cc:  Francisco Jerez 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/brw_defines.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index bfcc442..5358edc 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -2495,8 +2495,8 @@ enum brw_wm_barycentric_interp_mode {
> >   * cache settings.  We still use only either write-back or write-through; 
> > and
> >   * rely on the documented default values.
> >   */
> > -#define SKL_MOCS_WB 9
> > -#define SKL_MOCS_WT 5
> > +#define SKL_MOCS_WB 0x12
> > +#define SKL_MOCS_WT 0xa
> 
> 
> Yeah, it looks like Kristian made these defines the indices into the
> table, but may have missed that the MOCS field puts that table index in
> [6:1] and bit 0 is something else.
> 
> So shifting left by 1 seems like a good plan.  Perhaps write it as
> 
> #define SKL_MOCS_WB (0b000101 << 1)
> #define SKL_MOCS_WT (0b001001 << 1)
> 

You meant this, right (you reversed it, I think)?
#define SKL_MOCS_WB (0b001001 << 1)
#define SKL_MOCS_WT (0b000101 << 1)


> so the index value is written like it is in the documentation, and the
> shift 1 indicates moving it into the right place for MOCS?
> 
> Either way,
> Reviewed-by: Kenneth Graunke 
> 
> Incidentally...the WT value (index 5) appears to skip eLLC - the target
> cache is 01b = "LLC only".  That doesn't seem desirable.  We probably
> want index 6 instead (0b000110 << 1) which uses both LLC and eLLC.
> 
> That said, we shouldn't ever be using WT in the driver - we want to use
> the PTE value.  (krh even added a FINISHME comment to that effect.)
> 
> I think a proper value for that would be:
> #define SKL_MOCS_PTE (0b10 << 1)
> (Default: 0b10,
>  LeCC = 0x00 - use cacheability controls from page table / ...
>  TC = LLC/eLLC allowed)
> 
> We could either fix the _WT define or just delete it.
> 
> >  
> >  #define MEDIA_VFE_STATE 0x7000
> >  /* GEN7 DW2, GEN8+ DW3 */
> > 

I'll get on this too. Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/16] i965: Finish removing brw_context from the compiler

2015-06-22 Thread Jason Ekstrand

I started working on this project some time ago to remove brw_context from
the backend compiler.  I got a bunch of refactoring done but eventualy got
stuck up on shader_time and some debug logging stuff.  I've finally gotten
around to finishing it and here it is.

Jason Ekstrand (15):
  i965: Replace some instances of brw->gen with devinfo->gen
  i965: Plumb compiler debug logging through a function pointer in
brw_compiler
  i965: Remove the dependance on brw_context from the generators
  i965: Move INTEL_DEBUG variable parsing to screen creation time
  i965/fs: Make no16 non-variadic
  i965/fs: Do the no16 perf logging directly in fs_visitor::no16()
  i965/fs: Plumb compiler debug logging through brw_compiler
  i965: Add compiler options to brw_compiler
  i965: Use a single index per shader for shader_time.
  i965: Pull calls to get_shader_time_index out of the visitor
  i965/fs: Add a do_rep_send flag to run_fs
  i965/vs: Pass the current set of clip planes through run() and
run_vs()
  i965/vec4: Turn some _mesa_problem calls into asserts
  i965/vec4_vs: Add an explicit use_legacy_snorm_formula flag
  i965: Remove the brw_context from the visitors

Kenneth Graunke (1):
  mesa: Add a va_args variant of _mesa_gl_debug().

 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp|   3 +-
 src/mesa/drivers/dri/i965/brw_context.c|  54 ++---
 src/mesa/drivers/dri/i965/brw_context.h|  15 +--
 src/mesa/drivers/dri/i965/brw_cs.cpp   |  17 ++-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 127 -
 src/mesa/drivers/dri/i965/brw_fs.h |  28 +++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  21 ++--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  30 ++---
 src/mesa/drivers/dri/i965/brw_program.c|  67 ---
 src/mesa/drivers/dri/i965/brw_shader.cpp   | 100 +++-
 src/mesa/drivers/dri/i965/brw_shader.h |  13 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  49 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  23 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  22 ++--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |  32 --
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h|   5 +-
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |   1 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  16 +--
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  |   9 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp  |  16 +--
 src/mesa/drivers/dri/i965/brw_vs.h |   8 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h|   7 +-
 src/mesa/drivers/dri/i965/intel_debug.c|  13 +--
 src/mesa/drivers/dri/i965/intel_debug.h|   4 +-
 src/mesa/drivers/dri/i965/intel_screen.c   |   3 +
 src/mesa/main/errors.c |  29 +++--
 src/mesa/main/errors.h |   9 ++
 28 files changed, 379 insertions(+), 343 deletions(-)

-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/16] mesa: Add a va_args variant of _mesa_gl_debug().

2015-06-22 Thread Jason Ekstrand

From: Kenneth Graunke 

This will be useful for wrapper functions.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/errors.c | 29 +
 src/mesa/main/errors.h |  9 +
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 16f10dd..b340666 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -1413,6 +1413,26 @@ should_output(struct gl_context *ctx, GLenum error, 
const char *fmtString)
 
 
 void
+_mesa_gl_vdebug(struct gl_context *ctx,
+GLuint *id,
+enum mesa_debug_source source,
+enum mesa_debug_type type,
+enum mesa_debug_severity severity,
+const char *fmtString,
+va_list args)
+{
+   char s[MAX_DEBUG_MESSAGE_LENGTH];
+   int len;
+
+   debug_get_id(id);
+
+   len = _mesa_vsnprintf(s, MAX_DEBUG_MESSAGE_LENGTH, fmtString, args);
+
+   log_msg(ctx, source, type, *id, severity, len, s);
+}
+
+
+void
 _mesa_gl_debug(struct gl_context *ctx,
GLuint *id,
enum mesa_debug_source source,
@@ -1420,17 +1440,10 @@ _mesa_gl_debug(struct gl_context *ctx,
enum mesa_debug_severity severity,
const char *fmtString, ...)
 {
-   char s[MAX_DEBUG_MESSAGE_LENGTH];
-   int len;
va_list args;
-
-   debug_get_id(id);
-
va_start(args, fmtString);
-   len = _mesa_vsnprintf(s, MAX_DEBUG_MESSAGE_LENGTH, fmtString, args);
+   _mesa_gl_vdebug(ctx, id, source, type, severity, fmtString, args);
va_end(args);
-
-   log_msg(ctx, source, type, *id, severity, len, s);
 }
 
 
diff --git a/src/mesa/main/errors.h b/src/mesa/main/errors.h
index e6dc9b5..24f234f 100644
--- a/src/mesa/main/errors.h
+++ b/src/mesa/main/errors.h
@@ -76,6 +76,15 @@ extern FILE *
 _mesa_get_log_file(void);
 
 extern void
+_mesa_gl_vdebug(struct gl_context *ctx,
+GLuint *id,
+enum mesa_debug_source source,
+enum mesa_debug_type type,
+enum mesa_debug_severity severity,
+const char *fmtString,
+va_list args);
+
+extern void
 _mesa_gl_debug(struct gl_context *ctx,
GLuint *id,
enum mesa_debug_source source,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/16] i965: Remove the dependance on brw_context from the generators

2015-06-22 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp   | 2 +-
 src/mesa/drivers/dri/i965/brw_cs.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs.h| 4 +++-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 4 ++--
 src/mesa/drivers/dri/i965/brw_vec4.h  | 4 +++-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 3 ++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 2 +-
 9 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index 9c04137..789520c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -29,7 +29,7 @@
 brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw,
bool debug_flag)
: mem_ctx(ralloc_context(NULL)),
- generator(brw->intelScreen->compiler,
+ generator(brw->intelScreen->compiler, brw,
mem_ctx, (void *) rzalloc(mem_ctx, struct brw_wm_prog_key),
(struct brw_stage_prog_data *) rzalloc(mem_ctx, struct 
brw_wm_prog_data),
NULL, 0, false, "BLORP")
diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index f93ca2f..0833404 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -128,7 +128,7 @@ brw_cs_emit(struct brw_context *brw,
   return NULL;
}
 
-   fs_generator g(brw->intelScreen->compiler,
+   fs_generator g(brw->intelScreen->compiler, brw,
   mem_ctx, (void*) key, &prog_data->base, &cp->Base,
   v8.promoted_constants, v8.runtime_check_aads_emit, "CS");
if (INTEL_DEBUG & DEBUG_CS) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2b892f0..615c2f1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4069,7 +4069,7 @@ brw_wm_fs_emit(struct brw_context *brw,
   prog_data->no_8 = false;
}
 
-   fs_generator g(brw->intelScreen->compiler,
+   fs_generator g(brw->intelScreen->compiler, brw,
   mem_ctx, (void *) key, &prog_data->base,
   &fp->Base, v.promoted_constants, v.runtime_check_aads_emit, 
"FS");
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 7414b65..1d52ff0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -398,7 +398,7 @@ public:
 class fs_generator
 {
 public:
-   fs_generator(const struct brw_compiler *compiler,
+   fs_generator(const struct brw_compiler *compiler, void *log_data,
 void *mem_ctx,
 const void *key,
 struct brw_stage_prog_data *prog_data,
@@ -494,6 +494,8 @@ private:
bool patch_discard_jumps_to_fb_writes();
 
const struct brw_compiler *compiler;
+   void *log_data; /* Passed to compiler->*_log functions */
+
const struct brw_device_info *devinfo;
 
struct brw_codegen *p;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d98a40d..2ed0bac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -121,7 +121,7 @@ brw_reg_from_fs_reg(fs_reg *reg)
return brw_reg;
 }
 
-fs_generator::fs_generator(const struct brw_compiler *compiler,
+fs_generator::fs_generator(const struct brw_compiler *compiler, void *log_data,
void *mem_ctx,
const void *key,
struct brw_stage_prog_data *prog_data,
@@ -130,7 +130,8 @@ fs_generator::fs_generator(const struct brw_compiler 
*compiler,
bool runtime_check_aads_emit,
const char *stage_abbrev)
 
-   : compiler(compiler), devinfo(compiler->devinfo), key(key),
+   : compiler(compiler), log_data(log_data),
+ devinfo(compiler->devinfo), key(key),
  prog_data(prog_data),
  prog(prog), promoted_constants(promoted_constants),
  runtime_check_aads_emit(runtime_check_aads_emit), debug_flag(false),
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 5e549c4..572bc17 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1910,7 +1910,7 @@ brw_vs_emit(struct brw_context *brw,
  return NULL;
   }
 
-  fs_generator g(brw->intelScreen->compiler,
+  fs_generator g(brw->intelScreen->compiler, brw,
  mem_ctx, (void *) &c->key, &prog_data->base.base,
  &c->vp->program.Base, v.promoted_constants,
  v.runtime_check_aads_emit, "VS");
@@ -1948,7 +1948,7 @@ brw_vs_emit(struct brw_

Re: [Mesa-dev] [PATCH 4/5] i965/gen9: Add XY_FAST_COPY_BLT support to intelEmitCopyBlit()

2015-06-22 Thread Ben Widawsky

On Fri, Jun 19, 2015 at 02:41:50PM -0700, Anuj Phogat wrote:
> On Wed, Jun 10, 2015 at 3:34 PM, Anuj Phogat  wrote:
> > This patch enables using XY_FAST_COPY_BLT only for Yf/Ys tiled buffers.
> > It can be later turned on for other tiling patterns (X,Y) too.
> >
> > V3: Flush in between sequential fast copy blits.
> > Fix src/dst alignment requirements.
> > Make can_fast_copy_blit() helper.
> > Use ffs(), is_power_of_two()
> > Move overlap computation inside intel_miptree_blit().
> >
> > V4: Use _mesa_regions_overlap() function.
> > Simplify horizontal and vertical alignment computations.
> >
> > Signed-off-by: Anuj Phogat 
> > Cc: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/intel_blit.c   | 295 
> > ++-
> >  src/mesa/drivers/dri/i965/intel_blit.h   |   2 +
> >  src/mesa/drivers/dri/i965/intel_copy_image.c |   2 +
> >  src/mesa/drivers/dri/i965/intel_reg.h|  16 ++
> >  4 files changed, 268 insertions(+), 47 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> > b/src/mesa/drivers/dri/i965/intel_blit.c
> > index 5afc771..800ed7e 100644
> > --- a/src/mesa/drivers/dri/i965/intel_blit.c
> > +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> > @@ -27,6 +27,7 @@
> >
> >
> >  #include "main/mtypes.h"
> > +#include "main/blit.h"
> >  #include "main/context.h"
> >  #include "main/enums.h"
> >  #include "main/colormac.h"
> > @@ -43,6 +44,23 @@
> >
> >  #define FILE_DEBUG_FLAG DEBUG_BLIT
> >
> > +#define SET_TILING_XY_FAST_COPY_BLT(tiling, tr_mode, type)   \
> > +({   \
> > +   switch (tiling) { \
> > +   case I915_TILING_X:   \
> > +  CMD |= type ## _TILED_X;   \
> > +  break; \
> > +   case I915_TILING_Y:   \
> > +  if (tr_mode == INTEL_MIPTREE_TRMODE_YS)\
> > + CMD |= type ## _TILED_64K;  \
> > +  else   \
> > + CMD |= type ## _TILED_Y;\
> > +  break; \
> > +   default:  \
> > +  unreachable("not reached");\
> > +   } \
> > +})
> > +
> >  static void
> >  intel_miptree_set_alpha_to_one(struct brw_context *brw,
> > struct intel_mipmap_tree *mt,
> > @@ -75,6 +93,10 @@ static uint32_t
> >  br13_for_cpp(int cpp)
> >  {
> > switch (cpp) {
> > +   case 16:
> > +  return BR13_32323232;
> > +   case 8:
> > +  return BR13_16161616;
> > case 4:
> >return BR13_;
> >break;
> > @@ -89,6 +111,66 @@ br13_for_cpp(int cpp)
> > }
> >  }
> >
> > +static uint32_t
> > +get_tr_horizontal_align(uint32_t tr_mode, uint32_t cpp, bool is_src) {
> > +   /* Alignment tables for YF/YS tiled surfaces. */
> > +   const uint32_t align_2d_yf[] = {64, 64, 32, 32, 16};
> > +   const uint32_t align_2d_ys[] = {256, 256, 128, 128, 64};

If you move the alignment stuff from the other patch series to a more generic
place, you could reuse it here. Also, as you pointed out in that other patch,
ys = 4 * ys

> > +   const uint32_t bpp = cpp * 8;
> > +   const uint32_t shift = is_src ? 17 : 10;
> > +   uint32_t align;
> > +   int i = 0;
> > +
> > +   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)
> > +  return 0;
> > +
> > +   /* Compute array index. */
> > +   assert (bpp >= 8 && bpp <= 128 && is_power_of_two(bpp));
> > +   i = ffs(bpp / 8) - 1;
> > +
> > +   align = tr_mode == INTEL_MIPTREE_TRMODE_YF ?
> > +   align_2d_yf[i] :
> > +   align_2d_ys[i];
> > +
> > +   assert(is_power_of_two(align));
> > +
> > +   /* XY_FAST_COPY_BLT doesn't support horizontal alignment of 16. */
> > +   if (align == 16)
> > +  align = 32;
> > +
> > +   return (ffs(align) - 6) << shift;
> > +}
> > +
> > +static uint32_t
> > +get_tr_vertical_align(uint32_t tr_mode, uint32_t cpp, bool is_src) {
> > +   /* Vertical alignment tables for YF/YS tiled surfaces. */
> > +   const unsigned align_2d_yf[] = {64, 32, 32, 16, 16};
> > +   const unsigned align_2d_ys[] = {256, 128, 128, 64, 64};
> > +   const uint32_t bpp = cpp * 8;
> > +   const uint32_t shift = is_src ? 15 : 8;
> > +   uint32_t align;
> > +   int i = 0;
> > +
> > +   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)
> > +  return 0;
> > +
> > +   /* Compute array index. */
> > +   assert (bpp >= 8 && bpp <= 128 && is_power_of_two(bpp));
> > +   i = ffs(bpp / 8) - 1;
> > +
> > +   align = tr_mode == INTEL_MIPTREE_TRMODE_YF ?
> > +   align_2d_yf[i] :
> > +

[Mesa-dev] [PATCH 10/16] i965: Use a single index per shader for shader_time.

2015-06-22 Thread Jason Ekstrand

Previously, each shader took 3 shader time indices which were potentially
at arbirary points in the shader time buffer.  Now, each shader gets a
single index which refers to 3 consecutive locations in the buffer.  This
simplifies some of the logic at the cost of having a magic 3 a few places.
---
 src/mesa/drivers/dri/i965/brw_context.h   | 14 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 28 --
 src/mesa/drivers/dri/i965/brw_fs.h|  3 +-
 src/mesa/drivers/dri/i965/brw_program.c   | 67 +++
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 18 +++---
 src/mesa/drivers/dri/i965/brw_vec4.h  | 10 +---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  8 +--
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  2 +-
 9 files changed, 53 insertions(+), 100 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d8fcfff..a7d83f8 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -821,20 +821,10 @@ struct brw_tracked_state {
 enum shader_time_shader_type {
ST_NONE,
ST_VS,
-   ST_VS_WRITTEN,
-   ST_VS_RESET,
ST_GS,
-   ST_GS_WRITTEN,
-   ST_GS_RESET,
ST_FS8,
-   ST_FS8_WRITTEN,
-   ST_FS8_RESET,
ST_FS16,
-   ST_FS16_WRITTEN,
-   ST_FS16_RESET,
ST_CS,
-   ST_CS_WRITTEN,
-   ST_CS_RESET,
 };
 
 struct brw_vertex_buffer {
@@ -979,6 +969,8 @@ enum brw_predicate_state {
BRW_PREDICATE_STATE_USE_BIT
 };
 
+struct shader_times;
+
 /**
  * brw_context is derived from gl_context.
  */
@@ -1503,7 +1495,7 @@ struct brw_context
   const char **names;
   int *ids;
   enum shader_time_shader_type *types;
-  uint64_t *cumulative;
+  struct shader_times *cumulative;
   int num_entries;
   int max_entries;
   double report_time;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 460120d..c1bfe86 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -578,38 +578,30 @@ fs_visitor::emit_shader_time_begin()
 void
 fs_visitor::emit_shader_time_end()
 {
-   enum shader_time_shader_type type, written_type, reset_type;
+   enum shader_time_shader_type type;
switch (stage) {
case MESA_SHADER_VERTEX:
   type = ST_VS;
-  written_type = ST_VS_WRITTEN;
-  reset_type = ST_VS_RESET;
   break;
case MESA_SHADER_GEOMETRY:
   type = ST_GS;
-  written_type = ST_GS_WRITTEN;
-  reset_type = ST_GS_RESET;
   break;
case MESA_SHADER_FRAGMENT:
   if (dispatch_width == 8) {
  type = ST_FS8;
- written_type = ST_FS8_WRITTEN;
- reset_type = ST_FS8_RESET;
   } else {
  assert(dispatch_width == 16);
  type = ST_FS16;
- written_type = ST_FS16_WRITTEN;
- reset_type = ST_FS16_RESET;
   }
   break;
case MESA_SHADER_COMPUTE:
   type = ST_CS;
-  written_type = ST_CS_WRITTEN;
-  reset_type = ST_CS_RESET;
   break;
default:
   unreachable("fs_visitor::emit_shader_time_end missing code");
}
+   int shader_time_index = brw_get_shader_time_index(brw, shader_prog, prog,
+ type);
 
/* Insert our code just before the final SEND with EOT. */
exec_node *end = this->instructions.get_tail();
@@ -639,20 +631,20 @@ fs_visitor::emit_shader_time_end()
 * trying to determine the time taken for single instructions.
 */
ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, type, diff);
-   SHADER_TIME_ADD(ibld, written_type, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, shader_time_index, 0, diff);
+   SHADER_TIME_ADD(ibld, shader_time_index, 1, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, reset_type, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, shader_time_index, 2, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
 void
 fs_visitor::SHADER_TIME_ADD(const fs_builder &bld,
-enum shader_time_shader_type type, fs_reg value)
+int shader_time_index, int shader_time_subindex,
+fs_reg value)
 {
-   int shader_time_index =
-  brw_get_shader_time_index(brw, shader_prog, prog, type);
-   fs_reg offset = fs_reg(shader_time_index * SHADER_TIME_STRIDE);
+   int index = shader_time_index * 3 + shader_time_subindex;
+   fs_reg offset = fs_reg(index * SHADER_TIME_STRIDE);
 
fs_reg payload;
if (dispatch_width == 8)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index cffedc0..55a9722 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -278,7 +278,8 @@ public:
void emit_shader_time_begin();
void emit_shader_time_end();
void SHADER_TIME_ADD(const brw::fs_builder &bld,
-

[Mesa-dev] [PATCH 14/16] i965/vec4: Turn some _mesa_problem calls into asserts

2015-06-22 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
index 92d1085..dcbd240 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
@@ -381,8 +381,7 @@ vec4_vs_visitor::emit_program_code()
  break;
 
   default:
- _mesa_problem(ctx, "Unsupported opcode %s in vertex program\n",
-   _mesa_opcode_string(vpi->Opcode));
+ assert(!"Unsupported opcode in vertex program");
   }
 
   /* Copy the temporary back into the actual destination register. */
@@ -574,15 +573,13 @@ vec4_vs_visitor::get_vp_src_reg(const prog_src_register 
&src)
  break;
 
   default:
- _mesa_problem(ctx, "bad uniform src register file: %s\n",
-   _mesa_register_file_name((gl_register_file)src.File));
+ assert(!"Bad uniform in src register file");
  return src_reg(this, glsl_type::vec4_type);
   }
   break;
 
default:
-  _mesa_problem(ctx, "bad src register file: %s\n",
-_mesa_register_file_name((gl_register_file)src.File));
+  assert(!"Bad src register file");
   return src_reg(this, glsl_type::vec4_type);
}
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/16] i965: Pull calls to get_shader_time_index out of the visitor

2015-06-22 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_cs.cpp  |  8 +++-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 55 ---
 src/mesa/drivers/dri/i965/brw_fs.h|  7 ++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  7 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 25 ++-
 src/mesa/drivers/dri/i965/brw_vec4.h  |  7 ++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 18 +---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  4 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  5 ++-
 src/mesa/drivers/dri/i965/brw_vs.h|  3 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |  5 ++-
 12 files changed, 75 insertions(+), 72 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 0833404..fa8b5c8 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -88,10 +88,14 @@ brw_cs_emit(struct brw_context *brw,
cfg_t *cfg = NULL;
const char *fail_msg = NULL;
 
+   int st_index = -1;
+   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  st_index = brw_get_shader_time_index(brw, prog, &cp->Base, ST_CS);
+
/* Now the main event: Visit the shader IR and generate our CS IR for it.
 */
fs_visitor v8(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
- &cp->Base, 8);
+ &cp->Base, 8, st_index);
if (!v8.run_cs()) {
   fail_msg = v8.fail_msg;
} else if (local_workgroup_size <= 8 * brw->max_cs_threads) {
@@ -100,7 +104,7 @@ brw_cs_emit(struct brw_context *brw,
}
 
fs_visitor v16(brw, mem_ctx, MESA_SHADER_COMPUTE, key, &prog_data->base, 
prog,
-  &cp->Base, 16);
+  &cp->Base, 16, st_index);
if (likely(!(INTEL_DEBUG & DEBUG_NO16)) &&
!fail_msg && !v8.simd16_unsupported &&
local_workgroup_size <= 16 * brw->max_cs_threads) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c1bfe86..252196a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -578,31 +578,6 @@ fs_visitor::emit_shader_time_begin()
 void
 fs_visitor::emit_shader_time_end()
 {
-   enum shader_time_shader_type type;
-   switch (stage) {
-   case MESA_SHADER_VERTEX:
-  type = ST_VS;
-  break;
-   case MESA_SHADER_GEOMETRY:
-  type = ST_GS;
-  break;
-   case MESA_SHADER_FRAGMENT:
-  if (dispatch_width == 8) {
- type = ST_FS8;
-  } else {
- assert(dispatch_width == 16);
- type = ST_FS16;
-  }
-  break;
-   case MESA_SHADER_COMPUTE:
-  type = ST_CS;
-  break;
-   default:
-  unreachable("fs_visitor::emit_shader_time_end missing code");
-   }
-   int shader_time_index = brw_get_shader_time_index(brw, shader_prog, prog,
- type);
-
/* Insert our code just before the final SEND with EOT. */
exec_node *end = this->instructions.get_tail();
assert(end && ((fs_inst *) end)->eot);
@@ -631,16 +606,16 @@ fs_visitor::emit_shader_time_end()
 * trying to determine the time taken for single instructions.
 */
ibld.ADD(diff, diff, fs_reg(-2u));
-   SHADER_TIME_ADD(ibld, shader_time_index, 0, diff);
-   SHADER_TIME_ADD(ibld, shader_time_index, 1, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, 0, diff);
+   SHADER_TIME_ADD(ibld, 1, fs_reg(1u));
ibld.emit(BRW_OPCODE_ELSE);
-   SHADER_TIME_ADD(ibld, shader_time_index, 2, fs_reg(1u));
+   SHADER_TIME_ADD(ibld, 2, fs_reg(1u));
ibld.emit(BRW_OPCODE_ENDIF);
 }
 
 void
 fs_visitor::SHADER_TIME_ADD(const fs_builder &bld,
-int shader_time_index, int shader_time_subindex,
+int shader_time_subindex,
 fs_reg value)
 {
int index = shader_time_index * 3 + shader_time_subindex;
@@ -3823,7 +3798,7 @@ fs_visitor::run_vs()
assign_common_binding_table_offsets(0);
setup_vs_payload();
 
-   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+   if (shader_time_index >= 0)
   emit_shader_time_begin();
 
emit_nir_code();
@@ -3833,7 +3808,7 @@ fs_visitor::run_vs()
 
emit_urb_writes();
 
-   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+   if (shader_time_index >= 0)
   emit_shader_time_end();
 
calculate_cfg();
@@ -3871,7 +3846,7 @@ fs_visitor::run_fs()
} else if (brw->use_rep_send && dispatch_width == 16) {
   emit_repclear_shader();
} else {
-  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  if (shader_time_index >= 0)
  emit_shader_time_begin();
 
   calculate_urb_setup();
@@ -3906,7 +3881,7 @@ fs_visitor::run_fs()
 
   emit_fb_writes();
 
-  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+  if (shader_time_index >= 0)
  emit_shader_time_end();
 
   calculate_cfg();
@@ -3950,7 +3925,7 @@ fs_visitor::run_cs()
 
setu

[Mesa-dev] [PATCH 01/16] i965: Replace some instances of brw->gen with devinfo->gen

2015-06-22 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5563c5a..ac65202 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3187,7 +3187,7 @@ fs_visitor::lower_integer_multiplication()
  fs_reg high(GRF, alloc.allocate(dispatch_width / 8),
  inst->dst.type, dispatch_width);
 
- if (brw->gen >= 7) {
+ if (devinfo->gen >= 7) {
 fs_reg src1_0_w = inst->src[1];
 fs_reg src1_1_w = inst->src[1];
 
@@ -3616,7 +3616,7 @@ fs_visitor::setup_vs_payload()
 void
 fs_visitor::setup_cs_payload()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
payload.num_regs = 1;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4770838..cafe64a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1344,7 +1344,7 @@ fs_visitor::emit_interpolation_setup_gen6()
struct brw_reg g1_uw = retype(brw_vec1_grf(1, 0), BRW_REGISTER_TYPE_UW);
 
fs_builder abld = bld.annotate("compute pixel centers");
-   if (brw->gen >= 8 || dispatch_width == 8) {
+   if (devinfo->gen >= 8 || dispatch_width == 8) {
   /* The "Register Region Restrictions" page says for BDW (and newer,
* presumably):
*
@@ -1623,7 +1623,7 @@ fs_visitor::emit_single_fb_write(const fs_builder &bld,
   /* On pre-SNB, we have to interlace the color values.  LOAD_PAYLOAD
* will do this for us if we just give it a COMPR4 destination.
*/
-  if (brw->gen < 6 && exec_size == 16)
+  if (devinfo->gen < 6 && exec_size == 16)
  load->dst.reg |= BRW_MRF_COMPR4;
 
   write = ubld.emit(FS_OPCODE_FB_WRITE);
@@ -1934,7 +1934,7 @@ fs_visitor::emit_urb_writes()
 void
 fs_visitor::emit_cs_terminate()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
/* We are getting the thread ID from the compute shader header */
assert(stage == MESA_SHADER_COMPUTE);
@@ -1956,7 +1956,7 @@ fs_visitor::emit_cs_terminate()
 void
 fs_visitor::emit_barrier()
 {
-   assert(brw->gen >= 7);
+   assert(devinfo->gen >= 7);
 
/* We are getting the barrier ID from the compute shader header */
assert(stage == MESA_SHADER_COMPUTE);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/16] i965/vs: Pass the current set of clip planes through run() and run_vs()

2015-06-22 Thread Jason Ekstrand

Previously, these were pulled out of the GL context conditionally based on
whether we were running ff/ARB or a GLSL program.  Now, we just pass them
in so that the visitor doesn't have to grab them itself.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs.h|  8 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 11 +--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 
 src/mesa/drivers/dri/i965/brw_vec4.h  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  4 +---
 7 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bf04e26..23f60c2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3791,7 +3791,7 @@ fs_visitor::allocate_registers()
 }
 
 bool
-fs_visitor::run_vs()
+fs_visitor::run_vs(gl_clip_plane *clip_planes)
 {
assert(stage == MESA_SHADER_VERTEX);
 
@@ -3806,7 +3806,7 @@ fs_visitor::run_vs()
if (failed)
   return false;
 
-   emit_urb_writes();
+   emit_urb_writes(clip_planes);
 
if (shader_time_index >= 0)
   emit_shader_time_end();
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 4db5a91..e0a8984 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -84,8 +84,8 @@ public:
 
fs_reg vgrf(const glsl_type *const type);
void import_uniforms(fs_visitor *v);
-   void setup_uniform_clipplane_values();
-   void compute_clip_distance();
+   void setup_uniform_clipplane_values(gl_clip_plane *clip_planes);
+   void compute_clip_distance(gl_clip_plane *clip_planes);
 
uint32_t gather_channel(int orig_chan, uint32_t sampler);
void swizzle_result(ir_texture_opcode op, int dest_components,
@@ -104,7 +104,7 @@ public:
void DEP_RESOLVE_MOV(const brw::fs_builder &bld, int grf);
 
bool run_fs(bool do_rep_send);
-   bool run_vs();
+   bool run_vs(gl_clip_plane *clip_planes);
bool run_cs();
void optimize();
void allocate_registers();
@@ -271,7 +271,7 @@ public:
  fs_reg src0_alpha, unsigned components,
  unsigned exec_size, bool use_2nd_half = 
false);
void emit_fb_writes();
-   void emit_urb_writes();
+   void emit_urb_writes(gl_clip_plane *clip_planes);
void emit_cs_terminate();
 
void emit_barrier();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 9ce8491..395394c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1715,9 +1715,8 @@ fs_visitor::emit_fb_writes()
 }
 
 void
-fs_visitor::setup_uniform_clipplane_values()
+fs_visitor::setup_uniform_clipplane_values(gl_clip_plane *clip_planes)
 {
-   gl_clip_plane *clip_planes = brw_select_clip_planes(ctx);
const struct brw_vue_prog_key *key =
   (const struct brw_vue_prog_key *) this->key;
 
@@ -1731,7 +1730,7 @@ fs_visitor::setup_uniform_clipplane_values()
}
 }
 
-void fs_visitor::compute_clip_distance()
+void fs_visitor::compute_clip_distance(gl_clip_plane *clip_planes)
 {
struct brw_vue_prog_data *vue_prog_data =
   (struct brw_vue_prog_data *) prog_data;
@@ -1760,7 +1759,7 @@ void fs_visitor::compute_clip_distance()
if (outputs[clip_vertex].file == BAD_FILE)
   return;
 
-   setup_uniform_clipplane_values();
+   setup_uniform_clipplane_values(clip_planes);
 
const fs_builder abld = bld.annotate("user clip distances");
 
@@ -1781,7 +1780,7 @@ void fs_visitor::compute_clip_distance()
 }
 
 void
-fs_visitor::emit_urb_writes()
+fs_visitor::emit_urb_writes(gl_clip_plane *clip_planes)
 {
int slot, urb_offset, length;
struct brw_vs_prog_data *vs_prog_data =
@@ -1796,7 +1795,7 @@ fs_visitor::emit_urb_writes()
 
/* Lower legacy ff and ClipVertex clipping to clip distances */
if (key->base.userclip_active && !prog->UsesClipDistanceOut)
-  compute_clip_distance();
+  compute_clip_distance(clip_planes);
 
/* If we don't have any valid slots to write, just do a minimal urb write
 * send to terminate the shader. */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 093802c..9c45034 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1706,7 +1706,7 @@ vec4_visitor::emit_shader_time_write(int 
shader_time_subindex, src_reg value)
 }
 
 bool
-vec4_visitor::run()
+vec4_visitor::run(gl_clip_plane *clip_planes)
 {
sanity_param_count = prog->Parameters->NumParameters;
 
@@ -1728,7 +1728,7 @@ vec4_visitor::run()
base_ir = NULL;
 
if (key->userclip_active && !prog->UsesClipDistanceOut)
-  setup_uniform_clipplane_values();
+  setup_uniform_clipplane_values(clip_pla

[Mesa-dev] [PATCH 09/16] i965: Add compiler options to brw_compiler

2015-06-22 Thread Jason Ekstrand

This creates the options at screen cration time and then we just copy them
into the context at context creation time.  We also move is_scalar to the
brw_compiler structure.

We also end up manually setting some values that the core would have set by
default for us.  Fortunately, there are only two non-zero shader compiler
option defaults that we aren't overriding anyway so this isn't a big deal.
---
 src/mesa/drivers/dri/i965/brw_context.c  | 46 ++
 src/mesa/drivers/dri/i965/brw_context.h  |  1 -
 src/mesa/drivers/dri/i965/brw_shader.cpp | 49 +++-
 src/mesa/drivers/dri/i965/brw_shader.h   |  3 ++
 src/mesa/drivers/dri/i965/brw_vec4.cpp   |  2 +-
 src/mesa/drivers/dri/i965/intel_screen.c |  1 +
 6 files changed, 56 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 327a668..33cdbd2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -50,6 +50,7 @@
 
 #include "brw_context.h"
 #include "brw_defines.h"
+#include "brw_shader.h"
 #include "brw_draw.h"
 #include "brw_state.h"
 
@@ -68,8 +69,6 @@
 #include "tnl/t_pipeline.h"
 #include "util/ralloc.h"
 
-#include "glsl/nir/nir.h"
-
 /***
  * Mesa's Driver Functions
  ***/
@@ -558,48 +557,12 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128;
}
 
-   static const nir_shader_compiler_options nir_options = {
-  .native_integers = true,
-  /* In order to help allow for better CSE at the NIR level we tell NIR
-   * to split all ffma instructions during opt_algebraic and we then
-   * re-combine them as a later step.
-   */
-  .lower_ffma = true,
-  .lower_sub = true,
-   };
-
/* We want the GLSL compiler to emit code that uses condition codes */
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
-  ctx->Const.ShaderCompilerOptions[i].MaxIfDepth = brw->gen < 6 ? 16 : 
UINT_MAX;
-  ctx->Const.ShaderCompilerOptions[i].EmitCondCodes = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoNoise = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoMainReturn = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectInput = true;
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectOutput =
-(i == MESA_SHADER_FRAGMENT);
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectTemp =
-(i == MESA_SHADER_FRAGMENT);
-  ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectUniform = false;
-  ctx->Const.ShaderCompilerOptions[i].LowerClipDistance = true;
+  ctx->Const.ShaderCompilerOptions[i] =
+ brw->intelScreen->compiler->glsl_compiler_options[i];
}
 
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = true;
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
true;
-
-   if (brw->scalar_vs) {
-  /* If we're using the scalar backend for vertex shaders, we need to
-   * configure these accordingly.
-   */
-  
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
true;
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp 
= true;
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
false;
-
-  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
&nir_options;
-   }
-
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_FRAGMENT].NirOptions = 
&nir_options;
-   ctx->Const.ShaderCompilerOptions[MESA_SHADER_COMPUTE].NirOptions = 
&nir_options;
-
/* ARB_viewport_array */
if (brw->gen >= 6 && ctx->API == API_OPENGL_CORE) {
   ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS;
@@ -832,9 +795,6 @@ brwCreateContext(gl_api api,
if (INTEL_DEBUG & DEBUG_AUB)
   drm_intel_bufmgr_gem_set_aub_dump(brw->bufmgr, true);
 
-   if (brw->gen >= 8 && !(INTEL_DEBUG & DEBUG_VEC4VS))
-  brw->scalar_vs = true;
-
brw_initialize_context_constants(brw);
 
ctx->Const.ResetStrategy = notify_reset
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 58119ee..d8fcfff 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1137,7 +1137,6 @@ struct brw_context
bool has_pln;
bool no_simd8;
bool use_rep_send;
-   bool scalar_vs;
 
/**
 * Some versions of Gen hardware don't do centroid interpolation correctly
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 3ac5ef1..683946b 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -84,6 +84,53 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
brw_fs_alloc_reg_sets(compiler);
brw_vec4_alloc_reg_set(compiler);
 
+

[Mesa-dev] [PATCH 07/16] i965/fs: Do the no16 perf logging directly in fs_visitor::no16()

2015-06-22 Thread Jason Ekstrand

While we're at it, we'll drop the note about 10-20% performance loss.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a9d9f37..40e2c44 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -710,12 +710,7 @@ fs_visitor::no16(const char *msg)
} else {
   simd16_unsupported = true;
 
-  if (brw->perf_debug) {
- if (no16_msg)
-ralloc_strcat(&no16_msg, msg);
- else
-no16_msg = ralloc_strdup(mem_ctx, msg);
-  }
+  perf_debug("SIMD16 shader failed to compile: %s", msg);
}
 }
 
@@ -4042,14 +4037,10 @@ brw_wm_fs_emit(struct brw_context *brw,
  /* Try a SIMD16 compile */
  v2.import_uniforms(&v);
  if (!v2.run_fs()) {
-perf_debug("SIMD16 shader failed to compile, falling back to "
-   "SIMD8 at a 10-20%% performance cost: %s", v2.fail_msg);
+perf_debug("SIMD16 shader failed to compile: %s", v2.fail_msg);
  } else {
 simd16_cfg = v2.cfg;
  }
-  } else {
- perf_debug("SIMD16 shader unsupported, falling back to "
-"SIMD8 at a 10-20%% performance cost: %s", v.no16_msg);
   }
}
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 123 matches

Mail list logo